xml-conduit-1.2.3: Pure-Haskell utilities for dealing with XML with the conduit package.

Safe HaskellNone
LanguageHaskell98

Text.XML.Cursor

Contents

Description

This module provides for simple DOM traversal. It is inspired by XPath. There are two central concepts here:

  • A Cursor represents a node in the DOM. It also contains information on the node's location. While the Node datatype will only know of its children, a Cursor knows about its parent and siblings as well. (The underlying mechanism allowing this is called a zipper, see http://www.haskell.org/haskellwiki/Zipper and http://www.haskell.org/haskellwiki/Tying_the_Knot.)
  • An Axis, in its simplest form, takes a Cursor and returns a list of Cursors. It is used for selections, such as finding children, ancestors, etc. Axes can be chained together to express complex rules, such as all children named foo.

The terminology used in this module is taken directly from the XPath specification: http://www.w3.org/TR/xpath/. For those familiar with XPath, the one major difference is that attributes are not considered nodes in this module.

Synopsis

Data types

type Cursor = Cursor Node Source

A cursor: contains an XML Node and pointers to its children, ancestors and siblings.

type Axis = Cursor -> [Cursor] Source

The type of an Axis that returns a list of Cursors. They are roughly modeled after http://www.w3.org/TR/xpath/#axes.

Axes can be composed with >=>, where e.g. f >=> g means that on all results of the f axis, the g axis will be applied, and all results joined together. Because Axis is just a type synonym for Cursor -> [Cursor], it is possible to use other standard functions like >>= or concatMap similarly.

The operators &|, &/, &// and &.// can be used to combine axes so that the second axis works on the context nodes, children, descendants, respectively the context node as well as its descendants of the results of the first axis.

The operators $|, $/, $// and $.// can be used to apply an axis (right-hand side) to a cursor so that it is applied on the cursor itself, its children, its descendants, respectively itself and its descendants.

Note that many of these operators also work on generalised Axes that can return lists of something other than Cursors, for example Content elements.

Production

fromDocument :: Document -> Cursor Source

Convert a Document to a Cursor. It will point to the document root.

fromNode :: Node -> Cursor Source

Convert a Node to a Cursor (without parents).

cut :: Cursor -> Cursor Source

Cut a cursor off from its parent. The idea is to allow restricting the scope of queries on it.

Axes

parent :: Axis node Source

The parent axis. As described in XPath: the parent axis contains the parent of the context node, if there is one.

Every node but the root element of the document has a parent. Parent nodes will always be NodeElements.

precedingSibling :: Axis node Source

The preceding-sibling axis. XPath: the preceding-sibling axis contains all the preceding siblings of the context node [...].

followingSibling :: Axis node Source

The following-sibling axis. XPath: the following-sibling axis contains all the following siblings of the context node [...].

child :: Cursor node -> [Cursor node] Source

The child axis. XPath: the child axis contains the children of the context node.

node :: Cursor node -> node Source

The current node.

preceding :: Axis node Source

The preceding axis. XPath: the preceding axis contains all nodes in the same document as the context node that are before the context node in document order, excluding any ancestors and excluding attribute nodes and namespace nodes.

following :: Axis node Source

The following axis. XPath: the following axis contains all nodes in the same document as the context node that are after the context node in document order, excluding any descendants and excluding attribute nodes and namespace nodes.

ancestor :: Axis node Source

The ancestor axis. XPath: the ancestor axis contains the ancestors of the context node; the ancestors of the context node consist of the parent of context node and the parent's parent and so on; thus, the ancestor axis will always include the root node, unless the context node is the root node.

descendant :: Axis node Source

The descendant axis. XPath: the descendant axis contains the descendants of the context node; a descendant is a child or a child of a child and so on; thus the descendant axis never contains attribute or namespace nodes.

orSelf :: Axis node -> Axis node Source

Modify an axis by adding the context node itself as the first element of the result list.

Filters

check :: Boolean b => (Cursor -> b) -> Axis Source

Filter cursors that don't pass a check.

checkNode :: Boolean b => (Node -> b) -> Axis Source

Filter nodes that don't pass a check.

checkElement :: Boolean b => (Element -> b) -> Axis Source

Filter elements that don't pass a check, and remove all non-elements.

checkName :: Boolean b => (Name -> b) -> Axis Source

Filter elements that don't pass a name check, and remove all non-elements.

anyElement :: Axis Source

Remove all non-elements. Compare roughly to XPath: A node test * is true for any node of the principal node type. For example, child::* will select all element children of the context node [...].

element :: Name -> Axis Source

Select only those elements with a matching tag name. XPath: A node test that is a QName is true if and only if the type of the node (see [5 Data Model]) is the principal node type and has an expanded-name equal to the expanded-name specified by the QName.

laxElement :: Text -> Axis Source

Select only those elements with a loosely matching tag name. Namespace and case are ignored. XPath: A node test that is a QName is true if and only if the type of the node (see [5 Data Model]) is the principal node type and has an expanded-name equal to the expanded-name specified by the QName.

content :: Cursor -> [Text] Source

Select only text nodes, and directly give the Content values. XPath: The node test text() is true for any text node.

Note that this is not strictly an Axis, but will work with most combinators.

attribute :: Name -> Cursor -> [Text] Source

Select attributes on the current element (or nothing if it is not an element). XPath: the attribute axis contains the attributes of the context node; the axis will be empty unless the context node is an element

Note that this is not strictly an Axis, but will work with most combinators.

The return list of the generalised axis contains as elements lists of Content elements, each full list representing an attribute value.

laxAttribute :: Text -> Cursor -> [Text] Source

Select attributes on the current element (or nothing if it is not an element). Namespace and case are ignored. XPath: the attribute axis contains the attributes of the context node; the axis will be empty unless the context node is an element

Note that this is not strictly an Axis, but will work with most combinators.

The return list of the generalised axis contains as elements lists of Content elements, each full list representing an attribute value.

hasAttribute :: Name -> Axis Source

Select only those element nodes with the given attribute.

attributeIs :: Name -> Text -> Axis Source

Select only those element nodes containing the given attribute key/value pair.

Operators

(&|) :: (Cursor node -> [a]) -> (a -> b) -> Cursor node -> [b] infixr 1 Source

Apply a function to the result of an axis.

(&/) :: Axis node -> (Cursor node -> [a]) -> Cursor node -> [a] infixr 1 Source

Combine two axes so that the second works on the children of the results of the first.

(&//) :: Axis node -> (Cursor node -> [a]) -> Cursor node -> [a] infixr 1 Source

Combine two axes so that the second works on the descendants of the results of the first.

(&.//) :: Axis node -> (Cursor node -> [a]) -> Cursor node -> [a] infixr 1 Source

Combine two axes so that the second works on both the result nodes, and their descendants.

($|) :: Cursor node -> (Cursor node -> a) -> a infixr 1 Source

Apply an axis to a 'Cursor node'.

($/) :: Cursor node -> (Cursor node -> [a]) -> [a] infixr 1 Source

Apply an axis to the children of a 'Cursor node'.

($//) :: Cursor node -> (Cursor node -> [a]) -> [a] infixr 1 Source

Apply an axis to the descendants of a 'Cursor node'.

($.//) :: Cursor node -> (Cursor node -> [a]) -> [a] infixr 1 Source

Apply an axis to a 'Cursor node' as well as its descendants.

(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c infixr 1

Left-to-right Kleisli composition of monads.

Type classes

class Boolean a where Source

Something that can be used in a predicate check as a boolean.

Methods

bool :: a -> Bool Source

Instances

Error handling

force :: (Exception e, MonadThrow f) => e -> [a] -> f a Source

forceM :: (Exception e, MonadThrow f) => e -> [f a] -> f a Source