Safe Haskell | Safe |
---|---|
Language | Haskell98 |
All kinds of representations of a character in XML combined in one type.
Note that an entity can in principle represent a large text,
thus an "XML character" might actually be a text.
However the standard entities consist of one character.
In contrast to our representation,
HaXml uses Unicode substrings instead of Unicode characters,
which is certainly more efficient for common XML texts
that contain mainly Unicode text and only few references.
However our representation is unique,
whereas HaXmls may represent a text as "abc","def"
or "abcdef"
.
Synopsis
- data T
- toUnicode :: T -> Exceptional String Char
- toUnicodeGen :: Map String Char -> T -> Exceptional String Char
- toUnicodeOrFormat :: T -> ShowS
- toUnicodeOrFormatGen :: Map String Char -> T -> ShowS
- fromUnicode :: Char -> T
- fromCharRef :: Int -> T
- fromEntityRef :: String -> T
- maybeUnicode :: T -> Maybe Char
- maybeCharRef :: T -> Maybe Int
- maybeEntityRef :: T -> Maybe String
- isUnicode :: T -> Bool
- isCharRef :: T -> Bool
- isEntityRef :: T -> Bool
- isRef :: T -> Bool
- unicode :: Char -> T
- refC :: Int -> T
- refE :: String -> T
- asciiFromUnicode :: Char -> T
- asciiFromUnicodeGen :: Map Char String -> Char -> T
- minimalRefFromUnicode :: Char -> T
- reduceRef :: T -> T
- reduceRefGen :: Map String Char -> T -> T
- validCharRef :: Int -> Bool
- switchUnicodeRuns :: (String -> a) -> (Int -> a) -> (String -> a) -> [T] -> [a]
Documentation
toUnicode :: T -> Exceptional String Char Source #
If a reference cannot be resolved
then an Exception
constructor with an error message is returned.
toUnicodeGen :: Map String Char -> T -> Exceptional String Char Source #
toUnicodeOrFormat :: T -> ShowS Source #
If a reference cannot be resolved then a reference string is returned.
fromUnicode :: Char -> T Source #
fromCharRef :: Int -> T Source #
fromEntityRef :: String -> T Source #
isEntityRef :: T -> Bool Source #
asciiFromUnicode :: Char -> T Source #
Convert unicode character to XML Char, where Unicode constructor is only used for ASCII characters. This is achieved by the following decision: If there is a entity reference, use this. If it is ASCII, represent it as Char. Otherwise use a character reference.
minimalRefFromUnicode :: Char -> T Source #
Generate XML character from Unicode character
with minimal use of references.
The only references used are the XML entity references
'
, "
, &
, <
, >
.
Reduce the use of references. Represent as much as possible characters as Unicode characters, that is, using the Unicode constructor.
reduceRefGen :: Map String Char -> T -> T Source #
try to convert a References to equivalent Unicode characters
validCharRef :: Int -> Bool Source #