Copyright | (c) 2010 Bryan O'Sullivan |
---|---|
License | BSD-style |
Maintainer | bos@serpentine.com |
Stability | experimental |
Portability | GHC |
Safe Haskell | None |
Language | Haskell98 |
String collation functions for Unicode, implemented as bindings to the International Components for Unicode (ICU) libraries.
Synopsis
- data MCollator
- data Attribute
- data AlternateHandling
- data CaseFirst
- data Strength
- open :: LocaleName -> IO MCollator
- collate :: MCollator -> Text -> Text -> IO Ordering
- collateIter :: MCollator -> CharIterator -> CharIterator -> IO Ordering
- getAttribute :: MCollator -> Attribute -> IO Attribute
- setAttribute :: MCollator -> Attribute -> IO ()
- sortKey :: MCollator -> Text -> IO ByteString
- clone :: MCollator -> IO MCollator
- freeze :: MCollator -> IO Collator
Unicode collation API
French Bool | Direction of secondary weights, used in French. |
AlternateHandling AlternateHandling | For handling variable elements. |
CaseFirst (Maybe CaseFirst) | Control the ordering of upper and lower case letters.
|
CaseLevel Bool | Controls whether an extra case level (positioned
before the third level) is generated or not. When
|
NormalizationMode Bool | Controls whether the normalization check and necessary
normalizations are performed. When |
Strength Strength | |
HiraganaQuaternaryMode Bool | When turned on, this attribute positions Hiragana before all non-ignorables on quaternary level. This is a sneaky way to produce JIS sort order. |
Numeric Bool | When enabled, this attribute generates a collation key for the numeric value of substrings of digits. This is a way to get '100' to sort after '2'. |
data AlternateHandling Source #
Control the handling of variable weight elements.
NonIgnorable | Treat all codepoints with non-ignorable primary weights in the same way. |
Shifted | Cause codepoints with primary weights that are equal to or below the variable top value to be ignored on primary level and moved to the quaternary level. |
Instances
Control the ordering of upper and lower case letters.
UpperFirst | Force upper case letters to sort before lower case. |
LowerFirst | Force lower case letters to sort before upper case. |
Instances
Bounded CaseFirst Source # | |
Enum CaseFirst Source # | |
Defined in Data.Text.ICU.Collate succ :: CaseFirst -> CaseFirst # pred :: CaseFirst -> CaseFirst # fromEnum :: CaseFirst -> Int # enumFrom :: CaseFirst -> [CaseFirst] # enumFromThen :: CaseFirst -> CaseFirst -> [CaseFirst] # enumFromTo :: CaseFirst -> CaseFirst -> [CaseFirst] # enumFromThenTo :: CaseFirst -> CaseFirst -> CaseFirst -> [CaseFirst] # | |
Eq CaseFirst Source # | |
Show CaseFirst Source # | |
NFData CaseFirst Source # | |
Defined in Data.Text.ICU.Collate |
The strength attribute. The usual strength for most locales (except
Japanese) is tertiary. Quaternary strength is useful when combined with
shifted setting for alternate handling attribute and for JIS x 4061
collation, when it is used to distinguish between Katakana and Hiragana
(this is achieved by setting HiraganaQuaternaryMode
mode to
True
). Otherwise, quaternary level is affected only by the number of
non ignorable code points in the string. Identical strength is rarely
useful, as it amounts to codepoints of the NFD
form of the string.
Instances
Bounded Strength Source # | |
Enum Strength Source # | |
Eq Strength Source # | |
Show Strength Source # | |
NFData Strength Source # | |
Defined in Data.Text.ICU.Collate |
Functions
:: LocaleName | The locale containing the required collation rules. |
-> IO MCollator |
Open a Collator
for comparing strings.
collateIter :: MCollator -> CharIterator -> CharIterator -> IO Ordering Source #
Compare two CharIterator
s.
If either iterator was constructed from a ByteString
, it does not need
to be copied or converted internally, so this function can be quite
cheap.