Safe Haskell | None |
---|---|
Language | Haskell2010 |
Manipulate identifiers and structurally non-complex pieces of text by delimiting word boundaries via a combination of whitespace, control-characters, and case-sensitivity.
Assumptions have been made about word boundary characteristics inherint in predominantely English text, please see individual function documentation for further details and behaviour.
- takeWord :: Text -> Text
- dropWord :: Text -> Text
- stripWord :: Text -> Maybe Text
- breakWord :: Text -> (Text, Text)
- splitWords :: Text -> [Text]
- lowerHead :: Text -> Text
- upperHead :: Text -> Text
- mapHead :: (Char -> Char) -> Text -> Text
- indentLines :: Int -> Text -> Text
- prependLines :: Text -> Text -> Text
- toEllipsis :: Int64 -> Text -> Text
- toEllipsisWith :: Int64 -> Text -> Text -> Text
- toAcronym :: Text -> Maybe Text
- toOrdinal :: Integral a => a -> Text
- toTitle :: Text -> Text
- toCamel :: Text -> Text
- toPascal :: Text -> Text
- toSnake :: Text -> Text
- toSpinal :: Text -> Text
- toTrain :: Text -> Text
- isBoundary :: Char -> Bool
- isWordBoundary :: Char -> Bool
Strict vs lazy types
This library provides functions for manipulating both strict and lazy Text types. The strict functions are provided by the Data.Text.Manipulate module, while the lazy functions are provided by the Data.Text.Lazy.Manipulate module.
Unicode
While this library supports Unicode in a similar fashion to the underlying text library, more explicit Unicode specific handling of word boundaries can be found in the text-icu library.
Fusion
Many functions in this module are subject to fusion, meaning that a pipeline of such functions will usually allocate at most one Text value.
Functions that can be fused by the compiler are documented with the phrase Subject to fusion.
Subwords
Removing words
takeWord :: Text -> Text Source
O(n) Returns the first word, or the original text if no word boundary is encountered. Subject to fusion.
dropWord :: Text -> Text Source
O(n) Return the suffix after dropping the first word. If no word boundary is encountered, the result will be empty. Subject to fusion.
stripWord :: Text -> Maybe Text Source
O(n) Return the suffix after removing the first word, or Nothing
if no word boundary is encountered.
>>>
stripWord "HTML5Spaghetti"
Just "Spaghetti"
>>>
stripWord "noboundaries"
Nothing
Breaking on words
breakWord :: Text -> (Text, Text) Source
Break a piece of text after the first word boundary is encountered.
>>>
breakWord "PascalCasedVariable"
("Pacal", "CasedVariable")
>>>
breakWord "spinal-cased-variable"
("spinal", "cased-variable")
splitWords :: Text -> [Text] Source
O(n) Split into a list of words delimited by boundaries.
>>>
splitWords "SupercaliFrag_ilistic"
["Supercali","Frag","ilistic"]
Character manipulation
lowerHead :: Text -> Text Source
Lowercase the first character of a piece of text.
>>>
lowerHead "Title Cased"
"title Cased"
upperHead :: Text -> Text Source
Uppercase the first character of a piece of text.
>>>
upperHead "snake_cased"
"Snake_cased"
mapHead :: (Char -> Char) -> Text -> Text Source
Apply a function to the first character of a piece of text.
Line manipulation
indentLines :: Int -> Text -> Text Source
Indent newlines by the given number of spaces.
prependLines :: Text -> Text -> Text Source
Prepend newlines with the given separator
Ellipsis
toEllipsis :: Int64 -> Text -> Text Source
O(n) Truncate text to a specific length. If the text was truncated the ellipsis sign "..." will be appended.
See: toEllipsisWith
O(n) Truncate text to a specific length. If the text was truncated the given ellipsis sign will be appended.
Acronyms
toAcronym :: Text -> Maybe Text Source
O(n) Create an adhoc acronym from a piece of cased text.
>>>
toAcronym "AmazonWebServices"
Just "AWS"
>>>
toAcronym "Learn-You Some_Haskell"
Just "LYSH"
>>>
toAcronym "this_is_all_lowercase"
Nothing
Ordinals
toOrdinal :: Integral a => a -> Text Source
Render an ordinal used to denote the position in an ordered sequence.
>>>
toOrdinal (101 :: Int)
"101st"
>>>
toOrdinal (12 :: Int)
"12th"
Casing
Boundary predicates
isBoundary :: Char -> Bool Source
Returns True
for any boundary character.
isWordBoundary :: Char -> Bool Source
Returns True
for any boundary or uppercase character.