Safe Haskell | None |
---|
This module provides functionality for tokenizing text streams to differentiate between printed characters and structural elements such as newlines. Once tokenized, such text streams can be manipulated with the functions in this module.
- data TextStream a = TS ![TextStreamEntity a]
- data TextStreamEntity a
- data Token a
- tokenize :: Text -> a -> TextStream a
- serialize :: TextStream a -> Text
- tokenLen :: Token a -> Int
- entityToken :: TextStreamEntity a -> Token a
- streamEntities :: TextStream a -> [TextStreamEntity a]
- truncateLine :: Phys -> [Token a] -> [Token a]
- truncateText :: Phys -> Text -> Text
- wrapStream :: Eq a => Phys -> TextStream a -> TextStream a
- findLines :: [TextStreamEntity a] -> [[TextStreamEntity a]]
Documentation
data TextStream a Source
A text stream is a list of text stream entities. A text stream |combines structural elements of the text (e.g., newlines) with the |text itself (words, whitespace, etc.).
TS ![TextStreamEntity a] |
Eq a => Eq (TextStream a) | |
Show a => Show (TextStream a) |
data TextStreamEntity a Source
A text stream entity is either a token or a structural element.
Eq a => Eq (TextStreamEntity a) | |
Show a => Show (TextStreamEntity a) |
The type of text tokens. These should consist of printable characters and NOT presentation characters (e.g., newlines). Each type of token should have as its contents a string of characters all of the same type. Tokens are generalized over an attribute type which can be used to annotate each token.
To and from strings
tokenize :: Text -> a -> TextStream aSource
Tokenize a string and apply a default attribute to every token in the resulting text stream.
serialize :: TextStream a -> TextSource
Given a text stream, serialize the stream to its original textual representation. This discards token attribute metadata.
Inspection
entityToken :: TextStreamEntity a -> Token aSource
Gets a Token
from an entity or raises an exception if the entity
does not contain a token. Used primarily for convenience
transformations in which the parameter is known to be a token
entity.
streamEntities :: TextStream a -> [TextStreamEntity a]Source
Get the entities in a stream.
Manipulation
truncateLine :: Phys -> [Token a] -> [Token a]Source
Given a list of tokens, truncate the list so that its underlying string representation does not exceed the specified column width.
truncateText :: Phys -> Text -> TextSource
Same as truncateLine
but for Text
values.
wrapStream :: Eq a => Phys -> TextStream a -> TextStream aSource
Given a text stream and a wrapping width, return a new
TextStream
with newlines inserted in appropriate places to wrap
the text at the specified column (not character position).
This function results in text wrapped without leading or trailing whitespace on wrapped lines, although it preserves leading whitespace in the text which was not the cause of the wrapping transformation.
findLines :: [TextStreamEntity a] -> [[TextStreamEntity a]]Source
Given a list of text stream entities, split up the list wherever
newlines occur. Returns a list of lines of entities, such that all
entities wrap tokens and none are newlines. (Safe for use with
entityToken
.)