Copyright | (c) 2008 2009 Tom Harper (c) 2009 2010 Bryan O'Sullivan (c) 2009 Duncan Coutts |
---|---|
License | BSD-style |
Maintainer | bos@serpentine.com |
Stability | experimental |
Portability | GHC |
Safe Haskell | Safe-Inferred |
Language | Haskell2010 |
A module containing private Text
internals. This exposes the
Text
representation and low level construction functions.
Modules which extend the Text
system may need to use this module.
You should not use this module unless you are determined to monkey with the internals, as the functions here do just about nothing to preserve data invariants. You have been warned!
Synopsis
- data Text = Text !Array !Int !Int
- text :: Array -> Int -> Int -> Text
- textP :: Array -> Int -> Int -> Text
- safe :: Char -> Char
- empty :: Text
- empty_ :: Text
- append :: Text -> Text -> Text
- firstf :: (a -> c) -> Maybe (a, b) -> Maybe (c, b)
- mul :: Int -> Int -> Int
- mul32 :: Int32 -> Int32 -> Int32
- mul64 :: Int64 -> Int64 -> Int64
- showText :: Text -> String
- pack :: String -> Text
Types
Internally, the Text
type is represented as an array of Word8
UTF-8 code units. The offset and length fields in the constructor
are in these units, not units of Char
.
Invariants that all functions must maintain:
- Since the
Text
type uses UTF-8 internally, it cannot represent characters in the reserved surrogate code point range U+D800 to U+DFFF. To maintain this invariant, thesafe
function mapsChar
values in this range to the replacement character (U+FFFD, '�'). - Offset and length must point to a valid UTF-8 sequence of bytes. Violation of this may cause memory access violation and divergence.
A space efficient, packed, unboxed Unicode text type.
Instances
Data Text Source # | This instance preserves data abstraction at the cost of inefficiency. We omit reflection services for the sake of data abstraction. This instance was created by copying the updated behavior of
The original discussion is archived here: could we get a Data instance for Data.Text.Text? The followup discussion that changed the behavior of |
Defined in Data.Text gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> Text -> c Text # gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c Text # dataTypeOf :: Text -> DataType # dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c Text) # dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c Text) # gmapT :: (forall b. Data b => b -> b) -> Text -> Text # gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> Text -> r # gmapQr :: forall r r'. (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> Text -> r # gmapQ :: (forall d. Data d => d -> u) -> Text -> [u] # gmapQi :: Int -> (forall d. Data d => d -> u) -> Text -> u # gmapM :: Monad m => (forall d. Data d => d -> m d) -> Text -> m Text # gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> Text -> m Text # gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> Text -> m Text # | |
IsString Text Source # | Performs replacement on invalid scalar values:
|
Defined in Data.Text fromString :: String -> Text # | |
Monoid Text Source # | |
Semigroup Text Source # | Since: 1.2.2.0 |
IsList Text Source # | Performs replacement on invalid scalar values:
Since: 1.2.0.0 |
Read Text Source # | |
Show Text Source # | |
PrintfArg Text Source # | Since: 1.2.2.0 |
Defined in Data.Text formatArg :: Text -> FieldFormatter # parseFormat :: Text -> ModifierParser # | |
Binary Text Source # | Since: 1.2.1.0 |
NFData Text Source # | |
Eq Text Source # | |
Ord Text Source # | |
Lift Text Source # | Since: 1.2.4.0 |
type Item Text Source # | |
Construction
:: Array | bytearray encoded as UTF-8 |
-> Int | offset in bytes (not in Char!), pointing to a start of UTF-8 sequence |
-> Int | length in bytes (not in Char!), pointing to an end of UTF-8 sequence |
-> Text |
Construct a Text
without invisibly pinning its byte array in
memory if its length has dwindled to zero.
Safety
Map a Char
to a Text
-safe value.
Unicode Surrogate
code points are not included in the set of Unicode
scalar values, but are unfortunately admitted as valid Char
values by Haskell. They cannot be represented in a Text
. This
function remaps those code points to the Unicode replacement
character (U+FFFD, '�'), and leaves other code points
unchanged.
Code that must be here for accessibility
Utilities
firstf :: (a -> c) -> Maybe (a, b) -> Maybe (c, b) Source #
Apply a function to the first element of an optional pair.
Checked multiplication
mul :: Int -> Int -> Int infixl 7 Source #
Checked multiplication. Calls error
if the result would
overflow.
mul32 :: Int32 -> Int32 -> Int32 infixl 7 Source #
Checked multiplication. Calls error
if the result would
overflow.
mul64 :: Int64 -> Int64 -> Int64 infixl 7 Source #
Checked multiplication. Calls error
if the result would
overflow.