pipes-text-0.0.0.7: Text pipes.

Safe HaskellNone
LanguageHaskell2010

Pipes.Text.Encoding

Contents

Description

 

Synopsis

Lens type

The Codec type is just an aliased standard Prelude type. It is more or less the Lens' type of the standard lens libraries, lens and lens-families so you can use the view or (^.) and zoom functions from those libraries.

Each looks into a byte stream that is expected to contain text. The stream of text they see in a bytestream ends by returning the original byte stream beginning at the point of failure, or the empty bytestream with its return value. They are named in accordance with the expected encoding, utf8, utf16LE etc.

  view utf8 :: Producer ByteString m r -> Producer Text m (Producer ByteString m r)
  Bytes.stdin ^. utf8 ::  Producer Text m (Producer ByteString m r)

zoom converts a Text parser into a ByteString parser:

  zoom utf8 drawChar :: Monad m => StateT (Producer ByteString m r) m (Maybe Char)

  withNextByte :: Parser ByteString m (Maybe Char, Maybe Word8))) 
  withNextByte = do char_ <- zoom utf8 Text.drawChar
                    byte_ <- Bytes.peekByte
                    return (char_, byte_)

withNextByte will return the first valid Char in a ByteString, and the first byte of the next character, if they exists; because we draw one and peek at the other, we only advance one Char's length along the bytestring.

type Codec = forall f m r. (Functor f, Monad m) => (Producer Text m (Producer ByteString m r) -> f (Producer Text m (Producer ByteString m r))) -> Producer ByteString m r -> f (Producer ByteString m r) Source

Standard lenses for viewing Text in ByteString

utf8 :: Codec Source

An improper lens into a byte stream expected to be UTF-8 encoded; the associated text stream ends by returning the original bytestream beginning at the point of failure, or the empty bytestring for a well-encoded text.

Non-lens decoding functions

Functions for latin and ascii text

ascii and latin encodings only use a small number of the characters Text recognizes; thus we cannot use the pipes Lens style to work with them. Rather we simply define functions each way.

decodeAscii :: Monad m => Producer ByteString m r -> Producer Text m (Producer ByteString m r) Source

Reduce a byte stream to a corresponding stream of ascii chars, returning the unused ByteString upon hitting an un-ascii byte.

encodeIso8859_1 :: Monad m => Producer Text m r -> Producer ByteString m (Producer Text m r) Source

Reduce as much of your stream of Text actually is iso8859 or latin1 to a byte stream, returning the rest of the Text upon hitting any non-latin Char

decodeIso8859_1 :: Monad m => Producer ByteString m r -> Producer Text m (Producer ByteString m r) Source

Reduce a byte stream to a corresponding stream of ascii chars, returning the unused ByteString upon hitting the rare un-latinizable byte.