License | BSD-style |
---|---|
Maintainer | Vincent Hanquez <vincent@snarc.org> |
Stability | experimental |
Portability | portable |
Safe Haskell | None |
Language | Haskell2010 |
Opaque packed String encoded in UTF8.
The type is an instance of IsString and IsList, which allow OverloadedStrings
for string literal, and fromList
to convert a [Char] (Prelude String) to a packed
representation
{-# LANGUAGE OverloadedStrings #-} s = "Hello World" :: String
s = fromList ("Hello World" :: Prelude.String) :: String
Each unicode code point is represented by a variable encoding of 1 to 4 bytes,
For more information about UTF8: https://en.wikipedia.org/wiki/UTF-8
- data String
- data Encoding
- = ASCII7
- | UTF8
- | UTF16
- | UTF32
- | ISO_8859_1
- fromBytes :: Encoding -> UArray Word8 -> (String, Maybe ValidationFailure, UArray Word8)
- fromBytesLenient :: UArray Word8 -> (String, UArray Word8)
- fromBytesUnsafe :: UArray Word8 -> String
- toBytes :: Encoding -> String -> UArray Word8
- data ValidationFailure
- lines :: String -> [String]
- words :: String -> [String]
Documentation
Opaque packed array of characters in the UTF8 encoding
IsList String Source # | |
Eq String Source # | |
Data String Source # | |
Ord String Source # | |
Show String Source # | |
IsString String Source # | |
Monoid String Source # | |
Collection String Source # | |
InnerFunctor String Source # | |
Sequential String Source # | |
Buildable String Source # | |
Zippable String Source # | |
Hashable String Source # | |
type Item String Source # | |
type Element String Source # | |
type Mutable String Source # | |
type Step String Source # | |
fromBytes :: Encoding -> UArray Word8 -> (String, Maybe ValidationFailure, UArray Word8) Source #
Convert a ByteArray to a string assuming a specific encoding.
It returns a 3-tuple of:
- The string that has been succesfully converted without any error
- An optional validation error
- The remaining buffer that hasn't been processed (either as a result of an error, or because the encoded sequence is not fully available)
Considering a stream of data that is fetched chunk by chunk, it's valid to assume that some sequence might fall in a chunk boundary. When converting chunks, if the error is Nothing and the remaining buffer is not empty, then this buffer need to be prepended to the next chunk
fromBytesLenient :: UArray Word8 -> (String, UArray Word8) Source #
Convert a UTF8 array of bytes to a String.
If there's any error in the stream, it will automatically insert replacement bytes to replace invalid sequences.
In the case of sequence that fall in the middle of 2 chunks, the remaining buffer is supposed to be preprended to the next chunk, and resume the parsing.
fromBytesUnsafe :: UArray Word8 -> String Source #
Convert a Byte Array representing UTF8 data directly to a string without checking for UTF8 validity
If the input contains invalid sequences, it will trigger runtime async errors when processing data.
In doubt, use fromBytes
toBytes :: Encoding -> String -> UArray Word8 Source #
Convert a String to a bytearray in a specific encoding
if the encoding is UTF8, the underlying buffer is returned without extra allocation or any processing
In any other encoding, some allocation and processing are done to convert.
data ValidationFailure Source #