Safe Haskell | Safe |
---|---|
Language | Haskell2010 |
Since: 0.2.2.0
Synopsis
- decodeUtf8 :: forall xs c t y ys encs. (UnSnoc xs ~ (,) ys y, Superset "r-UTF8" y, encs ~ RemoveRs ys, AllEncodeInto "r-UTF8" encs) => Enc xs c ByteString -> Enc xs c Text
- decodeUtf8_1 :: Superset "r-UTF8" y => Enc '[y] c ByteString -> Enc '[y] c Text
- encodeUtf8 :: forall xs c t y ys encs. (UnSnoc xs ~ (,) ys y, Superset "r-UTF8" y, encs ~ RemoveRs ys, AllEncodeInto "r-UTF8" encs) => Enc xs c Text -> Enc xs c ByteString
- encodeUtf8_1 :: Superset "r-UTF8" y => Enc '[y] c Text -> Enc '[y] c ByteString
Documentation
>>>
:set -XScopedTypeVariables -XOverloadedStrings -XDataKinds -XFlexibleContexts -XTypeApplications
>>>
import Test.QuickCheck
>>>
import Test.QuickCheck.Instances.Text()
>>>
import Test.QuickCheck.Instances.ByteString()
>>>
import Data.TypedEncoding.Instances.Restriction.BoundedAlphaNums()
>>>
import qualified Data.ByteString.Char8 as B8
>>>
import Data.Char
>>>
import Data.Either
>>>
import Data.TypedEncoding
>>>
import Data.TypedEncoding.Conv.Text
>>>
let emptyUTF8B = unsafeSetPayload () "" :: Enc '["r-UTF8"] () B.ByteString
>>>
:{
instance Arbitrary (Enc '["r-UTF8"] () B.ByteString) where arbitrary = fmap (fromRight emptyUTF8B) . flip suchThat isRight . fmap (encodeFAll @'["r-UTF8"] @(Either EncodeEx) @(). toEncoding ()) $ arbitrary instance Arbitrary (Enc '["r-UTF8"] () T.Text) where arbitrary = fmap (unsafeSetPayload ()) arbitrary instance Arbitrary (Enc '["r-ASCII"] () B.ByteString) where arbitrary = fmap (unsafeSetPayload ()) . flip suchThat (B8.all isAscii) $ arbitrary instance Arbitrary (Enc '["r-ASCII"] () T.Text) where arbitrary = fmap (unsafeSetPayload ()) . flip suchThat (T.all isAscii) $ arbitrary :}
decodeUtf8 :: forall xs c t y ys encs. (UnSnoc xs ~ (,) ys y, Superset "r-UTF8" y, encs ~ RemoveRs ys, AllEncodeInto "r-UTF8" encs) => Enc xs c ByteString -> Enc xs c Text Source #
With given constraints decodeUtf8
and encodeUtf8
can be used on subsets of "r-UTF8"
Note: For example, the ByteString
encoding of "xd800"
(11101101 10100000 10000000
ed a0 80
) is considered invalid UTF8 by the Text
library
To be consistent we make the same assumption of also restricting representable Unicode chars as in Unicode.D76.
>>>
TE.decodeUtf8 "\237\160\128"
"*** Exception: Cannot decode byte '\xed': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream
The "xdfff" case (11101101 10111111 10111111
ed bf bf
):
>>> TE.decodeUtf8 "237191191"
"*** Exception: Cannot decode byte '\xed': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream
>>>
displ . decodeUtf8 $ (unsafeSetPayload () "Hello" :: Enc '["r-ASCII"] () B.ByteString)
"Enc '[r-ASCII] () (Text Hello)"
"r-UTF8" is redundant:
>>>
displ . utf8Demote . decodeUtf8 $ (unsafeSetPayload () "Hello" :: Enc '["r-UTF8"] () B.ByteString)
"Enc '[] () (Text Hello)"
decodeUtf8
and encodeUtf8
now form isomorphism
\x -> getPayload x == (getPayload . encodeUtf8 . decodeUtf8 @ '["r-UTF8"] @() $ x)
\x -> getPayload x == (getPayload . decodeUtf8 . encodeUtf8 @ '["r-UTF8"] @() $ x)
These nicely work as iso's for "r-ASCII" subset
\x -> getPayload x == (getPayload . encodeUtf8 . decodeUtf8 @ '["r-ASCII"] @() $ x)
\x -> getPayload x == (getPayload . decodeUtf8 . encodeUtf8 @ '["r-ASCII"] @() $ x)
Similarly to pack
this function makes unverified assumption
that the encoding stack xs
does invalidate UTF8 byte layout. This is safe for any "r-" encoding as well
as any of the "enc-" and "do-" encodings that can be currently found in this library.
Future versions of this method are likely to introduce constraints that guarantee better type safety.
This is technically unsafe (even if we ignore the use of unsafeSetPayload
) of decodeUtf8
since currently "r-ban:999"
does not have ByteString
instances and that violates the assumption of matching encoding/decoding stacks
on both sides.
>>> displ . decodeUtf8 $ (unsafeSetPayload () "123" :: Enc '["r-ban:999"] () B.ByteString)
"Enc '[r-ban:999] () (Text 123)"
See Data.TypedEncoding.Conv for more detailed discussion.
Note: implementation uses the partial decodeUtf8
function but provides type level guarantee that it this function
will not error out unless unsafe combinators were used in constructing the encoded input
Since: 0.4.0.0
decodeUtf8_1 :: Superset "r-UTF8" y => Enc '[y] c ByteString -> Enc '[y] c Text Source #
simplified version of decodeUtf8
that works on single r- encodings
@since 0.5.2.0
encodeUtf8 :: forall xs c t y ys encs. (UnSnoc xs ~ (,) ys y, Superset "r-UTF8" y, encs ~ RemoveRs ys, AllEncodeInto "r-UTF8" encs) => Enc xs c Text -> Enc xs c ByteString Source #
>>>
displ $ encodeUtf8 $ utf8Promote $ toEncoding () ("text" :: T.Text)
"Enc '[r-UTF8] () (ByteString text)"
See decodeUtf8
. Similar type safety concerns apply.
See Data.TypedEncoding.Conv for more detailed discussion.
Since: 0.4.0.0
encodeUtf8_1 :: Superset "r-UTF8" y => Enc '[y] c Text -> Enc '[y] c ByteString Source #
simplified version of decodeUtf8
that works on single r- encodings
@since 0.5.2.0