bytebuild-0.3.11.0: Build byte arrays
Safe HaskellNone
LanguageHaskell2010

Data.Bytes.Builder

Synopsis

Bounded Primitives

data Builder Source #

An unmaterialized sequence of bytes that may be pasted into a mutable byte array.

Instances

Instances details
IsString Builder Source # 
Instance details

Defined in Data.Bytes.Builder.Unsafe

Methods

fromString :: String -> Builder #

Semigroup Builder Source # 
Instance details

Defined in Data.Bytes.Builder.Unsafe

Monoid Builder Source # 
Instance details

Defined in Data.Bytes.Builder.Unsafe

ToBuilder Builder Source #

Identity

Instance details

Defined in Data.Bytes.Builder.Class

fromBounded :: Nat n -> Builder n -> Builder Source #

Convert a bounded builder to an unbounded one. If the size is a constant, use Arithmetic.Nat.constant as the first argument to let GHC conjure up this value for you.

Evaluation

run Source #

Arguments

:: Int

Size of initial chunk (use 4080 if uncertain)

-> Builder

Builder

-> Chunks 

Run a builder.

runOnto Source #

Arguments

:: Int

Size of initial chunk (use 4080 if uncertain)

-> Builder

Builder

-> Chunks 
-> Chunks 

Run a builder. The resulting chunks are consed onto the beginning of an existing sequence of chunks.

reversedOnto Source #

Arguments

:: Int

Size of initial chunk (use 4080 if uncertain)

-> Builder

Builder

-> Chunks 
-> Chunks 

Variant of runOnto that conses the additional chunks in reverse order.

putMany Source #

Arguments

:: Foldable f 
=> Int

Size of shared chunk (use 8176 if uncertain)

-> (a -> Builder)

Value builder

-> f a

Collection of values

-> (MutableBytes RealWorld -> IO b)

Consume chunks.

-> IO () 

Run a builder against lots of elements. This fills the same underlying buffer over and over again. Do not let the argument to the callback escape from the callback (i.e. do not write it to an IORef). Also, do not unsafeFreezeByteArray any of the mutable byte arrays in the callback. The intent is that the callback will write the buffer out.

putManyConsLength Source #

Arguments

:: (Foldable f, MonadIO m) 
=> Nat n

Number of bytes used by the serialization of the length

-> (Int -> Builder n)

Length serialization function

-> Int

Size of shared chunk (use 8176 if uncertain)

-> (a -> Builder)

Value builder

-> f a

Collection of values

-> (MutableBytes RealWorld -> m b)

Consume chunks.

-> m () 

Variant of putMany that prefixes each pushed array of chunks with the number of bytes that the chunks in each batch required. (This excludes the bytes required to encode the length itself.) This is useful for chunked HTTP encoding.

Materialized Byte Sequences

bytes :: Bytes -> Builder Source #

Create a builder from a sliced byte sequence. The variants copy and insert provide more control over whether or not the byte sequence is copied or aliased. This function is preferred when the user does not know the size of the byte sequence.

copy :: Bytes -> Builder Source #

Create a builder from a byte sequence. This always results in a call to memcpy. This is beneficial when the byte sequence is known to be small (less than 256 bytes).

copyCons :: Word8 -> Bytes -> Builder Source #

Variant of copy that additionally pastes an extra byte in front of the bytes.

copy2 :: Bytes -> Bytes -> Builder Source #

Create a builder from two byte sequences. This always results in two calls to memcpy. This is beneficial when the byte sequences are known to be small (less than 256 bytes).

insert :: Bytes -> Builder Source #

Create a builder from a byte sequence. This never calls memcpy. Instead, it pushes a chunk that references the argument byte sequence. This wastes the remaining space in the active chunk, so it may adversely affect performance if used carelessly. See flush for a way to mitigate this problem. This functions is most beneficial when the byte sequence is known to be large (more than 8192 bytes).

byteArray :: ByteArray -> Builder Source #

Create a builder from an unsliced byte sequence. Implemented with bytes.

shortByteString :: ShortByteString -> Builder Source #

Create a builder from a short bytestring. Implemented with bytes.

shortTextUtf8 :: ShortText -> Builder Source #

Create a builder from text. The text will be UTF-8 encoded.

shortTextJsonString :: ShortText -> Builder Source #

Create a builder from text. The text will be UTF-8 encoded, and JSON special characters will be escaped. Additionally, the result is surrounded by double quotes. For example:

  • foo ==> "foo" (no escape sequences)
  • \_"_/ ==> "\\_\"_/" (escapes backslashes and quotes)
  • hello<ESC>world ==> "hello\u001Bworld" (where <ESC> is code point 0x1B)

cstring :: CString -> Builder Source #

Create a builder from a NUL-terminated CString. This ignores any textual encoding, copying bytes until NUL is reached.

cstringLen :: CStringLen -> Builder Source #

Create a builder from a C string with explicit length. The builder must be executed before the C string is freed.

stringUtf8 :: String -> Builder Source #

Create a builder from a cons-list of Char. These must be UTF-8 encoded.

Byte Sequence Encodings

sevenEightRight :: Bytes -> Builder Source #

Encode seven bytes into eight so that the encoded form is eight-bit clean. Specifically segment the input bytes inot 7-bit groups (lowest-to-highest index byte, most-to-least significant bit within a byte), pads the last group with trailing zeros, and forms octects by prepending a zero to each group.

The name was chosen because this pads the input bits with zeros on the right, and also because this was likely the originally-indended behavior of the SMILE standard (see sevenEightSmile). Right padding the input bits to a multiple of seven, as in this variant, is consistent with base64 encodings (which encodes 3 bytes in 4) and base85 (which encodes 4 bytes in 5).

sevenEightSmile :: Bytes -> Builder Source #

Encode seven bytes into eight so that the encoded form is eight-bit clean. Specifically segment the input bytes inot 7-bit groups (lowest-to-highest index byte, most-to-least significant bit within a byte), then pad each group with zeros on the left until each group is an octet.

The name was chosen because this is the implementation that is used (probably unintentionally) in the reference SMILE implementation, and so is expected tp be accepted by existing SMILE consumers.

Encode Integral Types

Human-Readable

word64Dec :: Word64 -> Builder Source #

Encodes an unsigned 64-bit integer as decimal. This encoding never starts with a zero unless the argument was zero.

word32Dec :: Word32 -> Builder Source #

Encodes an unsigned 16-bit integer as decimal. This encoding never starts with a zero unless the argument was zero.

word16Dec :: Word16 -> Builder Source #

Encodes an unsigned 16-bit integer as decimal. This encoding never starts with a zero unless the argument was zero.

word8Dec :: Word8 -> Builder Source #

Encodes an unsigned 8-bit integer as decimal. This encoding never starts with a zero unless the argument was zero.

wordDec :: Word -> Builder Source #

Encodes an unsigned machine-sized integer as decimal. This encoding never starts with a zero unless the argument was zero.

naturalDec :: Natural -> Builder Source #

Encodes an unsigned arbitrary-precision integer as decimal. This encoding never starts with a zero unless the argument was zero.

int64Dec :: Int64 -> Builder Source #

Encodes a signed 64-bit integer as decimal. This encoding never starts with a zero unless the argument was zero. Negative numbers are preceded by a minus sign. Positive numbers are not preceded by anything.

int32Dec :: Int32 -> Builder Source #

Encodes a signed 32-bit integer as decimal. This encoding never starts with a zero unless the argument was zero. Negative numbers are preceded by a minus sign. Positive numbers are not preceded by anything.

int16Dec :: Int16 -> Builder Source #

Encodes a signed 16-bit integer as decimal. This encoding never starts with a zero unless the argument was zero. Negative numbers are preceded by a minus sign. Positive numbers are not preceded by anything.

int8Dec :: Int8 -> Builder Source #

Encodes a signed 8-bit integer as decimal. This encoding never starts with a zero unless the argument was zero. Negative numbers are preceded by a minus sign. Positive numbers are not preceded by anything.

intDec :: Int -> Builder Source #

Encodes a signed machine-sized integer as decimal. This encoding never starts with a zero unless the argument was zero. Negative numbers are preceded by a minus sign. Positive numbers are not preceded by anything.

integerDec :: Integer -> Builder Source #

Encode a signed arbitrary-precision integer as decimal. This encoding never starts with a zero unless the argument was zero. Negative numbers are preceded by a minus sign. Positive numbers are not preceded by anything.

Unsigned Words

64-bit

word64PaddedUpperHex :: Word64 -> Builder Source #

Encode a 64-bit unsigned integer as hexadecimal, zero-padding the encoding to 16 digits. This uses uppercase for the alphabetical digits. For example, this encodes the number 1022 as 00000000000003FE.

32-bit

word32PaddedUpperHex :: Word32 -> Builder Source #

Encode a 32-bit unsigned integer as hexadecimal, zero-padding the encoding to 8 digits. This uses uppercase for the alphabetical digits. For example, this encodes the number 1022 as 000003FE.

16-bit

word16PaddedUpperHex :: Word16 -> Builder Source #

Encode a 16-bit unsigned integer as hexadecimal, zero-padding the encoding to 4 digits. This uses uppercase for the alphabetical digits. For example, this encodes the number 1022 as 03FE.

word16PaddedLowerHex :: Word16 -> Builder Source #

Encode a 16-bit unsigned integer as hexadecimal, zero-padding the encoding to 4 digits. This uses lowercase for the alphabetical digits. For example, this encodes the number 1022 as 03fe.

word16LowerHex :: Word16 -> Builder Source #

Encode a 16-bit unsigned integer as hexadecimal without leading zeroes. This uses lowercase for the alphabetical digits. For example, this encodes the number 1022 as 3fe.

word16UpperHex :: Word16 -> Builder Source #

Encode a 16-bit unsigned integer as hexadecimal without leading zeroes. This uses uppercase for the alphabetical digits. For example, this encodes the number 1022 as 3FE.

8-bit

word8PaddedUpperHex :: Word8 -> Builder Source #

Encode a 8-bit unsigned integer as hexadecimal, zero-padding the encoding to 2 digits. This uses uppercase for the alphabetical digits. For example, this encodes the number 11 as 0B.

word8LowerHex :: Word8 -> Builder Source #

Encode a 16-bit unsigned integer as hexadecimal without leading zeroes. This uses lowercase for the alphabetical digits. For example, this encodes the number 1022 as 3FE.

ascii :: Char -> Builder Source #

Encode an ASCII char. Precondition: Input must be an ASCII character. This is not checked.

ascii2 :: Char -> Char -> Builder Source #

Encode two ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

ascii3 :: Char -> Char -> Char -> Builder Source #

Encode three ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

ascii4 :: Char -> Char -> Char -> Char -> Builder Source #

Encode four ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

ascii5 :: Char -> Char -> Char -> Char -> Char -> Builder Source #

Encode five ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

ascii6 :: Char -> Char -> Char -> Char -> Char -> Char -> Builder Source #

Encode six ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

ascii7 :: Char -> Char -> Char -> Char -> Char -> Char -> Char -> Builder Source #

Encode seven ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

ascii8 :: Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Builder Source #

Encode eight ASCII characters. Precondition: Must be an ASCII characters. This is not checked.

char :: Char -> Builder Source #

Encode a UTF-8 char. This only uses as much space as is required.

Machine-Readable

One

word8 :: Word8 -> Builder Source #

Requires exactly 1 byte.

Big Endian

word256BE :: Word256 -> Builder Source #

Requires exactly 32 bytes. Dump the octets of a 256-bit word in a big-endian fashion.

word128BE :: Word128 -> Builder Source #

Requires exactly 16 bytes. Dump the octets of a 128-bit word in a big-endian fashion.

word64BE :: Word64 -> Builder Source #

Requires exactly 8 bytes. Dump the octets of a 64-bit word in a big-endian fashion.

word32BE :: Word32 -> Builder Source #

Requires exactly 4 bytes. Dump the octets of a 32-bit word in a big-endian fashion.

word16BE :: Word16 -> Builder Source #

Requires exactly 2 bytes. Dump the octets of a 16-bit word in a big-endian fashion.

int64BE :: Int64 -> Builder Source #

Requires exactly 8 bytes. Dump the octets of a 64-bit signed integer in a big-endian fashion.

int32BE :: Int32 -> Builder Source #

Requires exactly 4 bytes. Dump the octets of a 32-bit signed integer in a big-endian fashion.

int16BE :: Int16 -> Builder Source #

Requires exactly 2 bytes. Dump the octets of a 16-bit signed integer in a big-endian fashion.

Little Endian

word256LE :: Word256 -> Builder Source #

Requires exactly 32 bytes. Dump the octets of a 256-bit word in a little-endian fashion.

word128LE :: Word128 -> Builder Source #

Requires exactly 16 bytes. Dump the octets of a 128-bit word in a little-endian fashion.

word64LE :: Word64 -> Builder Source #

Requires exactly 8 bytes. Dump the octets of a 64-bit word in a little-endian fashion.

word32LE :: Word32 -> Builder Source #

Requires exactly 4 bytes. Dump the octets of a 32-bit word in a little-endian fashion.

word16LE :: Word16 -> Builder Source #

Requires exactly 2 bytes. Dump the octets of a 16-bit word in a little-endian fashion.

int64LE :: Int64 -> Builder Source #

Requires exactly 8 bytes. Dump the octets of a 64-bit signed integer in a little-endian fashion.

int32LE :: Int32 -> Builder Source #

Requires exactly 4 bytes. Dump the octets of a 32-bit signed integer in a little-endian fashion.

int16LE :: Int16 -> Builder Source #

Requires exactly 2 bytes. Dump the octets of a 16-bit signed integer in a little-endian fashion.

LEB128

intLEB128 :: Int -> Builder Source #

Encode a signed machine-sized integer with LEB-128. This uses zig-zag encoding.

int32LEB128 :: Int32 -> Builder Source #

Encode a 32-bit signed integer with LEB-128. This uses zig-zag encoding.

int64LEB128 :: Int64 -> Builder Source #

Encode a 64-bit signed integer with LEB-128. This uses zig-zag encoding.

wordLEB128 :: Word -> Builder Source #

Encode a machine-sized word with LEB-128.

word32LEB128 :: Word32 -> Builder Source #

Encode a 32-bit word with LEB-128.

word64LEB128 :: Word64 -> Builder Source #

Encode a 64-bit word with LEB-128.

Many

word8Array :: PrimArray Word8 -> Int -> Int -> Builder Source #

Create a builder from a slice of an array of Word8. There is the same as bytes but is provided as a convenience for users working with different types.

Big Endian

Little Endian

Prefixing with Length

consLength Source #

Arguments

:: Nat n

Number of bytes used by the serialization of the length

-> (Int -> Builder n)

Length serialization function

-> Builder

Builder whose length is measured

-> Builder 

Prefix a builder with the number of bytes that it requires.

consLength32LE :: Builder -> Builder Source #

Variant of consLength32BE the encodes the length in a little-endian fashion.

consLength32BE :: Builder -> Builder Source #

Prefix a builder with its size in bytes. This size is presented as a big-endian 32-bit word. The need to prefix a builder with its length shows up a numbers of wire protocols including those of PostgreSQL and Apache Kafka. Note the equivalence:

forall (n :: Int) (x :: Builder).
  let sz = sizeofByteArray (run n (consLength32BE x))
  consLength32BE x === word32BE (fromIntegral sz) <> x

However, using consLength32BE is much more efficient here since it only materializes the ByteArray once.

consLength64BE :: Builder -> Builder Source #

Prefix a builder with its size in bytes. This size is presented as a big-endian 64-bit word. See consLength32BE.

Encode Floating-Point Types

Human-Readable

doubleDec :: Double -> Builder Source #

Encode a double-floating-point number, using decimal notation or scientific notation depending on the magnitude. This has undefined behavior when representing +inf, -inf, and NaN. It will not crash, but the generated numbers will be nonsense.

Replication

replicate Source #

Arguments

:: Int

Number of times to replicate the byte

-> Word8

Byte to replicate

-> Builder 

Replicate a byte the given number of times.

Control

flush :: Int -> Builder Source #

Push the buffer currently being filled onto the chunk list, allocating a new active buffer of the requested size. This is helpful when a small builder is sandwhiched between two large zero-copy builders:

insert bigA <> flush 1 <> word8 0x42 <> insert bigB

Without flush 1, word8 0x42 would see the zero-byte active buffer that insert returned, decide that it needed more space, and allocate a 4080-byte buffer to which only a single byte would be written.

Rebuild

rebuild :: Builder -> Builder Source #

This function and the documentation for it are copied from Takano Akio's fast-builder library.

rebuild b is equivalent to b, but it allows GHC to assume that b will be run at most once. This can enable various optimizations that greately improve performance.

There are two types of typical situations where a use of rebuild is often a win:

  • When constructing a builder using a recursive function. e.g. rebuild $ foldr ....
  • When constructing a builder using a conditional expression. e.g. rebuild $ case x of ...