Copyright | (c) 2020 Composewell Technologies |
---|---|
License | Apache-2.0 |
Maintainer | streamly@composewell.com |
Stability | experimental |
Portability | GHC |
Safe Haskell | None |
Language | Haskell2010 |
Streaming APIs for LZ4 (https://github.com/lz4/lz4) compression and decompression.
A compressed LZ4 object (e.g. a file) may be represented by a sequence of one or more LZ4 frames defined by the LZ4 frame format. A frame consists of a frame header followed by a number of compressed blocks and a frame footer. The frame header defines the attributes of the compression method and the blocks in the frame. For example, the blocks may be independently compressed or future blocks may depend on the past blocks. It may also describe the maximum size of the blocks in the frame and use of some optional features.
This module exposes combinators to only compress or decompress the stream of blocks in a frame and not the frame itself. See the Streamly.Internal.LZ4 module for an experimental frame parsing function.
How the blocks are encoded, depends on the attributes specified in the frame
header. We provide a BlockConfig
parameter to specify those options when
decoding or encoding a stream of blocks. Assuming you have parsed the frame,
you can set the BlockConfig
accordingly to parse the stream of blocks
appropriately.
Please build with fusion-plugin for best performance. See the streamly build guide for more details.
The APIs are not yet stable and may change in future.
Synopsis
- data BlockConfig
- defaultBlockConfig :: BlockConfig
- data BlockSize
- setBlockMaxSize :: BlockSize -> BlockConfig -> BlockConfig
- compressChunks :: MonadIO m => BlockConfig -> Int -> SerialT m (Array Word8) -> SerialT m (Array Word8)
- decompressChunks :: MonadIO m => BlockConfig -> SerialT m (Array Word8) -> SerialT m (Array Word8)
Configuration
data BlockConfig Source #
Defines the LZ4 compressed block format. Please note that the
Uncompressed length
field is optional and not in the LZ4 specification.
---------------------------------------------------------------------- | Compressed length | Uncompressed length | Data | Checksum | | (4 byte) | (4 byte) (optional) | | (4 byte) (optional) | ----------------------------------------------------------------------
Compressed length is the length of the Data
field only. Uncompressed
length is present only when the setBlockMaxSize
is set to BlockHasSize
.
Checksum is present when setBlockChecksum
is set to True
. The 4-byte
fields are stored in machine byte order.
defaultBlockConfig :: BlockConfig Source #
The default settings are:
Maximum uncompressed size of a data block.
BlockHasSize | Block header has uncompressed size after the compressed size field. Please note that this option is not in the LZ4 specification. |
BlockMax64KB | |
BlockMax256KB | |
BlockMax1MB | |
BlockMax4MB |
setBlockMaxSize :: BlockSize -> BlockConfig -> BlockConfig Source #
Set the maximum uncompressed size of the data block.
Combinators
compressChunks :: MonadIO m => BlockConfig -> Int -> SerialT m (Array Word8) -> SerialT m (Array Word8) Source #
compressChunks config speedup stream
compresses an input stream of
Array word8
using the configuration defined by config
. The resulting
stream is of type Array Word8
where each array represents a compressed
input block. Each input array becomes one compressed block.
speedup
is a compression speedup factor, more the value of speedup
faster the compression but the size of compressed data may increase. The
factor should be between 1 and 65537 inclusive, if it is less than 1 it is
set to 1 if it is more than 65537 then it is set to 65537.
LZ4 does not allow an uncompressed block of size more than 2,113,929,216 (0x7E000000) bytes (a little less than 2GiB). If the compressed block length is more than maximum uncompressed block length (approximately 2GiB) it would result in a decompression error.
See BlockConfig
for more details about the format of the compressed block.
Since 0.1.0
decompressChunks :: MonadIO m => BlockConfig -> SerialT m (Array Word8) -> SerialT m (Array Word8) Source #
Decompress a stream of Array Word8
compressed using LZ4 stream
compression. See compressChunks
for the format of the input blocks. The
input chunks could be of any size, they are resized to the appropriate block
size before decompression based on block headers. The decompressed output
arrays correspond to one compressed block each.
Since 0.1.0