Copyright	(c) 2020 Composewell Technologies
License	Apache-2.0
Maintainer	streamly@composewell.com
Stability	experimental
Portability	GHC
Safe Haskell	None
Language	Haskell2010

Streamly.LZ4

Contents

Configuration
Combinators

Description

Streaming APIs for LZ4 (https://github.com/lz4/lz4) compression and decompression.

A compressed LZ4 object (e.g. a file) may be represented by a sequence of one or more LZ4 frames defined by the LZ4 frame format. A frame consists of a frame header followed by a number of compressed blocks and a frame footer. The frame header defines the attributes of the compression method and the blocks in the frame. For example, the blocks may be independently compressed or future blocks may depend on the past blocks. It may also describe the maximum size of the blocks in the frame and use of some optional features.

This module exposes combinators to only compress or decompress the stream of blocks in a frame and not the frame itself. See the Streamly.Internal.LZ4 module for an experimental frame parsing function.

How the blocks are encoded, depends on the attributes specified in the frame header. We provide a BlockConfig parameter to specify those options when decoding or encoding a stream of blocks. Assuming you have parsed the frame, you can set the BlockConfig accordingly to parse the stream of blocks appropriately.

Please build with fusion-plugin for best performance. See the streamly build guide for more details.

The APIs are not yet stable and may change in future.

Synopsis

data BlockConfig
defaultBlockConfig :: BlockConfig
data BlockSize
- = BlockHasSize
- | BlockMax64KB
- | BlockMax256KB
- | BlockMax1MB
- | BlockMax4MB
setBlockMaxSize :: BlockSize -> BlockConfig -> BlockConfig
compressChunks :: MonadIO m => BlockConfig -> Int -> SerialT m (Array Word8) -> SerialT m (Array Word8)
decompressChunks :: MonadIO m => BlockConfig -> SerialT m (Array Word8) -> SerialT m (Array Word8)

Configuration

data BlockConfig Source #

Defines the LZ4 compressed block format. Please note that the Uncompressed length field is optional and not in the LZ4 specification.

 ----------------------------------------------------------------------
| Compressed length | Uncompressed length | Data | Checksum            |
|       (4 byte)    | (4 byte) (optional) |      | (4 byte) (optional) |
 ----------------------------------------------------------------------

Compressed length is the length of the Data field only. Uncompressed length is present only when the setBlockMaxSize is set to BlockHasSize. Checksum is present when setBlockChecksum is set to True. The 4-byte fields are stored in machine byte order.

defaultBlockConfig :: BlockConfig Source #

The default settings are:

data BlockSize Source #

Maximum uncompressed size of a data block.

Constructors

BlockHasSize	Block header has uncompressed size after the compressed size field. Please note that this option is not in the LZ4 specification.
BlockMax64KB
BlockMax256KB
BlockMax1MB
BlockMax4MB

setBlockMaxSize :: BlockSize -> BlockConfig -> BlockConfig Source #

Set the maximum uncompressed size of the data block.

Combinators

compressChunks :: MonadIO m => BlockConfig -> Int -> SerialT m (Array Word8) -> SerialT m (Array Word8) Source #

compressChunks config speedup stream compresses an input stream of Array word8 using the configuration defined by config. The resulting stream is of type Array Word8 where each array represents a compressed input block. Each input array becomes one compressed block.

speedup is a compression speedup factor, more the value of speedup faster the compression but the size of compressed data may increase. The factor should be between 1 and 65537 inclusive, if it is less than 1 it is set to 1 if it is more than 65537 then it is set to 65537.

LZ4 does not allow an uncompressed block of size more than 2,113,929,216 (0x7E000000) bytes (a little less than 2GiB). If the compressed block length is more than maximum uncompressed block length (approximately 2GiB) it would result in a decompression error.

See BlockConfig for more details about the format of the compressed block.

Since 0.1.0

decompressChunks :: MonadIO m => BlockConfig -> SerialT m (Array Word8) -> SerialT m (Array Word8) Source #

Decompress a stream of Array Word8 compressed using LZ4 stream compression. See compressChunks for the format of the input blocks. The input chunks could be of any size, they are resized to the appropriate block size before decompression based on block headers. The decompressed output arrays correspond to one compressed block each.

Since 0.1.0