Safe Haskell | None |
---|---|
Language | Haskell2010 |
Avro encoding and decoding routines.
This library provides a high level interface for encoding and decoding Haskell values in Apache's Avro serialization format.
Synopsis
- data Schema where
- Null
- Boolean
- Int { }
- Long { }
- Float
- Double
- Bytes { }
- String { }
- Array { }
- Map { }
- NamedType TypeName
- Record { }
- Enum { }
- Union { }
- Fixed {
- name :: TypeName
- aliases :: [TypeName]
- size :: Int
- logicalTypeF :: Maybe LogicalTypeFixed
- pattern Int' :: Schema
- pattern Long' :: Schema
- pattern Bytes' :: Schema
- pattern String' :: Schema
- data Field = Field {
- fldName :: Text
- fldAliases :: [Text]
- fldDoc :: Maybe Text
- fldOrder :: Maybe Order
- fldType :: Schema
- fldDefault :: Maybe DefaultValue
- data Order
- data TypeName = TN {}
- data Decimal = Decimal {}
- newtype LogicalTypeBytes = DecimalB Decimal
- data LogicalTypeFixed
- data LogicalTypeInt
- data LogicalTypeLong
- data LogicalTypeString = UUID
- data ReadSchema
- deconflict :: Schema -> Schema -> Either String ReadSchema
- readSchemaFromSchema :: Schema -> ReadSchema
- encodeValue :: (HasAvroSchema a, ToAvro a) => a -> ByteString
- encodeValueWithSchema :: ToAvro a => Schema -> a -> ByteString
- decodeValue :: forall a. (HasAvroSchema a, FromAvro a) => ByteString -> Either String a
- decodeValueWithSchema :: FromAvro a => ReadSchema -> ByteString -> Either String a
- decodeContainer :: forall a. (HasAvroSchema a, FromAvro a) => ByteString -> [Either String a]
- decodeContainerWithEmbeddedSchema :: forall a. FromAvro a => ByteString -> [Either String a]
- decodeContainerWithReaderSchema :: forall a. FromAvro a => Schema -> ByteString -> [Either String a]
- encodeContainer :: forall a. (HasAvroSchema a, ToAvro a) => Codec -> [[a]] -> IO ByteString
- encodeContainerWithSchema :: ToAvro a => Codec -> Schema -> [[a]] -> IO ByteString
- encodeContainerWithSync :: ToAvro a => Codec -> Schema -> ByteString -> [[a]] -> ByteString
- newSyncBytes :: IO ByteString
- extractContainerValuesBytes :: ByteString -> Either String (Schema, [Either String ByteString])
- decodeContainerValuesBytes :: forall a. FromAvro a => Schema -> ByteString -> Either String (Schema, [Either String (a, ByteString)])
- class ToAvro a
- class FromAvro a
- data Codec
- nullCodec :: Codec
- deflateCodec :: Codec
- class HasAvroSchema a where
- schemaOf :: HasAvroSchema a => a -> Schema
Schema
N.B. It is possible to create a Haskell value (of Schema
type) that is
not a valid Avro schema by violating one of the above or one of the
conditions called out in validateSchema
.
Null | |
Boolean | |
Int | |
Long | |
Float | |
Double | |
Bytes | |
String | |
Array | |
Map | |
NamedType TypeName | |
Record | |
Enum | |
Union | |
Fixed | |
|
Instances
Field | |
|
Instances
Instances
Eq Order Source # | |
Ord Order Source # | |
Show Order Source # | |
Generic Order Source # | |
Lift Order Source # | |
ToJSON Order Source # | |
Defined in Data.Avro.Schema.Schema | |
FromJSON Order Source # | |
NFData Order Source # | |
Defined in Data.Avro.Schema.Schema | |
type Rep Order Source # | |
Defined in Data.Avro.Schema.Schema |
A named type in Avro has a name and, optionally, a namespace.
A name is a string that starts with an ASCII letter or underscore followed by letters, underscores and digits:
name ::= [A-Za-z_][A-Za-z0-9_]*
Examples include "_foo7"
, Bar_
and "x"
.
A namespace is a sequence of names with the same lexical
structure. When written as a string, the components of a namespace
are separated with dots ("com.example"
).
TypeName
represents a fullname—a name combined with a
namespace. These are written and parsed as dot-separated
strings. The TypeName
TN Foo ["com", "example"]
is rendered
as "com.example.Foo"
.
Fullnames have to be globally unique inside an Avro schema.
A namespace of []
or [""]
is the "null namespace". In avro
an explicitly null-namespaced identifier is written as ".Foo"
Instances
Instances
Eq Decimal Source # | |
Ord Decimal Source # | |
Show Decimal Source # | |
Generic Decimal Source # | |
Lift Decimal Source # | |
NFData Decimal Source # | |
Defined in Data.Avro.Schema.Schema | |
type Rep Decimal Source # | |
Defined in Data.Avro.Schema.Schema type Rep Decimal = D1 (MetaData "Decimal" "Data.Avro.Schema.Schema" "avro-0.5.2.0-1fGNgWfYbHIOeLCKAVcy" False) (C1 (MetaCons "Decimal" PrefixI True) (S1 (MetaSel (Just "precision") NoSourceUnpackedness NoSourceStrictness DecidedStrict) (Rec0 Integer) :*: S1 (MetaSel (Just "scale") NoSourceUnpackedness NoSourceStrictness DecidedStrict) (Rec0 Integer))) |
newtype LogicalTypeBytes Source #
Instances
data LogicalTypeFixed Source #
Instances
data LogicalTypeInt Source #
Instances
data LogicalTypeLong Source #
Instances
data LogicalTypeString Source #
Instances
Deconflicting schemas
data ReadSchema Source #
This type represents a deconflicted version of a Schema
.
Schema resolution is described in Avro specification: https://avro.apache.org/docs/current/spec.html#Schema+Resolution
This library represents "normal" schema and "deconflicted" schema as different types to avoid confusion between these two usecases (we shouldn't serialise values with such schema) and to be able to accomodate some extra information that links between how data is supposed transformed between what reader wants and what writer has.
Instances
deconflict :: Schema -> Schema -> Either String ReadSchema Source #
deconflict writer reader
will produce a schema that can decode
with the writer's schema into the form specified by the reader's schema.
Schema resolution rules are described by the specification: https://avro.apache.org/docs/current/spec.html#Schema+Resolution
readSchemaFromSchema :: Schema -> ReadSchema Source #
Converts Schema
into ReadSchema
. This function may be useful when it is known
that the writer and the reader schemas are the same.
Individual values
encodeValue :: (HasAvroSchema a, ToAvro a) => a -> ByteString Source #
Serialises an individual value into Avro using the schema
from its coresponding HasAvroSchema
instance.
encodeValueWithSchema :: ToAvro a => Schema -> a -> ByteString Source #
Serialises an individual value into Avro with the schema provided.
decodeValue :: forall a. (HasAvroSchema a, FromAvro a) => ByteString -> Either String a Source #
Deserialises an individual value from Avro using the schema from its coresponding HasAvroSchema
.
NOTE: This function is only to be used when reader and writes schemas are known to be the same. Because only one schema is known at this point, and it is the reader schema, no decondlicting can be performed.
decodeValueWithSchema :: FromAvro a => ReadSchema -> ByteString -> Either String a Source #
Deserialises an individual value from Avro.
Working with containers
Decoding containers
decodeContainer :: forall a. (HasAvroSchema a, FromAvro a) => ByteString -> [Either String a] Source #
Decodes the container using a schema from HasAvroSchema
as a reader schema.
Errors are reported as a part of the list and the list will stop at first error. This means that the consumer will get all the "good" content from the container until the error is detected, then this error and then the list is finished.
decodeContainerWithEmbeddedSchema :: forall a. FromAvro a => ByteString -> [Either String a] Source #
Decodes the container as a list of values of the requested type.
Errors are reported as a part of the list and the list will stop at first error. This means that the consumer will get all the "good" content from the container until the error is detected, then this error and then the list is finished.
decodeContainerWithReaderSchema :: forall a. FromAvro a => Schema -> ByteString -> [Either String a] Source #
Decodes the container as a list of values of the requested type.
The provided reader schema will be de-conflicted with the schema embedded with the container.
Errors are reported as a part of the list and the list will stop at first error. This means that the consumer will get all the "good" content from the container until the error is detected, then this error and then the list is finished.
encodeContainer :: forall a. (HasAvroSchema a, ToAvro a) => Codec -> [[a]] -> IO ByteString Source #
Encode chunks of values into a container, using 16 random bytes for
the synchronization markers and a corresponding HasAvroSchema
schema.
Blocks are compressed (or not) according to the given Codec
(nullCodec
or deflateCodec
).
encodeContainerWithSchema :: ToAvro a => Codec -> Schema -> [[a]] -> IO ByteString Source #
Encode chunks of values into a container, using 16 random bytes for
the synchronization markers. Blocks are compressed (or not) according
to the given Codec
(nullCodec
or deflateCodec
).
encodeContainerWithSync :: ToAvro a => Codec -> Schema -> ByteString -> [[a]] -> ByteString Source #
Encode chunks of objects into a container, using the provided ByteString as the synchronization markers.
newSyncBytes :: IO ByteString Source #
Generates a new synchronization marker for encoding Avro containers
Extracting containers' data
extractContainerValuesBytes :: ByteString -> Either String (Schema, [Either String ByteString]) Source #
Splits container into a list of individual avro-encoded values.
This is particularly useful when slicing up containers into one or more smaller files. By extracting the original bytestring it is possible to avoid re-encoding data.
decodeContainerValuesBytes :: forall a. FromAvro a => Schema -> ByteString -> Either String (Schema, [Either String (a, ByteString)]) Source #
Splits container into a list of individual avro-encoded values. This version provides both encoded and decoded values.
This is particularly useful when slicing up containers into one or more smaller files. By extracting the original bytestring it is possible to avoid re-encoding data.
Classes
Describes how to encode Haskell data types into Avro bytes
Instances
Descrives how to convert a given intermediate Value
into a Haskell data type.
Instances
Compression
A Codec
allows for compression/decompression of a block in an
Avro container according to the Avro spec.
nullCodec
specifies null
required by Avro spec.
(see https://avro.apache.org/docs/1.8.1/spec.html#null)
deflateCodec :: Codec Source #
deflateCodec
specifies deflate
codec required by Avro spec.
(see https://avro.apache.org/docs/1.8.1/spec.html#deflate)
class HasAvroSchema a where Source #
Instances
schemaOf :: HasAvroSchema a => a -> Schema Source #