bytesmith-0.3.8.0: Nonresumable byte parser
Safe HaskellNone
LanguageHaskell2010

Data.Bytes.Parser.Latin

Description

Parse input as though it were text encoded by ISO 8859-1 (Latin-1). All byte sequences are valid text under ISO 8859-1.

Synopsis

Matching

Required

char :: e -> Char -> Parser e s () Source #

Consume the next character, failing if it does not match the expected value or if there is no more input.

char2 :: e -> Char -> Char -> Parser e s () Source #

Consume the next two characters, failing if they do not match the expected values.

char2 e a b === char e a *> char e b

char3 :: e -> Char -> Char -> Char -> Parser e s () Source #

Consume three characters, failing if they do not match the expected values.

char3 e a b c === char e a *> char e b *> char e c

char4 :: e -> Char -> Char -> Char -> Char -> Parser e s () Source #

Consume four characters, failing if they do not match the expected values.

char4 e a b c d === char e a *> char e b *> char e c *> char e d

char5 :: e -> Char -> Char -> Char -> Char -> Char -> Parser e s () Source #

Consume five characters, failing if they do not match the expected values.

char6 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () Source #

Consume six characters, failing if they do not match the expected values.

char7 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () Source #

Consume seven characters, failing if they do not match the expected values.

char8 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () Source #

Consume eight characters, failing if they do not match the expected values.

char9 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () Source #

Consume nine characters, failing if they do not match the expected values.

char10 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () Source #

Consume ten characters, failing if they do not match the expected values.

char11 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () Source #

Consume eleven characters, failing if they do not match the expected values.

char12 :: e -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Char -> Parser e s () Source #

Consume twelve characters, failing if they do not match the expected values.

Try

trySatisfy :: (Char -> Bool) -> Parser e s Bool Source #

Runs the predicate on the next character in the input. If the predicate is matched, this consumes the character. Otherwise, the character is not consumed. This returns False if the end of the input has been reached. This never fails.

trySatisfyThen Source #

Arguments

:: forall (r :: RuntimeRep) (e :: Type) (s :: Type) (a :: TYPE r). Parser e s a

Default parser. Runs on Nothing or end of input.

-> (Char -> Maybe (Parser e s a))

Parser-selecting predicate

-> Parser e s a 

Runs the function on the next character in the input. If the function returns Just, this consumes the character and then runs the parser on the remaining input. If the function returns Nothing, this does not consume the tested character, and it runs the default parser on the input (which includes the tested character). If there is no input remaining, this also runs the default parser. This combinator never fails.

One Character

any :: e -> Parser e s Char Source #

Consumes and returns the next character in the input.

opt :: Parser e s (Maybe Char) Source #

Consume a character from the input or return Nothing if end of the stream has been reached. Since ISO 8859-1 maps every bytes to a character, this parser never fails.

opt# :: Parser e s (# (# #) | Char# #) Source #

Variant of opt with unboxed result.

Many Characters

takeTrailedBy :: e -> Char -> Parser e s Bytes Source #

Take characters until the specified character is encountered. Consumes the matched character as well. Fails if the character is not present. Visually, the cursor advancement and resulting Bytes for takeTrailedBy 'D' look like this:

 A B C D E F | input
|->->->-|    | cursor
{-*-*-}      | result bytes

Lookahead

peek :: Parser e s (Maybe Char) Source #

Match any character, to perform lookahead. Returns Nothing if end of input has been reached. Does not consume any input.

Note: Because this parser does not fail, do not use it with combinators such as many, because such as many, because such parsers loop until a failure occurs. Careless use will thus result in an infinite loop.

peek' :: e -> Parser e s Char Source #

Match any byte, to perform lookahead. Does not consume any input, but will fail if end of input has been reached.

Skip

skipDigits :: Parser e s () Source #

Skip the characters 0-9 until a non-digit is encountered. This parser does not fail.

skipDigits1 :: e -> Parser e s () Source #

Variant of skipDigits that requires at least one digit to be present.

skipChar :: Char -> Parser e s () Source #

Skip the character any number of times. This succeeds even if the character was not present.

skipChar1 :: e -> Char -> Parser e s () Source #

Skip the character any number of times. It must occur at least once or else this will fail.

skipTrailedBy :: e -> Char -> Parser e s () Source #

Skip all characters until the terminator is encountered and then consume the matching character as well. Visually, skipTrailedBy 'C' advances the cursor like this:

 A Z B Y C X C W
|->->->->-|

This fails if it reaches the end of input without encountering the character.

skipUntil :: Char -> Parser e s () Source #

Skip all characters until the terminator is encountered. This does not consume the terminator. Visually, skipUntil 'C' advances the cursor like this:

 A Z B Y C X C W
|->->->-|

This succeeds if it reaches the end of the input without encountering the terminator. It never fails.

skipWhile :: (Char -> Bool) -> Parser e s () Source #

Skip while the predicate is matched. This is always inlined.

End of Input

endOfInput :: e -> Parser e s () Source #

Fails if there is still more input remaining.

isEndOfInput :: Parser e s Bool Source #

Returns true if there are no more bytes in the input. Returns false otherwise. Always succeeds.

Numbers

Decimal

Unsigned

decWord :: e -> Parser e s Word Source #

Parse a decimal-encoded number. If the number is too large to be represented by a machine word, this fails with the provided error message. This accepts any number of leading zeroes.

decWord8 :: e -> Parser e s Word8 Source #

Parse a decimal-encoded 8-bit word. If the number is larger than 255, this parser fails.

decWord16 :: e -> Parser e s Word16 Source #

Parse a decimal-encoded 16-bit word. If the number is larger than 65535, this parser fails.

decWord32 :: e -> Parser e s Word32 Source #

Parse a decimal-encoded 32-bit word. If the number is larger than 4294967295, this parser fails.

decWord64 :: e -> Parser e s Word64 Source #

Parse a decimal-encoded unsigned number. If the number is too large to be represented by a 64-bit word, this fails with the provided error message. This accepts any number of leading zeroes.

Signed

decUnsignedInt :: e -> Parser e s Int Source #

Parse a decimal-encoded number. If the number is too large to be represented by a machine integer, this fails with the provided error message. This rejects input with that is preceeded by plus or minus. Consequently, it does not parse negative numbers. Use decStandardInt or decSignedInt for that purpose. On a 64-bit platform decWord will successfully parse 9223372036854775808 (i.e. 2 ^ 63), but decUnsignedInt will fail. This parser allows leading zeroes.

decUnsignedInt# :: e -> Parser e s Int# Source #

Variant of decUnsignedInt with an unboxed result.

decSignedInt :: e -> Parser e s Int Source #

Parse a decimal-encoded number. If the number is too large to be represented by a machine integer, this fails with the provided error message. This allows the number to optionally be prefixed by plus or minus. If the sign prefix is not present, the number is interpreted as positive. This allows leading zeroes.

decStandardInt :: e -> Parser e s Int Source #

Parse a decimal-encoded number. If the number is too large to be represented by a machine integer, this fails with the provided error message. This allows the number to optionally be prefixed by minus. If the minus prefix is not present, the number is interpreted as positive. The disallows a leading plus sign. For example, decStandardInt rejects +42, but decSignedInt allows it.

decTrailingInt Source #

Arguments

:: e

Error message

-> Int

Leading digit, should be between 0 and 9.

-> Parser e s Int 

Variant of decUnsignedInt that lets the caller supply a leading digit. This is useful when parsing formats like JSON where integers with leading zeroes are considered invalid. The calling context must consume the first digit before calling this parser. Results are always positive numbers.

decSignedInteger :: e -> Parser e s Integer Source #

Parse a decimal-encoded integer of arbitrary size. This accepts input that begins with a plus or minus sign. Input without a sign prefix is interpreted as positive.

decUnsignedInteger :: e -> Parser e s Integer Source #

Parse a decimal-encoded positive integer of arbitrary size. This rejects input that begins with a plus or minus sign.

decTrailingInteger Source #

Arguments

:: Int

Leading digit, should be between 0 and 9.

-> Parser e s Integer 

Variant of decUnsignedInteger that lets the caller supply a leading digit. This is useful when parsing formats like JSON where integers with leading zeroes are considered invalid. The calling context must consume the first digit before calling this parser. Results are always positive numbers.

Hexadecimal

Variable Length

hexWord8 :: e -> Parser e s Word8 Source #

Parse a hexadecimal-encoded 8-bit word. If the number is larger than 255, this parser fails. This allows leading zeroes and is insensitive to case. For example, 00A, 0a and A would all be accepted as the same number.

hexWord16 :: e -> Parser e s Word16 Source #

Parse a hexadecimal-encoded 16-bit word. If the number is larger than 65535, this parser fails. This allows leading zeroes and is insensitive to case. For example, 0100a and 100A would both be accepted as the same number.

Fixed Length

hexFixedWord8 :: e -> Parser e s Word8 Source #

Parse exactly two ASCII-encoded characters, interpretting them as the hexadecimal encoding of a 8-bit number. Note that this rejects a sequence such as A, requiring 0A instead. This is insensitive to case.

hexFixedWord16 :: e -> Parser e s Word16 Source #

Parse exactly four ASCII-encoded characters, interpreting them as the hexadecimal encoding of a 16-bit number. Note that this rejects a sequence such as 5A9, requiring 05A9 instead. This is insensitive to case. This is particularly useful when parsing escape sequences in C or JSON, which allow encoding characters in the Basic Multilingual Plane as \uhhhh.

hexFixedWord32 :: e -> Parser e s Word32 Source #

Parse exactly eight ASCII-encoded characters, interpreting them as the hexadecimal encoding of a 32-bit number. Note that this rejects a sequence such as BC5A9, requiring 000BC5A9 instead. This is insensitive to case.

hexFixedWord64 :: e -> Parser e s Word64 Source #

Parse exactly 16 ASCII-encoded characters, interpreting them as the hexadecimal encoding of a 64-bit number. Note that this rejects a sequence such as BC5A9, requiring 00000000000BC5A9 instead. This is insensitive to case.

Digit

hexNibbleLower :: e -> Parser e s Word Source #

Consume a single character that is the lowercase hexadecimal encoding of a 4-bit word. Fails if the character is not in the class [a-f0-9].

tryHexNibbleLower :: Parser e s (Maybe Word) Source #

Consume a single character that is the lowercase hexadecimal encoding of a 4-bit word. Returns Nothing without consuming the character if it is not in the class [a-f0-9]. The parser never fails.

hexNibble :: e -> Parser e s Word Source #

Consume a single character that is the case-insensitive hexadecimal encoding of a 4-bit word. Fails if the character is not in the class [a-fA-F0-9].

tryHexNibble :: Parser e s (Maybe Word) Source #

Consume a single character that is the case-insensitive hexadecimal encoding of a 4-bit word. Returns Nothing without consuming the character if it is not in the class [a-fA-F0-9]. This parser never fails.