Safe Haskell | None |
---|---|
Language | Haskell2010 |
Parsing all context-free grammars using Earley's algorithm.
Synopsis
- data Prod r e t a
- terminal :: (t -> Maybe a) -> Prod r e t a
- (<?>) :: Prod r e t a -> e -> Prod r e t a
- data Grammar r a
- rule :: Prod r e t a -> Grammar r (Prod r e t a)
- satisfy :: (t -> Bool) -> Prod r e t t
- token :: Eq t => t -> Prod r e t t
- namedToken :: Eq t => t -> Prod r t t t
- list :: Eq t => [t] -> Prod r e t [t]
- listLike :: (Eq t, ListLike i t) => i -> Prod r e t i
- data Report e i = Report {
- position :: Int
- expected :: [e]
- unconsumed :: i
- data Result s e i a
- type Parser e i a = forall s. i -> ST s (Result s e i a)
- parser :: ListLike i t => (forall r. Grammar r (Prod r e t a)) -> Parser e i a
- allParses :: Parser e i a -> i -> ([(a, Int)], Report e i)
- fullParses :: ListLike i t => Parser e i a -> i -> ([a], Report e i)
- report :: Parser e i a -> i -> Report e i
- type Generator t a = forall s. ST s (Result s t a)
- generator :: (forall r. Grammar r (Prod r e t a)) -> [t] -> Generator t a
- language :: Generator t a -> [(a, [t])]
- upTo :: Int -> Generator t a -> [(a, [t])]
- exactly :: Int -> Generator t a -> [(a, [t])]
Context-free grammars
A production.
The type parameters are:
a
: The return type of the production.
t
for terminal: The type of the terminals that the production operates
on.
e
for expected: The type of names, used for example to report expected
tokens.
r
for rule: The type of a non-terminal. This plays a role similar to the
s
in the type ST s a
. Since the parser
function expects the r
to be
universally quantified, there is not much to do with this parameter other
than leaving it universally quantified.
As an example,
is the type of a production that
returns an Prod
r String
Char
Int
Int
, operates on (lists of) characters and reports String
names.
Most of the functionality of Prod
s is obtained through its instances, e.g.
Functor
, Applicative
, and Alternative
.
Instances
Functor (Prod r e t) Source # | |
Applicative (Prod r e t) Source # | |
Defined in Text.Earley.Grammar | |
Alternative (Prod r e t) Source # | |
(IsString t, Eq t, a ~ t) => IsString (Prod r e t a) Source # | String literals can be interpreted as
|
Defined in Text.Earley.Grammar fromString :: String -> Prod r e t a # | |
Semigroup a => Semigroup (Prod r e t a) Source # | |
Monoid a => Monoid (Prod r e t a) Source # | |
terminal :: (t -> Maybe a) -> Prod r e t a Source #
Match a token for which the given predicate returns Just a
,
and return the a
.
(<?>) :: Prod r e t a -> e -> Prod r e t a infixr 0 Source #
A named production (used for reporting expected things).
A context-free grammar.
The type parameters are:
a
: The return type of the grammar (often a Prod
).
r
for rule: The type of a non-terminal. This plays a role similar to the
s
in the type ST s a
. Since the parser
function expects the r
to be
universally quantified, there is not much to do with this parameter other
than leaving it universally quantified.
Most of the functionality of Grammar
s is obtained through its instances,
e.g. Monad
and MonadFix
. Note that GHC has syntactic sugar for
MonadFix
: use {-# LANGUAGE RecursiveDo #-}
and mdo
instead of
do
.
rule :: Prod r e t a -> Grammar r (Prod r e t a) Source #
Create a new non-terminal by giving its production.
Derived operators
satisfy :: (t -> Bool) -> Prod r e t t Source #
Match a token that satisfies the given predicate. Returns the matched token.
namedToken :: Eq t => t -> Prod r t t t Source #
Match a single token and give it the name of the token.
listLike :: (Eq t, ListLike i t) => i -> Prod r e t i Source #
Match a ListLike
of tokens in sequence.
Parsing
A parsing report, which contains fields that are useful for presenting errors to the user if a parse is deemed a failure. Note however that we get a report even when we successfully parse something.
Report | |
|
Instances
(Eq e, Eq i) => Eq (Report e i) Source # | |
(Ord e, Ord i) => Ord (Report e i) Source # | |
Defined in Text.Earley.Parser.Internal | |
(Read e, Read i) => Read (Report e i) Source # | |
(Show e, Show i) => Show (Report e i) Source # | |
The result of a parse.
Ended (Report e i) | The parser ended. |
Parsed (ST s [a]) Int i (ST s (Result s e i a)) | The parser parsed a number of |
parser :: ListLike i t => (forall r. Grammar r (Prod r e t a)) -> Parser e i a Source #
Create a parser from the given grammar.
allParses :: Parser e i a -> i -> ([(a, Int)], Report e i) Source #
Return all parses from the result of a given parser. The result may
contain partial parses. The Int
s are the position at which a result was
produced.
The elements of the returned list of results are sorted by their position in ascending order. If there are multiple results at the same position they are returned in an unspecified order.
fullParses :: ListLike i t => Parser e i a -> i -> ([a], Report e i) Source #
Return all parses that reached the end of the input from the result of a given parser.
If there are multiple results they are returned in an unspecified order.
Recognition
report :: Parser e i a -> i -> Report e i Source #
See e.g. how far the parser is able to parse the input string before it fails. This can be much faster than getting the parse results for highly ambiguous grammars.
Language generation
generator :: (forall r. Grammar r (Prod r e t a)) -> [t] -> Generator t a Source #
Create a language generator for given grammar and list of allowed tokens.
language :: Generator t a -> [(a, [t])] Source #
Run a generator, returning all members of the language.
The members are returned as parse results paired with the list of tokens used to produce the result. The elements of the returned list of results are sorted by their length in ascending order. If there are multiple results of the same length they are returned in an unspecified order.
upTo :: Int -> Generator t a -> [(a, [t])] Source #
upTo n gen
runs the generator gen
, returning all members of the
language that are of length less than or equal to n
.
The members are returned as parse results paired with the list of tokens used to produce the result. The elements of the returned list of results are sorted by their length in ascending order. If there are multiple results of the same length they are returned in an unspecified order.
exactly :: Int -> Generator t a -> [(a, [t])] Source #
exactly n gen
runs the generator gen
, returning all members of the
language that are of length equal to n
.
The members are returned as parse results paired with the list of tokens used to produce the result. If there are multiple results they are returned in an unspecified order.