grammatical-parsers: parsers that combine into grammars

[ bsd3, library, parsing, program, text ] [ Propose Tags ] [ Report a vulnerability ]
Versions [RSS] 0.1, 0.2, 0.2.1, 0.2.2, 0.3, 0.3.1, 0.3.2, 0.4, 0.4.0.1, 0.4.1, 0.4.1.1, 0.4.1.2, 0.5, 0.5.1, 0.5.2, 0.6, 0.7, 0.7.0.1, 0.7.1, 0.7.2
Change log CHANGELOG.md
Dependencies attoparsec (>=0.13 && <0.15), base (>=4.9 && <5), containers (>=0.5.7.0 && <0.8), grammatical-parsers, input-parsers (<0.4), monoid-subclasses (>=1.0 && <1.3), parsers (<0.13), rank2classes (>=1.4.6 && <1.6), transformers (>=0.5 && <0.7), witherable (>=0.4 && <0.6) [details]
Tested with ghc ==9.8.2, ghc ==9.6.4, ghc ==9.4.8, ghc ==9.2.8, ghc ==9.0.2, ghc ==8.10.7
License BSD-3-Clause
Copyright (c) 2017 Mario Blažević
Author Mario Blažević
Maintainer Mario Blažević <blamario@protonmail.com>
Category Text, Parsing
Home page https://github.com/blamario/grampa/tree/master/grammatical-parsers
Bug tracker https://github.com/blamario/grampa/issues
Source repo head: git clone https://github.com/blamario/grampa
Uploaded by MarioBlazevic at 2024-12-15T15:51:59Z
Distributions
Reverse Dependencies 3 direct, 1 indirect [details]
Executables boolean-transformations, arithmetic
Downloads 8039 total (57 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2024-12-15 [all 1 reports]

Readme for grammatical-parsers-0.7.2

[back to package description]

Grammatical Parsers

Behold, yet another parser combinator library in Haskell. Except this one is capable of working with grammars rather than mere parsers. A more in-depth description is available in the paper from Haskell Symposium 2017, what follows is a short tutorial.

You can apply the usual Applicative, Alternative, and Monad operators to combine primitive parsers into larger ones. The combinators from the parsers library type classes are also available. Here are some typical imports you may need:

{-# LANGUAGE RecordWildCards, ScopedTypeVariables, TemplateHaskell #-}
module README where
import Control.Applicative
import Data.Char (isDigit)
import Data.Functor.Classes (Show1, showsPrec1)
import Text.Grampa
import Text.Grampa.ContextFree.Memoizing.LeftRecursive (Parser)
import qualified Rank2
import qualified Rank2.TH

What puts this library apart from most is that the parsers it allows you to construct are grammatical, just as the library name says. Instead of writing the parser definitions as top-level bindings, you can and should group them into a grammar record definition, like this:

arithmetic :: Rank2.Apply g => GrammarBuilder Arithmetic g Parser String
arithmetic Arithmetic{..} = Arithmetic{
   sum= product
         <|> string "-" *> (negate <$> product)
         <|> (+) <$> sum <* string "+" <*> product
         <|> (-) <$> sum <* string "-" <*> product,
   product= factor
         <|> (*) <$> product <* string "*" <*> factor
         <|> div <$> product <* string "/" <*> factor,
   factor= read <$> number
           <|> string "(" *> sum <* string ")",
   number= takeCharsWhile1 isDigit <?> "number"}

What on Earth for? One good reason is that these parser definitions can then be left-recursive, which is usually deadly for parser combinator libraries. There are other benefits like memoization and grammar composability, and the main downside is the obligation to declare the grammar record:

data Arithmetic f = Arithmetic{sum     :: f Int,
                               product :: f Int,
                               factor  :: f Int,
                               number  :: f String}

and to make it an instance of several rank 2 type classes:

$(Rank2.TH.deriveAll ''Arithmetic)

Optionally, you may also be inclined to declare a proper Show instance, as it's often handy:

instance Show1 f => Show (Arithmetic f) where
   show Arithmetic{..} =
      "Arithmetic{\n  sum=" ++ showsPrec1 0 sum
           (",\n  product=" ++ showsPrec1 0 factor
           (",\n  factor=" ++ showsPrec1 0 factor
           (",\n  number=" ++ showsPrec1 0 number "}")))

Once that's done, use fixGrammar to, well, fix the grammar

grammar = fixGrammar arithmetic

and then parseComplete or parsePrefix to parse some input.

-- |
-- >>> parseComplete grammar "42"
-- Arithmetic{
--   sum=Compose (Right [42]),
--   product=Compose (Right [42]),
--   factor=Compose (Right [42]),
--   number=Compose (Right ["42"])}
-- >>> parseComplete grammar "1+2*3"
-- Arithmetic{
--   sum=Compose (Right [7]),
--   product=Compose (Left (ParseFailure {failurePosition = Down 4, expectedAlternatives = FailureDescription {staticDescriptions = ["end of input"], literalDescriptions = []}, errorAlternatives = []})),
--   factor=Compose (Left (ParseFailure {failurePosition = Down 4, expectedAlternatives = FailureDescription {staticDescriptions = ["end of input"], literalDescriptions = []}, errorAlternatives = []})),
--   number=Compose (Left (ParseFailure {failurePosition = Down 4, expectedAlternatives = FailureDescription {staticDescriptions = ["end of input"], literalDescriptions = []}, errorAlternatives = []}))}
-- >>> parsePrefix grammar "1+2*3 apples"
-- Arithmetic{
--   sum=Compose (Compose (Right [("+2*3 apples",1),("*3 apples",3),(" apples",7)])),
--   product=Compose (Compose (Right [("+2*3 apples",1)])),
--   factor=Compose (Compose (Right [("+2*3 apples",1)])),
--   number=Compose (Compose (Right [("+2*3 apples","1")]))}

To see more grammar examples, go straight to the examples directory that builds up several smaller grammars and combines them all together in the Combined module.

For more conventional tastes there are monolithic examples of Lua and Oberon grammars as well.