Copyright | ©2019 James Brock |
---|---|
License | BSD2 |
Maintainer | James Brock <jamesbrock@gmail.com> |
Safe Haskell | None |
Language | Haskell2010 |
Replace.Megaparsec is for finding text patterns, and also editing and replacing the found patterns. This activity is traditionally done with regular expressions, but Replace.Megaparsec uses Text.Megaparsec parsers instead for the pattern matching.
Replace.Megaparsec can be used in the same sort of “pattern capture” or “find all” situations in which one would use Python re.findall, or Perl m//, or Unix grep.
Replace.Megaparsec can be used in the same sort of “stream editing” or “search-and-replace” situations in which one would use Python re.sub, or Perl s///, or Unix sed, or awk.
See the replace-megaparsec package README for usage examples.
Synopsis
- sepCap :: forall e s m a. MonadParsec e s m => m a -> m [Either (Tokens s) a]
- findAll :: MonadParsec e s m => m a -> m [Either (Tokens s) (Tokens s)]
- findAllCap :: MonadParsec e s m => m a -> m [Either (Tokens s) (Tokens s, a)]
- streamEdit :: forall e s a. (Ord e, Stream s, Monoid s, Tokens s ~ s, Show s, Show (Token s), Typeable s) => Parsec e s a -> (a -> s) -> s -> s
- streamEditT :: forall e s m a. (Ord e, Stream s, Monad m, Monoid s, Tokens s ~ s, Show s, Show (Token s), Typeable s) => ParsecT e s m a -> (a -> m s) -> s -> m s
Parser combinator
:: MonadParsec e s m | |
=> m a | The pattern matching parser |
-> m [Either (Tokens s) a] |
Separate and capture
Parser combinator to find all of the non-overlapping ocurrences
of the pattern parser sep
in a text stream.
The sepCap
parser will always consume its entire input and can never fail.
Output
The input stream is separated into a list of sections:
- sections which can parsed by the pattern
sep
will be captured as matching sections inRight
- non-matching sections of the stream will be captured in
Left
.
There are two constraints on the output:
- The output list will non-empty. If there are no pattern matches, then
the entire input stream will be returned as one non-matching
Left
section. If the input is""
then the output list will be[Left ""]
. - The output list will not contain two consecutive
Left
s.
Zero-width matches forbidden
If the pattern matching parser sep
would succeed without consuming any
input then sepCap
will force it to fail.
If we allow sep
to match a zero-width pattern,
then it can match the same zero-width pattern again at the same position
on the next iteration, which would result in an infinite number of
overlapping pattern matches.
Special accelerated inputs
There are specialization re-write rules to speed up this function when the input type is Data.Text or Data.Bytestring.
Error parameter
The error type parameter e
for sep
should usually be Void
,
because sep
fails on every token in a non-matching Left
section,
so parser failures will not be reported.
Notes
This sepCap
parser combinator is the basis for all of the other
features of this module.
It is similar to the sep*
family of functions
found in
parser-combinators
and
parsers
but, importantly, it returns the parsed result of the sep
parser instead
of throwing it away, like
manyTill_.
:: MonadParsec e s m | |
=> m a | The pattern matching parser |
-> m [Either (Tokens s) (Tokens s)] |
:: MonadParsec e s m | |
=> m a | The pattern matching parser |
-> m [Either (Tokens s) (Tokens s, a)] |
Find all occurences, parse and capture pattern matches
Parser combinator for finding all occurences of a pattern in a stream.
Will call sepCap
with the match
combinator so that
the text which matched the pattern parser sep
will be returned in
the Right
sections, along with the result of the parse of sep
.
Definition:
findAllCap sep =sepCap
(match
sep)
Running parser
:: (Ord e, Stream s, Monoid s, Tokens s ~ s, Show s, Show (Token s), Typeable s) | |
=> Parsec e s a | The parser |
-> (a -> s) | The |
-> s | The input stream of text to be edited. |
-> s |
Stream editor
Also known as “find-and-replace”, or “match-and-substitute”. Finds all
of the sections of the stream which match the pattern sep
, and replaces
them with the result of the editor
function.
This function is not a “parser combinator,” it is
a “way to run a parser”, like parse
or runParserT
.
Access the matched section of text in the editor
If you want access to the matched string in the editor
function,
then combine the pattern parser sep
with match
.
This will effectively change the type of the editor
function
to (s,a) -> s
.
This allows us to write an editor
function which can choose to not
edit the match and just leave it as it is. If the editor
function
returns the first item in the tuple, then streamEdit
will not change
the matched string.
So, for all sep
:
streamEdit (match
sep)fst
≡id
Type constraints
The type of the stream of text that is input must
be Stream s
such that Tokens s ~ s
, because we want
to output the same type of stream that was input. That requirement is
satisfied for all the Stream
instances included
with Text.Megaparsec:
Data.Text,
Data.Text.Lazy,
Data.Bytestring,
Data.Bytestring.Lazy,
and Data.String.
We need the Monoid s
instance so that we can mconcat
the output
stream.
The error type parameter e
should usually be Void
.
:: (Ord e, Stream s, Monad m, Monoid s, Tokens s ~ s, Show s, Show (Token s), Typeable s) | |
=> ParsecT e s m a | The parser |
-> (a -> m s) | The |
-> s | The input stream of text to be edited. |
-> m s |
Stream editor transformer
Monad transformer version of streamEdit
.
Both the parser sep
and the editor
function run in the underlying
monad context.
If you want to do IO
operations in the editor
function or the
parser sep
, then run this in IO
.
If you want the editor
function or the parser sep
to remember some state,
then run this in a stateful monad.