Copyright | (c) 2015 Oleg Grenrus |
---|---|
License | BSD3 |
Maintainer | Oleg Grenrus <oleg.grenrus@iki.fi> |
Stability | experimental |
Safe Haskell | Safe |
Language | Haskell2010 |
- type RE' a = RE Char a
- data RE s a :: * -> * -> *
- sym :: Char -> RE' Char
- psym :: (Char -> Bool) -> RE' Char
- msym :: (Char -> Maybe a) -> RE' a
- anySym :: RE' Char
- string :: Text -> RE' Text
- reFoldl :: Greediness -> (b -> a -> b) -> b -> RE' a -> RE' b
- data Greediness :: *
- few :: RE' a -> RE' [a]
- withMatched :: RE' a -> RE' (a, Text)
- match :: RE' a -> Text -> Maybe a
- (=~) :: Text -> RE' a -> Maybe a
- replace :: RE' Text -> Text -> Text
- findFirstPrefix :: RE' a -> Text -> Maybe (a, Text)
- findLongestPrefix :: RE' a -> Text -> Maybe (a, Text)
- findShortestPrefix :: RE' a -> Text -> Maybe (a, Text)
- findFirstInfix :: RE' a -> Text -> Maybe (Text, a, Text)
- findLongestInfix :: RE' a -> Text -> Maybe (Text, a, Text)
- findShortestInfix :: RE' a -> Text -> Maybe (Text, a, Text)
- module Control.Applicative
Types
data RE s a :: * -> * -> *
Type of regular expressions that recognize symbols of type s
and
produce a result of type a
.
Regular expressions can be built using Functor
, Applicative
and
Alternative
instances in the following natural way:
f
<$>
ra
matches iffra
matches, and its return value is the result of applyingf
to the return value ofra
.pure
x
matches the empty string (i.e. it does not consume any symbols), and its return value isx
rf
<*>
ra
matches a string iff it is a concatenation of two strings: one matched byrf
and the other matched byra
. The return value isf a
, wheref
anda
are the return values ofrf
andra
respectively.ra
<|>
rb
matches a string which is accepted by eitherra
orrb
. It is left-biased, so if both can match, the result ofra
is used.empty
is a regular expression which does not match any string.many
ra
matches concatenation of zero or more strings matched byra
and returns the list ofra
's return values on those strings.some
ra
matches concatenation of one or more strings matched byra
and returns the list ofra
's return values on those strings.
Smart constructors
psym :: (Char -> Bool) -> RE' Char Source
Match and return a single Char
which satisfies the predicate
msym :: (Char -> Maybe a) -> RE' a Source
Like psym
, but allows to return a computed value instead of the
original symbol
string :: Text -> RE' Text Source
Match and return the given Text
.
import Text.Regex.Applicative number = string "one" *> pure 1 <|> string "two" *> pure 2 main = print $ "two" =~ number
reFoldl :: Greediness -> (b -> a -> b) -> b -> RE' a -> RE' b Source
Match zero or more instances of the given expression, which are combined using the given folding function.
Greediness
argument controls whether this regular expression should match
as many as possible (Greedy
) or as few as possible (NonGreedy
) instances
of the underlying expression.
data Greediness :: *
withMatched :: RE' a -> RE' (a, Text) Source
Return matched symbols as part of the return value
Basic matchers
match :: RE' a -> Text -> Maybe a Source
Attempt to match a Text
against the regular expression.
Note that the whole string (not just some part of it) should be matched.
>>> match (sym 'a' <|> sym 'b') "a" Just 'a' >>> match (sym 'a' <|> sym 'b') "ab" Nothing
replace :: RE' Text -> Text -> Text Source
Replace matches of regular expression with it's value.
>>> replace ("!" <$ sym 'f' <* some (sym 'o')) "quuxfoofooooofoobarfobar" "quux!!!bar!bar"
Advanced matchers
findFirstPrefix :: RE' a -> Text -> Maybe (a, Text) Source
Find a string prefix which is matched by the regular expression.
Of all matching prefixes, pick one using left bias (prefer the left part of
<|>
to the right part) and greediness.
This is the match which a backtracking engine (such as Perl's one) would find first.
If match is found, the rest of the input is also returned.
>>> findFirstPrefix ("a" <|> "ab") "abc" Just ("a","bc") >>> findFirstPrefix ("ab" <|> "a") "abc" Just ("ab","c") >>> findFirstPrefix "bc" "abc" Nothing
findLongestPrefix :: RE' a -> Text -> Maybe (a, Text) Source
Find the longest string prefix which is matched by the regular expression.
Submatches are still determined using left bias and greediness, so this is different from POSIX semantics.
If match is found, the rest of the input is also returned.
>>> let keyword = "if" >>> let identifier = many $ psym isAlpha >>> let lexeme = (Left <$> keyword) <|> (Right <$> identifier) >>> findLongestPrefix lexeme "if foo" Just (Left "if"," foo") >>> findLongestPrefix lexeme "iffoo" Just (Right "iffoo","")
findShortestPrefix :: RE' a -> Text -> Maybe (a, Text) Source
Find the shortest prefix (analogous to findLongestPrefix
)
findFirstInfix :: RE' a -> Text -> Maybe (Text, a, Text) Source
Find the leftmost substring that is matched by the regular expression.
Otherwise behaves like findFirstPrefix
. Returns the result together with
the prefix and suffix of the string surrounding the match.
findLongestInfix :: RE' a -> Text -> Maybe (Text, a, Text) Source
Find the leftmost substring that is matched by the regular expression.
Otherwise behaves like findLongestPrefix
. Returns the result together with
the prefix and suffix of the string surrounding the match.
findShortestInfix :: RE' a -> Text -> Maybe (Text, a, Text) Source
Find the leftmost substring that is matched by the regular expression.
Otherwise behaves like findShortestPrefix
. Returns the result together with
the prefix and suffix of the string surrounding the match.
Module re-exports
module Control.Applicative