Copyright | (c) Chris Penner 2019 |
---|---|
License | BSD3 |
Safe Haskell | None |
Language | Haskell2010 |
Synopsis
- regex :: QuasiQuoter
- match :: IndexedTraversal' [ByteString] Match ByteString
- groups :: IndexedLens' ByteString Match [ByteString]
- group :: Int -> IndexedTraversal' ByteString Match ByteString
- namedGroups :: IndexedLens' ByteString Match (Map ByteString ByteString)
- namedGroup :: ByteString -> IndexedTraversal' ByteString Match ByteString
- matchAndGroups :: Getter Match (ByteString, [ByteString])
- regexing :: Regex -> IndexedTraversal' Int ByteString Match
- mkRegexTraversalQQ :: [PCREOption] -> QuasiQuoter
- data Match
- data Regex
Basics
regex :: QuasiQuoter Source #
Builds a traversal over text using a Regex pattern
It's a QuasiQuoter
which creates a Traversal out of the given regex string.
It's equivalent to calling regexing
on a Regex
created using the
re
QuasiQuoter.
The "real" type is:
regex :: Regex -> IndexedTraversal' Int BS.ByteString Match
It's a traversal which selects Match
es; compose it with match
or groups
to get the relevant parts of your match.
>>>
txt = "raindrops on roses and whiskers on kittens"
Search
>>>
has ([regex|whisk|]) txt
True
Get matches
>>>
txt ^.. [regex|\br\w+|] . match
["raindrops","roses"]
Edit matches
>>>
txt & [regex|\br\w+|] . match %~ Char8.intersperse '-' . Char8.map toUpper
"R-A-I-N-D-R-O-P-S on R-O-S-E-S and whiskers on kittens"
Get Groups
>>>
txt ^.. [regex|(\w+) on (\w+)|] . groups
[["raindrops","roses"],["whiskers","kittens"]]
Edit Groups
>>>
txt & [regex|(\w+) on (\w+)|] . groups %~ reverse
"roses on raindrops and kittens on whiskers"
Get the third match
>>>
txt ^? [regex|\w+|] . index 2 . match
Just "roses"
Edit matches
>>>
txt & [regex|\br\w+|] . match %~ Char8.intersperse '-' . Char8.map toUpper
"R-A-I-N-D-R-O-P-S on R-O-S-E-S and whiskers on kittens"
Get Groups
>>>
txt ^.. [regex|(\w+) on (\w+)|] . groups
[["raindrops","roses"],["whiskers","kittens"]]
Edit Groups
>>>
txt & [regex|(\w+) on (\w+)|] . groups %~ reverse
"roses on raindrops and kittens on whiskers"
Get the third match
>>>
txt ^? [regex|\w+|] . index 2 . match
Just "roses"
Match integers, Read
them into ints, then sort them in-place
dumping them back into the source text afterwards.
>>>
"Monday: 29, Tuesday: 99, Wednesday: 3" & partsOf ([regex|\d+|] . match . from packedChars . _Show @Int) %~ sort
"Monday: 3, Tuesday: 29, Wednesday: 99"
To alter behaviour of the regex you may wish to pass PCREOption
s when compiling it.
The default behaviour may seem strange in certain cases; e.g. it operates in 'single-line'
mode. You can compile
the Regex
separately and add any options you like, then pass the resulting
Regex
into regex
;
Alternatively can make your own version of the QuasiQuoter with any options you want embedded
by using mkRegexQQ
.
match :: IndexedTraversal' [ByteString] Match ByteString Source #
Traverse each match
Stashes any matched groups into the index in case you need them.
Get a match if one exists:
>>>
"find a needle in a haystack" ^? [regex|n..dle|] . match
Just "needle"
Collect all matches
>>>
"one _two_ three _four_" ^.. [regex|_\w+_|] . match
["_two_","_four_"]
You can edit the traversal to perform a regex replace/substitution
>>>
"one _two_ three _four_" & [regex|_\w+_|] . match %~ Char8.map toUpper
"one _TWO_ three _FOUR_"
Here we use the group matches stored in the index to form key-value pairs, replacing the entire match.
>>>
"abc-def, ghi-jkl" & [regex|(\w+)-(\w+)|] . match %@~ \[k, v] _ -> "{" <> k <> ":" <> v <> "}"
"{abc:def}, {ghi:jkl}"
groups :: IndexedLens' ByteString Match [ByteString] Source #
Access all groups of a match as a list. Stashes the full match text as the index in case you need it.
Changing the length of the list has behaviour similar to partsOf
.
Get all matched groups:
>>>
"raindrops on roses and whiskers on kittens" ^.. [regex|(\w+) on (\w+)|] . groups
[["raindrops","roses"],["whiskers","kittens"]]
You can access a specific group combining with ix
, or just use group
instead
>>>
"raindrops on roses and whiskers on kittens" ^.. [regex|(\w+) on (\w+)|] . groups . ix 1
["roses","kittens"]
Editing groups:
>>>
"raindrops on roses and whiskers on kittens" & [regex|(\w+) on (\w+)|] . groups . ix 1 %~ Char8.map toUpper
"raindrops on ROSES and whiskers on KITTENS"
Editing the list rearranges groups
>>>
"raindrops on roses and whiskers on kittens" & [regex|(\w+) on (\w+)|] . groups %~ reverse
"roses on raindrops and kittens on whiskers"
You can traverse the list to flatten out all groups
>>>
"raindrops on roses and whiskers on kittens" ^.. [regex|(\w+) on (\w+)|] . groups . traversed
["raindrops","roses","whiskers","kittens"]
Use indexed helpers to access the full match when operating on a group.
This replaces each group with the full match text wrapped in parens:
>>>
"one-two" & [regex|(\w+)-(\w+)|] . groups <. traversed %@~ \mtch grp -> grp <> ":(" <> mtch <> ")"
"one:(one-two)-two:(one-two)"
group :: Int -> IndexedTraversal' ByteString Match ByteString Source #
Access a specific group of a match. Numbering starts at 0.
Stashes the full match text as the index in case you need it.
See groups
for more info on grouping
>>>
"key:value, a:b" ^.. [regex|(\w+):(\w+)|] . group 0
["key","a"]
>>>
"key:value, a:b" ^.. [regex|(\w+):(\w+)|] . group 1
["value","b"]
>>>
"key:value, a:b" & [regex|(\w+):(\w+)|] . group 1 %~ Char8.map toUpper
"key:VALUE, a:B"
Replace the first capture group with the full match:
>>>
"a, b" & [regex|(\w+), (\w+)|] . group 0 .@~ \i -> "(" <> i <> ")"
"(a, b), b"
namedGroups :: IndexedLens' ByteString Match (Map ByteString ByteString) Source #
Access all the named groups of a match as a Map
. Stashes the full match text as the index in case
you need it.
Note that you can edit the groups through this lens, but the behaviour is undefined when editing inner elements of nested groups. Behaviour is undefined if groups are removed from the map (so don't do that).
NOTE: There's currently some strange behaviour in pcre-heavy where trailing unmatched optional groups are omitted, I'm looking into getting that patched, but for now, note the difference in behaviour:
>>>
"A" ^? [regex|(?<a>A)|(?<b>B)|] . namedGroups
Just (fromList [("a","A")])
>>>
"B" ^? [regex|(?<a>A)|(?<b>B)|] . namedGroups
Just (fromList [("a",""),("b","B")])
Get all matched groups:
>>>
"raindrops on roses and whiskers on kittens" ^.. [regex|(?<first>\w+) on (?<second>\w+)|] . namedGroups
[fromList [("first","raindrops"),("second","roses")],fromList [("first","whiskers"),("second","kittens")]]
You can access a specific group combining with ix
, or just use namedGroup
instead
>>>
"raindrops on roses and whiskers on kittens" ^.. [regex|(?<first>\w+) on (?<second>\w+)|] . namedGroups . ix "second"
["roses","kittens"]
Editing groups:
>>>
"raindrops on roses and whiskers on kittens" & [regex|(?<first>\w+) on (?<second>\w+)|] . namedGroups . ix "second" %~ Char8.map toUpper
"raindrops on ROSES and whiskers on KITTENS"
Use indexed helpers to access the full match when operating on a group.
This replaces the "first" group with the full match text wrapped in parens:
>>>
"one-two" & [regex|(?<first>\w+)-(\w+)|] . namedGroups <. ix "first" %@~ \mtch grp -> grp <> ":(" <> mtch <> ")"
"one:(one-two)-two"
namedGroup :: ByteString -> IndexedTraversal' ByteString Match ByteString Source #
Access a specific named group of a match
See namedGroups
for caveats and more info.
Stashes the full match text as the index in case you need it.
>>>
"key:value, a:b" ^.. [regex|(?<first>\w+):(?<second>\w+)|] . namedGroup "first"
["key","a"]
>>>
"key:value, a:b" ^.. [regex|(?<first>\w+):(?<second>\w+)|] . namedGroup "second"
["value","b"]
>>>
"key:value, a:b" & [regex|(?<first>\w+):(?<second>\w+)|] . namedGroup "second" %~ Char8.map toUpper
"key:VALUE, a:B"
Replace the first capture group with the full match:
>>>
"a, b" & [regex|(?<first>\w+), (?<second>\w+)|] . namedGroup "first" .@~ \i -> "(" <> i <> ")"
"(a, b), b"
matchAndGroups :: Getter Match (ByteString, [ByteString]) Source #
Collect both the match text AND all the matching groups
>>>
"raindrops on roses and whiskers on kittens" ^.. [regex|(\w+) on (\w+)|] . matchAndGroups
[("raindrops on roses",["raindrops","roses"]),("whiskers on kittens",["whiskers","kittens"])]
Compiling regexes to Traversals
regexing :: Regex -> IndexedTraversal' Int ByteString Match Source #
Build a traversal from the provided Regex
, this is handy if you're QuasiQuoter
averse, or if you already have a Regex
object floating around.
Also see mkRegexTraversalQQ
mkRegexTraversalQQ :: [PCREOption] -> QuasiQuoter Source #
Build a QuasiQuoter just like regex
but with the provided PCREOption
overrides.
Types
Match represents an opaque regex match.
You can drill into it using match
, groups
, group
, namedGroup
, namedGroups
or matchAndGroups