Copyright	(c) Chris Penner 2019
License	BSD3
Safe Haskell	None
Language	Haskell2010

Control.Lens.Regex.ByteString

Contents

Basics
Compiling regexes to Traversals
Types

Description

Synopsis

regex :: QuasiQuoter
match :: IndexedTraversal' [ByteString] Match ByteString
groups :: IndexedLens' ByteString Match [ByteString]
group :: Int -> IndexedTraversal' ByteString Match ByteString
namedGroups :: IndexedLens' ByteString Match (Map ByteString ByteString)
namedGroup :: ByteString -> IndexedTraversal' ByteString Match ByteString
matchAndGroups :: Getter Match (ByteString, [ByteString])
regexing :: Regex -> IndexedTraversal' Int ByteString Match
mkRegexTraversalQQ :: [PCREOption] -> QuasiQuoter
data Match
data Regex

Basics

regex :: QuasiQuoter Source #

Builds a traversal over text using a Regex pattern

It's a QuasiQuoter which creates a Traversal out of the given regex string. It's equivalent to calling regexing on a Regex created using the re QuasiQuoter.

The "real" type is:

regex :: Regex -> IndexedTraversal' Int BS.ByteString Match

It's a traversal which selects Matches; compose it with match or groups to get the relevant parts of your match.

>>> txt = "raindrops on roses and whiskers on kittens"

Search

>>> has ([regex|whisk|]) txt
True

Get matches

>>> txt ^.. [regex|\br\w+|] . match
["raindrops","roses"]

Edit matches

>>> txt & [regex|\br\w+|] . match %~ Char8.intersperse '-' . Char8.map toUpper
"R-A-I-N-D-R-O-P-S on R-O-S-E-S and whiskers on kittens"

Get Groups

>>> txt ^.. [regex|(\w+) on (\w+)|] . groups
[["raindrops","roses"],["whiskers","kittens"]]

Edit Groups

>>> txt & [regex|(\w+) on (\w+)|] . groups %~ reverse
"roses on raindrops and kittens on whiskers"

Get the third match

>>> txt ^? [regex|\w+|] . index 2 . match
Just "roses"

Edit matches

>>> txt & [regex|\br\w+|] . match %~ Char8.intersperse '-' . Char8.map toUpper
"R-A-I-N-D-R-O-P-S on R-O-S-E-S and whiskers on kittens"

Get Groups

>>> txt ^.. [regex|(\w+) on (\w+)|] . groups
[["raindrops","roses"],["whiskers","kittens"]]

Edit Groups

>>> txt & [regex|(\w+) on (\w+)|] . groups %~ reverse
"roses on raindrops and kittens on whiskers"

Get the third match

>>> txt ^? [regex|\w+|] . index 2 . match
Just "roses"

Match integers, Read them into ints, then sort them in-place dumping them back into the source text afterwards.

>>> "Monday: 29, Tuesday: 99, Wednesday: 3" & partsOf ([regex|\d+|] . match . from packedChars . _Show @Int) %~ sort
"Monday: 3, Tuesday: 29, Wednesday: 99"

To alter behaviour of the regex you may wish to pass PCREOptions when compiling it. The default behaviour may seem strange in certain cases; e.g. it operates in 'single-line' mode. You can compile the Regex separately and add any options you like, then pass the resulting Regex into regex; Alternatively can make your own version of the QuasiQuoter with any options you want embedded by using mkRegexQQ.

match :: IndexedTraversal' [ByteString] Match ByteString Source #

Traverse each match

Stashes any matched groups into the index in case you need them.

Get a match if one exists:

>>> "find a needle in a haystack" ^? [regex|n..dle|] . match
Just "needle"

Collect all matches

>>> "one _two_ three _four_" ^.. [regex|_\w+_|] . match
["_two_","_four_"]

You can edit the traversal to perform a regex replace/substitution

>>> "one _two_ three _four_" & [regex|_\w+_|] . match %~ Char8.map toUpper
"one _TWO_ three _FOUR_"

Here we use the group matches stored in the index to form key-value pairs, replacing the entire match.

>>> "abc-def, ghi-jkl" & [regex|(\w+)-(\w+)|] . match %@~ \[k, v] _ -> "{" <> k <> ":" <> v <> "}"
"{abc:def}, {ghi:jkl}"

groups :: IndexedLens' ByteString Match [ByteString] Source #

Access all groups of a match as a list. Stashes the full match text as the index in case you need it.

Changing the length of the list has behaviour similar to partsOf.

Get all matched groups:

>>> "raindrops on roses and whiskers on kittens" ^.. [regex|(\w+) on (\w+)|] . groups
[["raindrops","roses"],["whiskers","kittens"]]

You can access a specific group combining with ix, or just use group instead

>>> "raindrops on roses and whiskers on kittens" ^.. [regex|(\w+) on (\w+)|] . groups .  ix 1
["roses","kittens"]

Editing groups:

>>> "raindrops on roses and whiskers on kittens" & [regex|(\w+) on (\w+)|] . groups .  ix 1 %~ Char8.map toUpper
"raindrops on ROSES and whiskers on KITTENS"

Editing the list rearranges groups

>>> "raindrops on roses and whiskers on kittens" & [regex|(\w+) on (\w+)|] . groups %~ reverse
"roses on raindrops and kittens on whiskers"

You can traverse the list to flatten out all groups

>>> "raindrops on roses and whiskers on kittens" ^.. [regex|(\w+) on (\w+)|] . groups . traversed
["raindrops","roses","whiskers","kittens"]

Use indexed helpers to access the full match when operating on a group.

This replaces each group with the full match text wrapped in parens:

>>> "one-two" & [regex|(\w+)-(\w+)|] . groups <. traversed %@~ \mtch grp -> grp <> ":(" <> mtch <> ")"
"one:(one-two)-two:(one-two)"

group :: Int -> IndexedTraversal' ByteString Match ByteString Source #

Access a specific group of a match. Numbering starts at 0.

Stashes the full match text as the index in case you need it.

See groups for more info on grouping

>>> "key:value, a:b" ^.. [regex|(\w+):(\w+)|] . group 0
["key","a"]

>>> "key:value, a:b" ^.. [regex|(\w+):(\w+)|] . group 1
["value","b"]

>>> "key:value, a:b" & [regex|(\w+):(\w+)|] . group 1 %~ Char8.map toUpper
"key:VALUE, a:B"

Replace the first capture group with the full match:

>>> "a, b" & [regex|(\w+), (\w+)|] . group 0 .@~ \i -> "(" <> i <> ")"
"(a, b), b"

namedGroups :: IndexedLens' ByteString Match (Map ByteString ByteString) Source #

Access all the named groups of a match as a Map. Stashes the full match text as the index in case you need it.

Note that you can edit the groups through this lens, but the behaviour is undefined when editing inner elements of nested groups. Behaviour is undefined if groups are removed from the map (so don't do that).

NOTE: There's currently some strange behaviour in pcre-heavy where trailing unmatched optional groups are omitted, I'm looking into getting that patched, but for now, note the difference in behaviour:

>>> "A" ^? [regex|(?<a>A)|(?<b>B)|] . namedGroups
Just (fromList [("a","A")])

>>> "B" ^? [regex|(?<a>A)|(?<b>B)|] . namedGroups
Just (fromList [("a",""),("b","B")])

Get all matched groups:

>>> "raindrops on roses and whiskers on kittens" ^.. [regex|(?<first>\w+) on (?<second>\w+)|] . namedGroups
[fromList [("first","raindrops"),("second","roses")],fromList [("first","whiskers"),("second","kittens")]]

You can access a specific group combining with ix, or just use namedGroup instead

>>> "raindrops on roses and whiskers on kittens" ^.. [regex|(?<first>\w+) on (?<second>\w+)|] . namedGroups .  ix "second"
["roses","kittens"]

Editing groups:

>>> "raindrops on roses and whiskers on kittens" & [regex|(?<first>\w+) on (?<second>\w+)|] . namedGroups . ix "second" %~ Char8.map toUpper
"raindrops on ROSES and whiskers on KITTENS"

Use indexed helpers to access the full match when operating on a group.

This replaces the "first" group with the full match text wrapped in parens:

>>> "one-two" & [regex|(?<first>\w+)-(\w+)|] . namedGroups <. ix "first" %@~ \mtch grp -> grp <> ":(" <> mtch <> ")"
"one:(one-two)-two"

namedGroup :: ByteString -> IndexedTraversal' ByteString Match ByteString Source #

Access a specific named group of a match

See namedGroups for caveats and more info.

Stashes the full match text as the index in case you need it.

>>> "key:value, a:b" ^.. [regex|(?<first>\w+):(?<second>\w+)|] . namedGroup "first"
["key","a"]

>>> "key:value, a:b" ^.. [regex|(?<first>\w+):(?<second>\w+)|] . namedGroup "second"
["value","b"]

>>> "key:value, a:b" & [regex|(?<first>\w+):(?<second>\w+)|] . namedGroup "second" %~ Char8.map toUpper
"key:VALUE, a:B"

Replace the first capture group with the full match:

>>> "a, b" & [regex|(?<first>\w+), (?<second>\w+)|] . namedGroup "first" .@~ \i -> "(" <> i <> ")"
"(a, b), b"

matchAndGroups :: Getter Match (ByteString, [ByteString]) Source #

Collect both the match text AND all the matching groups

>>> "raindrops on roses and whiskers on kittens" ^.. [regex|(\w+) on (\w+)|] . matchAndGroups
[("raindrops on roses",["raindrops","roses"]),("whiskers on kittens",["whiskers","kittens"])]

Compiling regexes to Traversals

regexing :: Regex -> IndexedTraversal' Int ByteString Match Source #

Build a traversal from the provided Regex, this is handy if you're QuasiQuoter averse, or if you already have a Regex object floating around.

Also see mkRegexTraversalQQ

mkRegexTraversalQQ :: [PCREOption] -> QuasiQuoter Source #

Build a QuasiQuoter just like regex but with the provided PCREOption overrides.

Types

data Match Source #

Match represents an opaque regex match. You can drill into it using match, groups, group, namedGroup, namedGroups or matchAndGroups

Instances

(TypeError (Text "You're trying to 'show' a raw 'Match' object." :$$: Text "You likely missed adding a 'match' or 'groups' or 'group' call after your 'regex' call :)") :: Constraint) => Show Match Source #
Instance details Defined in Control.Lens.Regex.ByteString Methods showsPrec :: Int -> Match -> ShowS # show :: Match -> String # showList :: [Match] -> ShowS #

data Regex #

An abstract pointer to a compiled PCRE Regex structure The structure allocated by the PCRE library will be deallocated automatically by the Haskell storage manager.

Instances

Eq Regex
Instance details Defined in Text.Regex.PCRE.Light.Base Methods (==) :: Regex -> Regex -> Bool # (/=) :: Regex -> Regex -> Bool #
Ord Regex
Instance details Defined in Text.Regex.PCRE.Light.Base Methods compare :: Regex -> Regex -> Ordering # (<) :: Regex -> Regex -> Bool # (<=) :: Regex -> Regex -> Bool # (>) :: Regex -> Regex -> Bool # (>=) :: Regex -> Regex -> Bool # max :: Regex -> Regex -> Regex # min :: Regex -> Regex -> Regex #
Show Regex
Instance details Defined in Text.Regex.PCRE.Light.Base Methods showsPrec :: Int -> Regex -> ShowS # show :: Regex -> String # showList :: [Regex] -> ShowS #