License | MIT |
---|---|
Safe Haskell | None |
Language | Haskell98 |
re2 is a regular expression library offering predictable run-time and memory consumption. This package is a binding to re2.
Supported expression syntax is documented at http://code.google.com/p/re2/wiki/Syntax.
$ ghci -XOverloadedStrings ghci> import Regex.RE2 ghci> find "\\w+" "hello world" Just (Match [Just "hello"]) ghci> find "\\w+$" "hello world" Just (Match [Just "world"]) ghci> find "^\\w+$" "hello world" Nothing
- data Pattern
- compile :: ByteString -> Either Error Pattern
- compileWith :: Options -> ByteString -> Either Error Pattern
- patternInput :: Pattern -> ByteString
- patternOptions :: Pattern -> Options
- patternGroups :: Pattern -> Vector (Maybe ByteString)
- data Options
- defaultOptions :: Options
- data Encoding
- optionEncoding :: Options -> Encoding
- optionPosixSyntax :: Options -> Bool
- optionLongestMatch :: Options -> Bool
- optionMaxMemory :: Options -> Int64
- optionLiteral :: Options -> Bool
- optionNeverNewline :: Options -> Bool
- optionNeverCapture :: Options -> Bool
- optionCaseSensitive :: Options -> Bool
- optionPerlClasses :: Options -> Bool
- optionWordBoundary :: Options -> Bool
- optionOneLine :: Options -> Bool
- data Error
- data ErrorCode
- errorMessage :: Error -> String
- errorCode :: Error -> ErrorCode
- data Match
- matchGroup :: Match -> Int -> Maybe ByteString
- matchGroups :: Match -> Vector (Maybe ByteString)
- data Anchor
- match :: Pattern -> ByteString -> Int -> Int -> Maybe Anchor -> Int -> Maybe Match
- find :: Pattern -> ByteString -> Maybe Match
- replace :: Pattern -> ByteString -> ByteString -> (ByteString, Bool)
- replaceAll :: Pattern -> ByteString -> ByteString -> (ByteString, Int)
- extract :: Pattern -> ByteString -> ByteString -> Maybe ByteString
- quoteMeta :: ByteString -> ByteString
Compiling patterns
A pattern is a compiled regular expression plus its compilation options.
Patterns can be created by calling compile
explicitly:
import Data.ByteString.Char8 (pack) p :: Pattern p = casecompile
(pack "^hello world$") of Right ok -> ok Left err -> error ("compilation error: " ++errorMessage
err)
Or by using the IsString
instance:
import Data.String (fromString) p :: Pattern p = fromString "^hello world$"
Or by using the OverloadedStrings
language extension:
{-# LANGUAGE OverloadedStrings #-} p :: Pattern p = "^hello world$"
compile :: ByteString -> Either Error Pattern Source
compile =compileWith
defaultOptions
compileWith :: Options -> ByteString -> Either Error Pattern Source
Compile a regular expression with the given options. If compilation fails,
the error can be inspected with errorMessage
and errorCode
.
Use optionEncoding
to select whether the input bytes should be interpreted
as UTF-8 or Latin1. The default is UTF8.
Pattern properties
patternInput :: Pattern -> ByteString Source
The regular expression originally provided to compileWith
.
patternOptions :: Pattern -> Options Source
The options originally provided to compileWith
.
patternGroups :: Pattern -> Vector (Maybe ByteString) Source
The capturing groups defined within the pattern. Groups are listed
from left to right, and are Nothing
if the group is unnamed.
ghci> patternGroups "(\\d+)|(?P<word>\\w+)" fromList [Nothing,Just "word"]
Options
Options controlling how to compile a regular expression. The fields in this value may be set using record syntax:
compileNoCase :: B.ByteString -> Either ErrorPattern
compileNoCase =compileWith
(defaultOptions
{optionCaseSensitive
= False })
defaultOptions :: Options Source
defaultOptions = Options { optionEncoding = EncodingUtf8 , optionPosixSyntax = False , optionLongestMatch = False , optionMaxMemory = 8388608 -- 8 << 20 , optionLiteral = False , optionNeverNewline = False , optionDotNewline = False , optionNeverCapture = False , optionCaseSensitive = True , optionPerlClasses = False , optionWordBoundary = False , optionOneLine = False }
optionEncoding :: Options -> Encoding Source
optionPosixSyntax :: Options -> Bool Source
optionLongestMatch :: Options -> Bool Source
optionMaxMemory :: Options -> Int64 Source
optionLiteral :: Options -> Bool Source
optionNeverNewline :: Options -> Bool Source
optionNeverCapture :: Options -> Bool Source
optionCaseSensitive :: Options -> Bool Source
optionPerlClasses :: Options -> Bool Source
Only checked in posix mode
optionWordBoundary :: Options -> Bool Source
Only checked in posix mode
optionOneLine :: Options -> Bool Source
Only checked in posix mode
Compilation errors
errorMessage :: Error -> String Source
Matching
A successful match of the pattern against some input. Capturing groups
may be retrieved with matchGroup
or matchGroups
.
matchGroup :: Match -> Int -> Maybe ByteString Source
The capturing group with the given index, or Nothing
if the group was
not set in this match.
The entire match is group 0.
matchGroups :: Match -> Vector (Maybe ByteString) Source
All of the groups in the pattern, with each group being Nothing
if it
was not set in this match. Groups are returned in the same order as
patternGroups
.
The entire match is group 0.
:: Pattern | |
-> ByteString | |
-> Int | Start position |
-> Int | End position |
-> Maybe Anchor | |
-> Int | How many match groups to populate |
-> Maybe Match |
The most general matching function. Attempt to match the pattern to the input within the given constraints.
If the number of match groups to populate is 0, matching can be performed more efficiently.
Searching
find :: Pattern -> ByteString -> Maybe Match Source
Attempt to find the pattern somewhere within the input.
Replacing
:: Pattern | |
-> ByteString | Input |
-> ByteString | Replacement template |
-> (ByteString, Bool) |
Replace the first occurance of the pattern with the given replacement
template. If the template contains backslash escapes such as \1
, the
capture group with the given index will be inserted in their place.
Returns the new bytes, and True
if a replacement occured.
:: Pattern | |
-> ByteString | Input |
-> ByteString | Replacement template |
-> (ByteString, Int) |
Replace every occurance of the pattern with the given replacement
template. If the template contains backslash escapes such as \1
, the
capture group with the given index will be inserted in their place.
Returns the new bytes, and how many replacements occured.
:: Pattern | |
-> ByteString | Input |
-> ByteString | Extraction template |
-> Maybe ByteString |
Attempt to find the pattern somewhere within the input, and extract
it using the given template. If the template contains backslash escapes
such as \1
, the capture group with the given index will be inserted
in their place.
Returns Nothing
if the pattern was not found in the input.
Utility functions
quoteMeta :: ByteString -> ByteString Source
Escapes bytes such that the output is a regular expression which will exactly match the input.