Safe Haskell | Safe |
---|---|
Language | Haskell2010 |
- data DFA c = DFA {
- dfaTransition :: !(IntMap (SF c Int))
- dfaAcceptable :: !IntSet
- dfaBlackholes :: !IntSet
- fromRE :: forall c. (Ord c, Enum c, Bounded c) => RE c -> DFA c
- toRE :: forall c. (Ord c, Enum c, Bounded c) => DFA c -> RE c
- fromERE :: forall c. (Ord c, Enum c, Bounded c) => ERE c -> DFA c
- toERE :: forall c. (Ord c, Enum c, Bounded c) => DFA c -> ERE c
- fromTM :: forall k c. (Ord k, Ord c, TransitionMap c k) => k -> DFA c
- fromTMEquiv :: forall k c. (Ord k, Ord c, TransitionMap c k, Equivalent c k) => k -> DFA c
- toKleene :: forall k c. (Ord c, Enum c, Bounded c, FiniteKleene c k) => DFA c -> k
Documentation
Deterministic finite automaton.
A deterministic finite automaton (DFA) over an alphabet \(\Sigma\) (type
variable c
) is 4-tuple \(Q\), \(q_0\) , \(F\), \(\delta\), where
- \(Q\) is a finite set of states (subset of
Int
), - \(q_0 \in Q\) is the distinguised start state (
0
), - \(F \subset Q\) is a set of final (or accepting) states (
dfaAcceptable
), and - \(\delta : Q \times \Sigma \to Q\) is a function called the state
transition function (
dfaTransition
).
DFA | |
|
Complement c (DFA c) Source # | Complement DFA. Complement of
|
Ord c => Match c (DFA c) Source # | Run Because we have analysed a language, in some cases we can determine an input
without traversing all of the input.
That's not the cases with
Holds:
all (match (fromRE r)) $ take 10 $ RE.generate (curry QC.choose) 42 (r :: RE.RE Char) |
Show c => Show (DFA c) Source # | |
Show c => Pretty (DFA c) Source # | |
Conversions
fromRE :: forall c. (Ord c, Enum c, Bounded c) => RE c -> DFA c Source #
>>>
putPretty $ fromRE $ RE.star "abc"
0+ -> \x -> if | x <= '`' -> 3 | x <= 'a' -> 2 | otherwise -> 3 1 -> \x -> if | x <= 'b' -> 3 | x <= 'c' -> 0 | otherwise -> 3 2 -> \x -> if | x <= 'a' -> 3 | x <= 'b' -> 1 | otherwise -> 3 3 -> \_ -> 3 -- black hole
Everything and nothing result in blackholes:
>>>
traverse_ (putPretty . fromRE) [RE.empty, RE.star RE.anyChar]
0 -> \_ -> 0 -- black hole 0+ -> \_ -> 0 -- black hole
Character ranges are effecient:
>>>
putPretty $ fromRE $ RE.charRange 'a' 'z'
0 -> \x -> if | x <= '`' -> 2 | x <= 'z' -> 1 | otherwise -> 2 1+ -> \_ -> 2 2 -> \_ -> 2 -- black hole
An example with two blackholes:
>>>
putPretty $ fromRE $ "c" <> RE.star RE.anyChar
0 -> \x -> if | x <= 'b' -> 2 | x <= 'c' -> 1 | otherwise -> 2 1+ -> \_ -> 1 -- black hole 2 -> \_ -> 2 -- black hole
toRE :: forall c. (Ord c, Enum c, Bounded c) => DFA c -> RE c Source #
>>>
putPretty $ toRE $ fromRE "foobar"
^foobar$
For string
regular expressions,
:toRE
. fromRE
= id
let s = take 5 s' in RE.string (s :: String) === toRE (fromRE (RE.string s))
But in general it isn't:
>>>
let aToZ = RE.star $ RE.charRange 'a' 'z'
>>>
traverse_ putPretty [aToZ, toRE $ fromRE aToZ]
^[a-z]*$ ^([a-z]|[a-z]?[a-z]*[a-z]?)?$
not-prop> (re :: RE.RE Char) === toRE (fromRE re)
However, they are equivalent
:
>>>
RE.equivalent aToZ (toRE (fromRE aToZ))
True
And so are others
>>>
all (\re -> RE.equivalent re (toRE (fromRE re))) [RE.star "a", RE.star "ab"]
True
expensive-prop> RE.equivalent re (toRE (fromRE (re :: RE.RE Char)))
Note, that
can, and usually makes regexp unrecognisable:toRE
. fromRE
>>>
putPretty $ toRE $ fromRE $ RE.star "ab"
^(a(ba)*b)?$
We can complement
DFA, therefore we can complement RE
.
For example. regular expression matching string containing an a
:
>>>
let withA = RE.star RE.anyChar <> "a" <> RE.star RE.anyChar
>>>
let withoutA = toRE $ complement $ fromRE withA
>>>
putPretty withoutA
^([^a]|[^a]?[^a]*[^a]?)?$
>>>
let withoutA' = RE.star $ RE.REChars $ RSet.complement $ RSet.singleton 'a'
>>>
putPretty withoutA'
^[^a]*$
>>>
RE.equivalent withoutA withoutA'
True
Quite small, for example 2 state DFAs can result in big regular expressions:
>>>
putPretty $ toRE $ complement $ fromRE $ star "ab"
^([^]|a(ba)*(ba)?|a(ba)*([^b]|b[^a])|([^a]|a(ba)*([^b]|b[^a]))[^]*[^]?)$
We can use
to convert toRE
. fromERE
ERE
to RE
:
>>>
putPretty $ toRE $ fromERE $ complement $ star "ab"
^([^]|a(ba)*(ba)?|a(ba)*([^b]|b[^a])|([^a]|a(ba)*([^b]|b[^a]))[^]*[^]?)$
>>>
putPretty $ toRE $ fromERE $ "a" /\ "b"
^[]$
See https://mathoverflow.net/questions/45149/can-regular-expressions-be-made-unambiguous for the description of the algorithm used.
fromERE :: forall c. (Ord c, Enum c, Bounded c) => ERE c -> DFA c Source #
We don't always generate minimal automata:
>>>
putPretty $ fromERE $ "a" /\ "b"
0 -> \_ -> 1 1 -> \_ -> 1 -- black hole
Compare this to an complement
example
Using fromTMEquiv
, we can get minimal automaton, for the cost of higher
complexity (slow!).
>>>
putPretty $ fromTMEquiv $ ("a" /\ "b" :: ERE.ERE Char)
0 -> \_ -> 0 -- black hole
>>>
putPretty $ fromERE $ complement $ star "abc"
0 -> \x -> if | x <= '`' -> 3 | x <= 'a' -> 2 | otherwise -> 3 1+ -> \x -> if | x <= 'b' -> 3 | x <= 'c' -> 0 | otherwise -> 3 2+ -> \x -> if | x <= 'a' -> 3 | x <= 'b' -> 1 | otherwise -> 3 3+ -> \_ -> 3 -- black hole
fromTM :: forall k c. (Ord k, Ord c, TransitionMap c k) => k -> DFA c Source #
Create from TransitionMap
.
See fromRE
for a specific example.
fromTMEquiv :: forall k c. (Ord k, Ord c, TransitionMap c k, Equivalent c k) => k -> DFA c Source #
Create from TransitonMap
minimising states with Equivalent
.
See fromERE
for an example.