Safe Haskell	Safe
Language	Haskell2010

Kleene.DFA

Contents

Conversions

Synopsis

Documentation

data DFA c Source #

Deterministic finite automaton.

A deterministic finite automaton (DFA) over an alphabet $\Sigma$ (type variable c) is 4-tuple $Q$, $q_0$ , $F$, $\delta$, where

$Q$ is a finite set of states (subset of Int),
$q_0 \in Q$ is the distinguised start state (0),
$F \subset Q$ is a set of final (or accepting) states (dfaAcceptable), and
$\delta : Q \times \Sigma \to Q$ is a function called the state transition function (dfaTransition).

Constructors

DFA
Fields dfaTransition :: !(IntMap (SF c Int)) transition function dfaAcceptable :: !IntSet accept states dfaBlackholes :: !IntSet states we cannot escape

Instances

Complement c (DFA c) Source #	Complement DFA. Complement of `DFA` is way easier than of `RE`: complement accept states. `>>>` `let dfa = complement $ fromRE $ RE.star "abc"``>>>` `putPretty dfa`0 -> \x -> if \| x <= '`' -> 3 \| x <= 'a' -> 2 \| otherwise -> 3 1+ -> \x -> if \| x <= 'b' -> 3 \| x <= 'c' -> 0 \| otherwise -> 3 2+ -> \x -> if \| x <= 'a' -> 3 \| x <= 'b' -> 1 \| otherwise -> 3 3+ -> \_ -> 3 -- black hole `>>>` `map (match dfa) ["", "abc", "abcabc", "aa","abca", 'a' : 'a' : undefined]`[False,False,False,True,True,True]
Methods complement :: DFA c -> DFA c Source #
Ord c => Match c (DFA c) Source #	Run `DFA` on the input. Because we have analysed a language, in some cases we can determine an input without traversing all of the input. That's not the cases with `RE` `match`. `>>>` `let dfa = fromRE $ RE.star "abc"``>>>` `map (match dfa) ["", "abc", "abcabc", "aa", 'a' : 'a' : undefined]`[True,True,True,False,False] Holds: `match` (`fromRE` re) xs == `match` re xs all (match (fromRE r)) $ take 10 $ RE.generate (curry QC.choose) 42 (r :: RE.RE Char)
Methods match :: DFA c -> [c] -> Bool Source #
Show c => Show (DFA c) Source #
Methods showsPrec :: Int -> DFA c -> ShowS # show :: DFA c -> String # showList :: [DFA c] -> ShowS #
Show c => Pretty (DFA c) Source #
Methods pretty :: DFA c -> String Source # prettyS :: DFA c -> ShowS Source #

Conversions

fromRE :: forall c. (Ord c, Enum c, Bounded c) => RE c -> DFA c Source #

Convert RE to DFA.

>>> putPretty $ fromRE $ RE.star "abc"
0+ -> \x -> if
    | x <= '`'  -> 3
    | x <= 'a'  -> 2
    | otherwise -> 3
1 -> \x -> if
    | x <= 'b'  -> 3
    | x <= 'c'  -> 0
    | otherwise -> 3
2 -> \x -> if
    | x <= 'a'  -> 3
    | x <= 'b'  -> 1
    | otherwise -> 3
3 -> \_ -> 3 -- black hole

Everything and nothing result in blackholes:

>>> traverse_ (putPretty . fromRE) [RE.empty, RE.star RE.anyChar]
0 -> \_ -> 0 -- black hole
0+ -> \_ -> 0 -- black hole

Character ranges are effecient:

>>> putPretty $ fromRE $ RE.charRange 'a' 'z'
0 -> \x -> if
    | x <= '`'  -> 2
    | x <= 'z'  -> 1
    | otherwise -> 2
1+ -> \_ -> 2
2 -> \_ -> 2 -- black hole

An example with two blackholes:

>>> putPretty $ fromRE $ "c" <> RE.star RE.anyChar
0 -> \x -> if
    | x <= 'b'  -> 2
    | x <= 'c'  -> 1
    | otherwise -> 2
1+ -> \_ -> 1 -- black hole
2 -> \_ -> 2 -- black hole

toRE :: forall c. (Ord c, Enum c, Bounded c) => DFA c -> RE c Source #

Convert DFA to RE.

>>> putPretty $ toRE $ fromRE "foobar"
^foobar$

For string regular expressions, toRE . fromRE = id:

let s = take 5 s' in RE.string (s :: String) === toRE (fromRE (RE.string s))

But in general it isn't:

>>> let aToZ = RE.star $ RE.charRange 'a' 'z'
>>> traverse_ putPretty [aToZ, toRE $ fromRE aToZ]
^[a-z]*$
^([a-z]|[a-z]?[a-z]*[a-z]?)?$

not-prop> (re :: RE.RE Char) === toRE (fromRE re)

However, they are equivalent:

>>> RE.equivalent aToZ (toRE (fromRE aToZ))
True

And so are others

>>> all (\re -> RE.equivalent re (toRE (fromRE re))) [RE.star "a", RE.star "ab"]
True

expensive-prop> RE.equivalent re (toRE (fromRE (re :: RE.RE Char)))

Note, that toRE . fromRE can, and usually makes regexp unrecognisable:

>>> putPretty $ toRE $ fromRE $ RE.star "ab"
^(a(ba)*b)?$

We can complement DFA, therefore we can complement RE. For example. regular expression matching string containing an a:

>>> let withA = RE.star RE.anyChar <> "a" <> RE.star RE.anyChar
>>> let withoutA = toRE $ complement $ fromRE withA
>>> putPretty withoutA
^([^a]|[^a]?[^a]*[^a]?)?$

>>> let withoutA' = RE.star $ RE.REChars $ RSet.complement $ RSet.singleton 'a'
>>> putPretty withoutA'
^[^a]*$

>>> RE.equivalent withoutA withoutA'
True

Quite small, for example 2 state DFAs can result in big regular expressions:

>>> putPretty $ toRE $ complement $ fromRE $ star "ab"
^([^]|a(ba)*(ba)?|a(ba)*([^b]|b[^a])|([^a]|a(ba)*([^b]|b[^a]))[^]*[^]?)$

We can use toRE . fromERE to convert ERE to RE:

>>> putPretty $ toRE $ fromERE $ complement $ star "ab"
^([^]|a(ba)*(ba)?|a(ba)*([^b]|b[^a])|([^a]|a(ba)*([^b]|b[^a]))[^]*[^]?)$

>>> putPretty $ toRE $ fromERE $ "a" /\ "b"
^[]$

See https://mathoverflow.net/questions/45149/can-regular-expressions-be-made-unambiguous for the description of the algorithm used.

fromERE :: forall c. (Ord c, Enum c, Bounded c) => ERE c -> DFA c Source #

Convert ERE to DFA.

We don't always generate minimal automata:

>>> putPretty $ fromERE $ "a" /\ "b"
0 -> \_ -> 1
1 -> \_ -> 1 -- black hole

Compare this to an complement example

Using fromTMEquiv, we can get minimal automaton, for the cost of higher complexity (slow!).

>>> putPretty $ fromTMEquiv $ ("a" /\ "b" :: ERE.ERE Char)
0 -> \_ -> 0 -- black hole

>>> putPretty $ fromERE $ complement $ star "abc"
0 -> \x -> if
    | x <= '`'  -> 3
    | x <= 'a'  -> 2
    | otherwise -> 3
1+ -> \x -> if
    | x <= 'b'  -> 3
    | x <= 'c'  -> 0
    | otherwise -> 3
2+ -> \x -> if
    | x <= 'a'  -> 3
    | x <= 'b'  -> 1
    | otherwise -> 3
3+ -> \_ -> 3 -- black hole

toERE :: forall c. (Ord c, Enum c, Bounded c) => DFA c -> ERE c Source #

Convert DFA to ERE.

fromTM :: forall k c. (Ord k, Ord c, TransitionMap c k) => k -> DFA c Source #

Create from TransitionMap.

See fromRE for a specific example.

fromTMEquiv :: forall k c. (Ord k, Ord c, TransitionMap c k, Equivalent c k) => k -> DFA c Source #

Create from TransitonMap minimising states with Equivalent.

See fromERE for an example.

toKleene :: forall k c. (Ord c, Enum c, Bounded c, FiniteKleene c k) => DFA c -> k Source #

Convert to any Kleene.

See toRE for a specific example.