Text.Regex.TDFA.TNFA converts the CorePattern Q/P data (and its Pattern leafs) to a QNFA tagged non-deterministic finite automata.
This holds every possible way to follow one state by another, while in the DFA these will be reduced by picking a single best transition for each (soure,destination) pair. The transitions are heavily and often redundantly annotated with tasks to perform, and this redundancy is reduced when picking the best transition. So far, keeping all this information has helped fix bugs in both the design and implementation.
The QNFA for a Pattern with a starTraned Q/P form with N one character accepting leaves has at most N+1 nodes. These nodes repesent the future choices after accepting a leaf. The processing of Or nodes often reduces this number by sharing at the end of the different paths. Turning off capturing while compiling the pattern may (future extension) reduce this further for some patterns by processing Star with optimizations. This compact design also means that tags are assigned not just to be updated before taking a transition (PreUpdate) but also after the transition (PostUpdate).
Uses recursive do notation.
- patternToNFA :: CompOption -> (Pattern, (GroupIndex, DoPa)) -> ((Index, Array Index QNFA), Array Tag OP, Array GroupIndex [GroupInfo])
- data QNFA = QNFA {}
- data QT
- type QTrans = IntMap [TagCommand]
- data TagUpdate
Documentation
patternToNFA :: CompOption -> (Pattern, (GroupIndex, DoPa)) -> ((Index, Array Index QNFA), Array Tag OP, Array GroupIndex [GroupInfo])Source
type QTrans = IntMap [TagCommand]Source
Internal type to represent the tagged transition from one QNFA to another (or itself). The key is the Index of the destination QNFA.