280 likes | 437 Views
Unambiguous automata inference by means of states-merging methods. François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr http://www.irisa.fr/symbiose IRISA-INRIA, Campus de Beaulieu 35042 Rennes Cedex France. I- Automata inference. *. *. . . Definitions.
E N D
Unambiguous automata inference by means of states-merging methods François Coste, Daniel Fredouille {fcoste|dfredoui}@irisa.fr http://www.irisa.fr/symbiose IRISA-INRIA, Campus de Beaulieu 35042 Rennes Cedex France
I- Automata inference * * Definitions • Alphabet: = {a,b} • Word: abbabbabbaaa • Language: L • Automaton: L={a+b}*a{a+b} D. Fredouille and F. Coste, Unambiguous Automata Inference
I- Automata inference a a Classes of automata (1/3) • Nondeterministic Automata (NFA) • Deterministic Automata (DFA) • one outgoing transition per input symbol L={a+b}*a{a+b} D. Fredouille and F. Coste, Unambiguous Automata Inference
I- Automata inference a b b b a a b b b a Classes of automata (2/3) • Unambiguous Automata (UFA) [SH85] • one acceptance per word L={a+b}*a{a+b} • NFA UFA DFA D. Fredouille and F. Coste, Unambiguous Automata Inference
I- Automata inference S ={aa,abab} + S ={ba,abbb} - Automata inference Examples Counter-examples L={a+b}*a{a+b} D. Fredouille and F. Coste, Unambiguous Automata Inference
I- Automata inference Why this study ? • State of the art: DFA inference • Our goal: introducing some amount of non-determinism • Why ? • NFA << DFA • inferring with less data • inferring “explicit” representations • Method: • extending classical DFA inference algorithm D. Fredouille and F. Coste, Unambiguous Automata Inference
II - Study of the DFA inference framework D. Fredouille and F. Coste, Unambiguous Automata Inference
II - The DFA search space Search space for NFAs [DMV94] UA MCA D. Fredouille and F. Coste, Unambiguous Automata Inference
II - The DFA search space L S S L = - - Counter-examples : compatibility UA (incompatible) (compatible) MCA D. Fredouille and F. Coste, Unambiguous Automata Inference
II - The DFA search space UA Deterministic merging State merging MCA The search space for DFA D. Fredouille and F. Coste, Unambiguous Automata Inference
II - The DFA search space Merging for determinisation procedure q1,q2 Q, w *: w pref(q1) w pref(q2) state-merging(q1,q2) D. Fredouille and F. Coste, Unambiguous Automata Inference
II - The DFA search space Merging for determinization procedure q1,q2 Q, w *: w pref(q1) w pref(q2) state-merging(q1,q2) D. Fredouille and F. Coste, Unambiguous Automata Inference
II - The DFA search space Merging for determinization procedure q1,q2 Q, w *: w pref(q1) w pref(q2) state-merging(q1,q2) D. Fredouille and F. Coste, Unambiguous Automata Inference
II - The DFA search space Deterministic merging operator =state-merging + merging for determinization • Very commonly used[OG92, LPP98,...] • Demonstration of formal properties • Merging for determinization • Enables to reach the “closest” DFA from the original NFA • Deterministic merging • Enables to reach all derived DFA from a given DFA • ... (see tech. rep.) D. Fredouille and F. Coste, Unambiguous Automata Inference
IV - From DFA to UFA inferenceor how to introduce some amount of non- determinism in inference D. Fredouille and F. Coste, Unambiguous Automata Inference
III - DFA to UFA inference Inferring non-deterministic representations: the choice of UFA • Why UFA ? • unity in the search space (like DFA) UA NFA UFA DFA MCA({aaaaa}) D. Fredouille and F. Coste, Unambiguous Automata Inference
III - DFA to UFA inference Merging for disambiguisation procedure q1,q2 Q, w1,w2 *: w1 pref(q1) w1 pref(q2) w2 suff(q1) w2 suff(q2) state-merging(q1,q2) D. Fredouille and F. Coste, Unambiguous Automata Inference
III - DFA to UFA inference Merging for disambiguisation procedure q1,q2 Q, w1,w2 *: w1 pref(q1) w1 pref(q2) w2 suff(q1) w2 suff(q2) state-merging(q1,q2) D. Fredouille and F. Coste, Unambiguous Automata Inference
III - DFA to UFA inference Unambiguous merging = state-merging + merging for disambiguisation • Finer operator than merging for determinization • Demonstration of formal properties • Merging for disambiguisation • Enables to reach the “closest” UFA from the original NFA • unambiguous merging • Enables to reach all derived UFA from a given UFA • ... (see tech. rep.) D. Fredouille and F. Coste, Unambiguous Automata Inference
IV - Comparative experiments - Inference algorithms - Benchmarks - Experimental results D. Fredouille and F. Coste, Unambiguous Automata Inference
IV - Comparative experiments Algorithms • UFA • Hill-climbing heuristic • DFA • EDSM heuristic [LPP98] • RFSA • DeLeTe II [DLT01] • Hill-climbing heuristic D. Fredouille and F. Coste, Unambiguous Automata Inference
IV - Comparative experiments S S S S - - + + Counter-example use for DFA and UFA inference • Compatibility [DMV94] • generalization of , stopped by • Functionality [AS95] • generalization of and , stopped by empty intersection D. Fredouille and F. Coste, Unambiguous Automata Inference
IV - Comparative experiments Benchmarks • [DLT01] • Generation: DFA, NFA, Regular Expression • 4 sizes of training sample • 30 languages generated for each generation mode and sample size • + UFA generator • Evaluation based on • average recognition level on test sets • matches between recognition level D. Fredouille and F. Coste, Unambiguous Automata Inference
IV - Comparative experiments ? Results • Best algorithms w.r.t. benchmarks • DFA bench: UFA inference with hill-climbing • UFA bench: DFA inference with hill-climbing UFA inference with hill-climbing • NFA bench: RFSA inference • Reg. Expr.: RFSA inference DFA inference with hill-climbing D. Fredouille and F. Coste, Unambiguous Automata Inference
IV - Comparative experiments Results • Heuristic: • Hill-climbing >> EDSM when inferring DFAs for NFA/Regular Expression/UFA bench. • Counter-examples: • Compatibility Functionality D. Fredouille and F. Coste, Unambiguous Automata Inference
IV - Comparative experiments Results (matches) D. Fredouille and F. Coste, Unambiguous Automata Inference
Conclusion • UFA inference • Merging for disambiguisation • Heuristic • Comparison with EDSM & DeLeTe II Perspectives • Speeding up the algorithm • Application • Using properties of the DFA/UFA space D. Fredouille and F. Coste, Unambiguous Automata Inference
References • [AS95] Alquézar, Sanfeliu, “Incremental grammatical inference from positive and negative data using unbiased finite state automata”, SSPR’94 • [DMV94] Dupont and al. “What is the search space of the regular inference ?”, ICGI ’94 • [DLT00] Denis and al., “Learning regular languages using nondeterministic automata”, ICGI ’00 • [SH85] Stearns, Hunt, “On the equivalence and containment problems for unambiguous regular expressions, regular grammars and finite automata”, SIAM vol 14 • [tech. rep.] Coste, Fredouille “What is the search space for NFA, UFA and DFA inference ?”, IRISA D. Fredouille and F. Coste, Unambiguous Automata Inference