Introduction to FSM Toolkit - PowerPoint PPT Presentation

addison
introduction to fsm toolkit n.
Skip this Video
Loading SlideShow in 5 Seconds..
Introduction to FSM Toolkit PowerPoint Presentation
Download Presentation
Introduction to FSM Toolkit

play fullscreen
1 / 28
Download Presentation
131 Views
Download Presentation

Introduction to FSM Toolkit

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Introduction toFSM Toolkit Examples: Part I NLP Course 07

  2. Example 1 • Acceptor for “sheeptalk”: /baa+!/ Text RepresentationSymbols File (sheep.txt) (S.syms) 0 1 b eps 0 1 2 a a 1 2 3 a b 2 3 3 a ! 3 3 4 ! w 4 4 o 5 u 6 f 7 -Symbols w, o, u and f are needed for the 2nd example. -eps symbol stands for possible future epsilon transitions.

  3. Example 1 • fsmcompile –i S.syms sheep.txt > sheep.fsa • fsmdraw –i S.syms sheep.fsa | dot –Tps > sheep.ps • Image format: PostScript. For jpg write: fsmdraw –i S.syms sheep.fsa | dot –Tjpg > sheep.jpg

  4. Write an acceptor for “dogtalk”: /wouf!/

  5. Example 2 • Acceptor for “dogtalk”: /wouf!/ Text RepresentationSymbols File (dog.txt) (S.syms)  same as Ex.1 0 1 w eps 0 (sheep & dog share 1 2 o a 1 the same symbols file) 2 3 u b 2 3 4 f ! 3 4 5 ! w 4 5 o 5 u 6 f 7


  6. Example 2 • fsmcompile –i S.syms dog.txt > dog.fsa • fsmdraw –i S.syms dog.fsa | dot –Tps > dog.ps

  7. Having the 2 fsa for “sheeptalk” and “dogtalk”, use the appropriate function to generate an acceptor that accepts a “sheeptalk” OR a “dogtalk”.

  8. Example 3 • fsmunion sheep.fsa dog.fsa > shORdg.fsa • fsmdraw –iS.syms < shORdg.fsa | dot –Tps > shORdg.ps

  9. Having the 2 fsa for “sheeptalk” and “dogtalk”, use the appropriate function to generate an acceptor that accepts a “sheeptalk” AND a “dogtalk”, using the constraint that sheep talks first!

  10. Example 4 • fsmconcat sheep.fsa dog.fsa > shANDdg.fsa • fsmdraw –iS.syms < shANDdg.fsa | dot –Tps > shANDdg.ps

  11. But the Society of Animals is always fair! This time let the dog to speak first…!!! ?

  12. Example 5 • Generate the following weighted FSM:

  13. Example 5 Text RepresentationSymbols File (A.txt) (S2.syms) 0 1 red 0.3 eps 0 1 3 blue 0.7 red 1 0 2 green 0.4 blue 2 2 3 yellow 0.8 green 3 3 0.3 yellow 4 4 0.4 As before: fsmcompile, fsmdraw

  14. Which is the path with the lowest cost?

  15. Example 5 • fsmbestpath A.fsa > B.fsa • fsmdraw –iS2.syms < B.fsa | dot –Tps > B.ps

  16. Integrating the power of Perl with the FSM Toolkit

  17. Perl & FSM Toolkit • Problem Definition: We have as input a file containing a single sentence of lower case words. “ hi nlp world” Goal: transform the above words into upper case using FSM. “ HI NLP WORLD”

  18. Perl & FSM Toolkit • A Perl script (composition.pl) that: • Extracts the lower case words from the input file • Generates the corresponding transducer • Generates a second transducer that transforms each word to its’ upper case form • Compose the two transducers • Projects the output of the resulted transducer • Extracts the output of the above transducer by reading the appropriate file and prints the upper case sentence to the screen

  19. #!/usr/bin/perl open (IN, $ARGV[0]) || die “error"; $rdln = <IN>; @in_wrds = split(/\s+/,$rdln); close(IN); # write the files for the transducers open (OUT_T11, ">T11") || die "error"; open (OUT_T12, ">T12") || die “error"; @low_up_words=@in_wrds; $c=0; foreach $tmp (@in_wrds) { print OUT_T11 ($c,"\t",$c+1,"\t",$tmp,"\t",$tmp,"\n"); print OUT_T12 ($c,"\t",$c+1,"\t",$tmp,"\t",uc($tmp),"\n"); push (@low_up_words,uc($tmp)); #gather lower and upper case words $c++; } print OUT_T11 ($c,"\n"); print OUT_T12 ($c,"\n"); close(OUT_T1); close(OUT_T2);

  20. # write symbols file $i=1; open (OUT_S12, ">S12") || die “error"; foreach $tmp (@low_up_words) { print OUT_S12 ($tmp,"\t",$i,"\n"); $i++; } close(OUT_S12); #call the FSM Library system ("fsmcompile -iS12 -oS12 -t < T11 > T11.fst"); system ("fsmdraw -iS12 -oS12 < T11.fst | dot -Tps > T11.ps"); system ("fsmcompile -iS12 -oS12 -t < T12 > T12.fst"); system ("fsmdraw -iS12 -oS12 < T12.fst | dot -Tps > T12.ps"); system ("fsmcompose T11.fst T12.fst > T12comp.fst"); system ("fsmdraw -iS12 -oS12 < T12comp.fst | dot -Tps > T12comp.ps");

  21. system ("fsmproject -2 T12comp.fst > final_out.fsa "); system ("fsmdraw -iS12 < final_out.fsa | dot -Tps > final_out.ps"); system ("fsmprint -iS12 < final_out.fsa > final_out"); # Finally, read the resulted file and extract the field of interest open (IN2, "final_out") || die "can not open the input file...\n"; $rdln2 = <IN2>; while ($rdln2 ne "") { @out_wrds = split(/\s+/,$rdln2); push (@up_wrds,$out_wrds[2]); $rdln2 = <IN2>; } close(IN2); # print the upper case content of the initial input file print (join(" ",@up_wrds),"\n");

  22. Perl & FSM Toolkit First fst (T11.fst) Second fst (T12.fst) 0 1 hi hi 0 1 hi HI 1 2 nlp nlp 1 2 nlp NLP 2 3 world world 2 3 world WORLD 3 3 Symbols File (S12) hi 1 nlp 2 world 3 HI 4 NLP 5 WORLD 6

  23. Perl & FSM Toolkit • Compose T11.fst and T12.fst system ("fsmcompose T11.fst T12.fst > T12comp.fst"); system ("fsmdraw -iS12 -oS12 < T12comp.fst | dot -Tps > T12comp.ps");

  24. Perl & FSM Toolkit Project the output of the resulted transducer: system ("fsmproject -2 T12comp.fst > final_out.fsa "); Draw the final_out.fsa: system ("fsmdraw -iS12 < final_out.fsa | dot -Tps > final_out.ps"); Print a textual description of the above fsa: system ("fsmprint -iS12 < final_out.fsa > final_out"); Read the textual this textual description using Perl: open (IN2, "final_out") || die "can not open the input file...\n"; $rdln2 = <IN2>; . . .

  25. Perl & FSM Toolkit Textual description of final_out.fsa: 0 1 HI 1 2 NLP 2 3 WORLD 3

  26. Simple extra exercises

  27. Extras 1 • Generate the following acceptor, determinize and minimize it

  28. Extras 2 • Generate the following transducers and find their composition