1 / 24

AutoEval and Missplel: Two Generic Tools for Automatic Evaluation

AutoEval and Missplel: Two Generic Tools for Automatic Evaluation. Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se www.nada.kth.se/theory/humanlang/tools.html. Manual evaluation. Time-consuming, tedious, error-prone

tad
Download Presentation

AutoEval and Missplel: Two Generic Tools for Automatic Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AutoEval and Missplel:Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se www.nada.kth.se/theory/humanlang/tools.html

  2. Manual evaluation • Time-consuming, tedious, error-prone • Computers are good at repetitive tasks, humans are not • Unavoidable in some situations

  3. Automatic evaluation • Cheap, fast, accurate, easily reproducible • Incorporated in the development of most NLP system

  4. Automatic evaluation • AutoEval: simplifies the construction of (NLP system) evaluation • Missplel: introduces human-like errors into text

  5. AutoEval • "I write evaluation code myself in all our NLP projects" • "Why would I need AutoEval?"

  6. AutoEval • Our point exactly Repetition of: • Input and output file handling • XML parsing and XML output • Error handling, malformed input • Data storage, management and processing

  7. AutoEval Features — avoids repetition: • Handles input (XML/structured plain-text) and generates output (XML) • Handles data storage and processing ...and also: • Generic and extendible script language • Efficient

  8. AutoEval Script language: • Simple C-like syntax • Powerful • Modules and macros in repository files • Extendible, add your own functions

  9. AutoEval Example of configuration and script language: <root> <files> <file format="plain" type="in" name="datafile">TnT.wt</file> <file format="xml" type="out" name="outfile">out.xml</file> </files> <process> field(file("datafile"), "\t", "\n", var("word"), var("tag")); inc(cnt("tot")); inc(cnt(lookup("tag"))); </process> <processonce> outputintcon(out("outfile"), cntmap("global"), "global"); </processonce> </root>

  10. AutoEval The result: <evaloutput date="Mon May 26 12:37:39 2003"> <global> <var name="tot">14119</var> <var name="ab">714</var> <var name="ab.kom">44</var> <var name="ab.pos">149</var> <var name="ab.suv">24</var> ... <var name="vb.sup.akt">117</var> <var name="vb.sup.sfo">35</var> </global>

  11. Missplel • Missplel is a highly configurable tool to introduce human-like spelling errors • Language, PoS tag set, character set and keyboard layout independent • All you need is a word/tag/lemma dictionary

  12. Missplel Performance errors – Damerau: • Keyboard mistypes (Damerau, 1964):Insertion, deletion, substitution, transposition of letters • wellcvome, wellcme, wellcpme, wellcmoe • Result: • a new existing/non-existing word • word class (PoS tag) change or not

  13. Missplel Competence errors – split compounds: • May alter the semantics of a sentence • Kycklinglever – chicken liver • Kyckling lever – chicken is alive • Settings of split compound elements: Minimum length? Allowed PoS tag? Found in dictionary? Word class change? etc.

  14. Missplel Competence errors – sound errors: • Letter level • e.g. sound-alike errors • Regular expression rules: (.+)ei(.+) @1ie@2 receive recieve

  15. Missplel Competence errors – syntax errors: • Word/letter level • Form new words from PoS tags,missing/doubled words etc. • Regular expression rules:<rule ex="slutat skrika - slutat skrikit"> <match>vb\.sup\.akt(.*) vb\.inf.*</match> <to>vb.sup.akt@1 vb.sup.akt</to> </rule>

  16. Missplel Letters NN2 would VM0 be VBI welcome AJ0-NN1 Litters NN2damerau/wordexist-notagchange would VM0ok bee NN1sound/wordexist-tagchange welcmoe ERRdamerau/nowordexist-tagchange

  17. Missplel <input> <filename>TnT.wt</filename> <expression>([^\t]+)\t([^\t]+)([^\r\n]*).*</expression> </input> <output> <filename>output.wte</filename> <!-- %1% Word, %2% Tag, %3% Lemma, %4% Rest of line, %5% Error descr --> <format>%1% %2% %5%</format> <description> <noError>ok</noError> <existingWord>exist</existingWord> <nonExistingWord>noexist</nonExistingWord> <wordChange>-wordch</wordChange> <noWordChange>-nowordch</noWordChange> <tagChange>-tagch</tagChange> <noTagChange>-notagch</noTagChange> </description> </output> ...

  18. Missplel ... <options> <unknownTag>unknown</unknownTag> <unknownLemma>unknownLemma</unknownLemma> <escapeChar>@</escapeChar> <spaceChar> </spaceChar> <wordChar>'</wordChar> <sentenceSeparatorTag>mad</sentenceSeparatorTag> <maxErrorsInSentence>30</maxErrorsInSentence> <configDir>felstava/conf/</configDir> </options> <wordlist> <create> <filename>Swedish.cwtl</filename> <expression>.+\t([^\t]+)\t([^\t]+)\t+([^\t]+)</expression> </create> <wordfile>outfile.gz</wordfile> <tagfile>tagfile</tagfile> </wordlist> ...

  19. Missplel ... <damerau> <reportName>damerau</reportName> <active>yes</active> <probability>10.0</probability> <confusionMatrix>confusionfile</confusionMatrix> <subst>1</subst> <ins>1</ins> <del>1</del> <transp>1</transp> <allowExistingWords>no</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </damerau> ...

  20. Missplel ... <splitCompound> <reportName>split</reportName> <active>no</active> <probability>99.0</probability> <splitUnknownWords>yes</splitUnknownWords> <splitThreshold>50</splitThreshold> <minWordLength>6</minWordLength> <minSplitWordLength>3</minSplitWordLength> <factors> <wordLength>1</wordLength> <inDictionaryFirst>10</inDictionaryFirst> <inDictionarySecond>10</inDictionarySecond> <tagAllowed>10</tagAllowed> <tagMatchFirst>0</tagMatchFirst> <tagMatchSecond>15</tagMatchSecond> </factors> </splitCompound> ...

  21. Missplel ... <soundError> <reportName>sound</reportName> <active>no</active> <filename>sound.test</filename> <probability>100.0</probability> <expression>(.+)\t(.+)\t(.+)</expression> <allowExistingWords>yes</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </soundError> ...

  22. Missplel ... <syntaxError> <reportName>introduced</reportName> <active>no</active> <filename>error.rules</filename> <probability>100.0</probability> <allowExistingWords>yes</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </syntaxError>

  23. Applications • AutoEval has been used to evaluate • Parsers • PoS taggers • PoS majority/ensemble tagging • Missplel has been used to evaluate • Spell checkers • Grammar checkers • Robustness of parsers and taggers

  24. Licence • AutoEval and Missplel are open source under the Gnu General Public Licence • Source code available at www.nada.kth.se/theory/ humanlang/tools.html

More Related