1 / 39

Using Machine Learning to Annotate Data for NLP Tasks Semi-Automatically

Using Machine Learning to Annotate Data for NLP Tasks Semi-Automatically. Overview. Introduction End-User Requirements Solution: Design & Implementation Evaluation Conclusion. Introduction End-User Requirements Solution: Design & Implementation Evaluation Conclusion.

kineta
Download Presentation

Using Machine Learning to Annotate Data for NLP Tasks Semi-Automatically

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Machine Learning to Annotate Data for NLP Tasks Semi-Automatically

  2. Overview • Introduction • End-User Requirements • Solution: Design & Implementation • Evaluation • Conclusion

  3. Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Human Language TechnologiesLess-resourced LanguagesMethodology Human Language Technologies • HLTs depends on availability of linguistic data • Specialized lexicons • Annotated and raw corpora • Formalized grammar rules • Creation of such resources • Expensive and protractive • Especially for less-resourced languages

  4. Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Human Language TechnologiesLess-resourced LanguagesMethodology Less-resourced Languages • "languages for which few digital resources exist; and thus, languages whose computerization poses unique challenges. [They] are languages with limited financial, political, and legal resources… " (Garrett, 2006) • Implicit in this definition: • Lacks human resources (little attention in research or discussions) • Lacks computational linguists working on these languages • Research question: • How could one facilitate development of linguistic data by enabling non-experts to collaborate in the computerization of less-resourced languages?

  5. Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Human Language TechnologiesLess-resourced LanguagesMethodology Methodology I • Empowering linguists and mother-tongue speakers to deliver annotated data • High quality • Shortest possible time • Escalate the annotation of linguistic data by mother-tongue speakers • User-friendly environments • Bootstrapping • Machine learning instead of rule-based techniques

  6. Methodology II The general idea: Development of gold standards Development of annotated data Bootstrapping With the click of a button: Annotate data Train machine-learning algorithm Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Human Language TechnologiesLess-resourced LanguagesMethodology

  7. Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion AssumptionsInterviews Central Point of Departure I • Annotators are invaluable resources • Based on experiences with less-resourced languages • Annotators have mostly word processing skills • Used to a GUI-based environment • Usually limited skills in a computational or programming environment • Worst cases annotators have difficulties with • File management • Unzipping • Proper encoding of text files

  8. Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion AssumptionsInterviews Central Point of Departure II • Aim of this project: Enabling annotators to focus on what they are good at: Enriching data with expert linguistic knowledge • Training the machine learner occurs automatically

  9. Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion AssumptionsInterviews End-user Requirements I • Unstructured interviews with four annotators • What do you find unpleasant about your work as an annotator? • What will make your life as an annotator easier?

  10. Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion AssumptionsInterviews End-user Requirements II • What do you find unpleasant about your work as an annotator? • Repetitiveness • Lack of concentration/motivation • Feeling “useless” • Do not see results

  11. Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion AssumptionsInterviews End-user Requirements III 2. What will make your life as an annotator easier? • Friendly environment (i.e. GUI-based, and not lists of words) • Bite-sizes of data rather than endless lists • Rather correct data than annotate from scratch • Program should already suggest a possible annotation • Click or drag • Reference works need to be available • Automatic data management

  12. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Solution: TurboAnnotate • User-friendly annotating environment • Bootstrapping with machine learning • Creating gold standards/annotated lists • Inspired by DictionaryMaker (Davel and Peche, 2006) and Alchemist (University of Chicago, 2004)

  13. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions DictionaryMaker Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion

  14. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Alchemist Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion

  15. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Simplified Workflow of TurboAnnotate Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion 1 2 3

  16. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Step 1: Create Gold Standard • Create gold standard • Independent test set for evaluating performance • 1000 random instances used • Annotator only has to select one data file

  17. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Simplified Workflow of TurboAnnotate Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion 1 2 3

  18. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Step 2: Verify Annotations • New data sourced from base list • Automatically annotated by classifier • Presented to annotator in the "Annotate" tab

  19. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions TurboAnnotate : Annotation Environment Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion

  20. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Simplified Workflow of TurboAnnotate Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion 1 2 3

  21. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Step 3: Verify Annotated Set • Bootstrapping – inspired by DictionaryMaker • 200 words per chunk – trained in background • Annotator verifies • Click “accept” or correct the instance • Verified data serve as training data • Iterative process till desired results

  22. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion The Machine Learning System I • Tilburg Memory-Based Learner (TiMBL). • Wide success and applicability in the field of natural language processing • Available for research purposes • Relative ease to use • On the down-side • Performs best with large quantities of data • For the tasks of hyphenation and compound analysis, TiMBL performs well with small quantities of data

  23. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion The Machine Learning System II • Default parameter settings used • Task specific feature selection • Performance is evaluated against gold standard • For hyphenation and compound analysis, accuracy is determined on word-level and not per instance

  24. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Features I All input words converted feature vectors Splitting window Context 3 positions (left and right) Class Hyphenation: indicating a break Compound Analysis: 3 possible classes + indicating word boundary _ indicating valence morpheme = no break Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion

  25. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Features II Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion • Example: eksamenlokaal -‘examination room’

  26. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Parameter Optimisation I Large variations in accuracy occur when parameter settings of MBL algorithms are changed Finding the best combination of parameters Exhaustive searches undesirable Slow and computationally expensive Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion

  27. Functional Specifications & SolutionsTechnicalSpecifications & Solutions User Instructions Parameter Optimisation II Alternative: Paramsearch (Van den Bosch, 2005) delivers combinations of algorithmic parameters that are estimated to perform well PSearch Our own modification of Paramsearch Only implemented after all data has been annotated Ensures the best possible classifier Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion

  28. CriteriaAccuracy Effort Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Criteria • Two criteria • Accuracy • Human effort (time) • Evaluated on the tasks of hyphenation and compound analysis for Afrikaans and Setswana • Four human annotators • Two well-experienced in annotating • Two considered novices in the field

  29. CriteriaAccuracy Effort Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Accuracy • Two kinds of accuracy • Classifier accuracy • Human accuracy • Expressed as percentage of correctly annotated words over total number of words • Gold standard excluded as training data

  30. CriteriaAccuracy Effort Classifier Accuracy (Hyphenation) Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion

  31. CriteriaAccuracy Effort Human Accuracy Human accuracy Two separate unseen datasets of 200 words for each language First dataset annotated in an ordinary text editor The second dataset annotated withTurboAnnotate. Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion

  32. CriteriaAccuracy Effort Human Accuracy Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion

  33. CriteriaAccuracy Effort Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Human Effort I • Two questions • Is it faster to annotate with TurboAnnotate? • What would the predicted saving on human effort be on a large dataset?

  34. CriteriaAccuracy Effort Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Human Effort II

  35. CriteriaAccuracy Effort Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Human Effort III • 1 minute faster to annotate 200 words with TurboAnnotate • Larger dataset (40,000 words) • Difference of only circa 3.5 uninterrupted human hours • This picture changes when the effect of bootstrapping is considered • Extrapolating to 42,967 words • Saving of 51 hours (68%) for hyphenation • Saving of 9 hours (41%) for compound analysis

  36. Conclusion Future WorkObtaining TurboAnnotate Acknowledgements Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Conclusion • TurboAnnotate helps to increase the accuracy of human annotators • Saves human effort

  37. Conclusion Future WorkObtaining TurboAnnotate Acknowledgements Future Work Other lexical annotation tasks Creating lexicons for spelling checkers Creating data for morphological analysis Stemming Lemmatization Improve GUI Network solution Active Learning Experiment with C5.0 Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion

  38. Conclusion Future WorkObtaining TurboAnnotate Acknowledgements TurboAnnotate Requirements: Linux Perl 5.8 Gtk+ 2.10 TiMBL 5.1 Open-source Available at http://www.nwu.ac.za/ctext Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion

  39. Conclusion Future WorkObtaining TurboAnnotate Acknowledgements Introduction End-User Requirements Solution: Design & ImplementationEvaluationConclusion Acknowledgements • This work was supported by a grant from the South African National Research Foundation (GUN: FA2004042900059). • We also acknowledge the inputs and contributions of • Ansu Berg • Pieter Nortjé • Rigardt Pretorius • Martin Schlemmer • Wikus Slabbert

More Related