1 / 83

Some Observations on Hindi Dependency Parsing

Some Observations on Hindi Dependency Parsing. Samar Husain Language Technologies Research Centre, IIIT-Hyderabad. Introduction. Parsing a free work order language with (relatively) rich morphology is a challenging task Methods, problems, causes. Experiments with Hindi. Introduction.

kellsie
Download Presentation

Some Observations on Hindi Dependency Parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Some Observations on Hindi Dependency Parsing Samar Husain Language Technologies Research Centre, IIIT-Hyderabad.

  2. Introduction • Parsing a free work order language with (relatively) rich morphology is a challenging task • Methods, problems, causes. • Experiments with Hindi

  3. Introduction • Parsing a free work order language with (relatively) rich morphology is a challenging task • Methods, problems, causes. • Experiments with Hindi

  4. Hindi: Brief overview • malaya ne sameer ko kitaba dii. Malay ERG Sameer DAT book gave “Malay gave the book to Sameer” (S-IO-DO-V) S-DO-IO-V IO-S-DO-V IO-DO-S-V DO-S-IO-V DO-IO-S-V

  5. Hindi: Brief overview • Inflections • Gender, number, person • Tense, aspect and modality • Agreement • Noun-adjective • Noun-verb

  6. Dependency Grammar A formalism for linguistic analysis Dependencies between words central to analysis Different from phrase structure analysis

  7. Dependency Grammar A formalism for linguistic analysis Dependencies between words central to analysis Different from phrase structure analysis Abhay ate a mango

  8. Dependency Grammar A formalism for linguistic analysis Dependencies between words central to analysis Different from phrase structure analysis Abhay ate a mango

  9. Dependency Tree Root property Spanning property Connectedness property Single head property Acyclicity property Arc size property Kübler et al. (2009)

  10. Dependency Parsing M = (Γ, λ, h) A dependency parsing model M comprises of a set of constraints Γ that define the space of permissible dependency structures, a set of parameters λ and a parsing algorithm h Γ maps an arbitrary sentence S and dependency type set R to a set of well-formed dependency trees Gs Γ = (Σ, R, C) where, Σ is the set of terminal symbols (here, words), R is the label set, and C is the set of constraints. Such constraints restrict dependencies between words and possible head of the word in well defined ways. G = h (Γ, λ, S) given a set of constraints Γ, parameter λ and a new sentence S, how does the system find out the most appropriate dependency tree G for that sentence Kübler et al. (2009)

  11. Dependency Parsing M = (Γ, λ, h) A dependency parsing model M comprises of a set of constraints Γ that define the space of permissible dependency structures, a set of parameters λ and a parsing algorithm h Γ maps an arbitrary sentence S and dependency type set R to a set of well-formed dependency trees Gs Γ = (Σ, R, C) where, Σ is the set of terminal symbols (here, words), R is the label set, and C is the set of constraints. Such constraints restrict dependencies between words and possible head of the word in well defined ways. G = h (Γ, λ, S) given a set of constraints Γ, parameter λ and a new sentence S, how does the system find out the most appropriate dependency tree G for that sentence Kübler et al. (2009)

  12. Dependency Parsing M = (Γ, λ, h) A dependency parsing model M comprises of a set of constraints Γ that define the space of permissible dependency structures, a set of parameters λ and a parsing algorithm h Γ maps an arbitrary sentence S and dependency type set R to a set of well-formed dependency trees Gs Γ = (Σ, R, C) where, Σ is the set of terminal symbols (here, words), R is the label set, and C is the set of constraints. Such constraints restrict dependencies between words and possible head of the word in well defined ways. G = h (Γ, λ, S) given a set of constraints Γ, parameter λ and a new sentence S, how does the system find out the most appropriate dependency tree G for that sentence Kübler et al. (2009)

  13. Constraint based based on the notion of eliminative parsing, where sentences are analyzed by successively eliminating representations that violate constraints until only valid representations remain Data-driven learning problem, which is the task of learning a parsing model from a representative sample of structure of sentences (training data) the parsing problem (or inference/decoding problem), which is the task of applying the learned model to the analysis of a new sentence.

  14. Constraint based based on the notion of eliminative parsing, where sentences are analyzed by successively eliminating representations that violate constraints until only valid representations remain Data-driven learning problem, which is the task of learning a parsing model from a representative sample of structure of sentences (training data) the parsing problem (or inference/decoding problem), which is the task of applying the learned model to the analysis of a new sentence.

  15. Constraint based method A Two-Stage Generalized Hybrid Constraint Based Parser (GH-CBP) Incorporates some of the notions of CPG Uses Integer linear programming for constraint satisfaction Also incorporate ideas from graph-based parsing and labeling for prioritization 15 Bharati et al. (2009a, 2009b); Husain (2011)

  16. Quick Illustration

  17. Data driven approaches • Transition based systems • MaltParser • Graph based systems • MSTParser

  18. MaltParser • Malt is a classifier based Shift/Reduce parser. • It uses arc-eager, arc-standard, covington projective and convington non-projective algorithms for parsing • History-based feature models are used for predicting the next parser action • Support vector machines are used for mapping histories to parser actions Nivre et al., (2006)

  19. Quick Illustration

  20. MSTParser • MST uses Chu-Liu-Edmonds Maximum Spanning Tree algorithm for non-projective parsing and Eisner's algorithm for projective parsing. • It uses online large margin learning as the learning algorithm McDonald et al., (2005a, 2005b)

  21. Quick Illustration

  22. Hybrid • Constraint parser + MSTParser (Husain et al., 2011b)

  23. Quick Illustration

  24. Quick Illustration

  25. Quick Illustration

  26. Use of modularity in parsing

  27. Modularity • Chunk • Local word groups • Local dependencies

  28. Modularity • Clause • Intra-clausal • Inter-clausal

  29. Chunk based parsing (I) Chunk as hard constraint Intra-chunk and inter-chunk dependencies identified separately But use intra-chunk features Identifying intra-chunk relations easy Ambati et al., (2010b)

  30. Chunk based parsing (II) • Chunk as soft constraint • Intra-chunk and inter-chunk dependencies identified together • Use local morphosyntactic features

  31. Clause based parsing (I) Husain et al., (2009)

  32. Clause based parsing (II) Husain et al. (2011a)

  33. MaltParser Configuration

  34. Clause based parsing (III) Similar to parser stacking ‘guide’ Malt with a 1st stage parse by Malt. The additional features added to the 2nd-stage parser during 2-Soft parsing encode the decisions by the 1st-stage parser concerning potential arcs and labels considered by the 2nd stage parser, in particular, arcs involving the word currently on top of the stack and the word currently at the head of the input buffer.

  35. Experimental setup Parsers GH-CBP (version 1.6) MaltParser (version 1.3.1) MSTParser (version 0.4b) Data ICON10 tools contest The training set had 3000 sentences, the development had 500 sentences and test set had 300 sentences 35

  36. Evaluation metric and accuracies • CoNLL dependency parsing shared task 2008 (Nivre et al., 2008) • UAS: Unlabeled attachment accuracy • LAS: Label attachment accuracy • LA: Label accuracy • Performance • Constraint based (coarse-grained tagset; oracle) • UAS = 88.50 • LAS = 79.12 • Statistical (fine-grained) • UAS = ~91 • LAS = ~76

  37. Remarks: Malt • Crucial features • Deprel of the partially built tree • Conjoined features • Good for short distance dependencies • Non-projective algo doesn’t help • Arc-eager, Libsvm Bharati et al., (2008), Ambati et al., (2010a)

  38. Remarks: MSTParser • Crucial features • Conjoined features • Modified MST • Difficult to incorporate complex features for labeled parsing • We use MaxEnt as a labeler • Good for long distance dependencies and for identifying the root • Non-projective performs better • Training k=5, order=2 (Bharati et al., 2008), (Ambati et al., 2010a)

  39. What helps Morphological features Local morphosyntactic Clausal Minimal semantics 39 Bharati et al., (2008); Ambati et al;. (2009) Ambati et al., (2010a); Ambati et al., (2010b); Gadde et al., (2010);

  40. What helps Morphological features Local morphosyntactic Clausal Minimal semantics 40

  41. What helps Morphological Local morphosyntactic Clausal Minimal semantics 41

  42. What helps Morphological Local morphosyntactic Clausal Minimal semantics 42

  43. What helps Morphological Local morphosyntactic Clausal Minimal semantics 43

  44. Relative comparison the relative importance of these features over the baseline LAS of MSTParser. 44

  45. 45

  46. What doesn’t • Gender, number, person

  47. Parsing MOR-FWO languages Problems in parsing of MOR-FWO languages Non-configurational nature of these languages Inherent limitations in the parsing/learning algorithms Less amount of annotated data 47

  48. Common errors • Simple sentences • the correct identification of the argument structure (labels)

  49. Common errors • Reasons for errors in label • Word order not strict • absence of postpositions, • ambiguous postpositions, • ambiguous TAMs, and • inability of the parser to exploit agreement features • inability to always make simple linguistic generalizations

  50. Embedded clauses • Relative clauses • Participles

More Related