1 / 31

HPSG parser development at U- tokyo

HPSG parser development at U- tokyo. Takuya Matsuzaki. University of Tokyo. Topics. Overview of U-Tokyo HPSG parsing system Supertagging with Enju HPSG grammar. Overview of U-Tokyo parsing system. Two different algorithms: E nju parser: Supertagging + CKY algo . for TFS

duane
Download Presentation

HPSG parser development at U- tokyo

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HPSG parser developmentat U-tokyo Takuya Matsuzaki University of Tokyo

  2. Topics • Overview of U-Tokyo HPSG parsing system • Supertagging with Enju HPSG grammar

  3. Overview of U-Tokyo parsing system • Two different algorithms: • Enju parser: Supertagging + CKY algo. for TFS • Mogura parser: Supertagging + CFG-filtering • Two disambiguation models: • one trained on PTB-WSJ • one trained on PTB-WSJ + Genia (biomedical)

  4. like Supertagger-based parsing [Clark and Curran, 2004; Ninomiyaet al., 2006] • Supertagging[Bangalore and Joshi, 1999]:Selecting a few LEs for a word by using a probabilistic model of P(LE | sentence) P: small HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> P: large it I

  5. LEs with P > threshold Input to the parser HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> it I like like Supertagger-based parsing [Clark and Curran, 2004; Ninomiyaet al., 2006] • Ignore the LEs with small probabilities P: small threshold HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> P: large it I

  6. Flow in Enju parser • POS tagging by a CRF-based model • Morphological analysis (inflected  base form) by the WordNet dictionary • Multi-Supertagging by a MaxEnt model • TFS CKY parsing + MaxEnt disambiguation on the multi-supertagged sentence

  7. Flow in Mogura parser • POS tagging by a CRF-based model • Morphological analysis (inflected  base form) by the WordNet dictionary • Supertagging by a MaxEnt model • Selection of (probably) constraint-satisfying supertag assignment • TFS shift-reduce parsing on singly-supertaggedsentence

  8. LEs with P > threshold Input to the parser HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> it I like like Previous supertagger-based parsing [Clark and Curran, 2004; Ninomiyaet al., 2006] • Ignore the LEs with small probabilities P: small threshold HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> P: large it I

  9. HEAD verbSUBJ <>COMPS <> HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <> HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > like Supertagging is “almost parsing”

  10. HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <VP> HEAD nounSUBJ < >COMPS < > it I like A dilemma in the previous method • Fewer LEs  Faster parsing, but • Too few LEs  More risk of no well-formed parse trees

  11. HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD verbSUBJ <NP>COMPS <NP> HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > ... it it it I I I like like like I like it like MoguraOverview input sentence I like it Enumeration of assignments Supertagger Deterministicdisambiguation Prob. HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> it I

  12. Prob. Prob. , , ) ... ... ... 1 1 1 2 2 2 1 1 1 1 1 1 it I like ( ( ( , , , , , , ) ) ) 2 2 1 1 1 2 1 1 1 1 2 1 2 1 1 ... Enumaration of the maybe-parsable LE assignments Enumeration of thehighest-prob. LE sequences Supertaggingresult CFG-filter ( ...

  13. CFG-filter • Parsing with a CFG that approximates the HPSG [Kiefer and Krieger, 2000; Torisawa et al, 2000] • Approximation = elmination of some constraints in the grammar (long-distance dep., number, case, etc.) • Covering property: if a LE assignment is parsable by the HPSG  it is also parsable by the approx. CFG • CFG parsing is much faster than HPSG parsing

  14. Results on PTB-WSJ

  15. Supertagging with Enju grammar • Input: POS-tagged sentence • Number of supertags (lexical templates): 2,308 • Current implementation • Classifier: MaxEnt, point-wise prediction (i.e., no dependencies among neighboring supertags) • Features: words and POS tags in -2/+3 window • 92% token accuracy (1-best, only on covered tokens) • It’s “almost parsing”: 98-99% parsing accuracy (PAS F1) given correct lexical assignments

  16. Pointwise-Supertagging Output S1 S2 S3 S4 S5 S6 S7 S8 Lex. Ent. P1 P2 P3 P4 P5 P6 P7 P8 POS tag w1 w2 w3 w4 w5 w6 w7 w8 Word Input

  17. Pointwise-Supertagging Output S1 S2 S3 S4 S5 S6 S7 S8 Lex. Ent. P1 P2 P3 P4 P5 P6 P7 P8 POS tag w1 w2 w3 w4 w5 w6 w7 w8 Word Input

  18. Pointwise-Supertagging Output S1 S2 S3 S4 S5 S6 S7 S8 Lex. Ent. P1 P2 P3 P4 P5 P6 P7 P8 POS tag w1 w2 w3 w4 w5 w6 w7 w8 Word Input

  19. Pointwise-Supertagging Output S1 S2 S3 S4 S5 S6 S7 S8 Lex. Ent. P1 P2 P3 P4 P5 P6 P7 P8 POS tag w1 w2 w3 w4 w5 w6 w7 w8 Word Input

  20. Pointwise-Supertagging Output S1 S2 S3 S4 S5 S6 S7 S8 Lex. Ent. P1 P2 P3 P4 P5 P6 P7 P8 POS tag w1 w2 w3 w4 w5 w6 w7 w8 Word Input

  21. Pointwise-Supertagging Output S1 S2 S3 S4 S5 S6 S7 S8 Lex. Ent. P1 P2 P3 P4 P5 P6 P7 P8 POS tag w1 w2 w3 w4 w5 w6 w7 w8 Word Input

  22. Supertagging: future directions • Basic strategy: do more work in supertagging (rather than in parsing) • Pros • Model/algorithm is simpler  Easy error analysis • Various features without extending the parsing algorithm • Fast try-and-error cycle for feature engineering • Cons • No tree structure  Feature design is sometimes tricky/ad-hoc:e.g., “nearest preceding verb/noun”, instead of “possible modifiee of a PP”

  23. Supertagging: future directions • Recovery from POS-tagging error in supertagging stage • Incorporation of shallow processing results (e.g., chunking, NER, coordination structure prediction) as new features • Comparison across other languages/grammar frameworks

  24. Thank you!

  25. next action Deterministic disambiguation • Implemented as a shift-reduce parser • Deterministic parsing: only one analysis at one time • Next parsing action is selected using a scoring function • F: scoring function (averaged-perceptron algorithm [Collins and Duffy, 2002]) • Features are extracted from the stack state S and lookahead queue Q • A: the set of possible actions (CFG-forest is used as a `guide’)

  26. Example Initial state Q S HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > it I like

  27. argmax F(a, S, Q) = SHIFT Q S HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > it I like

  28. argmax F(a, S, Q) = SHIFT Q S HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > it I like

  29. argmax F(a, S, Q) = SHIFT Q S HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <NP>COMPS <NP> HEAD nounSUBJ < >COMPS < > it I like

  30. argmax F(a, S, Q) = REDUCE(Head_Comp) Q S HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <[1]NP>COMPS <> Head-Comp-Schema I HEAD verbSUBJ <[1]>COMPS <NP> HEAD nounSUBJ < >COMPS < > it like

  31. argmax F(a, S, Q) = REDUCE(Subj_Head) Q S HEAD verbSUBJ <>COMPS <> Subj-Head-Schema HEAD nounSUBJ < >COMPS < > HEAD verbSUBJ <[1]NP>COMPS <> I HEAD verbSUBJ <[1]>COMPS <NP> HEAD nounSUBJ < >COMPS < > it like

More Related