1 / 43

Parsing to Meaning

Parsing to Meaning. Eugene Charniak Department of Computer Science Brown Laboratory for Linguistic Information Processing. BL IP. L. Parsing to Meaning. Words Meaning. I Parsing II Predicate-Argument Structure III Noun-Phrase Reference Lexical Semantic Resources IV Conclusion. S.

cheche
Download Presentation

Parsing to Meaning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parsing to Meaning Eugene Charniak Department of Computer Science Brown Laboratory for Linguistic Information Processing BL IP L

  2. Parsing to Meaning Words Meaning • I Parsing • II Predicate-Argument Structure • III Noun-Phrase Reference • Lexical Semantic Resources • IV Conclusion Parsing To Meaning

  3. S NP VP Alice ate yellow squash. V NP N Adj N Alice ate yellow squash Parsing Parser Parsing To Meaning

  4. The Importance of Parsing • In the hotel fake property was sold to tourists. What does “fake” modify? What does “In the hotel” modify? Parsing To Meaning

  5. Ambiguity S S NP VP NP VP N N V NP V NP NP Det N N Det N N Salesmen sold the dog biscuits Salesmen sold the dog biscuits Parsing To Meaning

  6. Probabilistic Context-free Grammars (PCFGs) S • S NP VP 1.0 • VP V NP 0.5 • VP V NP NP 0.5 • NP Det N 0.5 • NP Det N N 0.5 • N salespeople 0.3 • N dog 0.4 • N biscuits 0.3 • V sold 1.0 NP VP N V NP Det N N Salesmen sold the dog biscuits Parsing To Meaning

  7. The Basic Paradigm Training Portion of the Tree-bank Learner Parser Tester Testing Portion Parsing To Meaning

  8. S NP VP . VP V NP “Learning” a PCFG from a Tree-Bank • (S (NP (N Salespeople)) • (VP (V sold) • (NP (Det the) • (N dog) • (N biscuits))) • (. .)) Parsing To Meaning

  9. Producing a Single “Best” Parse • The parser finds the most probable parse tree (t) given the sentence (s) • For a PCFG we have the following, where varies over the constituents in the tree t : Parsing To Meaning

  10. Evaluating Parsing Accuracy • Few sentences are assigned a completely correct parse by any currently existing parsers. Rather, evaluation is done in terms of the percentage of correct constituents. • [ label, start, finish ] • A constituent is a triple, all of which must be in the true parse for the constituent to be marked correct. Parsing To Meaning

  11. Evaluating Constituent Accuracy • Let C be the number of correct constituents produced by the parser over the test set, M be the total number of constituents produced, and N be the total in the correct version • Precision = C/M • Recall = C/N • It is possible to artificially inflate either one by itself. I will typically give the average of the two. Parsing To Meaning

  12. Parsing Results • Method Prec/Rec • PCFG 75% • PCFG + simple tuning 78% Parsing To Meaning

  13. Lexicalized Parsing • To do better, it is necessary to condition probabilities on the actual words of the sentence. This makes the probabilities much tighter: • p(VP V NP NP) = 0.00151 • p(VP V NP NP | said) = 0.00001 • p(VP V NP NP | gave) = 0.01980 Parsing To Meaning

  14. Lexicalized Probabilities for Heads • p(prices | n-plural) = .013 • p(prices | n-plural, NP) = .013 • p(prices | n-plural, NP, S) = .025 • p(prices | n-plural, NP, S, v-past) = .052 • p(prices | n-plural, NP, S, v-past, fell) = .146 Parsing To Meaning

  15. Parsing Results • Method Prec/Rec • PCFG 75% • PCFG + simple tuning 78% • PCFG with lexicalization 90% Parsing To Meaning

  16. Papers on Parsing • Charniak, “Probabilistic context-free parsing with word statistics,” Proceedings of the AAAI, 1997 • Collins, “Three generative lexicalized statistical parsing models,” Proceedings of the ACL, 1997 • Charniak, “A Maximum-Entropy-Inspired Parser,” Proceedings of the NAACL, 2000 Parsing To Meaning

  17. Parsing to Meaning Words Meaning • I Parsing • II Predicate-Argument Structure • III Noun-Phrase Reference • IV Conclusion Parsing To Meaning

  18. From Parse Trees To Predicates • (S (NP The dog) • (VP ate (NP the meat))) • (arg1 ate-1 “the dog”) • (ate ate1) • (arg2 ate-1 “the meat”) Parsing To Meaning

  19. Problems in the Translation • Consider the sentence • The dog ate yesterday. • (S (NP The dog) • (VP ate (NP yesterday))) • (ate ate-1) • (arg1 ate-1 “the dog”) • (arg2 ate-1 “yesterday”) Parsing To Meaning

  20. Grammatical Function Tags • (S (NP-SUB The dog)) • (VP ate (NP the meat))) • (S (NP-SUB The dog) • (VP ate (NP-TMP yesterday))) Parsing To Meaning

  21. Multiple Function Tags • The 20 function tags come in four varieties. Nodes in the parse tree may receive at most on tag from each variety, for a maximum of four tags. (In fact, no tree-bank node receives more than three.) • Grammatical (e.g., LGS - Logical Subject) • Form/Function (e.g., LOC - Locative) • Topicalization (TPC) • Miscellaneous (HLN - headline) Parsing To Meaning

  22. Adding Function Tags to Parses • For each node in the parse tree, and for each function tag type, we ask the question, what is the most probable tag for this node (including the null tag). • We trained and tested on the usual portions of the tree-bank. • We achieved 85.6% F-measure. This rises to 98% when considering “no tag” as a valid choice. Parsing To Meaning

  23. Traces in Predicate-Argument Structure • A similar problem occurs with null or “trace” elements in a parse. • She found the meat that the dog ate. • (NP the meat (SBAR that (S (NP the dog) • (VP ate (NP) )))) • (ate ate-1) • (arg1 ate-1 “the dog”) • (arg2 ate-1 “the meat”) Parsing To Meaning

  24. Adding Traces to Parses • In much the same way we added grammatical-function tags to parses we added traces. • We currently achieve about 92% accuracy in putting trace elements in the right place and about 85% accuracy if we also require identifying the NP (if any) that the trace denotes. Parsing To Meaning

  25. References • Blaheta and Charniak, “Assigning Function Tags to Parsed Text”, Proceedings of the NAACL-2000. • The work on adding traces to parse trees is by Charniak and Johnson. It has not yet been written up. Parsing To Meaning

  26. Parsing to Meaning Words Meaning • I Parsing • II Predicate-Argument Structure • III Noun-Phrase Reference • Lexical Semantic Resources • IV Conclusion Parsing To Meaning

  27. Pronoun Reference • Mr. Smith ate the cookie. Then he got up. • Useful clues include distance to proposed antecedent (d), gender of pronoun (g), number of times the antecedent has been mentioned (m) and many other things. Parsing To Meaning

  28. Combining the Evidence • Let the random variable A denote the referent of the pronoun. We want to find • These latter statistics can be collected from a relatively small corpus. Note that I have played somewhat fast and loose with the equations. Parsing To Meaning

  29. p(d|i) • The random variable d is a distance measure from the pronoun to the proposed antecedent i that takes syntactic structure into account (due to Hobbs). • Distance P(d|i) • 1 .61 • 2 .11 • 3 .09 • 4 .05 • 5 .03 Parsing To Meaning

  30. Learning the Gender of Nouns - p(g|i) • One of the knowledge sources used in the pronoun experiments was the typical gender of nouns. In this experiment we attempted to learn this information from unmarked text. • Note that if we know referents of pronouns, then the gender of the pronoun will tell us the gender of the head noun of the noun-phrase for that occurrence. Parsing To Meaning

  31. How to Learn Gender • Thus, to learn gender of nouns, use an inaccurate pronoun resolution program and depend on statistics to weed out the mistakes. • We parsed about 15 million words of Wall Street Journal text and applied the Distance+Syntax algorithm mentioned earlier (about 61% accurate with perfect parses, probably less so in this configuration). Parsing To Meaning

  32. Results of Gender Learning Experiment • Here are the top (statistically most reliable) 43 words words(gender). Forty are correct. • Company(it), woman(she), president(he), group(it), Mr. Regan(he), man(he), President Reagan(he), government(it), U.S.(it), bank(it), mother(she), Col. North(he), Moody(it), spokeswoman(she), Aquino(she), Thatcher(she), GM(it), plan(it), Gorbachev(he), Bork(he), husband(she), Japan(it), agency(it), wife(he), dollar(it), Standard Poor(it), father(he), utility(it), Trump(he), Baker(he), IBM (it), maker(it), years(he), Meese(he), Brazil(it), spokesman(he), Simon(he), daughter(she), Ford(it), Greenspan(he) AT&T(it), minister(he), judge(he) Parsing To Meaning

  33. Experiment • All referential pronouns (he, she, it, I, you, we, they, plus variants) plus pleonastic “it”s (non-referential usage, as in “it is unlikely that the rock will fall”) were included. • Used 10-way cross validation on 4000 sentences with co-reference marked. Parsing To Meaning

  34. Results by Pronoun Type • Overall 88% • He, she, it 92% • Pleonastic it 70% • I 94% • They, we 77% Parsing To Meaning

  35. Results for He/She/It by Information Source • Distance 42% • Distance + some syntactic considerations 68% • Distance + syntax + gender 81% • Distance + syntax + gender + mention count 92% Parsing To Meaning

  36. Error Analysis • Correct referent not in candidate list 7% • Referent would be found with completely • correct gender information 20% • Referent would be found with completely • correct selectional-restriction information 19% • Referent seems to be “AI complete” 54% Parsing To Meaning

  37. Full Noun-Phrase Reference • The problem is to recognize when a full (non-pronominal) noun-phrase denotes an earlier mentioned entity. E.g., “George W. Bush”, “George Bush Jr.”, “Bush”. • We currently achieve about 65% average “link” precision recall. Parsing To Meaning

  38. Why is Full NP Reference Hard? • Since full NPs have valuable clues (e.g., all of the Bush examples have the word “Bush” in them), one might expect the problem to be easier than pronoun reference. • But as opposed to pronouns only 20% of full NPs co-refer. • 80%of our errors are in deciding whether a full NP has an earlier referent or not. Parsing To Meaning

  39. Papers • Lappin and Leass, “An algorithm for pronominal anaphora resolution”, Computational Linguistics, V. 20, 1994 • Ge, Hale, and Charniak, “A statistical approach to pronoun anaphora”, Sixth Workshop on Very Large Corpora, ACL, 1997 • Ge PhD Thesis, Brown University, 2000 • Hall and Charniak on full NP reference, Forthcoming Parsing To Meaning

  40. Parsing to Meaning Words Meaning • I Parsing • II Predicate-Argument Structure • III Noun-Phrase Reference • IV Conclusion Parsing To Meaning

  41. Putting Everything Together • We have parsed, function-tagged, added trace nouns, and noun-phrase (pronoun and full NP) co-reference information to 30 million words of Wall Street Journal articles. • This corpus is available from the Linguistic Data Consortium (LDC). Ask for BLLIP WSJ 1987-89 Release 1. Parsing To Meaning

  42. On-going Research • Lexical-semantic research: clustering nouns and finding hypernyms for the clusters. That is, automatically putting “car”, “boat”, etc. into the same cluster, and automatically labeling the cluster “vehicle”. • Parsing speech: e.g., parsing “I think, uh, I thought that he, uh, I meant to say, she” Parsing To Meaning

  43. Conclusion • Natural language processing is alive and well, thanks to corpus-based methods, and more specifically, statistical techniques. • Our current capabilities would have seemed impossible (at least to me) just ten years ago. • The directions for new research seem very rich. • Stay tuned. Parsing To Meaning

More Related