1 / 30

LING 581: Advanced Computational Linguistics

LING 581: Advanced Computational Linguistics. Lecture Notes April 27th. TCE. 2 nd last class today…. WordNet Homework. If you haven’t already, you should have emailed me this report. Idea: evaluate the feasibility of QA on the web using TREC 9 QA examples

chico
Download Presentation

LING 581: Advanced Computational Linguistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LING 581: Advanced Computational Linguistics Lecture Notes April 27th

  2. TCE • 2nd last class today…

  3. WordNetHomework If you haven’t already, you should have emailed me this report

  4. Idea: evaluate the feasibility of QA on the web using TREC 9 QA examples programming it up is optional see and appreciate why it’s hard to do... Steps: Pick 3 query groups Simulate (programmatically) QA use the Collins parser and WordNet to find answers to the queries submit report (before final class next week) Example Question group What kind of animal was Winnie the Pooh? Winnie the Pooh is what kind of animal? What species was Winnie the Pooh? Winnie the Pooh is an imitation of which animal? What was the species of Winnie the Pooh? QA Homework

  5. trees reformulate Qs into declarative sentences with missing wh-phrase ____ (kind of animal) is winnie the pooh winnie the pooh is ____ (species) winnie the pooh is an imitation of ____ (animal) the species of winnie the pooh is ____ Example

  6. Answers: Winnie the Pooh is such a popular character in Poland Winnie-the-Pooh Is My Co-worker Winnie the Pooh is a little, adorable and cute bear obsessed by honey. Winnie-the-Pooh is so fat. Winnie the Pooh is one of the things most closest to my heart Winnie the Pooh is his usual befuddled self Example

  7. Original declarative form: winnie the pooh is ____ (species) Check semantic relatedness of extracted head words using WordNet: character co-worker little one self Here, look at shortest paths Example • Summary • Headword Length #nodes • –bear 6 9258 • –character 6 1072 • one 7 734 • co-worker 7 6488 • self 7 14456 • little 10 28706 Constraints: length < #nodes

  8. Other resources • XWN: http://xwn.hlt.utdallas.edu/ glosses in Logical Form

  9. XWN • Applications • The Extended WordNet may be used as a Core Knowledge Base for applications such as Question Answering, Information Retrieval, Information Extraction, Summarization, Natural Language Generation, Inferences, and other knowledge intensive applications. • The glosses contain a part of the world knowledge since they define the most common concepts of the English language.

  10. XWN Example: • Dan Moldovan and Adrian Novischi, Lexical Chains for Question Answering, COLING 2002

  11. COALS • Take a look at an alternative to WordNet for computing similarity • WordNet: handbuilt system • COALS: • the correlated occurrence analogue to lexical semantics • (Rohde et al. 2004) • a instance of a vector-based statistical model for similarity • e.g., see also Latent Semantic Analysis (LSA) • Singular Valued Decomposition (SVD) • sort by singular values, take top k and reduce the dimensionality of the co-occurrence matrix to rank k • based on weighted co-occurrence data from large corpora

  12. 4 4 3 3 2 2 wi 1 1 wi-4 wi-3 wi-2 wi-1 wi+1 wi+2 wi+3 wi+4 COALS • Basic Idea: • compute co-occurrence counts for (open class) words from a large corpora • corpora: • Usenet postings over 1 month • 9 million (distinct) articles • 1.2 billion word tokens • 2.1 million word types • 100,000th word occurred 98 times • co-occurrence counts • based on a ramped weighting system with window size 4 • excluding closed-class items

  13. COALS • Example:

  14. COALS • available online • http://dlt4.mit.edu/~dr/COALS/similarity.php

  15. Computing Similarity

  16. runconnectbf/3 ?- connectbf(impassioned,zealous,X). X = 10 ? ?- connectbf(zealous,impassioned,X). X = 9 ? compare to b. ravenous ?- connectbf(ravenous,zealous,X). no ?- connectbf(zealous,ravenous,X). shortest link between impassioned and zealous Worked Example: zealous Old Code: WordNet1.7.1

  17. shortest path between ravenous and zealous Worked Example: zealous

  18. Task: Match each word in the first column with its definition in the second column accolade deviation aberrant keen insight abate abolish abscond lessen in intensity acumen sour or bitter acerbic depart secretly abscission building up accretion renounce abjure removal abrogate praise

  19. Task: Match each word in the first column with its definition in the second column accolade deviation aberrant keen insight abate 2 abolish abscond 2 lessen in intensity acumen 2 sour or bitter acerbic 3 depart secretly abscission building up accretion 2 renounce abjure 2 removal abrogate 3 praise

  20. COALS and the GRE

  21. COALS and the GRE

  22. COALS and the GRE

  23. COALS and the GRE

  24. COALS and the GRE

  25. COALS and the GRE

  26. COALS and the GRE

  27. COALS and the GRE

  28. COALS and the GRE

  29. Task: Match each word in the first column with its definition in the second column accolade deviation aberrant keen insight abate abolish abscond lessen in intensity acumen sour or bitter acerbic depart secretly abscission building up accretion renounce abjure removal abrogate praise

  30. Heuristic: competing words, pick the strongest accolade deviation aberrant keen insight abate abolish abscond lessen in intensity acumen 7 out of 10 sour or bitter acerbic depart secretly abscission building up accretion renounce abjure removal abrogate praise

More Related