1 / 36

Probabilistic Lexical Models for Textual Inference

Probabilistic Lexical Models for Textual Inference Eyal Shnarch , Ido Dagan, Jacob Goldberger. The entire talk in a single sentence. lexical textual inference. principled probabilistic model . improves state-of-the art. Outline. lexical textual inference.

vanya
Download Presentation

Probabilistic Lexical Models for Textual Inference

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probabilistic Lexical Models for Textual Inference Eyal Shnarch,Ido Dagan, Jacob Goldberger Bar Ilan University @ IBM July 2012

  2. The entire talk in a single sentence • lexical textual inference • principled probabilistic model • improves state-of-the art Bar Ilan University @ IBM July 2012

  3. Outline • lexical textual inference • principled probabilistic model improves state-of-the art 1 2 3 Bar Ilan University @ IBM July 2012

  4. lexical textual inference • principled probabilistic model improves state-of-the art 1 2 3 Bar Ilan University @ IBM July 2012

  5. Textual inference – useful in many NLP apps At Waterloo Napoleon did surrender... Waterloo - finally facing my Waterloo Napoleon engaged in a series of wars, and won many Napoleon was Emperor of the French from 1804 to 1815. Napoleon was not tall enough to win the Battle of Waterloo In the Battle of Waterloo, 18 Jun 1815, the French army, led by Napoleon, was crushed. in Belgium Napoleon was defeated • lexical textual inference • principled probabilistic model • improves state-of-the-art Bar Ilan University @ IBM July 2012

  6. BIU NLP lab ChayaLiebeskind Bar Ilan University @ IBM July 2012

  7. Lexical textual inference • Complex systems use parser • Lexical inference rules link terms from T to H • Lexical rules come from lexical resources • H is inferred from T iff all its terms are inferred 1st or 2nd order co-occurrence In the Battle of Waterloo, 18 Jun 1815, the French army, led by Napoleon, was crushed. in Belgium Napoleon was defeated Text Hypothesis • lexical textual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

  8. Textual inference for ranking 1 a 2 d At Waterloo Napoleon did surrender... Waterloo - finally facing my Waterloo Napoleon engaged in a series of wars, and won many Napoleon was Emperor of the French from 1804 to 1815. Napoleon was not tall enough to win the Battle of Waterloo In the Battle of Waterloo, 18 Jun 1815, the French army, led by Napoleon, was crushed. In which battle was Napoleon defeated? e 3 b c 5 4 • lexical textual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

  9. Ranking textual inference – prior work • Syntactic-based methods • Transform T’s parsed tree into H’s parsed tree • Based on principled ML model • (Wang et al. 07,Heilman and Smith 10, Wang and Manning 10) • Heuristic lexical methods • Fast, easy to implement, highly competitive • Practical across genres and languages • (MacKinlay and Baldwin 09, Clark and Harrison 10, • Majumdar and Bhattacharyya 10) • lexical textual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

  10. Lexical entailment scores – current practice • principled probabilistic model • Count covered/uncovered • (Majumdar and Bhattacharyya, 2010; Clark and Harrison, 2010) • Similarity estimation • (Corley and Mihalcea, 2005; Zanzotto and Moschitti, 2006) • Vector space • (MacKinlay and Baldwin, 2009)  Mostly heuristic Bar Ilan University @ IBM July 2012

  11. lexical textual inference • principled probabilistic model • improves state-of-the art 1 2 3 Bar Ilan University @ IBM July 2012

  12. Probabilistic model – overview t1 t2 t3 t4 t5 t6 T Battle of Waterloo French army led by Napoleon was crushed knowledge integration h1 h2 h3 H which battle was Napoleon defeated term-level x1 x2 x3 sentence-level annotations are available at sentence-level only • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

  13. Knowledge integration • Distinguish resources reliability levels • WordNet >> similarity-based thesauri (Lin, 1998; Pantel and Lin, 2002) • Consider transitive chains length • The longer a chain is the lower its probability • Consider multiple pieces of evidence • More evidence means higher probability Battle of Waterloo French army led by Napoleon was crushed rule1 r is a rule multiple evidence transitive chain t rule2 which battle was Napoleon defeated Bar Ilan University @ IBM July 2012

  14. Probabilistic model – term level t1 t2 t3 t4 t5 t6 T Battle of Waterloo French army led by Napoleon was crushed r is a rule multiple evidence OR t' is the reliability level of the resource which suggested r h1 h2 h3 H which battle was Napoleon defeated this level parameters: one per input lexical resource ACL 11 short paper • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

  15. Probabilistic model – overview T Battle of Waterloo French army led by Napoleon was crushed knowledge integration H which battle was Napoleon defeated term-level sentence-level • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

  16. Probabilistic model – sentence level h1 h2 h3 H which battle was Napoleon defeated x1 x2 x3 we define hidden binary random variables: xt = 1 iff ht is inferred from T (zero otherwise) AND • Modeling with AND gate: • Most intuitively • However • Too strict • Does not model terms dependency y final sentence-level decision • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

  17. Probabilistic model – sentence level h1 h2 h3 H which battle was Napoleon defeated xt = 1 iff ht is inferred by T (zero otherwise) x1 x2 x3 we define another binary random variable: yt – inference decision for the prefix h1… ht P(yt= 1) is dependent on yt-1 and xt y1 y2 y3 final sentence-level decision M-PLM Markovian - Probabilistic this level parameters Lexical Model • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

  18. M-PLM – inference h1 h2 h3 H which battle was Napoleon defeated x1 x2 x3 y1 y2 y3 can be computed efficiently with a forward algorithm qij(k) final sentence-level decision • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

  19. Parametersresource M-PLM – summary ObservedLexical rules which link terms Annotation final sentence-level decision • Learning • we developed EM scheme to jointly learn all parameters • Hidden • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

  20. so how our model does? 1 2 At Waterloo Napoleon did surrender... Waterloo - finally facing my Waterloo Napoleon engaged in a series of wars, and won many Napoleon was Emperor of the French from 1804 to 1815. Napoleon was not tall enough to win the Battle of Waterloo In the Battle of Waterloo, 18 Jun 1815, the French army, led by Napoleon, was crushed. In which battle was Napoleon defeated? 3 5 4 • lexical textual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

  21. lexical textual inference • principled probabilistic model • improves state-of-the art 1 2 3 Bar Ilan University @ IBM July 2012

  22. Evaluations – data sets Ranking in passage retrieval for QA (Wang et al. 07) 5700/1500 question-candidate answer pairs from TREC 8-13 Manually annotated Notable line of work from recent years: Punyakanoket al. 04, Cui et al. 05, Wang et al. 07, Heilmanand Smith 10, Wang and Manning 10 Recognizing textual entailment within a corpus 20,000 text-hypothesis pairs in each RTE-5, RTE-6 Originally constructed for classification • lexicaltextual inference • principled probabilistic model • improves sate-of-the-art Bar Ilan University @ IBM July 2012

  23. Evaluations – baselines Syntactic generative models • Require parsing • Apply sophisticated machine learning methods (Punyakanok et al. 04, Cui et al. 05, Wang et al. 07, Heilmanand Smith 10, Wang and Manning 10) Lexical model – Heuristically Normalized-PLM • AND-gate for the sentence-level • Add heuristic normalizations to addresses its disadvantages (TextInfer workshop 11) • Performance in line with best RTE systems HN-PLM • lexicaltextual inference • principled probabilistic model • improvessate-of-the-art Bar Ilan University @ IBM July 2012

  24. QA results – syntactic baselines • lexicaltextual inference • principled probabilistic model • improvessate-of-the-art Bar Ilan University @ IBM July 2012

  25. QA results – syntactic baselines + HN-PLM +0.7% +1% • lexicaltextual inference • principled probabilistic model • improvessate-of-the-art Bar Ilan University @ IBM July 2012

  26. QA results – baselines + M-PLM +3.2% +3.5% M-PLM • lexicaltextual inference • principled probabilistic model • improvessate-of-the-art Bar Ilan University @ IBM July 2012

  27. RTE results – M-PLM vs. HN-PLM +1.9% +7.3% +3.6% +6.0% • lexicaltextual inference • principled probabilistic model • improvessate-of-the-art Bar Ilan University @ IBM July 2012

  28. First approach - summary Clean probabilistic lexical model • As a lexical component or as a stand alone inference system • Superiority of principled methods over heuristic ones • Attractive passage retrieval ranking method • Code available - BIU NLP downloads M-PLM limits • Processing is term order dependent • Lower performance on classification vs. HN-PLM  does not normalize well across hypotheses length • lexicaltextual inference • principled probabilistic model improvesstate-of-the-art Bar Ilan University @ IBM July 2012

  29. lexical textual inference • principled probabilistic model improves state-of-the art 1 2 3 4 • second approach: • resources as observers • we address • with a which • A (very) new Bar Ilan University @ IBM July 2012

  30. each resource is a witness t1 t2 t3 t4 t5 t6 Battle of Waterloo French army led by Napoleon was crushed t' h1 h2 h3 which battle was Napoleon defeated Bar Ilan University @ IBM July 2012

  31. Bottom-up witnesses model t1 t2 t3 t4 t5 t6 Battle of Waterloo French army led by Napoleon was crushed h1 h2 h3 which battle was Napoleon defeated x1 x2 x3 AND Likelihood y Bar Ilan University @ IBM July 2012

  32. Advantages of the second approach Inference: • Hypothesis length is not an issue • Learn from non-entailing resources • and provide a recall and precision estimation for a resource Bar Ilan University @ IBM July 2012

  33. (near) future plans • Context model • There are other languages than English • Deploy the new version of a Wikipedia-base lexical resource with the Italian dump • Test the probabilistic lexical models for other languages • Cross language textual entailment Bar Ilan University @ IBM July 2012

  34. Cross Language Textual Entailment Battle of Waterloo French army led by Napoleon was crushed English monolingual English-Italian phrase table Italian monolingual Thank You quale battaglia fu sconfitto Napoleone Bar Ilan University @ IBM July 2012

  35. Bar Ilan University @ IBM July 2012

  36. Demo examples: [Bap,WN] no transitivity Jack and Jill go_up the hill to fetch a pail of water Jack and Jill climbed a mountain to get a bucket of fluid [WN,Wiki] <show graph> Barak Obama's Buick got stuck in Dublin in a large Irish crowd United_States_President's car got stuck in Ireland, surrounded by many people Barak Obama - WN is out of date, need a new version of Wikipedia Bill_Clinton's Buick got stuck in Dublin in a large Irish crowd United_States_President's car got stuck in Ireland, surrounded by many people ------------------------------------------------------------------------------ [Bap,WN] this time with <transitivity & multiple evidence> Jack and Jill go_up the hill to fetch a pail of water Jack and Jill climbed a mountain to get a bucket of fluid [VO,WN,Wiki] in the Battle_of_Waterloo the French army led by Napoleon was crushed in which battle Napoleon was defeated? ------------------------------------------------------------------------------ [all] 1. in the Battle_of_Waterloo the French army led by Napoleon was crushed 72% 2. Napoleon was not tall enough to win the Battle_of_Waterloo 47% 3. at Waterloo Napoleon did surrender... Waterloo - finally facing my Waterloo 34% 4. Napoleon engaged in a series of wars, and won many 47% 5. Napoleon was Emperor of the French from 1804 to 1815 9% [a bit long run] Bar Ilan University @ IBM July 2012

More Related