Probabilistic Lexical Models for Textual Inference

Probabilistic Lexical Models for Textual Inference Eyal Shnarch,Ido Dagan, Jacob Goldberger Bar Ilan University @ IBM July 2012

The entire talk in a single sentence • lexical textual inference • principled probabilistic model • improves state-of-the art Bar Ilan University @ IBM July 2012

Outline • lexical textual inference • principled probabilistic model improves state-of-the art 1 2 3 Bar Ilan University @ IBM July 2012

lexical textual inference • principled probabilistic model improves state-of-the art 1 2 3 Bar Ilan University @ IBM July 2012

Textual inference – useful in many NLP apps At Waterloo Napoleon did surrender... Waterloo - finally facing my Waterloo Napoleon engaged in a series of wars, and won many Napoleon was Emperor of the French from 1804 to 1815. Napoleon was not tall enough to win the Battle of Waterloo In the Battle of Waterloo, 18 Jun 1815, the French army, led by Napoleon, was crushed. in Belgium Napoleon was defeated • lexical textual inference • principled probabilistic model • improves state-of-the-art Bar Ilan University @ IBM July 2012

BIU NLP lab ChayaLiebeskind Bar Ilan University @ IBM July 2012

Lexical textual inference • Complex systems use parser • Lexical inference rules link terms from T to H • Lexical rules come from lexical resources • H is inferred from T iff all its terms are inferred 1st or 2nd order co-occurrence In the Battle of Waterloo, 18 Jun 1815, the French army, led by Napoleon, was crushed. in Belgium Napoleon was defeated Text Hypothesis • lexical textual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

Textual inference for ranking 1 a 2 d At Waterloo Napoleon did surrender... Waterloo - finally facing my Waterloo Napoleon engaged in a series of wars, and won many Napoleon was Emperor of the French from 1804 to 1815. Napoleon was not tall enough to win the Battle of Waterloo In the Battle of Waterloo, 18 Jun 1815, the French army, led by Napoleon, was crushed. In which battle was Napoleon defeated? e 3 b c 5 4 • lexical textual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

Ranking textual inference – prior work • Syntactic-based methods • Transform T’s parsed tree into H’s parsed tree • Based on principled ML model • (Wang et al. 07,Heilman and Smith 10, Wang and Manning 10) • Heuristic lexical methods • Fast, easy to implement, highly competitive • Practical across genres and languages • (MacKinlay and Baldwin 09, Clark and Harrison 10, • Majumdar and Bhattacharyya 10) • lexical textual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

Lexical entailment scores – current practice • principled probabilistic model • Count covered/uncovered • (Majumdar and Bhattacharyya, 2010; Clark and Harrison, 2010) • Similarity estimation • (Corley and Mihalcea, 2005; Zanzotto and Moschitti, 2006) • Vector space • (MacKinlay and Baldwin, 2009)  Mostly heuristic Bar Ilan University @ IBM July 2012

lexical textual inference • principled probabilistic model • improves state-of-the art 1 2 3 Bar Ilan University @ IBM July 2012

Probabilistic model – overview t1 t2 t3 t4 t5 t6 T Battle of Waterloo French army led by Napoleon was crushed knowledge integration h1 h2 h3 H which battle was Napoleon defeated term-level x1 x2 x3 sentence-level annotations are available at sentence-level only • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

Knowledge integration • Distinguish resources reliability levels • WordNet >> similarity-based thesauri (Lin, 1998; Pantel and Lin, 2002) • Consider transitive chains length • The longer a chain is the lower its probability • Consider multiple pieces of evidence • More evidence means higher probability Battle of Waterloo French army led by Napoleon was crushed rule1 r is a rule multiple evidence transitive chain t rule2 which battle was Napoleon defeated Bar Ilan University @ IBM July 2012

Probabilistic model – term level t1 t2 t3 t4 t5 t6 T Battle of Waterloo French army led by Napoleon was crushed r is a rule multiple evidence OR t' is the reliability level of the resource which suggested r h1 h2 h3 H which battle was Napoleon defeated this level parameters: one per input lexical resource ACL 11 short paper • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

Probabilistic model – overview T Battle of Waterloo French army led by Napoleon was crushed knowledge integration H which battle was Napoleon defeated term-level sentence-level • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

Probabilistic model – sentence level h1 h2 h3 H which battle was Napoleon defeated x1 x2 x3 we define hidden binary random variables: xt = 1 iff ht is inferred from T (zero otherwise) AND • Modeling with AND gate: • Most intuitively • However • Too strict • Does not model terms dependency y final sentence-level decision • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

Probabilistic model – sentence level h1 h2 h3 H which battle was Napoleon defeated xt = 1 iff ht is inferred by T (zero otherwise) x1 x2 x3 we define another binary random variable: yt – inference decision for the prefix h1… ht P(yt= 1) is dependent on yt-1 and xt y1 y2 y3 final sentence-level decision M-PLM Markovian - Probabilistic this level parameters Lexical Model • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

M-PLM – inference h1 h2 h3 H which battle was Napoleon defeated x1 x2 x3 y1 y2 y3 can be computed efficiently with a forward algorithm qij(k) final sentence-level decision • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

Parametersresource M-PLM – summary ObservedLexical rules which link terms Annotation final sentence-level decision • Learning • we developed EM scheme to jointly learn all parameters • Hidden • lexicaltextual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

so how our model does? 1 2 At Waterloo Napoleon did surrender... Waterloo - finally facing my Waterloo Napoleon engaged in a series of wars, and won many Napoleon was Emperor of the French from 1804 to 1815. Napoleon was not tall enough to win the Battle of Waterloo In the Battle of Waterloo, 18 Jun 1815, the French army, led by Napoleon, was crushed. In which battle was Napoleon defeated? 3 5 4 • lexical textual inference • principled probabilistic model • Improves state-of-the-art Bar Ilan University @ IBM July 2012

lexical textual inference • principled probabilistic model • improves state-of-the art 1 2 3 Bar Ilan University @ IBM July 2012

Evaluations – data sets Ranking in passage retrieval for QA (Wang et al. 07) 5700/1500 question-candidate answer pairs from TREC 8-13 Manually annotated Notable line of work from recent years: Punyakanoket al. 04, Cui et al. 05, Wang et al. 07, Heilmanand Smith 10, Wang and Manning 10 Recognizing textual entailment within a corpus 20,000 text-hypothesis pairs in each RTE-5, RTE-6 Originally constructed for classification • lexicaltextual inference • principled probabilistic model • improves sate-of-the-art Bar Ilan University @ IBM July 2012

Evaluations – baselines Syntactic generative models • Require parsing • Apply sophisticated machine learning methods (Punyakanok et al. 04, Cui et al. 05, Wang et al. 07, Heilmanand Smith 10, Wang and Manning 10) Lexical model – Heuristically Normalized-PLM • AND-gate for the sentence-level • Add heuristic normalizations to addresses its disadvantages (TextInfer workshop 11) • Performance in line with best RTE systems HN-PLM • lexicaltextual inference • principled probabilistic model • improvessate-of-the-art Bar Ilan University @ IBM July 2012

QA results – syntactic baselines • lexicaltextual inference • principled probabilistic model • improvessate-of-the-art Bar Ilan University @ IBM July 2012

QA results – syntactic baselines + HN-PLM +0.7% +1% • lexicaltextual inference • principled probabilistic model • improvessate-of-the-art Bar Ilan University @ IBM July 2012

QA results – baselines + M-PLM +3.2% +3.5% M-PLM • lexicaltextual inference • principled probabilistic model • improvessate-of-the-art Bar Ilan University @ IBM July 2012

RTE results – M-PLM vs. HN-PLM +1.9% +7.3% +3.6% +6.0% • lexicaltextual inference • principled probabilistic model • improvessate-of-the-art Bar Ilan University @ IBM July 2012

First approach - summary Clean probabilistic lexical model • As a lexical component or as a stand alone inference system • Superiority of principled methods over heuristic ones • Attractive passage retrieval ranking method • Code available - BIU NLP downloads M-PLM limits • Processing is term order dependent • Lower performance on classification vs. HN-PLM  does not normalize well across hypotheses length • lexicaltextual inference • principled probabilistic model improvesstate-of-the-art Bar Ilan University @ IBM July 2012

lexical textual inference • principled probabilistic model improves state-of-the art 1 2 3 4 • second approach: • resources as observers • we address • with a which • A (very) new Bar Ilan University @ IBM July 2012

each resource is a witness t1 t2 t3 t4 t5 t6 Battle of Waterloo French army led by Napoleon was crushed t' h1 h2 h3 which battle was Napoleon defeated Bar Ilan University @ IBM July 2012

Bottom-up witnesses model t1 t2 t3 t4 t5 t6 Battle of Waterloo French army led by Napoleon was crushed h1 h2 h3 which battle was Napoleon defeated x1 x2 x3 AND Likelihood y Bar Ilan University @ IBM July 2012

Advantages of the second approach Inference: • Hypothesis length is not an issue • Learn from non-entailing resources • and provide a recall and precision estimation for a resource Bar Ilan University @ IBM July 2012

(near) future plans • Context model • There are other languages than English • Deploy the new version of a Wikipedia-base lexical resource with the Italian dump • Test the probabilistic lexical models for other languages • Cross language textual entailment Bar Ilan University @ IBM July 2012

Cross Language Textual Entailment Battle of Waterloo French army led by Napoleon was crushed English monolingual English-Italian phrase table Italian monolingual Thank You quale battaglia fu sconfitto Napoleone Bar Ilan University @ IBM July 2012

Bar Ilan University @ IBM July 2012

Demo examples: [Bap,WN] no transitivity Jack and Jill go_up the hill to fetch a pail of water Jack and Jill climbed a mountain to get a bucket of fluid [WN,Wiki] <show graph> Barak Obama's Buick got stuck in Dublin in a large Irish crowd United_States_President's car got stuck in Ireland, surrounded by many people Barak Obama - WN is out of date, need a new version of Wikipedia Bill_Clinton's Buick got stuck in Dublin in a large Irish crowd United_States_President's car got stuck in Ireland, surrounded by many people ------------------------------------------------------------------------------ [Bap,WN] this time with <transitivity & multiple evidence> Jack and Jill go_up the hill to fetch a pail of water Jack and Jill climbed a mountain to get a bucket of fluid [VO,WN,Wiki] in the Battle_of_Waterloo the French army led by Napoleon was crushed in which battle Napoleon was defeated? ------------------------------------------------------------------------------ [all] 1. in the Battle_of_Waterloo the French army led by Napoleon was crushed 72% 2. Napoleon was not tall enough to win the Battle_of_Waterloo 47% 3. at Waterloo Napoleon did surrender... Waterloo - finally facing my Waterloo 34% 4. Napoleon engaged in a series of wars, and won many 47% 5. Napoleon was Emperor of the French from 1804 to 1815 9% [a bit long run] Bar Ilan University @ IBM July 2012

Probabilistic Lexical Models for Textual Inference

Probabilistic Lexical Models for Textual Inference

Presentation Transcript

Exact and approximate inference in probabilistic graphical models

Knowledge Representation and Inference Models for Textual Entailment

Natural Logic for Textual Inference

Probabilistic models

Exact and approximate inference in probabilistic graphical models

Probabilistic Models

Feature-Enhanced Probabilistic Models for Diffusion Network Inference

PLIS : a Probabilistic Lexical Inference System

Probabilistic inference

Probabilistic Models

Robust Local Textual Inference

Probabilistic Inference

Towards a probabilistic Model for Lexical Entailment

Probabilistic Models

Probabilistic Inference

Probabilistic Models

Query-Specific Learning and Inference for Probabilistic Graphical Models

Probabilistic Inference

Natural Logic for Textual Inference

Probabilistic Models

Probabilistic Models

TEXTUAL EQUIVALENCE:COHESION Grammatical and lexical