Robust Textual Inference viaGraph Matching Aria Haghighi Andrew Ng Christopher Manning
Textual Entailment Examples • TEXT (T): A Filipino hostage in Iraq was released. • HYPOTHESIS (H): A Filipino hostage was freed in Iraq. • Entailed • Only Need Lexical Similarity Matching
Another Example • T: The Psychobiology Institute of Israel was established in 1979. • H: Israel was established in 1979. • Not Entailed • Must go beyond matching only words
The Need For Relations • H: Israel was founded in 1971. • T: The Psychobiolgy Institute of Israel was founded in 1971. • No match for important relation in H! • Must match words and relations between them
Our Approach • Dependency Graph • Represent words / phrases as vertices and edges as syntactic / semantic relations • Graph Matching • Approximate notion of Isomorphism • H is entailed from T if the cost of matching H to T low.
Phrase Structure Parse S NP VP PP John’s mother walked to the store. Representation Pipeline Raw Text John’s mother walked to the store. • Modified parser of [Klein and Manning ‘03] • Handle collocations: John rang_up Mary
walked (VBD) to subj mother (NN) store (NN) poss John (NNP) Representation Pipeline Phrase Structure Parse Dependency Tree S NP VP PP John’s mother walked to the store. • Modified Collins’ Head Rules • Typed relations via tgrep expressions
Representation Pipeline • Local dependencies not enough • Additional Analysis • Semantic Role Labeling [Toutanova et al ‘05] • Named Entity Recognition: Collapse named entities into single vertex [Finkel et al ‘04] • Coreference Resolution: • T: Since its formation in 1948, Israel … • H: Israel was established in 1948.
Matching Example Hypothesis Text
Cost Model • Matching: Amapping from vertices of Hto those of T (and NULL vertex) • Cost of matching H to T determined by lowest cost matching
Vertex Cost Model • Penalize for each vertex substitution
Vertex Substitution • VertexSub(v,M(v)) • Exact Match • Synonym Match • Hypernym Match: v is a “kind of” M(v) • WordNet Similarity (Resnik Measure) • Distributional Similarity • Part-Of-Speech Match
Vertex Weight • Weights for Vertex Importance • Part-Of-Speech • Named Entity Type • TF-IDF
Relation Matching • Partial Match (and Stem Match) • T: The Japanese invasion of Manchuria. • H: Japan invaded Manchuria. • Ancestor Match • T: John is studying French farming practices. • H: John is studying French farming.
Relation Cost • For each edge e in H,is the image under M, a path in T • Weigh each edge according to “importance” of typed relation
Cost Model • PathSub(v v’, M(v) M(v’)) • Exact Match: Matching preserves edge and edge label • Partial Match: Match preserves edge but not label • Ancestor Match: M(v) is an ancestor of M(v’) • Kinked Match: M(v) and M(v’) share a common ancestor • Costs Scale with Length of Path
Final Cost Model • Combine VertexCost and RelationCost
Matching Example Hypothesis Text
Finding Minimal Matching • With VertexCost only, minimal matching found with Bipartite Graph Matching • NP-Hard: RelationCost(M) = 0 if and only if H isomorphic to sub-graph of T • Approximate Search • Initialize M to best matching using only VertexCost(M) [Bipartite Graph Matching] • Do Greedy Hill-climbing with full cost model • Seems to do well in practice
Learning Weights • Parameterize Substitution Costs • Problem: We don’t know matchings in training data. If we did, training would be easy. • Solution: Alternate between finding matchings and re-estimating parameters
Experiments • Data: Recognizing Textual Entailment ‘05 [Dagan et al, ‘05] • 567 Development Pairs • 800 Test Pairs • CWS = Confidence Weighted Score
Problem Cases • Monotonicity Assumptions • Superlatives • T: Osaka is the tallest tower in western Japan. • H: Osaka is the tallest tower in Japan. • Non-Factive Verbs • T: It is rumored that John is dating Sally. • H: John is dating Sally.
Conclusions • What’s been done • Learned Graph Matching framework • New edge and vertex features • Fast effective search procedure • What’s Needed? More Resources! • Lexical Resources: Problems with Recall • Better Dependency Parsing • Measures of Phrasal Similarity
Thanks! Aria Haghighi Andrew Ng Christopher Manning
Examples • T: C and D Technologies announced that it has closed the acquisition of Datel Inc. • H: Datel Acquired C and D technologies. • Not Entailed • Recognize switch in argument structure. • Note nominilization
Textual Entailment • Problem Definition • Given text and hypothesis (T,H) • Determine if H ‘follows’ from T ? • Not strict logical entailment • Applications • Information Extraction • Question Answering