Textual Entailment
1 / 67

Textual Entailment as a Framework for Applied Semantics - PowerPoint PPT Presentation

  • Uploaded on

Textual Entailment as a Framework for Applied Semantics. Ido Dagan Bar-Ilan University, Israel Joint works with: Oren Glickman, Idan Szpektor, Roy Bar Haim, Maayan Geffet, Moshe Koppel, Efrat Marmorshtein, Bar Ilan University Shachar Mirkin Hebrew University, Israel

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Textual Entailment as a Framework for Applied Semantics' - phyllis-petty

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Textual Entailment as a Framework for Applied Semantics

Ido Dagan Bar-Ilan University, Israel

Joint works with:

Oren Glickman, Idan Szpektor, Roy Bar Haim, Maayan Geffet, Moshe Koppel, Efrat Marmorshtein, Bar Ilan UniversityShachar Mirkin Hebrew University, Israel

Hristo Tanev, Bernardo Magnini, Alberto Lavelli, Lorenza Romano ITC-irst, Italy

Bonaventura Coppola, Milen Kouylekov University of Trento and ITC-irst, Italy

Danilo Giampiccolo, CELCT, Italy Dan Roth, UIUC

Applied semantics for text understanding reading
Applied Semantics forText Understanding/Reading

  • Understanding text meaning refers to the semantic level of language

  • An applied computational framework for semantics is needed

  • Such common framework is still missing

Desiderata for modeling framework
Desiderata for Modeling Framework

  • A framework for a target level of language processing should provide:

    • Generic module for applications

    • Unified paradigm for investigating language phenomena

    • Unified knowledge representation

  • Most semantics research is scattered

    • WSD, NER, SRL, lexical semantics relations… (e.g. vs. syntax)

    • Dominating approach - interpretation


  • The textual entailment task – what and why?

  • Evaluation – PASCAL RTE Challenges

  • Modeling approach:

    • Knowledge acquisition

    • Inference (briefly)

    • Application example

  • An alternative framework for investigating semantics

Natural language and meaning



Natural Language and Meaning



Variability of semantic expression
Variability of Semantic Expression

The Dow Jones Industrial Average closed up 255

Model variabilityas relations between text expressions:

  • Equivalence: expr1  expr2 (paraphrasing)

  • Entailment: expr1  expr2 – the general case

    • Incorporates inference as well

Dow ends up

Dow gains 255 points

Stock market hits a record high

Dow climbs 255

Typical application inference
Typical Application Inference

QuestionExpected answer formWhoboughtOverture? >> XboughtOverture

Overture’s acquisitionby Yahoo

Yahoo bought Overture


hypothesized answer


  • Similar for IE: X buy Y

  • Similar for “semantic” IR: t: Overture was bought …

  • Summarization (multi-document) – identify redundant info

  • MT evaluation (and recent ideas for MT)

  • Educational applications

Kraq 05 workshop knowledge and reasoning for answering questions ijcai 05


  • Reasoning aspects:    * information fusion,    * search criteria expansion models     * summarization and intensional answers,    * reasoning under uncertainty or with incomplete knowledge,

  • Knowledge representation and integration:    * levels of knowledge involved (e.g. ontologies, domain knowledge),    * knowledge extraction models and techniques to optimize response accuracy… but similar needs for other applications – can entailment provide a common empirical task?

Classical entailment definition
Classical Entailment Definition QUESTIONS (IJCAI-05)

  • Chierchia & McConnell-Ginet (2001):A text t entails a hypothesis h if h is true in every circumstance (possible world) in which t is true

  • Strict entailment - doesn't account for some uncertainty allowed in applications

Almost certain entailments
“Almost certain” Entailments QUESTIONS (IJCAI-05)

t:The technological triumph known as GPS … was incubated in the mind of Ivan Getting.

h: Ivan Getting invented the GPS.

Applied textual entailment
Applied Textual Entailment QUESTIONS (IJCAI-05)

  • Directional relation between two text fragments: Text (t) and Hypothesis (h):

  • Operational (applied) definition:

    • Human gold standard - as in NLP applications

    • Assuming common background knowledge – which is indeed expected from applications!

Probabilistic interpretation
Probabilistic Interpretation QUESTIONS (IJCAI-05)


  • t probabilistically entailshif:

    • P(h istrue | t) > P(h istrue)

      • tincreases the likelihood of h being true

      • ≡ Positive PMI – t provides information on h’s truth

  • P(h istrue | t ):entailment confidence

    • The relevant entailment score for applications

    • In practice: “most likely” entailment expected

The role of knowledge
The Role of Knowledge QUESTIONS (IJCAI-05)

  • For textual entailment to hold we require:

    • text AND knowledgeh


    • knowledge should not entail h alone

  • Systems are not supposed to validate h’s truth without utilizing t

PASCAL Recognizing Textual Entailment (RTE) Challenges QUESTIONS (IJCAI-05)EU FP-6 Funded PASCAL NOE 2004-7

Bar-Ilan University ITC-irst and CELCT, Trento

MITRE Microsoft Research

Generic dataset by application use
Generic Dataset by Application Use QUESTIONS (IJCAI-05)

  • 7 application settings in RTE-1, 4 in RTE-2/3

    • QA

    • IE

    • “Semantic” IR

    • Comparable documents / multi-doc summarization

    • MT evaluation

    • Reading comprehension

    • Paraphrase acquisition

  • Most data created from actual applications output

  • RTE-2: 800 examples in development and test sets

  • 50-50% YES/NO split

Some examples
Some Examples QUESTIONS (IJCAI-05)

Participation and impact
Participation and Impact QUESTIONS (IJCAI-05)

  • Very successful challenges, world wide:

    • RTE-1 – 17 groups

    • RTE-2 – 23 groups

      • 30 groups in total


    • RTE-3 underway – 25 groups

      • Joint workshop at ACL-07

  • High interest in the research community

    • Papers, conference sessions and areas, PhD’s, influence on funded projects

    • Textual Entailment special issue at JNLE

    • ACL-07 tutorial

Methods and approaches rte 2
Methods and Approaches (RTE-2) QUESTIONS (IJCAI-05)

  • Measure similarity match between t and h (coverage of h by t):

    • Lexical overlap (unigram, N-gram, subsequence)

    • Lexical substitution (WordNet, statistical)

    • Syntactic matching/transformations

    • Lexical-syntactic variations (“paraphrases”)

    • Semantic role labeling and matching

    • Global similarity parameters (e.g. negation, modality)

  • Cross-pair similarity

  • Detect mismatch (for non-entailment)

  • Logical interpretation and inference (vs. matching)

Dominant approach supervised learning
Dominant approach: QUESTIONS (IJCAI-05)Supervised Learning

  • Features model similarity and mismatch

  • Classifier determines relative weights of information sources

  • Train on development set and auxiliary t-h corpora

Similarity Features:Lexical, n-gram,syntactic

semantic, global





Feature vector


Average: 60%

Median: 59%


  • For the first time: deeper methods (semantic/ syntactic/ logical) clearly outperform shallow methods (lexical/n-gram)

Cf. Kevin Knight’s invited talk at EACL-06, titled:

Isn’t linguistic Structure Important, Asked the Engineer

  • Still, most systems based on deep analysis did not score significantly better than the lexical baseline


  • System reports point at:

    • Lack of knowledge (syntactic transformation rules, paraphrases, lexical relations, etc.)

    • Lack of training data

  • It seems that systems that coped better with these issues performed best:

    • Hickl et al. - acquisition of large entailment corpora for training

    • Tatu et al. – large knowledge bases (linguistic and world knowledge)

Some suggested research directions
Some suggested research directions QUESTIONS (IJCAI-05)

  • Knowledge acquisition

    • Unsupervised acquisition of linguistic and world knowledge from general corpora and web

    • Acquiring larger entailment corpora

    • Manual resources and knowledge engineering

  • Inference

    • Principled framework for inference and fusing information levels

    • Are we happy with bags of features?

Complementary evaluation modes
Complementary Evaluation Modes QUESTIONS (IJCAI-05)

  • Entailment subtasks evaluations

    • Lexical, lexical-syntactic, logical, alignment…

  • “Seek” mode:

    • Input: h and corpus

    • Output: All entailing t’s in corpus

    • Captures information seeking needs, but requires post-run annotation (TREC style)

  • Contribution to specific applications!

    • QA – Harabagiu & Hickl, ACL-06; RE – Romano et al., EACL-06

Our Own Research Directions QUESTIONS (IJCAI-05)AcquisitionInferenceApplications

Learning entailment rules
Learning Entailment Rules QUESTIONS (IJCAI-05)

Q: What reduces the risk of Heart Attacks?

Hypothesis:Aspirinreduces the risk ofHeart Attacks

Text:Aspirin prevents Heart Attacks

Entailment Rule:XpreventY ⇨ Xreduce risk ofY



Need a large knowledge base of entailment rules

Tease algorithm flow
TEASE – Algorithm Flow QUESTIONS (IJCAI-05)


Input template:




Sample corpus for input template:

Paula Jones accused Clinton…

Sanhedrin accused St.Paul…

Anchor Set Extraction(ASE)

Anchor sets:

{Paula Jonessubj; Clintonobj}

{Sanhedrinsubj; St.Paulobj}

Template Extraction


Sample corpus for anchor sets:

Paula Jones called Clinton indictable…

St.Paul defendedbefore the Sanhedrin


X call YindictableY defend before X…


Sample of extracted anchor sets for x prevent y
Sample of Extracted QUESTIONS (IJCAI-05)Anchor-Sets for X prevent Y

Sample of extracted templates for x prevent y
Sample of Extracted Templates for QUESTIONS (IJCAI-05)X preventY

Experiment and evaluation
Experiment and Evaluation QUESTIONS (IJCAI-05)

  • 48 randomly chosen input verbs

  • 1392 templates extracted ; human judgments

    Encouraging Results:

  • Future work: precision, estimate probabilities

Acquiring lexical entailment relations
Acquiring Lexical Entailment Relations QUESTIONS (IJCAI-05)

  • COLING-04, ACL-05Lexical entailment via distributional similarity

    • Individual features characterize semantic properties

    • Obtain characteristic features via bootstrapping

    • Test characteristic feature inclusion (vs. overlap)

  • COLING-ACL-06Integrate pattern-based extraction

    • NP such as NP1, NP2, …

    • Complementary information to distributional evidence

    • Integration using ML with minimal supervision (10 words)

Acquisition example
Acquisition Example QUESTIONS (IJCAI-05)

  • Top-ranked entailments for “company”:

  • firm, bank, group, subsidiary, unit, business,

  • supplier, carrier, agency, airline, division, giant,

  • entity, financial institution, manufacturer, corporation,

  • commercial bank, joint venture, maker, producer, factory …

  • Does not overlap traditional ontological relations

Initial probabilistic lexical co occurrence models
Initial Probabilistic QUESTIONS (IJCAI-05)Lexical Co-occurrence Models

  • Alignment-based (RTE-1 & ACL-05 Workshop)

    • The probability that a term in h is entailed by a particular term in t

  • Bayesian classification (AAAI-05)

    • The probability that a term in h is entailed by (fits in) the entire text of t

    • An unsupervised text categorization setting – each term is a category

  • Demonstrate directions for probabilistic modeling and unsupervised estimation

Manual syntactic transformations example x prevent y









Manual Syntactic Transformations Example: ‘X preventY ’

  • Sunscreen, which prevents moles and sunburns, ….

















Syntactic variability phenomena
Syntactic Variability Phenomena QUESTIONS (IJCAI-05)

Template: X activate Y


  • Promising potential for creating huge entailment knowledge bases

    • Mostly by unsupervised approaches

    • Manually encoded

    • Derived from lexical resources

  • Potential for uniform representations, such as entailment rules, for different types of semantic and world knowledge

Inference QUESTIONS (IJCAI-05)

  • Goal: infer hypothesis from text

    • Match and apply available entailment knowledge

    • Heuristically bridge inference gaps

  • Our approach: mapping language constructs

    • Vs. semantic interpretation

    • Lexical-syntactic structures as meaning representation

      • Amenable for unsupervised learning

    • Entailment rule transformations over syntactic trees

Application unsupervised relation extraction eacl 2006
Application: QUESTIONS (IJCAI-05)UnsupervisedRelation ExtractionEACL 2006

Relation extraction
Relation Extraction QUESTIONS (IJCAI-05)

  • Subfield of Information Extraction

  • Identify differentwaysof expressing a target relation

    • Examples: Management Succession, Birth - Death, Mergers and Acquisitions, Protein Interaction

  • Traditionally performed in a supervised manner

    • Requires dozens-hundreds examples per relation

    • Examples should cover broad semantic variability

  • Costly - Feasible???

  • Little work on unsupervised approaches

Our goals

Entailment Approach


Relation Extraction


Relation Extraction


Evaluation Framework for

Entailment Rule Acquisition and Matching

Proposed approach
Proposed Approach QUESTIONS (IJCAI-05)

Input Template

X prevent Y

Entailment Rule Acquisition



X prevention for Y, X treat Y, X reduce Y


Syntactic Matcher

Relation Instances

<sunscreen, sunburns>


  • Bunescu 2005

  • Recognizing interactions between annotated proteins pairs

    • 200 Medline abstracts

    • Gold standard dataset of protein pairs

  • Input template : X interact with Y

Manual analysis results
Manual Analysis - Results QUESTIONS (IJCAI-05)

  • 93% of interacting protein pairs can be identified with lexical syntactic templates

Number of templates vs. recall (within 93%):

Frequency of syntactic phenomena:

Tease output for x interact with y
TEASE Output for QUESTIONS (IJCAI-05)X interact with Y

A sample of correct templates learned:

Tease algorithm potential recall on training set
TEASE algorithm - QUESTIONS (IJCAI-05)Potential Recall on Training Set

  • Iterative - taking the top 5 ranked templates as input

  • Morph - recognizing morphological derivations(cf. semantic role labeling vs. matching)

Results for full system
Results for Full System QUESTIONS (IJCAI-05)

Error sources:

  • Dependency parser and syntactic matching errors

  • No morphological derivation recognition

  • TEASE limited precision (incorrect templates)

Vs supervised approaches
Vs Supervised Approaches QUESTIONS (IJCAI-05)

  • 180 training abstracts

Textual Entailment as a Framework for QUESTIONS (IJCAI-05)Investigating Semantics

Classical approach interpretation
Classical Approach = Interpretation QUESTIONS (IJCAI-05)

Stipulated Meaning Representation(by scholar)


Language(by nature)

  • Logical forms, word senses, semantic roles, named entity types, … - scattered tasks

  • Feasible/suitable framework for applied semantics?

Textual entailment text mapping
Textual Entailment = Text Mapping QUESTIONS (IJCAI-05)

Assumed Meaning (by humans)


Language(by nature)

General case inference
General Case – Inference QUESTIONS (IJCAI-05)





Textual Entailment

  • Entailment mapping is the actual applied goal - but also a touchstone for understanding!

  • Interpretation becomes a possiblemean

Some perspectives
Some perspectives QUESTIONS (IJCAI-05)

  • Issues with interpretation approach:

    • Hard to agree on a representation language

    • Costly to annotate semantic representations for training

  • Textual entailment refers to texts

    • Texts are theory neutral

    • Amenable for unsupervised learning

    • “Proof is in the pudding” test

Opens up a framework for investigating semantic issues
Opens up a framework for investigating semantic issues QUESTIONS (IJCAI-05)

  • Classical problems can be cast (linguistics)

    • All boys are nice  All tall boys are nice

      But also…

  • A new slant at old problems

  • Exposing many new ones

Making sense of implicit senses
Making sense of (implicit) senses QUESTIONS (IJCAI-05)

  • What is the RIGHT set of senses?

    • Any concrete set is problematic/subjective

    • … but WSD forces you to choose one

  • A lexical entailment perspective:

    • Instead of identifying an explicitly stipulated sense of a word occurrence …

    • identify whether a word occurrence (i.e. its implicit sense) entails another word occurrence, in context

    • ACL-2006

Lexical matching for applications
Lexical Matching for Applications QUESTIONS (IJCAI-05)

Q: announcement of new models of chairs

  • Sense equivalence

T1: IKEA announced a new comfort chair

T2: MIT announced a new CS chair position

  • Sense entailment in substitution

Q: announcement of new models of furnitures

T1: IKEA announced a new comfort chair

T2: MIT announced a new CS chair position

Synonym substitution
Synonym Substitution QUESTIONS (IJCAI-05)

Source = record Target = disc

This is anyway a stunning disc, thanks to the playing of the Moscow Virtuosi with Spivakov.

He said computer networks would not be affected and copies of information should be made on floppy discs.

Before the dead soldier was placed in the ditch his personal possessions were removed, leaving one disc on the body for identification purposes.




Investigated methods
Investigated Methods QUESTIONS (IJCAI-05)

  • Matching: indirect directLearning: supervised unsupervisedTask: classification ranking

Unsupervised direct knn ranking
Unsupervised Direct: kNN-ranking QUESTIONS (IJCAI-05)

  • Test example score:Average Cosine similarity of target example with k most similar instances of source word

  • Rational:

    • positive examples of target will be similar to some source occurrence (of corresponding sense)

    • negative examples won’t be similar to source

  • Rank test examples by score

    • A classification slant on language modeling

Results for synonyms ranking
Results (for synonyms): Ranking QUESTIONS (IJCAI-05)

  • kNN improves 8-18% precision up to 25% recall

Other projected and new problems
Other Projected and New Problems QUESTIONS (IJCAI-05)

  • Named Entity Classification – by any textual type

    • Which pickup trucks are produced by Mitsubishi? Magnum  pickup truck

  • Lexical semantic relationships (e.g. Wordnet)

    • Which relations contribute to entailment inference? How?

  • Semantic role mapping (vs. labeling)

  • Recognize transparent heads

  • Topical entailment – entailing textually defined topics

Textual entailment as goal
Textual Entailment as Goal QUESTIONS (IJCAI-05)

  • The essence of our proposal:

    • Formulate various semantic problems as entailment tasks

    • Base applied inference on entailment “engines” and KBs

  • Interpretations and mapping methods may compete

  • Open question: which inference

    • can be represented at language level?

    • requires logical or specialized representation and inference? (temporal, spatial, mathematical, …)

Meeting the knowledge challenge by a coordinated effort
Meeting the knowledge challenge – QUESTIONS (IJCAI-05)by a coordinated effort?

  • A vast amount of “entailment rules” needed

  • Speculation: is it possible to have a public effort for knowledge acquisition?

    • Simple, uniform representations

    • Assuming mostly automatic acquisition (millions of rules?)

    • Human Genome Projectanalogy

  • Preliminary: RTE-3 Resources Pool at ACLWiki

Textual entailment human reading comprehension
Textual Entailment QUESTIONS (IJCAI-05)≈Human Reading Comprehension

  • From a children’s English learning book(Sela and Greenberg):

    Reference Text:“…The Bermuda Triangle lies in the Atlantic Ocean, off the coast of Florida. …”

    Hypothesis (True/False?):The Bermuda Triangle is near the United States


Optimistic conclusions textual entailment
Optimistic Conclusions: QUESTIONS (IJCAI-05)Textual Entailment…

is a promising framework for applied semantics:

  • Defines new semantic problems to work on

  • May be modeled probabilistically

  • Appealing potential for knowledge acquisition

Thank you!