Normalized alignment of dependency trees for detecting textual entailment
1 / 16

Normalized alignment of dependency trees for detecting textual entailment - PowerPoint PPT Presentation

  • Uploaded on

Normalized alignment of dependency trees for detecting textual entailment. Erwin Marsi & Emiel Krahmer Tilburg University. Wauter Bosma & Mariët Theune University of Twente. Basic idea. A true hypothesis is included in the text, allowing omission and rephrasing

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Normalized alignment of dependency trees for detecting textual entailment' - aram

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Normalized alignment of dependency trees for detecting textual entailment

Normalized alignment of dependency trees for detecting textual entailment

Erwin Marsi & Emiel Krahmer

Tilburg University

Wauter Bosma & Mariët Theune

University of Twente

Basic idea
Basic idea textual entailment

  • A true hypothesis is included in the text, allowing omission and rephrasing

    Text: The Rolling Stones kicked off their latest tour on Sunday with a concert at Boston's Fenway Park.

    Hypothesis:The Rolling Stones have begun their latest tour with a concert in Boston.


  • Omissions:

    • on Sunday

    • Fenway Park

  • Paraphrases:

    • kicked off  begun

    • Boston's Fenway Park  Boston

RTE2 Workshop

Matching surface words alone is not sufficient
Matching surface words alone textual entailmentis not sufficient...

  • Variation in surface realization  perfect word match is no guarantee for entailment

  • Using syntactic analysis

    • for syntactic normalization

    • to match on hierarchical relations among constituents

      Example: “He became a boxing referee in 1964, and became well-known […]” “He became well-known in 1964”

RTE2 Workshop

Preprocessing textual entailment

  • Input: T-H pairs in XML

  • Processing pipeline:

    • Sentence splitting, MXTERMINATOR (Reynar & Ratnaparkhi, 1997)

    • Tokenization, Penn Treebank SED script

    • POS tagging with PTB POS tags using Mbt (van den Bosch et al)

    • Lemmatizing using Memory-based learning (van den Bosch et al)

    • Dependency parsing using Maltparser trained on PTB (Nivre & Scholz, 2004)

    • Syntactic normalization

  • Output: T-H dependency tree(s) pairs in XML

RTE2 Workshop

Syntactic normalization
Syntactic Normalization textual entailment

  • Three types of syntactic normalization:

    • Auxiliary reduction

    • Passive to active form

    • Copula reduction

RTE2 Workshop

Auxiliary reduction
Auxiliary Reduction textual entailment

  • Auxiliaries of progressive and perfective tense are removed

  • Their children are attached to the remaining content verb

  • The same goes for modal verbs, and for do in the do-support function.

    Example: “demand for ivory has dropped”  “demand for ivory dropped”

    Example: “legalization does not solve any social problems”  “legalization not solves any social problems”

RTE2 Workshop

Passive to active form
Passive to Active Form textual entailment

  • The passive form auxiliary is removed

  • The original subject becomes object

  • Where possible, a by-phrase becomes the subject

    Example: “Ahmedinejad was attacked by the US”  “the US attacked Ahmedinejad”

RTE2 Workshop

Copula reduction
Copula Reduction textual entailment

  • Copular verbs are removed by attaching the predicate as a daughter to the subject

    Example: “Microsoft Corp. is a partner of Intel Corp.”  “Microsoft Corp., a partner of Intel Corp.”

RTE2 Workshop

Alignment of dependency trees
Alignment of Dependency Trees textual entailment

  • Tree alignment algorithm based on (Meyers, Yangarbar and Grishman, 1996)

  • Searches for an optimal alignment of the nodes of the text tree to the nodes of the hypothesis tree

  • Tree alignment is a function of:

    • how well the words of the two nodes match

    • recursively, the weighted alignment score for each of the aligned daughter nodes

RTE2 Workshop

Word matching
Word Matching textual entailment

  • function WordMatch(wt,wh) -> [0,1] maps text-hypothesis word pairs to a similarity score

  • returns 1 if

    • wt is identical to wh

    • the lemma of wt is identical to the lemma of wh

    • wt is a synonym of wh (lookup in EuroWordnet with lemma & POS)

    • wh is a hypernym of wt (idem)

  • returns similarity from automatically derived thesaurus if > 0.1 (Lin’s dependency-based thesaurus)

  • otherwise returns 0

  • also match on phrasal verbs

    • e.g. “kick off“ is a synonym of “begin“

RTE2 Workshop

Alignment example
Alignment example textual entailment

Text:The development of agriculture by early humans, roughly 10,000 years ago, was also harmful to many natural ecosystems as they were systematically destroyed and replaced with artificial versions.

Hypothesis: Humans existed 10,000 years ago.


RTE2 Workshop

Alignment example cont d
Alignment example (cont’d) textual entailment

RTE2 Workshop

Entailment prediction
Entailment prediction textual entailment

  • Prediction rule:

    IF top node of the hypothesis is aligned AND score > threshold


    entailment = true


    entailment = false

  • Threshold and parameters of tree alignment algorithm (skip penalty)optimized per task

RTE2 Workshop

Results textual entailment

Percentage entailment accuracy (n=800)

RTE2 Workshop

Problems textual entailment

  • Many parses contain errors due to syntactic ambiguity and propagation of

    • Spelling errors

    • Tokenization errors

    • POS errors

    • broken dependency trees

  • Consequently, syntactic normalization & alignment failed

  • Dependency relations did not help

RTE2 Workshop

Discussion conclusion
Discussion & Conclusion textual entailment

  • There are many forms of textual entailment that we cannot recognize automatically...

    • Paraphrasing

    • Co-reference resolution

    • Ellipsis

    • Condition/modality

    • Inference

    • Common sense / world knowledge

  • RTE requires a combination of deep NLP, common sense knowledge and reasoning

RTE2 Workshop