Normalized alignment of dependency trees for detecting textual entailment
1 / 16

Normalized alignment of dependency trees for detecting textual entailment - PowerPoint PPT Presentation

  • Uploaded on

Normalized alignment of dependency trees for detecting textual entailment. Erwin Marsi & Emiel Krahmer Tilburg University. Wauter Bosma & Mariët Theune University of Twente. Basic idea. A true hypothesis is included in the text, allowing omission and rephrasing

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Normalized alignment of dependency trees for detecting textual entailment' - aram

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Normalized alignment of dependency trees for detecting textual entailment

Normalized alignment of dependency trees for detecting textual entailment

Erwin Marsi & Emiel Krahmer

Tilburg University

Wauter Bosma & Mariët Theune

University of Twente

Basic idea
Basic idea textual entailment

  • A true hypothesis is included in the text, allowing omission and rephrasing

    Text: The Rolling Stones kicked off their latest tour on Sunday with a concert at Boston's Fenway Park.

    Hypothesis:The Rolling Stones have begun their latest tour with a concert in Boston.


  • Omissions:

    • on Sunday

    • Fenway Park

  • Paraphrases:

    • kicked off  begun

    • Boston's Fenway Park  Boston

RTE2 Workshop

Matching surface words alone is not sufficient
Matching surface words alone textual entailmentis not sufficient...

  • Variation in surface realization  perfect word match is no guarantee for entailment

  • Using syntactic analysis

    • for syntactic normalization

    • to match on hierarchical relations among constituents

      Example: “He became a boxing referee in 1964, and became well-known […]” “He became well-known in 1964”

RTE2 Workshop

Preprocessing textual entailment

  • Input: T-H pairs in XML

  • Processing pipeline:

    • Sentence splitting, MXTERMINATOR (Reynar & Ratnaparkhi, 1997)

    • Tokenization, Penn Treebank SED script

    • POS tagging with PTB POS tags using Mbt (van den Bosch et al)

    • Lemmatizing using Memory-based learning (van den Bosch et al)

    • Dependency parsing using Maltparser trained on PTB (Nivre & Scholz, 2004)

    • Syntactic normalization

  • Output: T-H dependency tree(s) pairs in XML

RTE2 Workshop

Syntactic normalization
Syntactic Normalization textual entailment

  • Three types of syntactic normalization:

    • Auxiliary reduction

    • Passive to active form

    • Copula reduction

RTE2 Workshop

Auxiliary reduction
Auxiliary Reduction textual entailment

  • Auxiliaries of progressive and perfective tense are removed

  • Their children are attached to the remaining content verb

  • The same goes for modal verbs, and for do in the do-support function.

    Example: “demand for ivory has dropped”  “demand for ivory dropped”

    Example: “legalization does not solve any social problems”  “legalization not solves any social problems”

RTE2 Workshop

Passive to active form
Passive to Active Form textual entailment

  • The passive form auxiliary is removed

  • The original subject becomes object

  • Where possible, a by-phrase becomes the subject

    Example: “Ahmedinejad was attacked by the US”  “the US attacked Ahmedinejad”

RTE2 Workshop

Copula reduction
Copula Reduction textual entailment

  • Copular verbs are removed by attaching the predicate as a daughter to the subject

    Example: “Microsoft Corp. is a partner of Intel Corp.”  “Microsoft Corp., a partner of Intel Corp.”

RTE2 Workshop

Alignment of dependency trees
Alignment of Dependency Trees textual entailment

  • Tree alignment algorithm based on (Meyers, Yangarbar and Grishman, 1996)

  • Searches for an optimal alignment of the nodes of the text tree to the nodes of the hypothesis tree

  • Tree alignment is a function of:

    • how well the words of the two nodes match

    • recursively, the weighted alignment score for each of the aligned daughter nodes

RTE2 Workshop

Word matching
Word Matching textual entailment

  • function WordMatch(wt,wh) -> [0,1] maps text-hypothesis word pairs to a similarity score

  • returns 1 if

    • wt is identical to wh

    • the lemma of wt is identical to the lemma of wh

    • wt is a synonym of wh (lookup in EuroWordnet with lemma & POS)

    • wh is a hypernym of wt (idem)

  • returns similarity from automatically derived thesaurus if > 0.1 (Lin’s dependency-based thesaurus)

  • otherwise returns 0

  • also match on phrasal verbs

    • e.g. “kick off“ is a synonym of “begin“

RTE2 Workshop

Alignment example
Alignment example textual entailment

Text:The development of agriculture by early humans, roughly 10,000 years ago, was also harmful to many natural ecosystems as they were systematically destroyed and replaced with artificial versions.

Hypothesis: Humans existed 10,000 years ago.


RTE2 Workshop

Alignment example cont d
Alignment example (cont’d) textual entailment

RTE2 Workshop

Entailment prediction
Entailment prediction textual entailment

  • Prediction rule:

    IF top node of the hypothesis is aligned AND score > threshold


    entailment = true


    entailment = false

  • Threshold and parameters of tree alignment algorithm (skip penalty)optimized per task

RTE2 Workshop

Results textual entailment

Percentage entailment accuracy (n=800)

RTE2 Workshop

Problems textual entailment

  • Many parses contain errors due to syntactic ambiguity and propagation of

    • Spelling errors

    • Tokenization errors

    • POS errors

    • broken dependency trees

  • Consequently, syntactic normalization & alignment failed

  • Dependency relations did not help

RTE2 Workshop

Discussion conclusion
Discussion & Conclusion textual entailment

  • There are many forms of textual entailment that we cannot recognize automatically...

    • Paraphrasing

    • Co-reference resolution

    • Ellipsis

    • Condition/modality

    • Inference

    • Common sense / world knowledge

  • RTE requires a combination of deep NLP, common sense knowledge and reasoning

RTE2 Workshop