1 / 21

Textual Entailment as Syntactic Graph Distance: a rule based and a SVM based approach

Textual Entailment as Syntactic Graph Distance: a rule based and a SVM based approach. Fabio Massimo Zanzotto Dipartimento Informatica Sistemistica e Comunicazione Università di Milano Bicocca Italy. Maria Teresa Pazienza and Marco Pennacchiotti

kyoko
Download Presentation

Textual Entailment as Syntactic Graph Distance: a rule based and a SVM based approach

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Textual Entailment as Syntactic Graph Distance: a rule based and a SVM based approach Fabio Massimo Zanzotto Dipartimento Informatica Sistemistica e Comunicazione Università di Milano Bicocca Italy Maria Teresa Pazienza and Marco Pennacchiotti Department of Computer Science, Systems and Production University of Roma “Tor Vergata”

  2. Classifying Textual Entailment (TE) Two dimensions Semantic dimension • paraphrasing (i.e., synonymy) • strict entailment Recognition dimension • semantic subsumption • America Airlines will lay off...  America Airlines will fire ... • syntactic subsumption • American Airlines began laying off hundreds of flight attendants on Tuesday  American Airlines will fire hundreds of flight attendants • direct implication • America Airlines will fire flight attendants  hundreds of flight attendents will lose their jobs

  3. semantic subsumption syntactic subsumption TE is a Graph Matching problem! Recognizing Textual Entailment (TE) T: H:

  4. Graph Matching (GM) GM is used, for instance, in Image Recognition One Problem: distortions in the input graphs!!

  5. Textual Entailment as Graph Matching (GM) Known limitations • distortion in the input syntactic/semantic graphs (errors in parsing, word sense disambiguation, etc.) • matching nodes is more complex than simple label matching • syntactic transformations should be an invariant phenomenon (nominalization, passivization, argument movement, ...) • textual entailment relation is an asimmetric relation Textual Entailment Measure

  6. What’s next Step 1 • Definition of the syntactic representation model (Extended Dependency Graph, XDG) Step 2: Rule-based Approach • Definition of the Graph Matching measure for the textual entailment relation Step 3: SVM-based Approach • Using a SVM to evaluate parameters of the Graph matching measure Step 4 • Preliminary analysis of the results on the development set

  7. Extended Dependency Graph (XDG) • C are constituents • syntactic head • potential semantic governor • D are dependencies among constituents

  8. GM on XDG: definitions • Isomorphic subsumption if two biiective functions fc and fd exist • Subgraph isomorphic subsumption if it exists so that • Maximal Common Subsumption Subgraph (MCSS) given and , is the MCSS if and then

  9. Finding the bijective function and evaluating the measure • Step 1 Constituent matching (fc:ChCt bijective) • Step 2 Dependency matching (fd:DhDt bijective) • Step 3 Define MCSS using fc and fd • Step 4 Evaluate Similarity Measure on MCSS

  10. Constituent Similarity • Degree of similarity where t h Parameter Box a

  11. AL Dependency Similarity • Degree of Similarity Parameter Box a

  12. dependencies constituents Textual Entailment Measure Finally.... textual entailment holds if >t Parameter Box a,d,t

  13. Some more details • Syntactic Transformation • nominalization • passive form • Other phenomena • be-sentences vs appositions, e.g., the president of XYZ is ... • treating the not

  14. Estimating Parameters with SVM • Main idea: divide the Graph Matching measure in many subparts • Assumptions • The hypothesis H is a simple S-V-O sentence • SVM must learn parameters and thresholds • A possibility: • Feature space divided in three parts: • Subject Related Features • Main Verb Related Features • Object Related Features

  15. Feature Spaces T: H:

  16. Feature Spaces • Percent of common tokens and lemmas • Task • Structural (Graph) Features • Subgraph matching indicators • Mean number of commonly anchored dependencies within constituents

  17. Used Resources • Chaos: A modular and lexicalised parser for English and Italian (Basili&Zanzotto, 1998, 2002) based on the extended dependency graph (XDG) formalism • WordNet • SVMlight

  18. Preliminary analysis (Rule-based System) Analysis of a on dev1 we decided for: a=0.85 g=0.85 d=0.5

  19. winning horse! Preliminary analysis (SVM-based system) • Test Bed: dev1+dev2 • Test Method: 3-fold cross validation repeated 10 times

  20. Out from the Fairy Tale...

  21. ... and back to real life!!!! Comdex -- once among the world's largest trade shows, the launching pad for new computer and software products, and a Las Vegas fixture for 20 years -- has been canceled for this year. Los Vegas hosted the Comdex trade show for 20 years.

More Related