1 / 16

Using Maximal Embedded Subtrees for Textual Entailment Recognition

Sophia Katrenko & Pieter Adriaans Adaptive Information Disclosure project Human Computer Studies Laboratory, IvI, University of Amsterdam katrenko@science.uva.nl. Using Maximal Embedded Subtrees for Textual Entailment Recognition. Outline. Task statement Tree mining: methods Experiments

tarias
Download Presentation

Using Maximal Embedded Subtrees for Textual Entailment Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sophia Katrenko & Pieter Adriaans Adaptive Information Disclosure project Human Computer Studies Laboratory, IvI, University of Amsterdam katrenko@science.uva.nl Using Maximal Embedded Subtrees for Textual Entailment Recognition

  2. Outline • Task statement • Tree mining: methods • Experiments • Discussion

  3. Why trees?… Complex structure • What do these two pictures have in common? (Scottish handwriting (17th century)) Complex structure!

  4. Motivation Idea: trees can be compared in order to find highly similar structures • Tree mining is an intermediate step which allows for the frequent subtree discovery • When looking for the most frequent subtrees, we can relax the restrictions on how similar two subtrees should be

  5. What type of trees? (1) • In tree mining, there are the following subtrees distinguished: • Bottom-up subtrees • Induced subtrees • Embedded subtrees • We use embedded tree mining as described in (M. Zaki, 2005, “Efficiently mining Frequent Trees in a Forest: Algorithms and Applications).

  6. What type of trees?(2) A A B D B D C K E G C E H F K F G H Tree 2 Tree 1 Tree 1 RED – embedded trees YELLOW – bottom-up trees

  7. Data Dependency parsing Depth first search (DFS, preorder) Rooted ordered emb. tree mining Setting thresholds Evaluation Methodology

  8. Data preprocessing • Each pair of sentences has been parsed by Minipar (Dekang Lin) • Each dependency tree has been transformed by incorporating edge labels into node labels • Each transformed tree has been presented in preorder (or DFS)

  9. Syntactic matching • Provided two sentences (trees, consequently) S1 and S2 where =|S1| and =|S2|, let the size of the rooted maximal embedded tree be . We define the similarity score as a ratio

  10. Runs • Run 1: syntactic matching (syntactic functions being incorporated into the node labels) & lemmas overlap • Run 2: lemmas overlap (baseline) • Run 3: syntactic matching (without syntactic functions) & lemmas overlap

  11. Official results (accuracy) • Run 1 (59%) • QA 60.50% • SUM 69.50% • IR 62.00% • IE 44.00%

  12. Precision vs. Recall

  13. Precision vs. Recall (2)

  14. Conclusions: Does it work? • Syntactic matching improves precision! But… • In some cases, it is too flexible (which leads to false positives) • We used ordered trees, therefore such pairs as below do not get high matching scores (h) The currency used in China is the Renminbi Yuan. (t) The Renminbi Yuan is the currency used in China.

  15. Possible extensions • Use the synonyms/antonyms from WordNet • Handle situation where there are several maximal subtrees • Use weighing for the tree nodes • Use deep semantic analysis

  16. H: The author expressed his gratitude to the audience T: Thank you! / False? True

More Related