1 / 31

Knowledge and Tree-Edits in Learnable Entailment Proofs

BIUTEE. Knowledge and Tree-Edits in Learnable Entailment Proofs. Asher Stern, Amnon Lotan , Shachar Mirkin , Eyal Shnarch , Lili Kotlerman , Jonathan Berant and Ido Dagan TAC November 2011, NIST, Gaithersburg, Maryland, USA Download at: http ://www.cs.biu.ac.il/~ nlp/downloads/biutee.

pilar
Download Presentation

Knowledge and Tree-Edits in Learnable Entailment Proofs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BIUTEE Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern,AmnonLotan, ShacharMirkin, EyalShnarch, LiliKotlerman, Jonathan Berantand Ido Dagan TAC November 2011, NIST, Gaithersburg, Maryland, USA Download at: http://www.cs.biu.ac.il/~nlp/downloads/biutee

  2. RTE • Classify a (T,H) pair asENTAILING or NON-ENTAILING Example • T: The boy was located by the police. • H: Eventually, the police found the child.

  3. Matching vs. Transformations • Matching • Sequence of transformations (A proof) • Tree-Edits • Complete proofs • Estimate confidence • Knowledge based Entailment Rules • Linguistically motivated • Formalize many types of knowledge T = T0→ T1→ T2→ ... →Tn = H

  4. Transformation based RTE - Example T = T0→ T1→ T2→ ... →Tn = H Text: The boy was located by the police. Hypothesis: Eventually, the police found the child.

  5. Transformation based RTE - Example T = T0→ T1→ T2→ ... →Tn = H Text: The boy was located by the police. The police located the boy. The police found the boy. The police found the child. Hypothesis:Eventually, the police found the child.

  6. Transformation based RTE - Example T = T0→ T1→ T2→ ... →Tn = H

  7. BIUTEE Goals • Tree Edits • Complete proofs • Estimate confidence • Entailment Rules • Linguistically motivated • Formalize many types of knowledge • BIUTEE • Integrates the benefits of both worlds

  8. Challenges / System Components How to • generate linguistically motivated complete proofs? • estimate proof confidence? • find the best proof? • learn the model parameters?

  9. 1. Generate linguistically motivated complete proofs

  10. Entailment Rules Generic Syntactic Lexical Syntactic Lexical boy child Bar-Haim et al. 2007. Semantic inference at the lexical-syntactic level.

  11. Extended Tree Edits (On The Fly Operations) • Predefined custom tree edits • Insert node on the fly • Move node / move sub-tree on the fly • Flip part of speech • … • Heuristically capture linguistic phenomena • Operation definition • Features definition

  12. Proof over Parse Trees - Example T = T0→ T1→ T2→ ... →Tn = H Text: The boy was located by the police. Passive to active The police located the boy. X locate Y  X find Y The police found the boy. Boy  child The police found the child. Insertion on the fly Hypothesis: Eventually, the police found the child.

  13. 2. Estimate proof confidence

  14. Cost based Model • Define operation cost • Assesses operation’s validity • Represent each operation as a feature vector • Cost is linear combination of feature values • Define proof cost as the sum of the operations’ costs • Classify: entailment if and only if proof cost is smaller than a threshold

  15. Feature vector representation • Define operation cost • Represent each operation as a feature vector Features (Insert-Named-Entity, Insert-Verb, … , WordNet, Lin, DIRT, …) The police located the boy. DIRT: X locate Y X find Y (score = 0.9) The police found the boy. An operation A downward function of score (0,0,…,0.457,…,0) (0 ,0,…,0,…,0) Feature vector that represents the operation

  16. Cost based Model • Define operation cost • Cost is linear combination of feature values Cost = weight-vector * feature-vector • Weight-vector is learned automatically

  17. Confidence Model • Define operation cost • Represent each operation as a feature vector • Define proof cost as the sum of the operations’ costs Vector represents the proof.Define Cost of proof Weight vector

  18. Feature vector representation - example T = T0→ T1→ T2→ ... →Tn = H Text: The boy was located by the police. Passive to active The police located the boy. X locate Y  X find Y The police found the boy. Boy  child The police found the child. Insertion on the fly Hypothesis: Eventually, the police found the child. (0,0,……………….………..,1,0) + (0,0,………..……0.457,..,0,0) + (0,0,..…0.5,.……….……..,0,0) + (0,0,1,……..…….…..…....,0,0) = (0,0,1..0.5..…0.457,....…1,0)

  19. Cost based Model • Define operation cost • Represent each operation as a feature vector • Define proof cost as the sum of the operations’ costs • Classify: “entailing” if and only if proof cost is smaller than a threshold Learn

  20. 3. Find the best proof

  21. Search the best proof T H Proof #1 Proof #2 Proof #3 Proof #4

  22. Search the best proof T  H T  H Proof #1 Proof #1 Proof #2 Proof #2 Proof #3 Proof #3 Proof #4 Proof #4 • Need to find the “best” proof • “Best Proof” = proof with lowest cost • Assuming a weight vector is given • Search space is exponential • AI style search algorithm

  23. 4. Learn model parameters

  24. Learning • Goal: Learn parameters (w,b) • Use a linear learning algorithm • logistic regression, SVM, etc.

  25. Inference vs. Learning Feature extraction Vector representation Learning algorithm Training samples Feature extraction Best Proofs w,b

  26. Inference vs. Learning Vector representation Learning algorithm Training samples Feature extraction Best Proofs w,b

  27. Iterative Learning Scheme Vector representation Learning algorithm Training samples 3. Learn new w and b Best Proofs w,b 4. Repeat to step 2 2. Find the best proofs 1. W=reasonable guess

  28. Summary- System Components How to • Generate syntactically motivated complete proofs? • Entailment rules • On the fly operations (Extended Tree Edit Operations) • Estimate proof validity? • Confidence Model • Find the best proof? • Search Algorithm • Learn the model parameters? • Iterative Learning Scheme

  29. Results RTE7

  30. Conclusions • Inference via sequence of transformations • Knowledge • Extended Tree Edits • Proof confidence estimation • Results • Better than median on RTE7 • Best on RTE6 • Open Source http://www.cs.biu.ac.il/~nlp/downloads/biutee

  31. Thank You http://www.cs.biu.ac.il/~nlp/downloads/biutee

More Related