1 / 67

Textual Entailment as a Framework for Applied Semantics

Textual Entailment as a Framework for Applied Semantics. Ido Dagan Bar-Ilan University, Israel Joint works with: Oren Glickman, Idan Szpektor, Roy Bar Haim, Maayan Geffet, Moshe Koppel, Efrat Marmorshtein, Bar Ilan University Shachar Mirkin Hebrew University, Israel

Download Presentation

Textual Entailment as a Framework for Applied Semantics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Textual Entailment as a Framework for Applied Semantics Ido Dagan Bar-Ilan University, Israel Joint works with: Oren Glickman, Idan Szpektor, Roy Bar Haim, Maayan Geffet, Moshe Koppel, Efrat Marmorshtein, Bar Ilan UniversityShachar Mirkin Hebrew University, Israel Hristo Tanev, Bernardo Magnini, Alberto Lavelli, Lorenza Romano ITC-irst, Italy Bonaventura Coppola, Milen Kouylekov University of Trento and ITC-irst, Italy Danilo Giampiccolo, CELCT, Italy Dan Roth, UIUC

  2. Applied Semantics forText Understanding/Reading • Understanding text meaning refers to the semantic level of language • An applied computational framework for semantics is needed • Such common framework is still missing

  3. Desiderata for Modeling Framework • A framework for a target level of language processing should provide: • Generic module for applications • Unified paradigm for investigating language phenomena • Unified knowledge representation • Most semantics research is scattered • WSD, NER, SRL, lexical semantics relations… (e.g. vs. syntax) • Dominating approach - interpretation

  4. Outline • The textual entailment task – what and why? • Evaluation – PASCAL RTE Challenges • Modeling approach: • Knowledge acquisition • Inference (briefly) • Application example • An alternative framework for investigating semantics

  5. Variability Ambiguity Natural Language and Meaning Meaning Language

  6. Variability of Semantic Expression The Dow Jones Industrial Average closed up 255 Model variabilityas relations between text expressions: • Equivalence: expr1  expr2 (paraphrasing) • Entailment: expr1  expr2 – the general case • Incorporates inference as well Dow ends up Dow gains 255 points Stock market hits a record high Dow climbs 255

  7. Typical Application Inference QuestionExpected answer formWhoboughtOverture? >> XboughtOverture Overture’s acquisitionby Yahoo Yahoo bought Overture entails hypothesized answer text • Similar for IE: X buy Y • Similar for “semantic” IR: t: Overture was bought … • Summarization (multi-document) – identify redundant info • MT evaluation (and recent ideas for MT) • Educational applications

  8. KRAQ'05 Workshop - KNOWLEDGE and REASONING for ANSWERING QUESTIONS (IJCAI-05) CFP: • Reasoning aspects:    * information fusion,    * search criteria expansion models     * summarization and intensional answers,    * reasoning under uncertainty or with incomplete knowledge, • Knowledge representation and integration:    * levels of knowledge involved (e.g. ontologies, domain knowledge),    * knowledge extraction models and techniques to optimize response accuracy… but similar needs for other applications – can entailment provide a common empirical task?

  9. Classical Entailment Definition • Chierchia & McConnell-Ginet (2001):A text t entails a hypothesis h if h is true in every circumstance (possible world) in which t is true • Strict entailment - doesn't account for some uncertainty allowed in applications

  10. “Almost certain” Entailments t:The technological triumph known as GPS … was incubated in the mind of Ivan Getting. h: Ivan Getting invented the GPS.

  11. Applied Textual Entailment • Directional relation between two text fragments: Text (t) and Hypothesis (h): • Operational (applied) definition: • Human gold standard - as in NLP applications • Assuming common background knowledge – which is indeed expected from applications!

  12. Probabilistic Interpretation Definition: • t probabilistically entailshif: • P(h istrue | t) > P(h istrue) • tincreases the likelihood of h being true • ≡ Positive PMI – t provides information on h’s truth • P(h istrue | t ):entailment confidence • The relevant entailment score for applications • In practice: “most likely” entailment expected

  13. The Role of Knowledge • For textual entailment to hold we require: • text AND knowledgeh but • knowledge should not entail h alone • Systems are not supposed to validate h’s truth without utilizing t

  14. PASCAL Recognizing Textual Entailment (RTE) ChallengesEU FP-6 Funded PASCAL NOE 2004-7 Bar-Ilan University ITC-irst and CELCT, Trento MITRE Microsoft Research

  15. Generic Dataset by Application Use • 7 application settings in RTE-1, 4 in RTE-2/3 • QA • IE • “Semantic” IR • Comparable documents / multi-doc summarization • MT evaluation • Reading comprehension • Paraphrase acquisition • Most data created from actual applications output • RTE-2: 800 examples in development and test sets • 50-50% YES/NO split

  16. Some Examples

  17. Participation and Impact • Very successful challenges, world wide: • RTE-1 – 17 groups • RTE-2 – 23 groups • 30 groups in total • ~150 downloads! • RTE-3 underway – 25 groups • Joint workshop at ACL-07 • High interest in the research community • Papers, conference sessions and areas, PhD’s, influence on funded projects • Textual Entailment special issue at JNLE • ACL-07 tutorial

  18. Methods and Approaches (RTE-2) • Measure similarity match between t and h (coverage of h by t): • Lexical overlap (unigram, N-gram, subsequence) • Lexical substitution (WordNet, statistical) • Syntactic matching/transformations • Lexical-syntactic variations (“paraphrases”) • Semantic role labeling and matching • Global similarity parameters (e.g. negation, modality) • Cross-pair similarity • Detect mismatch (for non-entailment) • Logical interpretation and inference (vs. matching)

  19. Dominant approach: Supervised Learning • Features model similarity and mismatch • Classifier determines relative weights of information sources • Train on development set and auxiliary t-h corpora Similarity Features:Lexical, n-gram,syntactic semantic, global Classifier YES t,h NO Feature vector

  20. Results Average: 60% Median: 59%

  21. Analysis  • For the first time: deeper methods (semantic/ syntactic/ logical) clearly outperform shallow methods (lexical/n-gram) Cf. Kevin Knight’s invited talk at EACL-06, titled: Isn’t linguistic Structure Important, Asked the Engineer • Still, most systems based on deep analysis did not score significantly better than the lexical baseline

  22. Why? • System reports point at: • Lack of knowledge (syntactic transformation rules, paraphrases, lexical relations, etc.) • Lack of training data • It seems that systems that coped better with these issues performed best: • Hickl et al. - acquisition of large entailment corpora for training • Tatu et al. – large knowledge bases (linguistic and world knowledge)

  23. Some suggested research directions • Knowledge acquisition • Unsupervised acquisition of linguistic and world knowledge from general corpora and web • Acquiring larger entailment corpora • Manual resources and knowledge engineering • Inference • Principled framework for inference and fusing information levels • Are we happy with bags of features?

  24. Complementary Evaluation Modes • Entailment subtasks evaluations • Lexical, lexical-syntactic, logical, alignment… • “Seek” mode: • Input: h and corpus • Output: All entailing t’s in corpus • Captures information seeking needs, but requires post-run annotation (TREC style) • Contribution to specific applications! • QA – Harabagiu & Hickl, ACL-06; RE – Romano et al., EACL-06

  25. Our Own Research DirectionsAcquisitionInferenceApplications

  26. Learning Entailment Rules Q: What reduces the risk of Heart Attacks? Hypothesis:Aspirinreduces the risk ofHeart Attacks Text:Aspirin prevents Heart Attacks Entailment Rule:XpreventY ⇨ Xreduce risk ofY template template Need a large knowledge base of entailment rules

  27. TEASE – Algorithm Flow Lexicon Input template: Xsubj-accuse-objY WEB TEASE Sample corpus for input template: Paula Jones accused Clinton… Sanhedrin accused St.Paul… … Anchor Set Extraction(ASE) Anchor sets: {Paula Jonessubj; Clintonobj} {Sanhedrinsubj; St.Paulobj} … Template Extraction (TE) Sample corpus for anchor sets: Paula Jones called Clinton indictable… St.Paul defendedbefore the Sanhedrin … Templates: X call YindictableY defend before X… iterate

  28. Sample of ExtractedAnchor-Sets for X prevent Y

  29. Sample of Extracted Templates for X preventY

  30. Experiment and Evaluation • 48 randomly chosen input verbs • 1392 templates extracted ; human judgments Encouraging Results: • Future work: precision, estimate probabilities

  31. Acquiring Lexical Entailment Relations • COLING-04, ACL-05Lexical entailment via distributional similarity • Individual features characterize semantic properties • Obtain characteristic features via bootstrapping • Test characteristic feature inclusion (vs. overlap) • COLING-ACL-06Integrate pattern-based extraction • NP such as NP1, NP2, … • Complementary information to distributional evidence • Integration using ML with minimal supervision (10 words)

  32. Acquisition Example • Top-ranked entailments for “company”: • firm, bank, group, subsidiary, unit, business, • supplier, carrier, agency, airline, division, giant, • entity, financial institution, manufacturer, corporation, • commercial bank, joint venture, maker, producer, factory … • Does not overlap traditional ontological relations

  33. Initial Probabilistic Lexical Co-occurrence Models • Alignment-based (RTE-1 & ACL-05 Workshop) • The probability that a term in h is entailed by a particular term in t • Bayesian classification (AAAI-05) • The probability that a term in h is entailed by (fits in) the entire text of t • An unsupervised text categorization setting – each term is a category • Demonstrate directions for probabilistic modeling and unsupervised estimation

  34. rel  rel N1 N2 conj mod and N2 Manual Syntactic Transformations Example: ‘X preventY ’ • Sunscreen, which prevents moles and sunburns, …. sunscreen prevent obj subj Y X which subj prevents obj () moles mod conj and sunburns

  35. Syntactic Variability Phenomena Template: X activate Y

  36. Takeout • Promising potential for creating huge entailment knowledge bases • Mostly by unsupervised approaches • Manually encoded • Derived from lexical resources • Potential for uniform representations, such as entailment rules, for different types of semantic and world knowledge

  37. Inference • Goal: infer hypothesis from text • Match and apply available entailment knowledge • Heuristically bridge inference gaps • Our approach: mapping language constructs • Vs. semantic interpretation • Lexical-syntactic structures as meaning representation • Amenable for unsupervised learning • Entailment rule transformations over syntactic trees

  38. Application:UnsupervisedRelation ExtractionEACL 2006

  39. Relation Extraction • Subfield of Information Extraction • Identify differentwaysof expressing a target relation • Examples: Management Succession, Birth - Death, Mergers and Acquisitions, Protein Interaction • Traditionally performed in a supervised manner • Requires dozens-hundreds examples per relation • Examples should cover broad semantic variability • Costly - Feasible??? • Little work on unsupervised approaches

  40. Our Goals Entailment Approach for Relation Extraction Unsupervised Relation Extraction System Evaluation Framework for Entailment Rule Acquisition and Matching

  41. Proposed Approach Input Template X prevent Y Entailment Rule Acquisition TEASE Templates X prevention for Y, X treat Y, X reduce Y TransformationRules Syntactic Matcher Relation Instances <sunscreen, sunburns>

  42. Dataset • Bunescu 2005 • Recognizing interactions between annotated proteins pairs • 200 Medline abstracts • Gold standard dataset of protein pairs • Input template : X interact with Y

  43. Manual Analysis - Results • 93% of interacting protein pairs can be identified with lexical syntactic templates Number of templates vs. recall (within 93%): Frequency of syntactic phenomena:

  44. TEASE Output for X interact with Y A sample of correct templates learned:

  45. TEASE algorithm - Potential Recall on Training Set • Iterative - taking the top 5 ranked templates as input • Morph - recognizing morphological derivations(cf. semantic role labeling vs. matching)

  46. Results for Full System Error sources: • Dependency parser and syntactic matching errors • No morphological derivation recognition • TEASE limited precision (incorrect templates)

  47. Vs Supervised Approaches • 180 training abstracts

More Related