1 / 48

Classification of Discourse Coherence Relations: An Exploratory Study using Multiple Knowledge Sources

mitre. Classification of Discourse Coherence Relations: An Exploratory Study using Multiple Knowledge Sources. Ben Wellner † *, James Pustejovsky † , Catherine Havasi † , Anna Rumshisky † and Roser Saurí †. † Brandeis University * The MITRE Corporation. Outline of Talk.

marge
Download Presentation

Classification of Discourse Coherence Relations: An Exploratory Study using Multiple Knowledge Sources

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. mitre Classification of Discourse Coherence Relations: An Exploratory Study using Multiple Knowledge Sources Ben Wellner†*, James Pustejovsky†, Catherine Havasi†, Anna Rumshisky† and Roser Saurí† † Brandeis University * The MITRE Corporation

  2. Outline of Talk • Overview and Motivation for Modeling Discourse • Background • Objectives • The Discourse GraphBank • Overview • Coherence Relations • Issues with the GraphBank • Modeling Discourse • Machine learning approach • Knowledge Sources and Features • Experiments and Analysis • Conclusions and Future Work

  3. Modeling Discourse: Motivation • Why model discourse? • Dialogue • General text understanding applications • Text summarization and generation • Information extraction • MUC Scenario Template Task • Discourse is vital for understanding how events are related • Modeling discourse generally may aid specific extraction tasks

  4. Background • Different approaches to discourse • Semantics/formalisms: Hobbs [1985], Mann and Thomson[1987], Grosz and Sidner[1986], Asher [1993], others • Different objectives • Informational vs. intentional, dialog vs. general text • Different inventories of discourse relations • Coarse vs. fine-grained • Different representations • Tree representation vs. Graph • Same steps involved: • 1. Identifying discourse segments • 2. Grouping discourse segments into sequences • 3. Identifying the presence of a relation • 4. Identifying the type of the relation

  5. Discourse Steps #1* Mary is in a bad mood because Fred played tuba while she was taking a nap. 1. Segment: A B C r1 2. Group r2 A 3. Connect segments B C r1 = cause-effect r2 = elaboration 4. Relation Type: * Example from [Danlos 2004]

  6. Discourse Steps #2* 1. Segment: Fred played the tuba. Next he prepared a pizza to please Mary. A B C r1 r2 2. Group 3. Connect segments A B C r1 = temporal precedence r2 = cause-effect 4. Relation Type: * Example from [Danlos 2004]

  7. Objectives • Our Main Focus: Step 4 - classifying discourse relations • Important for all approaches to discourse • Can be approached independently of representation • But – relation types and structure are probably quite dependent • Task will vary with inventory of relation types • What types of knowledge/features are important for this task • Can we apply the same approach to Step 3: • Identifying whether two segment groups are linked

  8. Discourse GraphBank: Overview • [Wolf and Gibson, 2005] • Graph-based representation of discourse • Tree-representation inadequate: multiple parents, crossing dependencies • Discourse composed of clausal segments • Segments can be grouped into sequences • Relations need not exist between segments within a group • Coherence relations between segment groups • Roughly those of Hobbs [1985] • Why GraphBank? • Similar inventory of relations as SDRT • Linked to lexical representations • Semantics well-developed • Includes non-local discourse links • Existing annotated corpus, unexplored outside of [Wolf and Gibson, 2005]

  9. Resemblance Relations The first flight to Frankfurt this morning was delayed. The second flight arrived late as well. Similarity: (parallel) The first flight to Frankfurt this morning was delayed. The second flight arrived on time. Contrast: There have been many previous missions to Mars. A famous example is the Pathfinder mission. Example: Generalization: Two missions to Mars in 1999 failed. There are many missions to Mars that have failed. A probe to Mars was launched from the Ukraine this week. The European-built “Mars Express” is scheduled to reach Mars by Dec. Elaboration*: • * The elaboration relation is given one or more sub-types: • organization, person, location, time, number, detail

  10. Causal, Temporal and Attribution Relations Cause-effect: There was bad weather at the airport and so our flight got delayed Causal If the new software works, everyone should be happy. Conditional: The new software worked great, but nobody was happy. Violated Expectation: First, John went grocery shopping. Then, he disappeared into a liquor store. Temporal Precedence: John said that the weather would be nice tomorrow. Attribution Attribution: The economy, according to analysts, is expected to improve by early next year. Same:

  11. Some Issues with GraphBank • Coherence relations • Conflation of actual causation and intention/purpose • Granularity • Desirable for relations hold between eventualities or entities, not necessarily entire clausal segments: The university spent $30,000 to upgrade lab equipment in 1987 cause ?? John pushed the door to open it. cause elaboration the new policy came about after President Reagan’s historic decision in mid-December to reverse the policy of refusing to deal with members of the organization, long shunned as a band of terrorists. Reagan said PLO chairman Yasser Arafat had met US demands.

  12. A Classifier-based Approach • For each pair of discourse segments, classify relation type between them • For segment pairs on which we know a relation exists • Advantages • Include arbitrary knowledge sources as features • Easier than implementing inference on top of semantic interpretations • Robust performance • Gain insight into how different knowledge sources contribute • Disadvantages • Difficult to determine why mistakes happen • Maximum Entropy • Commonly used discriminative classifier • Allows for a high-number of non-independent features

  13. Knowledge Sources • Knowledge Sources: • Proximity • Cue Words • Lexical Similarity • Events • Modality and Subordinating Relations • Grammatical Relations • Temporal relations • Associate with each knowledge source • One or more Feature Classes

  14. Example SEG2: The university spent $30000 SEG1: to upgrade lab equipment in 1987

  15. Proximity • Motivation • Some relations tend to be local – i.e. Their arguments appear nearby in the text • Attribution, cause-effect, temporal precedence, violated expectation • Other relations can span larger portions of text • Elaboration • Similar, contrast Feature Class Proximity: - Whether segments are adjacent or not - Directionality (which argument appears earlier in the text) - Number of intervening segments

  16. Example SEG2: The university spent $30000 SEG1: to upgrade lab equipment in 1987

  17. Cue Words • Motivation: • Many coherence relations are frequently identified by a discourse cue word or phrase: “therefore”, “but”, “in contrast” • Cues are generally captured by the first word in a segment • Obviates enumerating all potential cue words • Non-traditional discourse markers (e.g. adverbials or even determiners) may indicate a preference for certain relation types Feature Class Cue Words: - First word in each segment

  18. Example SEG2: The university spent $30000 SEG1: to upgrade lab equipment in 1987

  19. Lexical Coherence • Motivation: • Identify lexical associations, lexical/semantic similarities • E.g. push/fall, crash/injure, lab/university • Brandeis Semantic Ontology (BSO) • Taxonomy of types (i.e. senses) • Includes qualia information for words • Telic (purpose), agentive (creation), constitutive (parts) • Word Sketch Engine (WSE) • Similarity of words as measured by their contexts in a corpus (BNC) Feature Class BSO: - Paths between words up to length 10 WSE: - Number of word pairs with similarity > 0.05, > 0.01 - Segment similarities (sum of word-pair similarities / # words)

  20. Example SEG2: The university spent $30000 SEG1: to upgrade lab equipment in 1987

  21. Events • Motivation: • Certain events and event-pairs are indicative of certain relation types (e.g. “push”-”fall”: cause) • Allow learner to associate events and event-pairs with particular relation types • Evita: EVents In Text Analyzer • Performs domain independent identification of events • Identifies all event-referring expressions (that can be temporally ordered) Feature Class Events: - Event mentions in each segment - Event mention pairs drawn from both segments

  22. Example SEG2: The university spent $30000 SEG1: to upgrade lab equipment in 1987

  23. Modality and Subordinating Relations • Motivation: • Event modality and subordinating relations are indicative of certain relations • SlinkET [Saurí et al. 2006] • Identifies subordinating contexts and classifying as: • Factive, counter-factive, evidential, negative evidential, or modal • E.g. evidential => attribute relation • Event class, polarity, tense, etc. Feature Class SlinkET: - Event class, polarity, tense and modality of events in each segment - Subordinating relations between event pairs

  24. Example SEG2: The university spent $30000 SEG1: to upgrade lab equipment in 1987

  25. Cue Words and Events • Motivation • Certain events (event types) are likely to appear in particular discourse contexts keyed by certain connectives. • Pairing connectives with events captures this more precisely than connectives or events on their own Feature Class CueWords + Events: - First word of SEG1 and each event mention in SEG2 - First word of SEG2 and each event mention in SEG1

  26. Example SEG2: The university spent $30000 SEG1: to upgrade lab equipment in 1987

  27. Grammatical Relations • Motivation: • Certain intra-sentential relations captured or ruled out by particular dependency relations between clausal headwords • Identification of headwords also important • Main events identified • RASP parser Syntax: - Grammatical relations between two segments - GR + SEG1 head word - GR + SEG2 head word - GR + Both head words Feature Class

  28. Example SEG2: The university spent $30000 SEG1: to upgrade lab equipment in 1987

  29. Temporal Relations • Motivation: • Temporal ordering between events constrains possible coherence relations • E.g. E1 BEFORE E2 => NOT(E2 CAUSE E1) • Temporal Relation Classifier • Trained on TimeBank 1.2 using MaxEnt • See [Mani et al. “Machine Learning of Temporal Relations” ACL 2006] Feature Class TLink: - Temporal Relations holding between segments

  30. Example SEG2: The university spent $30000 SEG1: to upgrade lab equipment in 1987

  31. Relation Classification • Identify • Specific coherence relation • Ignoring elaboration subtypes (too sparse) • Coarse-grained relation (resemblance, cause-effect, temporal, attributive) • Evaluation Methodology • Used Maximum Entropy classifier ( Gaussian prior variance = 2.0 ) • 8-fold cross validation • Specific relation accuracy: 81.06% • Inter-annotator agreement: 94.6% • Majority Class Baseline: 45.7% • Classifying all relations as elaboration • Coarse-grain relation accuracy: 87.51%

  32. F-Measure Results

  33. Results: Confusion Matrix Hypothesis Reference

  34. Feature Class Analysis • What is the utility of each feature class? • Features overlap significantly – highly correlated • How can we estimate utility? • Independently • Start with Proximity feature class (baseline) • Add each feature class separately • Determine improvement over baseline • In combination with other features • Start with all features • Remove each feature class individually • Determine reduction from removal of feature class

  35. Feature Class Analysis Results Feature Class Contributions in Isolation Feature Class Contributions in Conjunction

  36. Relation Identification • Given • Discourse segments (and segment sequences) • Identify • For each pair of segments, whether a relation (any relation) exists on those segments • Two issues: • Highly skewed classification • Many negatives, few positives • Many of the relations are transitive • These aren’t annotated and will be false negative instances

  37. Relation Identification Results • For all pairs of segment sequence in a document • Used same features as for classification • Achieved accuracy only slightly above majority class baseline • For segment pairs in same sentence • Accuracy: 70.04% (baseline 58%) • Identification and classification in same sentence • Accuracy: 64.53% (baseline 58%)

  38. Inter-relation Dependencies • Each relation shouldn’t be identified in isolation • When identifying a relation between si and sj, consider other relations involving si and sj • Include as features the other (gold standard true) relation types both segments are involved in • Adding this feature class improves performance to 82.3% • 6.3% error reduction • Indicates room for improvement with • Collective classification (where outputs influence each other) • Incorporating explicit modeling constraints • Tree-based parsing model • Constrained DAGs [Danlos 2004] • Including, deducing transitive links may help further

  39. Conclusions • Classification approach with many features achieves good performance at classifying coherence relation types • All feature classes helpful, but: • Discriminative power of most individual feature classes captured by union of remaining feature classes • Proximity + CueWords acheives 76.77% • Remaining features reduce error by 23.7% • Classification approach performs less well on task of identifying the presence of a relation • Using same features as for classifying coherence relation types • “Parsing” may prove better for local relationships

  40. Future Work • Additional linguistic analysis • Co-reference – both entities and events • Word-sense • lexical similarity confounded with multiple types for a lexeme • Pipelined or ‘stacked’ architecture • Classify coarse-grained category first, then specific coherence relation • Justification: different categories require different types of knowledge • Relational classification • Model decisions collectively • Include constraints on structure • Investigate transitivity of resemblance relations • Consider other approaches for identification of relations

  41. Questions?

  42. Backup Slides

  43. GraphBank Annotation Statistics • Corpus and Annotator Statistics • 135 doubly annotated newswire articles • Identifying discourse segments had high agreement (> 90% from pilot study of 10 documents) • Corpus segments ultimately annotated once (by both annotators together) • Segment grouping - Kappa 0.8424 • Relation identification and typing - Kappa 0.8355

  44. Factors Involved in Identifying Coherence Relations • Proximity • E.g. Attribution local, elaboration non-local • Lexical and phrasal cues • Constrain possible relation types • But => ‘contrast’, ‘expected violation’ • And => ‘elaboration’, ‘similar’, ‘contrast’ • Co-reference • Coherence established with references to mentioned entities/events • Argument structure • E.g. similar => similar/same event and/or participants • Lexical Knowledge • Type inclusion, word sense • Qualia (purpose of an object, resulting state of an action), event structure • Paraphrases: delay => arrive late • World Knowledge • E.g. Ukraine is part of Europe

  45. Architecture Training Knowledge Source 1 Pre-processing Knowledge Source 2 Feature Constructor Model Classifications Knowledge Source n Prediction

  46. Scenario Extraction: MUC • Pull together relevant facts related to a “complex event” • Management Succession • Mergers and Acquisitions • Natural Disasters • Satellite launches • Requires identifying relations between events: • Parallel, cause-effect, elaboration • Also: identity, part-of • Hypothesis: • Task independent identification of discourse relations will allow rapid development of Scenario Extraction systems

  47. Information Extraction: Current Scenario Extraction Fact Extraction Task 1.1 Domain 1 Task 1.N Pre-process Task 2.1 Domain 2 Task 2.N Domain N

  48. Information Extraction: Future Pre-process Fact Extraction Discourse

More Related