1 / 19

Discourse Structure and Anaphoric Accessibility

Discourse Structure and Anaphoric Accessibility. Massimo Poesio and Barbara Di Eugenio with help from Gerard Keohane. Content. Empirical Investigations of Discourse Structure Grosz and Sidner’s theory of the Global Focus Relational Discourse Analysis How we used RDA to study G&S

shalin
Download Presentation

Discourse Structure and Anaphoric Accessibility

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discourse Structure and Anaphoric Accessibility Massimo Poesio and Barbara Di Eugenio with help from Gerard Keohane

  2. Content • Empirical Investigations of Discourse Structure • Grosz and Sidner’s theory of the Global Focus • Relational Discourse Analysis • How we used RDA to study G&S • Results • Discussion Information Structure and Discourse Structure

  3. Empirical Investigations of Discourse Structure: A new opportunity • Original proposals concerning effect of discourse structure on accessibility (Reichman, 1985; Fox, 1987; Grosz and Sidner, 1986) based on unsystematic analysis of data • These days we know more about reliable studies of discourse phenomena (Passonneau and Litman, 1993; Carletta et al, 1997) • These new resources already used to propose new theories of anaphora and discourse structure such as Veins Theory (Cristea, Ide, Marcu, et al, 1998, 1999, 2000) • The goal of this project: use a reliably annotated corpus (the Sherlock corpus from the University of Pittsburgh, Moser and Moore, 1996; Di Eugenio et al, 1997) to study claims of G&S Information Structure and Discourse Structure

  4. Grosz and Sidner’s Theory of the Global Focus • The structure of a discourse is determined by the intentions utterances are meant to convey (DISCOURSE SEGMENT PURPOSES) • INTENTIONAL STRUCTURE: DOMINANCE and SAT-PRECEDES relations between DSPs • ATTENTIONAL STRUCTURE: a stack of FOCUS SPACES • Focus spaces on the stack contain accessible discourse entities • Presence on the stack reflects intentional structure • The problem: how to identify DSPs in a discourse Information Structure and Discourse Structure

  5. Relational Discourse Analysis (RDA) • Moore and Pollack, 1992; Moser and Moore, 1996 • Combines ideas from RST and Grosz and Sidner’s theory • From Grosz and Sidner: discourse structure is determined by intentional structure • RDA-SEGMENT: a segment expressing an intentional relation • From RST: segments have internal structure • CORE (cfr. NUCLEUS) • CONTRIBUTOR (cfr. SATELLITE) • Both INTENTIONAL and INFORMATIONAL relations • A fixed number of intentional relations • Has been proven to be usable for reliable analysis Information Structure and Discourse Structure

  6. RDA Analysis of an excerpt from a tutorial • 1.1 Before troubleshooting inside the text station, • 1.2 It’s always best to eliminate both the UUT and the TP • 2.1 Since the test package is moved frequently • 2.2 It is prone to damage • 3.1 Also, testing the test package is much easier and faster • 3.2 than opening up test station drawers. CONVINCE CONVINCE ENABLE Prescribed-act: Wrong-act Cause:effect step1:step2 1.1 1.2 2.1 2.2 3.1 3.2 Information Structure and Discourse Structure

  7. Moser and Moore: mapping between RST relations and G&S • Basic principles: • Every DSP must be associated with a core • Constituents of the RDA structure that do not include cores – such as clusters – do not introduce DSPs • Consequences for attentional state: • A new focus space only pushed when a segment is open • Information relations do not affect the attentional state Information Structure and Discourse Structure

  8. Mapping RDA into Attentional State • 1.1 Before troubleshooting inside the text station, • 1.2 It’s always best to eliminate both the UUT and the TP • 2.1 Since the test package is moved frequently • 2.2 It is prone to damage • 3.1 Also, testing the test package is much easier and faster • 3.2 than opening up test station drawers. DSP1 CONVINCE CONVINCE DSP 2 ENABLE Prescribed-act: Wrong-act Cause:effect step1:step2 1.1 1.2 2.1 2.2 3.1 3.2 Information Structure and Discourse Structure

  9. Using an RDA-annotated corpus to study anaphoric accessibility • The data: the SHERLOCK corpus, already annotated according to RDA instructions (Moser, 1996) • Added anaphoric annotation according to GNOME instructions (Poesio, 2000) derived from MATE scheme (Poesio Bruneseaux and Romary, 1999) • Use RDA analysis to drive focus space construction • Measure: • Accessibility • Perplexity Information Structure and Discourse Structure

  10. The Data: the SHERLOCK corpus • 17 tutorial dialogues collected within the Sherlock project (Lesgold et al, 1992) • Students solve electronic troubleshooting problem • 313 turns, 1333 clauses • RDA annotation: Moser and Moore, 1996 • Reliability verified at different levels • Intentional relations: CONCEDE, CONVINCE, ENABLE, JOINT Information Structure and Discourse Structure

  11. An example of Sherlock dialogue • STUDENT: • 1.1 Why isn't measurement signal path green during good test • readings (steps)? • TUTOR: • 2.1 For each step that passed, • 2.2 you know the measurement path is good. • 2.3 You also know that one of the measurement paths is bad. • 2.4 Showing the UUT, Test Package, and measurement section as • unknown is correct • 2.5 because, you know when you get your fail that • something was wrong, • 2.6 but you didn't know exactly what. • 2.7 The DMM is green • 2.8 because it has been working all along. • 2.9 The stimulus section is green • 2.10 because it was not used • 2.11 and is assumed to be good. Information Structure and Discourse Structure

  12. Anaphoric Annotation • The GNOME scheme (Poesio, 2000) • Mark up all NPs as NE element, with a variety of attributes • About 3000 NEs • Use separate ANTE element to mark up anaphoric relations (including bridges) • In this annotation: only direct anaphoric relations • (About 1500 total) Information Structure and Discourse Structure

  13. Evaluation • A PERL script simulates focus space construction and computes accessibility and perplexity • Accessibility: whether antecedent is in focus stack • Perplexity: Sum 1/d(xi ) m(xi) (where m(xi) = 1 if xi matches anaphor, 0 otherwise) • Parameters for focus space construction: • PUSHING: • Whenever relation is encountered (either informational or intentional) • Only when intentional • POPPING: • As soon as associated constituent is completed • Immediate popping of contributors, delayed popping of cores • Delayed popping of contributors Information Structure and Discourse Structure

  14. Evaluation I: Intentional vs Informational Accessibility: Perplexity: All = 0.83, Intentional = 1.23 Information Structure and Discourse Structure

  15. Complications ENABLE • 24.13a Since S52 puts a return (0 VDC) on it’s outputs • 24.13b when they are active, • 24.14 the inactive state must be some other voltage. • 24.15 So even though you may not know what the “other” voltage is, • 24.16 You can test to ensure that • 24.17a the active pins are 0 VDC • 24.17b and all the inactive pins are not 0 VDC. DSP 1 CONCEDE ENABLE 24.14 24.15 24.16 Effect:cause 24.13a 24.13b Contrast1: contrast2 24.17a 24.17b Information Structure and Discourse Structure

  16. Complications ENABLE • 24.13a Since S52 puts a return (0 VDC) on it’s outputs • 24.13b when they are active, • 24.14 the inactive state must be some other voltage. • 24.15 So even though you may not know what the “other” voltage is, • 24.16 You can test to ensure that • 24.17a the active pins are 0 VDC • 24.17b and all the inactive pins are not 0 VDC. DSP 1 CONCEDE ENABLE 24.14 24.15 24.16 Effect:cause 24.13a 24.13b Contrast1: contrast2 24.17a 24.17b Information Structure and Discourse Structure

  17. Evaluation II: Delayed Popping Accessibility Average perplexity with immediate popping: 1.23 Delayed popping of cores: 1.3 Delayed popping of contributors: 1.33 Perplexity Information Structure and Discourse Structure

  18. Discussion • Accessibility: • Intentional vs. informational distinction makes sense • Cfr. Fox • Want to keep contributors as well as cores on stack • cfr. Veins Theory • An evaluation of Grosz and Sidner’s framework: • The most direct implementation makes quite a few discourse entities unaccessible • Difficult to interpret more complex operations in terms of intentional structure • Alternative: a cache model (cfr. Guindon 1985, Walker 1996, 1998) • Version 1 (conservative): cache of focus spaces • Version 2: cache of forward looking centers Information Structure and Discourse Structure

  19. Cache-based global focus: a conservative proposal • Cache elements are FOCUS SPACES • Cache elements are RANKED: Current focus space < other constituents of same segment < dominating segments < focus spaces of contributors to closed spaces(Cfr. Reichman 85) • Search algorithm: follow ranking • Cache replacement algorithm: • Opening RDA segment: open new focus space, replace lowest-ranked element of cache, assign it highest rank • Closing RDA segment: Assign lowest rank to embedded contributors Information Structure and Discourse Structure

More Related