1 / 25

Empirical Evaluation of Pronoun Resolution and Clausal Structure

Empirical Evaluation of Pronoun Resolution and Clausal Structure. Joel Tetreault and James Allen University of Rochester Department of Computer Science. RST and pronoun resolution.

corine
Download Presentation

Empirical Evaluation of Pronoun Resolution and Clausal Structure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Empirical Evaluation of Pronoun Resolution and Clausal Structure Joel Tetreault and James Allen University of Rochester Department of Computer Science

  2. RST and pronoun resolution • Previous work suggests that breaking apart utterances into clauses (Kameyama 1998), or assigning a hierarchical structure (Grosz and Sidner, 1986; Webber 1988) can aid in the resolution of pronouns: • Make search more efficient (less entities to consider) • Make search more successful (block competing antecedents) • Empirical work has focused on using segmentation to limit accessibility space of antecedents • Test claim by performing an automated study on a corpus (1241 sentence subsection of PennTreebank; 454 3rd person pronouns)

  3. Rhetorical Structure Theory • A way of organizing and describing natural text (Mann and Thompson, 1988) • It identifies a hierarchical structure • Describes binary relations between text parts

  4. Experiment • Create coref corpus that includes PT syntactic trees and RST information • Run pronoun algorithms over this merged data set to determine baseline score • LRC (Tetreault, 1999) • S-list (Strube, 1998) • BFP (Brennan et al., 1987) • Develop algorithms that use clausal information to compare with baseline

  5. Corpus • 52 Wall Street Journal Articles from 1995 Penn Treebank • 1273 sentences, 7594 words, 454 third person pronouns • Pronoun Corpus annotated in same manner as Ge and Charniak (1998) • RST corpus from RST Discourse Treebank (Marcu et al., 2002)

  6. Pronoun Corpus ( (S (S (NP\-SBJ\-\1#-290~1 (DT The) (NN package) ) (VP (VBD was) (VP (VBN termed) (S (NP\-SBJ (\-NONE\- \*\-\1) ) (ADJP\-PRD (JJ excessive) )) (PP (IN by) (NP\-LGS (DT the) (NNP Bush) (NN administration) ))))) (\, \,) (CC but) (S (NP\-SBJ (PRP#OBJREF-290~2 it) ) (ADVP (RB also) ) (VP (VBD provoked) (NP…..

  7. RST Corpus (SATELLITE (SPAN |4| |19|) (REL2PAR ELABORATION-ADDITIONAL) (SATELLITE (SPAN |4| |7|) (REL2PAR CIRCUMSTANCE) (NUCLEUS (LEAF |4|) (REL2PAR CONTRAST) (TEXT _!THE PACKAGE WAS TERMED EXCESSIVE BY THE BUSH |ADMINISTRATION,_!|)) (NUCLEUS (SPAN |5| |7|) (REL2PAR CONTRAST) (NUCLEUS (LEAF |5|) (REL2PAR SPAN) (TEXT _!BUT IT ALSO PROVOKED A STRUGGLE WITH INFLUENTIAL CALIFORNIA LAWMAKERS_!))

  8. Baseline Results

  9. LRC Algorithm • While processing utterance’s entities (left to right) do: • Push entity onto Cf-list-new, if pronoun, attempt to resolve first: • Search through Cf-list-new (l-to-r) taking the first candidate that meets gender, agreement constraints, etc. • If none found, search past utterance’s Cf-lists starting from previous utterance to beginning of discourse

  10. LRC Error Analysis (89 errors) • (24) Minimal S • “the committee said the company reneged on its obligations” • (21) Localized Errors • “…to get a customer’s 1100 parcel-a-week load to its doorstep” • (15) Preposed Phrase • “Although he was really tired, John managed to drive 10 hours without sleep”

  11. LRC Errors (2) • (12) Parallelism • “It more than doubled the Federal’s long term debt to 1.9 billion dollars, thrust the company into unknown territory – heavy cargo – and suddenly exanded its landing rights to 21 countries from 4. • (11) Competing Antecedents • “The weight of Lebanon’s history was also against him, and it is a history…” • (4) Plurals referring to companies • “The Ministory of Construction spreads concrete…. But they seldom think of the poor commuters.”

  12. LRC Errors (3) • (2) Genitive Errors • “Mr. Richardson wouldn;t offer specifics regarding Atco’s proposed British project, but he said it would compete for customers…”

  13. Advanced Approaches • Grosz and Sidner (1986)– discourse structure is dependent on intentional structure. Attentional state is modeled as a stack that pushes and pops current state with changes in intentional structure • Veins Theory (Ide and Cristea, 2000) – position of nuclei and satellites in a RST tree determine DRA (domain of referential accessibility) for each clause

  14. G&S Accessibility e1, e2 e3 e4 Search Order: e6, e5, e4, e1, e2 e5 e6, p1

  15. Veins Theory • Each RST discourse unit (leaf) has an associated vein (Cristea et al., 1998; Ide and Cristea, 2000) • Vein provides a “summary of the discourse fragment that contains that unit” • Contains salient parts of the RST tree – the preceding nuclei and surrounding satellites • Veins determined by whether node is a nucleus or satellite and what its left and right children are

  16. Veins Algorithm • Use same data set augmented with head and veins information (automatically computed) • Exception: RST data set has some multi-child nodes, assume all extra children are right children • Bonus: areas to the left of the root are potentially accessible – makes global topics introduced in the beginning accessible • Implementation – search each unit in the entity’s DRA starting with most-recent and left-to-right within clause. If no antecedent is found, use LRC to search.

  17. Transforms • Goal of transforms – flatten corpus a bit to create larger segments, so more entities can be considered • SAT – merge satellite leaf into its sibling if sibling is a subtree with all leaves • SENT – merge clauses together in RST tree back into sentence • ATT – merge clauses that are in attribution relation

  18. (1) ORIG Subtree Root Nucleus * Sat-leaf C3 Transform Examples (2) SAT Subtree Root C3 C1 C2 Satellite Leaf C2 Nucleus Leaf C1 (3) SENT Subtree Root C1 + C2 + C3 * C1 and C2 are in an Attribution relation (4) ATT Subtree Root C1 + C2 C3

  19. SAT example Nucleus ORIGINAL Sat-Leaf (attribution) Nucleus according to the commission. Nuc-Leaf Sat-Leaf (condition) if it exercises the option S.A. Brewing would make a takeover offer for all of Bell Resources TRANSFORM Nucleus Sat-Leaf Nuc-Leaf Sat-Leaf according to the commission S.A. Brewing would make a takeover offer for all of Bell Resources if it exercises the option

  20. SENT example Nucleus ORIGINAL Satellite (attribution) Nuc-Leaf Under the plan, Costa Rica will buy back roughly 60% of its bank debt at a deeply discounted price Sat-Leaf (elaboration) Nuc-Leaf according to officials involved in the agreement. TRANSFORM Nuc-leaf Under the plan, Costa Rica will buy back roughly 60% of its bank debt at a deeply discounted price, according to officials involved in the agreement

  21. ATT example Satellite (summary) ORIGINAL Nuc-Leaf Sat-Leaf (attribution) Lion Nathan has a concluded contract with Bond and Bell Resources, said Douglas Myers, Chief Executive Of Lion Nathan. TRANSFORM Sat-leaf (summary) Lion Nathan has a concluded contract with Bond and Bell Resources, said Douglas Myers, Chief Executive of Lion Nathan

  22. Results

  23. Long Distance Resolution • 10 cases in corpus of pronouns with antecedents more than 2 utterances away, most in ATT relations • LRC gets them all correct, since no competing antecedents (“him”, “their”) • Veins (w/o ATT) gets 6 out of 10 • With the transforms, all algorithms get 100%

  24. Conclusions • Two ways to determine success of decomposition strategy: intrasentential and intersentential resolution • Intra: no improvement, better to use grammatical function • Inter: LDR’s…. Hard to draw concrete conclusions • Need more data to determine if transforms give a good approximation of segmentation • Using G&S accessibility of clauses doesn’t seem to work either • At the minimum, even if a method performs the same, it has the advantage of a smaller search space

  25. Future Work • Error analysis shows determining coherence relations could account for several intrasentential cases • Use rhetorical relations themselves to constrain accessibility of entities • Annotating human-human dialogues in TRIPS 911 domain for reference, already been annotated for argumentation acts (Stent, 2001)

More Related