Empirical Evaluation of Pronoun Resolution and Clausal Structure

Empirical Evaluation of Pronoun Resolution and Clausal Structure Joel Tetreault and James Allen University of Rochester Department of Computer Science

RST and pronoun resolution • Previous work suggests that breaking apart utterances into clauses (Kameyama 1998), or assigning a hierarchical structure (Grosz and Sidner, 1986; Webber 1988) can aid in the resolution of pronouns: • Make search more efficient (less entities to consider) • Make search more successful (block competing antecedents) • Empirical work has focused on using segmentation to limit accessibility space of antecedents • Test claim by performing an automated study on a corpus (1241 sentence subsection of PennTreebank; 454 3rd person pronouns)

Rhetorical Structure Theory • A way of organizing and describing natural text (Mann and Thompson, 1988) • It identifies a hierarchical structure • Describes binary relations between text parts

Experiment • Create coref corpus that includes PT syntactic trees and RST information • Run pronoun algorithms over this merged data set to determine baseline score • LRC (Tetreault, 1999) • S-list (Strube, 1998) • BFP (Brennan et al., 1987) • Develop algorithms that use clausal information to compare with baseline

Corpus • 52 Wall Street Journal Articles from 1995 Penn Treebank • 1273 sentences, 7594 words, 454 third person pronouns • Pronoun Corpus annotated in same manner as Ge and Charniak (1998) • RST corpus from RST Discourse Treebank (Marcu et al., 2002)

Pronoun Corpus ( (S (S (NP\-SBJ\-\1#-290~1 (DT The) (NN package) ) (VP (VBD was) (VP (VBN termed) (S (NP\-SBJ (\-NONE\- \*\-\1) ) (ADJP\-PRD (JJ excessive) )) (PP (IN by) (NP\-LGS (DT the) (NNP Bush) (NN administration) ))))) (\, \,) (CC but) (S (NP\-SBJ (PRP#OBJREF-290~2 it) ) (ADVP (RB also) ) (VP (VBD provoked) (NP…..

RST Corpus (SATELLITE (SPAN |4| |19|) (REL2PAR ELABORATION-ADDITIONAL) (SATELLITE (SPAN |4| |7|) (REL2PAR CIRCUMSTANCE) (NUCLEUS (LEAF |4|) (REL2PAR CONTRAST) (TEXT _!THE PACKAGE WAS TERMED EXCESSIVE BY THE BUSH |ADMINISTRATION,_!|)) (NUCLEUS (SPAN |5| |7|) (REL2PAR CONTRAST) (NUCLEUS (LEAF |5|) (REL2PAR SPAN) (TEXT _!BUT IT ALSO PROVOKED A STRUGGLE WITH INFLUENTIAL CALIFORNIA LAWMAKERS_!))

Baseline Results

LRC Algorithm • While processing utterance’s entities (left to right) do: • Push entity onto Cf-list-new, if pronoun, attempt to resolve first: • Search through Cf-list-new (l-to-r) taking the first candidate that meets gender, agreement constraints, etc. • If none found, search past utterance’s Cf-lists starting from previous utterance to beginning of discourse

LRC Error Analysis (89 errors) • (24) Minimal S • “the committee said the company reneged on its obligations” • (21) Localized Errors • “…to get a customer’s 1100 parcel-a-week load to its doorstep” • (15) Preposed Phrase • “Although he was really tired, John managed to drive 10 hours without sleep”

LRC Errors (2) • (12) Parallelism • “It more than doubled the Federal’s long term debt to 1.9 billion dollars, thrust the company into unknown territory – heavy cargo – and suddenly exanded its landing rights to 21 countries from 4. • (11) Competing Antecedents • “The weight of Lebanon’s history was also against him, and it is a history…” • (4) Plurals referring to companies • “The Ministory of Construction spreads concrete…. But they seldom think of the poor commuters.”

LRC Errors (3) • (2) Genitive Errors • “Mr. Richardson wouldn;t offer specifics regarding Atco’s proposed British project, but he said it would compete for customers…”

Advanced Approaches • Grosz and Sidner (1986)– discourse structure is dependent on intentional structure. Attentional state is modeled as a stack that pushes and pops current state with changes in intentional structure • Veins Theory (Ide and Cristea, 2000) – position of nuclei and satellites in a RST tree determine DRA (domain of referential accessibility) for each clause

G&S Accessibility e1, e2 e3 e4 Search Order: e6, e5, e4, e1, e2 e5 e6, p1

Veins Theory • Each RST discourse unit (leaf) has an associated vein (Cristea et al., 1998; Ide and Cristea, 2000) • Vein provides a “summary of the discourse fragment that contains that unit” • Contains salient parts of the RST tree – the preceding nuclei and surrounding satellites • Veins determined by whether node is a nucleus or satellite and what its left and right children are

Veins Algorithm • Use same data set augmented with head and veins information (automatically computed) • Exception: RST data set has some multi-child nodes, assume all extra children are right children • Bonus: areas to the left of the root are potentially accessible – makes global topics introduced in the beginning accessible • Implementation – search each unit in the entity’s DRA starting with most-recent and left-to-right within clause. If no antecedent is found, use LRC to search.

Transforms • Goal of transforms – flatten corpus a bit to create larger segments, so more entities can be considered • SAT – merge satellite leaf into its sibling if sibling is a subtree with all leaves • SENT – merge clauses together in RST tree back into sentence • ATT – merge clauses that are in attribution relation

(1) ORIG Subtree Root Nucleus * Sat-leaf C3 Transform Examples (2) SAT Subtree Root C3 C1 C2 Satellite Leaf C2 Nucleus Leaf C1 (3) SENT Subtree Root C1 + C2 + C3 * C1 and C2 are in an Attribution relation (4) ATT Subtree Root C1 + C2 C3

SAT example Nucleus ORIGINAL Sat-Leaf (attribution) Nucleus according to the commission. Nuc-Leaf Sat-Leaf (condition) if it exercises the option S.A. Brewing would make a takeover offer for all of Bell Resources TRANSFORM Nucleus Sat-Leaf Nuc-Leaf Sat-Leaf according to the commission S.A. Brewing would make a takeover offer for all of Bell Resources if it exercises the option

SENT example Nucleus ORIGINAL Satellite (attribution) Nuc-Leaf Under the plan, Costa Rica will buy back roughly 60% of its bank debt at a deeply discounted price Sat-Leaf (elaboration) Nuc-Leaf according to officials involved in the agreement. TRANSFORM Nuc-leaf Under the plan, Costa Rica will buy back roughly 60% of its bank debt at a deeply discounted price, according to officials involved in the agreement

ATT example Satellite (summary) ORIGINAL Nuc-Leaf Sat-Leaf (attribution) Lion Nathan has a concluded contract with Bond and Bell Resources, said Douglas Myers, Chief Executive Of Lion Nathan. TRANSFORM Sat-leaf (summary) Lion Nathan has a concluded contract with Bond and Bell Resources, said Douglas Myers, Chief Executive of Lion Nathan

Results

Long Distance Resolution • 10 cases in corpus of pronouns with antecedents more than 2 utterances away, most in ATT relations • LRC gets them all correct, since no competing antecedents (“him”, “their”) • Veins (w/o ATT) gets 6 out of 10 • With the transforms, all algorithms get 100%

Conclusions • Two ways to determine success of decomposition strategy: intrasentential and intersentential resolution • Intra: no improvement, better to use grammatical function • Inter: LDR’s…. Hard to draw concrete conclusions • Need more data to determine if transforms give a good approximation of segmentation • Using G&S accessibility of clauses doesn’t seem to work either • At the minimum, even if a method performs the same, it has the advantage of a smaller search space

Future Work • Error analysis shows determining coherence relations could account for several intrasentential cases • Use rhetorical relations themselves to constrain accessibility of entities • Annotating human-human dialogues in TRIPS 911 domain for reference, already been annotated for argumentation acts (Stent, 2001)

Empirical Evaluation of Pronoun Resolution and Clausal Structure

Empirical Evaluation of Pronoun Resolution and Clausal Structure

Presentation Transcript

Empirical Issues Portfolio Performance Evaluation

Dialogue Structure and Pronoun Resolution

Empirical Evaluation

Empirical Evaluation of Dissimilarity Measures for Color and Texture

Non-clausal Reasoning

Agreement of Pronoun and Antecedent

Zero Pronoun Resolution in Japanese

Evaluation of Dispute Resolution methods

An Empirical Evaluation of Extendible Arrays

Agreement of Pronoun and Antecedent

Market Structure Comparison and Evaluation

Dialogue Structure and Pronoun Resolution

AR: clausal logic

Clausal Form

Empirical Evaluation of Learning Styles Adaptation Language

After the Practical on Clausal Form and Resolution

Pronoun-Antecedent Agreement and Pronoun Reference

Computational Structure of Aṣṭādhyāyī and Conflict Resolution Techniques

Evaluation (cont.): Empirical Studies

Clausal logic

Empirical Evaluation of innovations in automatic repair

Empirical Issues Portfolio Performance Evaluation