Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution

Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution Ryu Iida, Kentaro Inui and Yuji MatsumotoNara Institute of Science and Technology {ryu-i,inui,matsu}@is.naist.jp June, 20th, 2006

Zero-anaphora resolution • Zero-anaphor = a gap with an anaphoric function • Zero-anaphora resolution becoming important in many applications • In Japanese, even obligatory arguments of a predicate are often omitted when they are inferable from the context • 45.5% nominative arguments of verbs are omitted in newspaper articles

Zero-anaphora resolution (cont’d) • Three sub-tasks: • Zero-pronoun detection: detect a zero-pronoun • Antecedent identification: identify the antecedent for a given zero-pronoun • Anaphoricity determination: anaphoric zero-pronoun antecedent Mary-wa John-ni (φ-ga ) tabako-o yameru-youni it-ta Mary-NOM John-DAT (φ-NOM ) smoking-OBJ quit-COMP say-PAST [Mary asked John to quit smoking.]

non-anaphoric zero-pronoun (φ-ga) ie-ni kaeri-tai (φ -NOM) home-DAT want to go back [(φ=I) want to go home.] Zero-anaphora resolution (cont’d) • Three sub-tasks: • Zero-pronoun detection: detect a zero-pronoun • Antecedent identification: identify antecedent from the set of candidate antecedents for a given zero-pronoun • Anaphoricity determination: classify whether a given zero-pronoun is anaphoric or non-anaphoric anaphoric zero-pronoun antecedent Mary-wa John-ni (φ-ga ) tabako-o yameru-youni it-ta Mary-NOM John-DAT (φ-NOM ) smoking-OBJ quit-COMP say-PAST [Mary asked John to quit smoking.]

Previous work on anaphora resolution • Research trend has been shifting from rule-based approaches (Baldwin, 95; Lappin and Leass, 94; Mitkov, 97, etc.) to empirical, or learning-based, approaches (Soon et al., 2001; Ng 04, Yang et al., 05, etc.) • Cost-efficient solution for achieving performance comparable to best performing rule-based systems • Learning-based approaches represent a problem, anaphoricity determination and antecedent identification, as a set of feature vectors and apply machine learning algorithms to them

Mary-wa Mary-TOP AntecedentJohn-niJohn-DAT zero-pronounφ-gaφ-NOM tabako-osmoking-OBJ predicateyameru-youniquit-CONP predicateit-tasay-PAST Syntactic pattern features • Useful clues for both anaphoricity determination and antecedent identification

Mary-wa Mary-TOP AntecedentJohn-niJohn-DAT zero-pronounφ-gaφ-NOM tabako-osmoking-OBJ predicateyameru-youniquit-CONP predicateit-tasay-PAST Syntactic pattern features • Useful clues for both anaphoricity determination and antecedent identification • Questions • How to encode syntactic patterns as features • How to avoid data sparseness problem

Talk outline • Zero-anaphora resolution: Background • Selection-then-classification model (Iida et al., 05) • Proposed model • Represents syntactic patterns based on dependency trees • Uses a tree mining technique to seek useful sub-trees to solve data sparseness problem • Incorporates syntactic pattern features in the selection-then-classification model • Experiments on Japanese zero-anaphora • Conclusion and future work

A federal judge in Pittsburgh issued a temporary restraining order preventing Trans World Airlines from buying additional shares of USAir Group Inc. The order, requested in a suit filed by USAir, … candidate anaphor federal judge tournament model candidate antecedents order … USAir Group Inc suit candidate anaphor USAir Selection-then-Classification Model(SCM) (Iida et al., 05)

federal judge tournament model candidate antecedents order … USAir Group Inc suit candidate anaphor USAir Selection-then-Classification Model(SCM) (Iida et al., 05) (Iida et al. 03) USAir Group Inc USAir Group Inc … USAir Group Inc Federal judge order USAir suit candidate anaphor candidate antecedents

Selection-then-Classification Model(SCM) (Iida et al., 05) federal judge tournament model candidate antecedents order … USAir Group Inc USAir Group Inc suit most likelycandidate antecedent candidate anaphor USAir

USAir Group Inc USAir Anaphoricitydetermination model score ≧θ ana scoreθ ana is anaphoric and USAir is non-anaphoric USAir is the antecedent of USAir Group Inc USAir Selection-then-Classification Model(SCM) (Iida et al., 05) federal judge tournament model candidate antecedents order … USAir Group Inc USAir Group Inc suit most likelycandidate antecedent candidate anaphor USAir

Anaphoric Non-anaphoric Training the anaphoricity determination model NP1 set of candidate antecedents NPi: candidate antecedent NP2 NP3 Anaphoricinstances Antecedent NP4 NP5 NP4 ANP anaphoric noun phrase ANP NP1 tournament model NP2 set of candidate antecedents NP3 Non-anaphoricinstances candidate antecedent NP4 NP3 NP5 non-anaphoric noun phrase NP3 NANP NANP

Talk outline • Zero-anaphora resolution: Background • Selection-then-classification model (Iida et al., 05) • Proposed model • Represents syntactic patterns based on dependency trees • Uses a tree mining technique to seek useful sub-trees to solve data sparseness problem • Incorporates syntactic pattern features in the selection-then-classification model • Experiments on Japanese zero-anaphora • Conclusion and future work

candidate anaphor USAir Group Inc USAir Anaphoricitydetermination model score ≧θ ana scoreθ ana is anaphoric and USAir is non-anaphoric USAir is the antecedent of USAir Group Inc USAir New model federal judge tournament model candidate antecedents order … USAir Group Inc USAir Group Inc suit most likelycandidate antecedent USAir

Use of syntactic pattern features • Encoding parse tree features • Learning useful sub-trees

Mary-wa Mary-TOP AntecedentJohn-niJohn-DAT zero-pronounφ-gaφ-NOM tabako-osmoking-OBJ predicateyameru-youniquit-CONP predicateit-tasay-PAST Encoding parse tree features

Mary-wa Mary-TOP tabako-osmoking-OBJ Encoding parse tree features AntecedentJohn-niJohn-DAT zero-pronounφ-gaφ-NOM predicateyameru-youniquit-CONP predicateit-tasay-PAST

Antecedent zero-pronoun predicate predicate Encoding parse tree features AntecedentJohn-niJohn-DAT zero-pronounφ-gaφ-NOM predicateyameru-youniquit-CONP predicateit-tasay-PAST

niDAT gaCONJ youniCONJ taPAST Encoding parse tree features AntecedentJohn-niJohn-DAT zero-pronounφ-gaφ-NOM predicateyameru-youniquit-CONP predicateit-tasay-PAST Antecedent zero-pronoun predicate predicate

(TL) (TI) LeftCand zero- pronoun predicate predicate (TR) LeftCand RightCand predicate RightCand zero- pronoun predicate predicate Encoding parse trees LeftCandMary-wa Mary-TOP RightCand John-niJohn-DAT zero-pronounφ-gaφ-NOM tabako-osmoking-OBJ predicateyameru-youniquit-CONP predicateit-tasay-PAST

Encoding parse trees • Antecedent identification root Three sub-trees

… n 1 2 … Lexical, Grammatical, Semantic, Positional and Heuristic binary features Encoding parse trees • Antecedent identification root Three sub-trees

label Left or right Encoding parse trees • Antecedent identification root … n 1 2 … Lexical, Grammatical, Semantic, Positional and Heuristic binary features Three sub-trees

Learning useful sub-trees • Kernel methods: • Tree kernel (Collins and Duffy, 01) • Hierarchical DAG kernel (Suzuki et al., 03) • Convolution tree kernel (Moschitti, 04) • Boosting-based algorithm: • BACT (Kudo and Matsumoto, 04) system learns a list of weighted decision stumps with the Boosting algorithm

decision stumps learn weight sub-tree 0.4 Label positive apply Score: +0.34 positive Learning useful sub-trees • Boosting-based algorithm: BACT • Learns a list of weighted decision stumps with Boosting • Classifies a given input tree by weighted voting Training instances positive Labels positive positive ….

scoreintra≧θintra Output the most-likely candidate antecedent appearing in S scoreintra<θintra scoreinter≧θinter Inter-sentential model Output the most-likely candidate appearing outside of S scoreinter<θinter Return ‘‘non-anaphoric’’ Overall process Input (a zero-pronoun φ in the sentence S) syntactic patterns Intra-sentential model

Table of contents • Zero-anaphora resolution • Selection-then-classification model (Iida et al., 05) • Proposed model • Parse encoding • Tree mining • Experiments • Conclusion and future work

# of correctly resolved zero-anaphoric relations # of anaphoric zero-pronouns # of correctly resolved zero-anaphoric relations # of anaphoric zero-pronouns the model detected Experiments • Japanese newspaper article corpus comprising zero-anaphoric relations: 197 texts (1,803 sentences) • 995 intra-sentential anaphoric zero-pronouns • 754 inter-sentential anaphoric zero-pronouns • 603 non-anaphoric zero-pronouns • Recall = • Precision =

Experimental settings • Conducting five-fold cross validation • Comparison among four models • BM: Ng and Cardie (02)’s model: • Identify an antecedent with candidate-wise classification • Determine the anaphoricity of a given anaphor as a by-product of the search for its antecedent • BM_STR: BM +syntactic pattern features • SCM: Selection-then-classification model (Iida et al., 05) • SCM_STR: SCM + syntactic pattern features

Results of intra-sentential ZAR • Antecedent identification (accuracy)  The performance of antecedent identification improved by using syntactic pattern features

Results of intra-sentential ZAR • antecedent identification + anaphoricity determination

Impact on overall ZAR • Evaluate the overall performance for both intra-sentential and inter-sentential ZAR • Baseline model:SCM • resolves intra-sentential and inter-sentential zero-anaphora simultaneously with no syntactic pattern features.

Results of overall ZAR

AUC curve • AUC (Area Under the recall-precision Curve) plotted by altering θintra • Not peaky  optimizing parameter θintra is not difficult

Conclusion • We have addressed the issue of how to use syntactic patterns for zero-anaphora resolution. • How to encode syntactic pattern features • How to seek useful sub-trees • Incorporating syntactic pattern features into our selection-then-classification model improves the accuracy for intra-sentential zero-anaphora, which consequently improves the overall performance of zero-anaphora resolution

Future work • How to find zero-pronouns? • Designing a broader framework to interact with analysis of predicate argument structure • How to find a globally optimal solution to the set of zero-anaphora resolution problems in a given discourse? • Exploring methods as discussed by McCallum and Wellner (03)

Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution

Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution

Presentation Transcript

Anaphora

Zero Pronoun Resolution in Japanese

Anaphora

Anaphora

A Cross -Lingual ILP Solution to Zero Anaphora Resolution

Anaphora

Anaphora

Anaphora in English vs. Anaphora in Arabic.

Anaphora

Anaphora

Anaphora

ANAPHORA IN NATURAL LANGUAGE

Changing patterns in UK dispute resolution

Anaphora Resolution

Anaphora Resolution

Anaphora Resolution

Anaphora

Statistical Anaphora Resolution

Anaphora Resolution

Evaluation issues in anaphora resolution and beyond

Learning syntactic patterns for automatic hypernym discovery

Anaphora