1 / 37

The extension of the Anaphora Resolution Exercise (ARE) to Spanish and Catalan

The extension of the Anaphora Resolution Exercise (ARE) to Spanish and Catalan. Constantin Orasan University of Wolverhampton, UK and Marta Recasens Universitat de Barcelona, Spain. Structure. Description of ARE2007 English corpus used in ARE2007 The AnCora corpora

randyc
Download Presentation

The extension of the Anaphora Resolution Exercise (ARE) to Spanish and Catalan

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The extension of the Anaphora Resolution Exercise (ARE)to Spanish and Catalan Constantin Orasan University of Wolverhampton, UK and Marta Recasens Universitat de Barcelona, Spain

  2. Structure • Description of ARE2007 • English corpus used in ARE2007 • The AnCora corpora • Adapting the AnCora corpora for ARE2009 • Plans for ARE2009

  3. The Anaphora Resolution Exercises (AREs) • the goal of ARE was to “develop discourse anaphora resolution methods and to evaluate them in a common and consistent manner“ • we organise them in conjunction with DAARC conferences • thought as multilingual evaluations • not supposed to be restricted only to pronominal and NP coreference Do we need a roadmap?

  4. ARE2007 • organised in conjunction with DAARC2007 • only English texts • very short time to organise it • can be considered a dry-run for ARE2009 • focused on 4 tasks • 3 participants, 8 runs submitted • we used the NP4E corpus, a corpus of newswire texts

  5. Task 1: Pronominal resolution on pre-annotated texts • resolve pronouns to NPs • participants received the pronouns to be resolved and NP candidates Pronouns to resolve 6 = it Input text [Israeli-PLO relations]1 have hit [a new low]2 with [the Palestinian Authority]3 saying [Israel]5 is wrong to think [it]6 can treat [the Authority]7 like [a client militia]8. Output (6, 5) = (Israel, it)

  6. Evaluation method for task 1 • success rate (accuracy) defined as the number of correctly resolved anaphoric pronouns divided by the total number of anaphoric pronouns

  7. Task 2: Coreferential chains resolution on pre-annotated texts • assign NPs to chains • participants received texts with NPs belonging to a chain with at least two elements annotated Input text [Israeli-PLO relations]1 have hit a new low with [the Palestinian Authority]2 saying [Israel]3 is wrong to think [it]4 can treat [the Authority]5 like a client militia. Output Chain 1 3 = Israel 4 = it Chain 2 2 = the Palestinian Authority 5 = the Authority

  8. Evaluation method for Task 2 • Precision and recall as defined by MUC • only one system participated: • precision: 53.01% • recall: 45.72% • f-measure: 48.32%

  9. Task 3: Pronominal resolution on raw texts • unannotated texts were given to participants • systems had to: • determine the referential pronouns • NP candidates • resolve pronouns to NPs Input text Japan26 and27 Peru28 on29 Saturday30 took31 a32 tough33 stand34 … their45 accord46 was47 swiftly48 ... Output text (45 – 45, 26 – 28): (their, Japan and Peru)

  10. Task 4: Coreferential chains resolution on raw texts • unannotated texts were given to participants • systems had to determine • the coreferential NPs • assign the to chains • the most popular task (3 runs submitted) Input text Japan26 and27 Peru28 on29 Saturday30 took31 a32 tough33 stand34 … their45 accord46 was47 swiftly48 Output text (26 – 28 , 45 – 45, …): (Japan and Peru, their, …)

  11. Overlap measure

  12. Evaluation method for task 3

  13. Evaluation of task 3

  14. Evaluation for Task 4 • MUC scores modified to use overlap metric

  15. Rationale for the tasks • tasks 1 and 3 evaluation of pronominal anaphora • tasks 2 and 4 evaluation of coreference resolution • tasks 1 and 2 evaluation of algorithms • tasks 3 and 4 evaluation of fully automatic systems

  16. Corpus used in ARE2007 • we used the NP4E corpus (Hasler et. al, 2006) • over 55,000 words • newswire texts • five clusters of related documents • annotation in two steps: • identification of markables • identification of relations between markables • annotation done using PALinkA (Orasan 2003)

  17. Markables • all the NPs at all levels regardless whether they are coreferential or not • include all the modifiers (both pre- and post- modifiers) • possessive pronouns and possessors • no relative pronouns or relative clauses • no NPs from fixed expressions (in town, on board, etc.)

  18. Coreferential links • COREF and UCOREF • only nominal identity of reference direct anaphoric expressions • relations marked: • identity • synonymy • generalisation • specialisation lexical choice rather than concept (i.e. the house … the door)

  19. Coreferential links • definite NPs in copular relation:[the blast] was [the worst attack on [civilians] on [U.S. soil]] • definite appositives[Zaire Airlines, [the main commercial airline in [Zaire]]] • text in brackets • I, you, we in speech coreferential to their antecedents

  20. AnCora corpora • ANnotated CORporA for Catalan and Spanish • Newspaper and newswire texts • 500,000 words each • Annotated with: • PoS tags and lemmas • Constituents and functions • Argument structures, thematic roles • Named entities • Nominal WordNet synsets • Coreference relations

  21. AnCora corpora • XML in-line annotation • Markables (syntactic nodes) • NPs  <sn> ... </sn> + <sn elliptic=“yes”/> • Clitics  <v> darles </v> • Clauses, sentences  <S> ... </S> • Attributes • entity=“entity#” • coreftype=“ident/pred/dx”

  22. Example: AnCora-Ca "[L' aeroport] ha d' anar amb [compte] amb [els sorolls]. [Ø] Ha de comportar -se com [un bon veí]", va recomanar [Morlanes] ... Malgrat [les diferències entre [AENA i veïns de [Gavà_Mar]]]

  23. Example: AnCora-Ca <sn entity=“entity3”>"L' aeroport</sn>ha d' anar amb compte amb els sorolls.<sn elliptic=“yes” entity=“entity3” coreftype=“ident”/>ØHa de comportar -se com un bon veí", va recomanar Morlanes ... Malgrat les diferències entre<sn entity=“entity3” coreftype=“ident”>AENA</sn>i veïns de Gavà_Mar

  24. Identity of reference • Identity, synonymy l’Ajuntament de Tarragona ... l’Ajuntament los usuarios de la Red en EEUU ... los internautas estadounidenses • Generalisation, specialisation los precios del café ... los precios • Metonymy los conductores de camiones ... los camiones no hacen caso de los agentes

  25. Identity of reference • Different scope of generics las mujeres de España ... las mujeres • Place boundness In Garraf, the unemployment rate ... it is higher in Lleida • Time boundness el Festival de la Música Viva ... aquesta edició • Unrealized entities Si hay [un fan de [Georgie_Fame] , o de [Gary_Brooker] o de [Albert_Lee]] , [Ø] puede estar a [dos metros de [él]]

  26. AnCora vs. NP4E • Similarities • All NPs (modifiers, embedded NPs, coordinated NPs) e.g. [passengers on [a flight from [Moscow] to [Nigeria]]] [la existencia de [una fuerte división] en [esta institución]] [[McVeigh] and [Nichols]] [[el PP] y [el PSOE]] [[Barcelona] airport] vs. [l’ aeroport de [Barcelona]] • No NPs part of fixed expressions e.g.came to power, subió al poder • Identity relation • Predicative relation: copular, apposition • No identity-of-sense: China-org vs. China-loc • No pleonastic pronouns

  27. AnCora vs. NP4E • Differences • Zero elements: elliptical subjects • Clitical pronouns e.g. give them = (Spanish) darles / (Catalan) donar-les • Relative pronouns • Discourse deixis • No possessive pronouns • Split antecedents

  28. Preparation of Catalan and Spanish data for ARE2009 • there are lots of similarities between the guidelines used for AnCora and NP4E corpus • features too specific will be discarded • AnCora will be converted to the light XML annotation used in ARE2007 We hope not to encounter major problems when we do the actual conversion

  29. Lessons learnt from ARE2007 • if possible more evaluation methods and more baselines • better overlap metric • participants want more time (lots of interest, but the evaluation clashed with some major conferences) • participants want to be able to publish

  30. Gold standard Output of the system Insert MIN attribute Better overlap metric • no head/MIN attribute for the markable • the same system obtained better results on Task 4 than Task 2 Task 2  Score 0 Task 4  Score 0.xxxx

  31. Plans for ARE2009 • include 4 languages: Catalan, Dutch, English, and Spanish • keep the 4 tasks • include a multilingual task for pronominal anaphora resolution • evaluate some preprocessing stages for anaphora resolution • have a real time task

  32. Preprocessing tasks • Identification of pleonastic it pronouns in English texts • Identification of pleonastic het pronouns in Dutch • Identification of elliptical subjects in Spanish and Catalan

  33. NP anaphora and NP coreference resolution tasks Catalan Dutch English Spanish Task 1 Yes No Yes Yes Task 2 Yes No Yes Yes Task 3 Yes Yes Yes Yes Task 4 Yes Yes Yes Yes

  34. Multilingual task for pronominal anaphora resolution • Is it possible to have a multilingual system? • participants get a set of documents with paragraphs in Catalan, Dutch, English, and Spanish • referential personal pronouns marked in all the texts • candidate noun phrases not annotated • use a modified version of success rate that considers how correctly pronouns were resolved and in how many languages • 350+50 pronouns per language … but, more thinking necessary

  35. Real time tasks • Invite DAARC participants to take part in a real time exercise • The same tasks as for the main ARE2009 exercise, but … • participants will need to bring their programs • will have one hour to submit the results • the tasks may include some surprise texts • … subject to interest from participants and presence of the necessary infrastructure

  36. Tentative timescale 14 Nov 2008 Preliminary call for participation 15 Jan 2009 Training data released 4 - 23 May 2009 Test data is released (48 hours to submit the results after test data downloaded) 30 May 2009 Results communicated back to participants 6 June 2009 4 page technical reports due from participants 20 June 2009 Reviews back to participants 1 July 2009 Final version of technical reports 5 - 6 Nov 2009 DAARC2009, Goa, India

  37. Webpage http://www.anaphora-and-coreference.info/ARE2009 Mailing list ARE2009-list@anaphora-and-coreference.info Email address ARE2009@anaphora-and-coreference.info Thank you!

More Related