1 / 18

ANNOTATING EVENT ANAPHORA: A CASE STUDY

ANNOTATING EVENT ANAPHORA: A CASE STUDY. Tommaso Caselli and Irina Prodanof ILC-CNR, Pisa tommaso.caselli@ilc.cnr.it irina.prodanof@ilc.cnr.it. LREC-10 – May, 19th, La Valletta, Malta. Outline. Motivations Coreference annotation in TimeML Annotating event anaphora: a preliminary scheme

noble-myers
Download Presentation

ANNOTATING EVENT ANAPHORA: A CASE STUDY

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ANNOTATING EVENT ANAPHORA: A CASE STUDY Tommaso Caselli and Irina Prodanof ILC-CNR, Pisa tommaso.caselli@ilc.cnr.it irina.prodanof@ilc.cnr.it LREC-10 – May, 19th, La Valletta, Malta

  2. Outline • Motivations • Coreference annotation in TimeML • Annotating event anaphora: a preliminary scheme • Annotation methodology and results • Lesson learned and future works

  3. Motivations • Eventualities represent the building blocks of the informative content of a document • Eventualities give rise to relations which create a rich informative network. • temporal relations • sharing of participants • factivity • coreferential relations • Coreferential relations among eventualities plays an important role for facilitating access to content and extract relevant information

  4. Coref. in TimeML • TimeML & ISO-TimeML are standards for the annotation of events, temporal expressions and a set of relations between these entities (temporal, subordinating and aspectual relations) • Main contribution of TimeML: standard definition of event and methodology for its annotation • It-TimeML: Italian adaptation of TimeML (updated version on request) and part of ISO-TimeML • It-TimeML is currently used for the creation of the Italian TimeBank (172 news articles from ISST, PAROLE and Web, 67,140 tokens)

  5. Coref. in TimeML (2) • TimeML tags involved: EVENT and TLINK (temporal link) • TimeML has not a specific link for coreference annotation • workaround: use of a special value of the TLINK tag: “identity” • “identity” is used to: • connect two tokens which are part of a single event instance (e.g. light verbs) • connect coreferential relations between events, namely set-subset

  6. Coref. in TimeML (3) – Use of “identity” fare la spesa [to do shopping]. <EVENT id="e1">fare</EVENT> la <EVENT id="e2">spesa</EVENT> <TLINK lid="l1" eventInstanceID="e1" relatedToEventInstance="e2“ relType="IDENTITY"/>

  7. Coref. in TimeML – Use of “identity” (3) La sessione privata servira’ a tre adempimentij . Innanzitutto, all’ approvazionej della proposta di Abete (ISST sole006). The private session will be used for three [fulfillments]j . First, the [approval]j of the proposal of Abete. La <EVENT id="e1">sessione</EVENT> privata <EVENT id="e2">servira’</EVENT> a tre <EVENT id="e3">adempimenti</EVENT>. <SIGNAL id="s1">Innanzitutto</SIGNAL>, all’ <EVENT id="e4>approvazione</EVENT> della <EVENT id="e5">proposta</EVENT>di Abete. <TLINK lid="l1" eventInstanceID="e4“ relatedToEventInstance="e3" relType="IDENTITY"/>

  8. Coref. in TimeML (4) • The use of the value “identity” is not satisfactory since it is NOT homogeneous • During the (current!) annotation effort for the creation of the Italian TimeBank we have observed that this value could be applied to other cases such as: • synonyms • hypernyms • coreference (strict coreference – same referent in the world)

  9. Event Anaphora • Previous works: Hasler et al 2006; Bejan & Harabagiu 2008 • Hasler et al. 2006: only NPs coreference (strict definition), detailed guidelines – but NO specifications for the annotation; • which events? ACE event frame (LIFE, CONFLICT, MOVEMENT, JUSTICE….) • TimeML compliant • Bejan & Harabagiu 2008: event coreference as a side effect of event structure. • Event coreference is considered when two predicates express same predicate, synonyms or hypernyms and share same arguments • TimeML compliant

  10. Event Anaphora - Methodology (2) • Our approach: • no event frames nor event templates; all instances of event annotated in the Italian TimeBank (TimeML compliant); • open-domain text/discourse • coarse grained bottom up approach in the definition of the annotation scheme • reduced and limited set of guidelines  active discovery of what is needed through annotation and observations from the data • event anaphora: strict coreference + indirect coreference

  11. Event Anaphora - Annotation scheme (3) <MARKABLE> = <EVENT> BUT extended  includes annotation of pronouns and adverbs. JJJJJJIII MA

  12. Event Anaphora - Annotation scheme (4) <EMPTY> = to annotate cases of zero anaphora and ellipsis (frequent in Italian) <TOPIC> = to annotate entire portions of text; it provides anchor to those linguistic entities which can refer to discourse topic “Stiamo ancora parlando, come certamente deve essere, e continueremo a consultarci”j. James Baker, segretario al Tesoro americano, ha commentato cosi’ji risultati dell’assemblea. (ISST els019) “[We are still speaking, as it should be, and we will keep consulting]”j. James Baker, the American Treasure secretary, commented [so]jthe results of the assembly.

  13. Event Anaphora - Annotation scheme (4) <EMPTY> = to annotate cases of zero anaphora and ellipsis (frequent in Italian) <TOPIC> = to annotate entire portions of text; it provides anchor to those linguistic entities which can refer to discourse topic <LINK> = it marks up an anaphoric relations. The attribute “anaphorType” explicits which type of anaporic relation “src” marks the anchor

  14. Event Anaphora – Results (5) • Annotation tool: PALinkA (Orasan, 2003) • 3 annotators / 1,792 tokens • no K scores • Low agreement on the identification of anaphora but relative good on the anchors • More specific guidelines and information • Event anaphora is a widespread phenomenon

  15. Lession Learned and Future Work • Event anaphora is a widespread phenomenon which must be addressed in separate tasks • Relations between full event N, V, PP and Adj • no pronominal anaphoras • New annotation scheme: • 2 tags: <EVENT> and <AnafLink> • different attributes for <EVENT>: FACTIVITY, GENERICITY, POLARITY • relations between particular events according to the attributes' values • reduced type of anaphors (two values: direct vs. indirect) • Tracking of the participants: how to? • Event anaphora annotation as a further link in TimeML or as a separate task which can be built upon the TimeML annotation • New Tool: BAT (thanks to Marc Verhagen)

  16. Lession Learned and Future Work - Example

  17. Lession Learned and Future Work - Example

  18. Thank you!

More Related