1 / 17

Anaphor Resolution in Norwegian

Anaphor Resolution in Norwegian. Gordana Ili c Holen Institut for lingvistiske fag Det historisk-filosofiske fakultet Universitetet i Oslo g.i.holen@hfstud.uio.no. Some technical data. Hovedfagsoppgave (incl. obligatory courses, a 4 semestrer project)

konane
Download Presentation

Anaphor Resolution in Norwegian

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Anaphor Resolution in Norwegian Gordana Ilic Holen Institut for lingvistiske fag Det historisk-filosofiske fakultet Universitetet i Oslo g.i.holen@hfstud.uio.no

  2. Some technical data Hovedfagsoppgave (incl. obligatory courses, a 4 semestrer project) Aim: Making a system for resolving pronominal anaphors in Norwegian. Mentor: Janne Bondi Johannessen Implementation in (CLOS) LISP To be finished Christmas 2003 Fefor

  3. Where did it start? Martin Hassel, 2000 Made AR system for Swedish pronouns han/ honom/ hans and hon/ henne /hennes Differences Planning to cover more pronouns A different theoretical background Fefor

  4. The Top List • Han/ ham/ hans and hun/ henne/ hennes • Among the most used; not ambiguous • Seg and selv • Syntactic solutions • Den • Ambiguous with the determinative den (gule bilen). Fefor

  5. The Top Wish List • De • Ambiguous with a determinative de (gule bilene) • Problems delimiting the antecedent • Det • Problems in deciding whether det is pronominal • det (gule huset) • det (regner) Fefor

  6. Approach To be based on • Mitkov's anaphora resolution system/ MARS (Mitkov 1996, 1998) and partially on • Resolution of Anaphora Procedures/ RAP (Leass & Lappin 1994). Fefor

  7. Why MARS and RAP • Both made for English • MARS: intuitive, fully automated • RAP: high precision • Flexible Fefor

  8. MARS • No parsing • The AR module uses a list of preferences called antecedent indicators • Boosting • Impeding • Fully automatic, not very high precision (60 - 61%) Fefor

  9. MARS: The algorithm • The text is POS tagged. • NPs are extracted by a NP-extractor • NPs which precede the anaphor (in a two-sentence scope) are located • Gender and number constraints are applied • Antecedent indicators are applied to the antecedent candidates that agree in gender and number. The scores (2, 1, 0 or -1) are assigned. • The NP with the highest score is proposed as antecedent. Fefor

  10. MARS: Antecedent indicators(boosting) • First noun phrases +1 • Indicating verbs +1 • Lexical reiteration +2 / +1 • Section heading preference +1 • Collocation match +2 • Immediate reference +2 • Sequential instructions +2 • Term preference +2 Fefor

  11. MARS: Antecedent indicators(boosting) • First noun phrases +1 • Indicating verbs +1 • Lexical reiteration +2 / +1 • Section heading preference +1 • Collocation match +2 • Immediate reference +2 • Sequential instructions +2 • Term preference +2 Fefor

  12. MARS: Antecedent indicators(boosting) • First noun phrases +1 • Indicating verbs +1 • Lexical reiteration +2 / +1 • Section heading preference +1 • Collocation match +2 • Immediate reference +2 • Sequential instructions +2 • Term preference +2 Fefor

  13. MARS: Antecedent indicators(impeding) • Indefiniteness -1 • Prepositional NPs -1 Fefor

  14. RAP • A high precision system (86% correctly resolved anaphors) • Originally based on parsed text, but there exists a version without (Kennedy and Boguraev, 1996) • The AR module: Salience weighting Fefor

  15. RAP: Salience weighting • Salience factors: • Sentence recency 100 • Subject emphasis 80 • Head noun emphasis 80 • Existential emphasis 70 • Accusative emphasis 50 • Non-adverbial emphasis 50 • IO and oblique component emphasis 40 Fefor

  16. Modifications As both systems exist in versions with or without parsing, leaving this question open. Starting with using Oslo Corpus for training and adjusting • Experiment with antecedent indicators and adjust them for Norwegian • Try to combine them with RAP’s salience factors Fefor

  17. Open for suggestions g.i.holen@hfstud.uio.no Fefor

More Related