Anaphora Resolution





  1. Anaphora Resolution Sobha Lalitha Devi AU-KBC Research Centre MIT Campus of Anna University Chennai-44

  2. Contents • Introduction to Anaphora and Anaphora Resolution • Types of Anaphora • Process of Anaphora Resolution • Tools • Applications • References Summer School, IIIT HYderabad

  3. What is Cohesion COHESION is the internal continuity or network of points of continuity within a text. Text is not just a string of sentences. It is not simply a large grammatical unit “something of the same kind as a sentence, but differing from it in size- a sort of super-sentence· A semantic unit” Halliday & Hassan Summer School, IIIT HYderabad

  4. Cohesive Relationships Cohesive relationships between words and sentences have certain definable qualities that allow us to recognize the super sentence Nature of cohesive relation Type of cohesion Relatedness of form Substitution and ellipsis; Lexical collocation Relatedness of reference Reference, Lexical reiteration Semantic connection Conjunction Summer School, IIIT HYderabad

  5. Relatedness of Form • Substitution: "Nice teapots! I'll take one.“ • Ellipsis: "Turn on. Tune in. Drop out." ['you' is elided] • Collocation: "John went to the bank. He wanted to swim in the river." ['river' disambiguates 'bank'] Summer School, IIIT HYderabad

  6. Relatedness of Reference Exophora: (extra linguistic feature: deitic markers , this that) “what is this?” Anaphora: "I used to have the key. But I lost it.” Cataphora: "It is your turn, John” Reiteration: "He speaks only to the Huxleys; the Huxleys speak only to the Darwins; and the Darwins speak only to God.“ Summer School, IIIT HYderabad

  7. Semantic Connection Conjunction: "You tell me that you've got ev'rything you want, and your bird can sing, but you don't get me, you don't get me! You say you've seen seven wonders, and your bird is green, but you can't see me, you can't see me! When your prized possessions, start to tear you down, then look in my direction, I'll be round, I'll be round." [Beatles -- Lee Campbell] Summer School, IIIT HYderabad

  8. Comparison Halliday & Hassan also classify comparison as a form of cohesion "She's more fun than a barrel of monkeys!" "He's as tall as a six foot four inch tree." Summer School, IIIT HYderabad

  9. CONTEXT DEPENDENCE • The interpretation of most expressions depends on the context in which they are used • Developing methods for interpreting context dependent expressions useful in many applications • We focus here on dependence of nominal expressions on context introduced LINGUISTICALLY, for which we will use the term ANAPHORA Summer School, IIIT HYderabad

  10. Introduction What is Anaphora Antecedent Anaphora Resolution 1. Sabeer Bhatiaarrived at Los Angeles International Airport at 6 p.m. on September 23, 1998. Hisflight from Bangalore had taken 22hrs and he was starving. [RD, NOV 2000] Summer School, IIIT HYderabad

  11. Etymology of Anaphora ANA- Back, Upstream, Back upstream Phora- Act of Carrying Anaphora - Act of Carrying Back Summer School, IIIT HYderabad

  12. What is Anaphora Anaphora, in discourse, is a device for making an abbreviated reference (containing fewer bits of disambiguating information, rather than being lexically or phonetically shorter) to some entity (or entities) in the expectation that the receiver of the discourse will be able to disabbreviate the reference and, thereby, determine the identity of the entity. (Hirst 1981) Summer School, IIIT HYderabad

  13. Cataphora • When “anphora” precedes the antecedent Because she was going to the departmental store, Mary was asked to pick up the vegetables. Summer School, IIIT HYderabad

  14. Relevance from the Linguistics point of view • Binding Theory is one of the major results of the principles and parameters approach developed in Chomsky (1981) and is one of the mainstays of generative linguistics. • The Binding Theory deals with the relations between nominal expressions and possible antecedents. • It attempts to provide a structural account of the complementarity of distribution between pronouns, reflexives and R-expressions. Summer School, IIIT HYderabad

  15. Dichotomy Between Linguistic and NLP • The Binding Theory (and its various formulations) deals only with intra-sentential anaphora, • A very small subset of the anaphoric phenomenon that practical NLP systems are interested in resolving. • A much larger set of anaphoric phenomenon is the resolution of pronouns inter-sententially. • This problem is dealt with by Discourse Representation Theory and more specifically by Centering Theory (Grosz et al., 1995).. Summer School, IIIT HYderabad

  16. Types of Anaphors The Prime Ministeris yet to arrive and heis expected at the central hall at any time. [The Times of India, Feb 2001] This bookis about Anaphora Resolution. Thebook is designed to help beginners in the field and its author hopes that it will be useful. VP Anaphor John screamed, as did Mary . Summer School, IIIT HYderabad

  17. Pronominal anaphora Vajpayeehits back forcefully when he told the opposition today “sometimes we fall prey to the media and sometimes you do. [Indian Express 2001] Possessive Priyankaeats only chicken sandwiches before going to take any exam; nothing else goes downhergullet that day.[Indian Express, 13 March 2001] Summer School, IIIT HYderabad

  18. Reflexive Pronoun Finally ,Danianheaved himselfup and lay on a waiting stretcher. Demonstrative Pronoun John had lots of packingto do before he shifted his house. Thiswas something he never liked…. Relative Pronoun Stumper Sameer Dige, whomade his test debut, failed to show fast reflexives when it mattered. Summer School, IIIT HYderabad

  19. Non Anaphoric Usage of Pronouns. Pleonastic It Cognative a. It is believed that….. b. It appears that….. Modal Adjectives c. It is dangerous…… d. It is important….. Temporal e. It is five o’clock f. It is winter Weather verbs g. It is raining f. It is snowing Distance h. How far it is to Chennai? Summer School, IIIT HYderabad

  20. Non-anaphoric uses of pronouns He that plants thorns must never expect to gather roses. He who dares wins. Deictic He seems remarkably bright for a child of his age. Summer School, IIIT HYderabad

  21. Noun Phrase Anaphora Definite descriptions and Proper names Roy Kaene has warned Manchester United he may snub their pay deal. United’s skipper is even hinting that unless the future Old Trafford Package meets his demands, he could quit the club in June 2000. Irishman Keane, 27, still has 17 months to run on his current 23,000 pound a week contract and wants to commit himself to United for life. Alex Ferguson’sNo 1 player confirmed: If it’s not the contract Iwant, Iwon’t sign”. Summer School, IIIT HYderabad

  22. Coreference Computational Linguists from many different countries attended the tutorial. The participants found it hard to cope with the speed of the presentation, nevertheless they manages to take extensive notes. Summer School, IIIT HYderabad

  23. Coreference Sophia Loren says she will always be grateful to Bono. The actress revealed that the U2 singer helped her calm down when she became scared by a thunderstorm while travelling by a plane. She=> Sophia Loren The actress=> Sophia Loren The U2 Singer=> Bono Her=>Sophia Loren She=>Sophia Loren Summer School, IIIT HYderabad

  24. Coreference chain Sophia Loren says she will always be grateful to Bono. The actress revealed that the U2 singer helped her calm down when she became scared by a thunderstorm while travelling by a plane. Coreference chains • {Sophia Loren, she, the actress, her, she} • {Bono, the U2 singer} Summer School, IIIT HYderabad

  25. Chains of object mentions in text Toni Johnson pulls a tape measure across the front of what was once a stately Victorian home. A deep trench now runs along its north wall, exposed when the house lurched two feet off its foundation during last week's earthquake. Once inside, she spends nearly four hours measuring and diagramming each room in the 80-year-old house, gathering enough information to estimate what it would cost to rebuild it. While she works inside, a tenant returns with several friends to collect furniture and clothing. One of the friends sweeps broken dishes and shattered glass from a countertop and starts to pack what can be salvaged from the kitchen. (WSJ section of Penn Treebank corpus) Summer School, IIIT HYderabad

  26. What is Anaphora Resolution • The Process of finding the antecedent for an Anaphor is Anaphora resolution • Anaphor-The reference that point to the previous item. • Antecedent-The entity to which the anaphor refers Summer School, IIIT HYderabad

  27. RESEARCH ON ANAPHORA RESOLUTION: A QUICK SUMMARY 1970-1995 Primarily theoretical Emphasis: commonsense knowledge, salience Exception: Hobbs 1977, Shalom Lappin 1995-2005 First annotated corpora to be used to develop, evaluate and compare systems (MUC, Geand Charniak, ACE) First robust systems Heuristic-based: Mitkov ML: Vieira & Poesio 1998, 2000; Soon et al 2001, Ng and Cardie2002 Emphasis: surface features Exceptions: Poesio & Vieira, Harabagiu, Markert 2005-present More sophisticated ML techniques (global models, kernels) Richer features –especially semantic information First tools Summer School, IIIT HYderabad

  28. Application of Anaphora Resolution Tasks that require determining the coherence of (segments of) text Segmentation Post-hoc coherence check in summarization (Steinberger et al, 2007) Tasks that require identifying the most important information ina text Sentence selection in summarization (Steinberger et al 2005, 2007) Indexing Information extraction: recognize which expressions refer to objects in the domain Relation extraction from biomedical text (Sanchez-Grailletand Poesio, 2006, 2007) Multimodal interfaces: recognize which objects in the visual scene are being referred to Summer School, IIIT HYderabad

  29. Different Approaches In Anaphora Resolution • Rule Based • Statistical Based • Machine Learning Based Summer School, IIIT HYderabad

  30. Rule Based • Hobbs system Summer School, IIIT HYderabad

  31. Hard Constraints on Coreference • Number agreement • Person and case • Gender Agreement • Syntactic Agreement • Selectional Restrictions Summer School, IIIT HYderabad

  32. Number agreement John and Mary loaned Sue a cup of coffee. Little did they know the magnitude of her addiction. Summer School, IIIT HYderabad

  33. Person and Case Agreement Summer School, IIIT HYderabad

  34. Gender Agreement *John has a coffee machine. She loves it. Summer School, IIIT HYderabad

  35. Syntactic Agreement • Reflexives (himself, herself…) have strong constraints on what syntactic positions they can appear in John bought himself a cup of coffee. *John bought him a cup of coffee. Summer School, IIIT HYderabad

  36. Selectional Constraints Jim bought a coffee from the store. He drank it quickly. Summer School, IIIT HYderabad

  37. Also : Preferences • Recency • Grammatical Role • Repeated Mention • Parallelism • Verb Semantics • Based on Salience Summer School, IIIT HYderabad

  38. Recency John had a pop-tart. Bill had a jelly donut. Mary wanted it. Recent Entities are more salient Summer School, IIIT HYderabad

  39. Grammatical Role “Sue bought a cup of coffee and a donut from Jane. She met John as she left.” • Entities in subject position are more salient Summer School, IIIT HYderabad

  40. Repeated Mention John went to the store to buy coffee. He loves coffee. He drinks 5 cups a day. At the store, Bill sold him a cup. He was delighed. • Entities mentioned more frequently are more salient Summer School, IIIT HYderabad

  41. Parallelism John bought coffee from Jim in the morning. Sue bought coffee from him in the evening. • Even with preferences to the contrary (grammatical role) the syntactic parallelism strongly prefers [him = Jim] Summer School, IIIT HYderabad

  42. Verb Semantics John telephoned Bill. He was jonesing for coffee. John criticized Bill. He was jonesing for coffee. • Perhaps salience of different elements in the sentence changes with respect to the verb used. Summer School, IIIT HYderabad

  43. Algorithms --- How to integrate these preferences? • Constraints are easy to use : reject all hypothesis which violate the hard constraints (if you can accurately detect the constraints!) • Preferences more difficult – how can one integrate these different preferences? Summer School, IIIT HYderabad

  44. Hobbs Tree Search Algorithm • Given parse trees, search them in a specific order to find the most likely referent Summer School, IIIT HYderabad

  45. Hobbs in Detail • Begin at NP • Go up tree to first NP or S. Call this X, and the path p. • Traverse all branches below X to the left of p. Propose as antecedent any NP that has a NP or S between it and X • If X is the highest S in the sentence, traverse the parse trees of the previous sentences in the order of recency. Traverse left-to-right, breadth first. When a NP is encountered, propose as antecedent. If not the highest node, go to step 5. Summer School, IIIT HYderabad

  46. Hobbs cont. • From node X, go up the tree to the first NP or S. Call it X, and the path p. • If X is an NP and the path to X did not pass through the nominal that X dominates, propose X as antecedent • Traverse all branches below X to the right of the path, in a left-to-right, breadth first manner. Propose any NP encountered as the antecdent • If X is an S node, traverse all brnaches of X to the right of the path but do not go below any NP or S encountered. Propose any NP as the antecedent. Summer School, IIIT HYderabad

  47. Lappin and Leass (1994) Anaphora Resolution Algorithm • The Lappin and Leass(1994) anaphora resolution algorithm uses salience weight in determining the antecedent to the pronominals. • It requires as input a fully parsed sentence structure and uses hierarchy in identifying the subject, object etc. • This algorithm uses syntactic criteria to rule out noun phrases that cannot possibly corefer with it. • The antecedent is then chosen according to a ranking based on salience weights. Summer School, IIIT HYderabad

  48. The salience Factors and Weights A pronoun P is non-coreferential with a (non-reflexive or non-reciprocal) noun phrase N if any of the following conditions hold: • P and N have incompatible agreement features. • P is in the argument domain of N. • P is in the adjunct domain of N. P is an argument of a head H, N is not a pronoun, and N is contained in H. • P is in the NP domain of N. • P is a determiner of a noun Q, and N is contained in Q. Summer School, IIIT HYderabad

  49. Examples Condition 1: The woman said that he is funny. Condition 2: She likes her. John seems to want to see him. Condition 3: She sat near her. Condition 4: He believes that the man is amusing. This is the man he said John wrote about. Condition 5: John’s portrait of him is interesting. Summer School, IIIT HYderabad

  50. Salience Factors and Weights Salience factor types with initial weights Factor type Initial weight Sentence recency 100 Subject emphasis 80 Existential emphasis 70 Accusative emphasis 50 Indirect object and oblique complement emphasis 40 Head noun emphasis 80 Non-adverbial emphasis 50 Summer School, IIIT HYderabad