Measuring the Influence of Errors Induced by the Presence of Dialogs in Reference Clustering of Narr...
Download
1 / 19

Measuring the Influence of Errors Induced by the Presence of Dialogs in Reference Clustering of Narrative Text - PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on

Measuring the Influence of Errors Induced by the Presence of Dialogs in Reference Clustering of Narrative Text. Alaukik Aggarwal, Department of Computer Science and Engineering, MAIT Pablo Gervás, Instituto de Technologia del Concimiento , Universidad Complutense de Madrid

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Measuring the Influence of Errors Induced by the Presence of Dialogs in Reference Clustering of Narrative Text' - waylon


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Measuring the Influence of Errors Induced by the Presence of Dialogs in Reference Clustering of Narrative Text

Alaukik Aggarwal, Department of Computer Science and Engineering, MAIT

Pablo Gervás, Instituto de Technologia del Concimiento, Universidad Complutense de Madrid

Raquel Hervás, Instituto de Technologia del Concimiento, Universidad Complutense de Madrid


Outline of the problem
Outline of the Problem Dialogs in Reference Clustering of Narrative Text

  • Coreference Resolution = Anaphoric + Non-anaphoric

  • Different genres of text studied:

    • Text without dialogues (like news articles)

    • Text consisting only of dialogues (conversations)


An example
An Example Dialogs in Reference Clustering of Narrative Text

  • Sachin Tendulkarhas been honoured with Padma Vibhushan Award. India’sworld number one batsman secured 17,000 runs on home soil. Tendulkar has put India in a strong position against Australia in the One-Day Series. The Indian responded to his critics who believed that his career was sliding with his 40th century.

    Generally the kind of text found in News Articles.


Problems in dialogue why
Problems in Dialogue - Why? Dialogs in Reference Clustering of Narrative Text

  • Pronominal Reference within quoted fragments

  • Change in referential value of demonstratives

    • “You take these bags and I’ll take those”

  • Non-NP antecedents or no antecedents at all


Coreference in narrative
Coreference in Narrative Dialogs in Reference Clustering of Narrative Text

  • Contain many characters and objects

  • Rich in dialogues and coreferences

  • Cover different style of writing from different authors and time periods


Another example
Another Example Dialogs in Reference Clustering of Narrative Text

  • The two elder sons did not delay but set off at once, and the third and youngest son began pleading. "No, my son, you mustn't leave me, an old man, all alone," said the king. "Please let me go, Father! I do so want to travel over the world and find my mother." The king reasoned with him, but, seeing that he could not stop him from going, said: "Oh, all right then, I suppose it can't be helped. Go and God be with you!"

    An excerpt from Three Kingdoms (by Alexander Afanasiev )



Resolving coreference in nps
Resolving Coreference in NPs Narrative Texts

  • Knowledge-rich and Knowledge-poor

  • Different approaches considered by us:

    • Decision trees

    • C4.5 Machine Learning algorithm

    • Clustering

    • Hybrid


Corpus of narrative texts
Corpus of narrative texts Narrative Texts

  • Thirty folk tales in English

  • Different styles, authors and time periods

  • Rich in dialogs between characters

  • Process:

    • Identify references

    • Enrich references with semantic information

    • Coreference resolution using a clustering approach


Step 1 identifying references
Step 1: Identifying References Narrative Texts

  • GATE (General Architecture for Text Engineering)

    • Annie Sentence Splitter

    • Annie English Tokeniser

    • Annie POS Tagger

    • CREOLE plugin

  • Output in XML format


Step 2 feature extraction
Step 2: Feature Extraction Narrative Texts

  • Position

  • Part of Speech (POS)

  • Article

  • Number

  • Semantic Class

    • WordNet (sysnets)

  • Gender

    • A resource of Gender data


Annotated data
Annotated Data Narrative Texts


Step 3 algorithm and working
Step 3: Algorithm and Working Narrative Texts

  • Based on the clustering algorithm by (Cardie and Wagstaff, 1999)

  • dist(NPi, NPj) = ∑ wf * incompatibility (Npi, NPj)

    f Є F

  • Feature (f) - Position, Pronoun, Article, Word-substring, Number, Semantic Class, Gender


Evaluation and results
Evaluation and Results Narrative Texts


Evaluation
Evaluation Narrative Texts

  • Clustering algorithm over the tales twice

    • With dialogs

    • Without dialogs

  • Hand correction of the obtained coreferences for comparison

    • Precision and recall


Results
Results Narrative Texts

  • Precision and Recall Results with and without dialogues:


Conclusions
Conclusions Narrative Texts

  • Nested dialogues decrease the efficiency by 9% in Precision and 7% in Recall

  • But information lost if dialogues are removed

    • Dialogs need to be treated separately

  • In addition, constructed a corpus of tales annotated with coreference information for nominal phrases


Future work
Future work Narrative Texts

  • Dialogs could be extracted from the tale, and considered as a separated text

    • Information about the characters involved is required

  • Possible improvements in different problems

    • Word Sense Disambiguation

    • Named Entity Recognition


Thank you
Thank You. Narrative Texts


ad