1 / 8

Columbia CCLS: Committed Belief and Dialog Acts

Columbia CCLS: Committed Belief and Dialog Acts. Mona Diab Rebecca Passonneau Owen Rambow {mdiab,becky,rambow}@ccls.columbia.edu. Columbia CCLS Activities: Committed Belief. Identify in text what the writer (speaker) actually believes is true

fleta
Download Presentation

Columbia CCLS: Committed Belief and Dialog Acts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Columbia CCLS:Committed Belief and Dialog Acts Mona Diab Rebecca Passonneau Owen Rambow {mdiab,becky,rambow}@ccls.columbia.edu

  2. Columbia CCLS Activities:Committed Belief • Identify in text what the writer (speaker) actually believes is true • “The Marines attacked rebels in the South” – committed belief: the writer believes this is true • “An Iraqi government spokesman said that the Marines attacked rebels in the South” – non-committed belief: the writer could believe this, but does not indicate that s/he believes it to be true • “I have demanded for days that the Marines attack rebels in the South” – not applicable: this is not something that the writer could believe to be true (in this case, because it is a desired state) • Team leader: Mona Diab, with input form Lori Levin (CMU) and Owen Rambow • 1 Summer Student

  3. Columbia CCLS Activities:Dialog Acts • Identify dialog acts in spoken and written dialog • Extend existing work by linking dialog acts across greater distance • Extend existing work by annotating complex dialogs including written interactions (email), in which many dialog acts happen in parallel • Address issue that a single turn (=coherent utterance in spoken dialog, or a single email) may perform several dialog acts; identify Dialog Function Units (DFU) • Team leaders: Becky Passonneau and Owen Rambow • 1 Masters student (partially funded by TTO3)

  4. Committed Belief: Accomplishments • Manual Annotation • Completed double annotation for the basic document collection in English (August 31st, 2008) • Updated Manual based on observations by annotators (August 31st 2008) • Automatic Annotation • Created a preliminary supervised system for the prediction of committed belief (SPCB) • For SPCB, We experimented with the original annotations from April (new annotations are much cleaner, will be able to give inter annotator agreement later — mid september) • SPCB is an IOB sequence model using YAMCHA SVMs • Split the data 80/10/10, train, test, dev • Features: character ngrams, lemma as obtained using lingo, context size, POS tag

  5. Committed Belief: Preliminary Results for Automatic Annotation • Preliminary results on the dev data (all numbers are Precision, Recall and F-measure): • Baseline: Default YAMCHA settings: P: 55.80% R: 27.37% F: 36.73% • Best contextual features window size of +1/-1 words, -2/-1 Tags before the current word: 57.66% 34.69% 43.32% • Adding lemma to #2:    60.98% 33.88% 43.55%    • Adding POS tag to #3: 52.94% 46.34% 49.42%    • Adding POS tag to #2: 45.53% 47.46% 47.46% • Adding ngram features “only end of word” to #2: 59.62% 42.82% 49.84%  • Combining #6 and #4: 54.43% 46.61% 50.22%  • Adding up to 4 character ngrams from the beg and end of words to #2: 57.77% 46.34% 51.43% • Combining #8 and #4: 55.94% 48.51% 51.96% • Similar to #9 but the context is different –1/0 words, -2/-1 Tags before the current word: 57.19% 48.51% 52.49%

  6. Committed Belief: Future Work • Manual Annotation (till money runs out) • Calculate inter-annotator agreements • Add more annotations on different genres • Annotate Arabic based on improved manual • Automatic Annotation (subject to new funding) • Create a supervised system for Arabic committed belief annotation • Run cross validation experiments on supervised system using new (clean) data • Experiment with more features such as shallow syntactic features, syntactic dependency features, TAG features, semantic role features, word senses • Experiment with semi supervised approaches • Bootstrap from multilingual data using parallel corpora

  7. Dialog Acts: Accomplishments • Manual Annotation • Hired and trained new annotators • Started annotation of dialog corpora, email corpora to follow • Note: core data includes little dialog, so we are annotating non-core data • Automatic Annotation • Student read background material over summer • Successful launch meeting with student

  8. Dialog Acts: Future Work • Manual Annotation: Ongoing • Automatic Annotation • Phase I: Basic dialog act tagging by Sep 15 • Phase II: Forward and backward links by Sep 30 • Phase III: Dynamic turn segmentation into Dialog Function Units by Oct 31 • Phase IV: Improvements to all three functionalities by Dec 31 • Note 1: schedule negotiated previously • Note 2: student partially funded by TTO3, partially by other sources

More Related