1 / 19

Story Segmentation of Broadcast News

Story Segmentation of Broadcast News. Mehrbod Sharifi mehrbod@cs.columbia.edu Thanks to Andrew Rosenberg ~mehrbod/presentations/SSegDec06.pdf. GALE (Global Autonomous Language Exploitation ).

stokese
Download Presentation

Story Segmentation of Broadcast News

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Story Segmentation of Broadcast News Mehrbod Sharifi mehrbod@cs.columbia.edu Thanks to Andrew Rosenberg ~mehrbod/presentations/SSegDec06.pdf

  2. GALE (Global Autonomous Language Exploitation) “… to absorb, analyze and interpret huge volumes of speech and text in multiple languages, eliminating the need for linguists and analysts and automatically providing relevant, distilled actionable information …” • Transcription Engines (ASR) • Translation Engines (MT) • Distillation Engines (QA+IR) http://projects.ldc.upenn.edu/gale/ http://www.darpa.mil/ipto/Programs/gale/

  3. Task: Story Segmentation • Input: • .sph: audio files from TDT-4 corpus distributed by LDC • .rttmx: output from other collaborators of GALE project (all automated, one word per row) • Speaker boundaries (Chuck at ICSI) • ASR: words, start & end time, confidence, phone durations (Andreas at SRI/ICSI) • Sentence boundaries probabilities (Dustin at UW) • Gold standard – annotated story boundaries • Output: • .rttmx files with story boundaries (generated by a method that performs well on unseen data) /n/squid/proj/gale1/AA/eng-tdt4/tdt4-eng-rttmx-12192005/README

  4. Task: Story Segmentation • Event: specific thing that happens at a specific time and place along with all necessary preconditions and unavoidable consequences “U.S. Marine jet sliced a funicular cable in Italy in February 1998”, the cable car's crash to earth and the subsequent injuries were all unavoidable consequences and thus part of the same event. • Topic: an event or activity, along with all directly related events and activities • Story: News stories may be of any length, even fewer than two independent clauses, as long as they constitute a complete, cohesive news report on a particular topic. Note that single news stories may discuss more than one related topic. http://www.ldc.upenn.edu/Projects/TDT4/Annotation/annot_task_def_V1.4.pdf

  5. Task: Story Segmentation Example: 3898 words / 263 sentences / 26 stories (?: reject or low confidence word) • ? • ? ? [headlines] ? ? • good evening everyone ...[report on war] ... gillian findlay a. b. c. news ? • turning to politics ... [election - Gore] ... a. b. c. news ? ? • this is ron claiborne ... [election - Bush] ... a. b. c. news ? ? • ? as for the two other candidates ... said the same • still ahead ... [teaser] ... camera man • this is world news ... [commercials] ... was a woman • turning to news overseas ... [election] ... no matter what • its just days after a deadly ferry sinking in greece ... safety tests ~mehrbod/rttmx/eng/20001001_1830_1900_ABC_WNT.rttmx ~mehrbod/out/eng.ANC_WNT.txt

  6. Task: Story Segmentation • How difficult is it? • Topic vs. Story • Segment classes • New story • Teaser • Misc. • Under-transcribed • Error accumulated from previous processes

  7. Current Approach - Summary • Align story boundaries with sentence boundaries • Extract sentence level features • Lexical • Acoustic • Speaker-dependent • Train and evaluate a decision tree classifier (J48 or JRip) http://www1.cs.columbia.edu/~amaxwell/pubs/storyseg-final-hlt.pdf

  8. Current Approach - Features • Lexical (*various windows) • TextTiling*, LCSeg, keywords*, sentence position and length • Acoustic • Pitch and Intensity: min, max, median, mean, std. dev., mean absolute slope • Pause, speaking rate (voiced frame / total) • Vowel Duration: Mean vowel length, sentence final vowel length, sentence final rhyme length • Second order of the above • Speaker • speaker distribution, speaker turn, first in the show

  9. Current Approach - Results • Report in the HLT paper for full feature set at the sentence level pk (Beeferman et al., 1999) WindowDiff (Pevzner and Hearst, 2002) Cseg (Doddington, 1998)

  10. Improvements In Progress • Looking for ways to reduce the negative effect of error inherited from upstream processes (ASR, SU and speaker detection) • Adding/modifying features to make them more flexible to error • Analyzing the current features and discard those that are not discriminative or descriptive enough • Improving the framework for the package

  11. Word Level vs. Sentence Level • Pros • Eliminate the error on sentence boundary detection (it becomes a feature) • No need for story boundary alignment • Cons • More chance for error and lower baseline • Higher risk of over fitting

  12. Word Level vs. Sentence Level

  13. Word Level - Features • Providing information about a window preceding, surrounding or following the current word to provide more information: • Acoustic features were done for windows of five words • Similar idea for other features, e.g., • @attribute speaker_boundary {TRUE,FALSE} • @attribute same_speaker_5 {TRUE,FALSE} • @attribute same_speaker_10 {TRUE,FALSE} • @attribute same_speaker_20 {TRUE,FALSE}

  14. Word Level - Features • Feature analysis for sentence level features e.g., for ABC show using Weka (ordered list):

  15. Word Level - Features • Word ASR confidence: score, (@reject@ or score < 0.8): Boolean and count in various window widths • Word introduction

  16. Word Level - Results

  17. Future Directions • Finding a reasonable segmentation strategy, followed by • clustering on featured extracted from segments: • Sentences => A+L+S • Pause => L • “acoustic tiling” => L+S • Sequential Modeling • Performing more morphological analysis particularly in Arabic • Using the rest of the story and topic labels • Using other parts of the TDT and/or external information for training: WordNet, WSJ, etc. • Experimenting with other classifiers: JRip, SVM, Bayesian, GMM, etc.

  18. Thank you. Questions?

More Related