speech summarization n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Speech Summarization PowerPoint Presentation
Download Presentation
Speech Summarization

Loading in 2 Seconds...

play fullscreen
1 / 23

Speech Summarization - PowerPoint PPT Presentation


  • 114 Views
  • Uploaded on

Speech Summarization. Julia Hirschberg (thanks to Sameer Maskey for some slides) CS4706. Summarization  Distillation.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Speech Summarization' - nathan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
speech summarization

Speech Summarization

Julia Hirschberg (thanks to Sameer Maskey for some slides)

CS4706

summarization distillation
Summarization Distillation
  • ‘…the process of distilling the most important information from a source (or sources) to produce an abridged version for a particular user (or users) and task (or tasks) [Mani and Maybury, 1999]
  • Why summarize? Too much data!
types of summarization
Types of Summarization
  • Indicative
    • Describes the document and its contents
  • Informative
    • ‘Replaces’ the document
  • Extractive
    • Concatenate pieces of existing document
  • Generative
    • Creates a new document
  • Document compression
slide4

[Salton, et al., 1995]

Sentence Extraction

Similarity Measures

[McKeown, et al., 2001]

Extraction Training

w/ manual Summaries

SOME SUMMARIZATION

TECHNIQUES BASED

ON TEXT (LEXICAL FEATURES)

[Hovy & Lin, 1999]

Concept Level

Extract concepts units

[Witbrock & Mittal, 1999]

Generate Words/Phrases

[Maybury, 1995]

Use of Structured Data

sentence extraction similarity measures salton et al 1995
Sentence Extraction/Similarity measures [Salton, et al. 1995]
  • Extract sentences by their similarity to a topic sentence and their dissimilarity to sentences already in summary (Maximal Marginal Relativity)
  • Similarity measures
    • Cosine Measure
    • Vocabulary Overlap
    • Topic word overlap
    • Content Signatures Overlap
concept content level extraction hovy lin 1999
Concept/content level extraction [Hovy & Lin, 1999]
  • Present key-words as summary
  • Builds concept signatures by finding relevant words in 30,000 WSJ documents, each categorized into different topics
  • Phrase concatenation of relevant concepts/content
  • Sentence planning for generation
feature based statistical models kupiec et al 1995
Feature-based statistical models [Kupiec, et al., 1995]
  • Create manual summaries
  • Extract features
  • Train statistical model using various ML techniques
  • Use the trained model to score each sentence in the test data
  • Extract N highest-scoring sentences
      • Where S is summary given k features Fj and P(Fj) & P(Fj|s of S) can be computed by counting occurrences
structured database maybury 1995
Structured Database [Maybury, 1995]
  • Summarize text represented in structured form: database, templates
    • E.g. generation of a medical history from a database of medical ‘events’
  • Link analysis (semantic relations within the structure)
  • Domain dependent importance of events
comparing speech and text summarization
Comparing Speech and Text Summarization
  • Alike
    • Identifying important information
    • Some lexical, discourse features
    • Extraction or generation or compression
  • Different
    • Speech Signal
    • Prosodic features
    • NLP tools?
    • Segments – sentences?
    • Generation?
    • Errors
    • Data size
text vs speech summarization news
Text vs. Speech Summarization (NEWS)

Speech Signal

Speech Channels

- phone, remote satellite, station

Transcripts

- ASR, Close Captioned

Error-free Text

Transcript- Manual

Many Speakers

- speaking styles

Lexical Features

Some Lexical Features

Segmentation

-sentences

Structure

-Anchor, Reporter Interaction

Story presentation

style

Prosodic Features

-pitch, energy, duration

NLP tools

Commercials, Weather Report

speech summarization today
Speech Summarization Today
  • Mostly extractive:
    • Words, sentences, content units
  • Some compression methods
  • Generation-based summarization difficult
    • Text or synthesized speech?
generation or extraction
Generation or Extraction?
  • SENT27 a trial that pits the cattle industry against tv talk show host oprah winfrey is under way in amarillo , texas.
  • SENT28 jury selection began in the defamation lawsuit began this morning .
  • SENT29 winfrey and a vegetarian activist are being sued over an exchange on her April 16, 1996 show .
  • SENT30 texas cattle producers claim the activists suggested americans could get mad cow disease from eating beef .
  • SENT31 and winfrey quipped , this has stopped me cold from eating another burger
  • SENT32 the plaintiffs say that hurt beef prices and they sued under a law banning false and disparaging statements about agricultural products
  • SENT33 what oprah has done is extremely smart and there's nothing wrong with it she has moved her show to amarillo texas , for a while
  • SENT34 people are lined up , trying to get tickets to her show so i'm not sure this hurts oprah .
  • SENT35 incidentally oprah tried to move it out of amarillo . she's failed and now she has brought her show to amarillo .
  • SENT36 the key is , can the jurors be fair
  • SENT37 when they're questioned by both sides, by the judge , they will be asked, can you be fair to both sides
  • SENT38 if they say , there's your jury panel
  • SENT39 oprah winfrey's lawyers had tried to move the case from amarillo , saying they couldn't get an impartial jury
  • SENT40 however, the judge moved against them in that matter …

story

summary

slide13

[Christensen et al., 2004]

Sentence extraction with

similarity measures

[Hori C. et al., 1999, 2002] , [Hori T. et al., 2003]

Word scoring

with dependency structure

SPEECH SUMMARIZATION

TECHNIQUES

[Koumpis & Renals, 2004]

Classification

[He et al., 1999]

User access information

[Zechner, 2001]

Removing disfluencies

[Hori T. et al., 2003]

Weighted finite state

transducers

content context sentence level extraction for speech summary christensen et al 2004
Content/Context sentence level extraction for speech summary [Christensen et al., 2004]
  • Find sentences similar to the lead topic sentences
  • Use position features to find the relevant nearby sentences after detecting a topic sentence
    • where Sim is a similarity measure between two sentences or a sentence and a document (D) and E is the set of sentences already in the summary
    • Choose a new sentence which is most like D and most different from E
weighted finite state transducers for speech summarization hori t et al 2003
Weighted finite state transducers for speech summarization [Hori T. et al., 2003]
  • Summarization includes speech recognition, paraphrasing, sentence compaction integrated into single Weighted Finite State Transducer
  • Decoder can use all knowledge sources in one-pass strategy
  • Speech recognition using WFST
    • Where H is state network of triphone HMMs, C is triphone connection rules, L is pronunciation and G is trigram language model
  • Paraphrasing can be looked at as a kind of machine translation with translation probability P(W|T) where W is source language and T is the target language
  • If S is the WFST representing translation rules and D is the language model of the target language speech summarization can be looked at as the following composition

Speech Translator

H

C

L

G

S

D

Speech recognizer

Translator

user access identifies what to include he et al 1999
User Access Identifies What to Include [He et al., 1999]
  • Summarize lectures or shows by extracting parts that have been viewed the longest
  • Needs multiple users of the same show, meeting or lecture for training
  • E.g. To summarize lectures compute the time spent on each slide
  • Summarizer based on user access logs did as well as summarizers that used linguistic and acoustic features
    • Average score of 4.5 on a scale of 1 to 8 for the summarizer (subjective evaluation)
word level extraction by scoring classifying words hori c et al 1999 2002
Word level extraction by scoring/classifying words [Hori C. et al., 1999, 2002]
  • Score each word in the sentence and extract a set of words to form a sentence whose total score is the product/sum of the scores of each word
  • Example:
    • Word Significance score (topic words)
    • Linguistic Score (bigram probability)
    • Confidence Score (from ASR)
    • Word Concatenation Score (dependency structure grammar)

Where M is the number of words to be extracted, and I C T are weighting factors for balancing among L, I, C, and T r

segmentation using discourse cues maybury 1998
Segmentation Using Discourse Cues [Maybury, 1998]
  • Discourse Cue-Based Story Segmentation
  • Discourse Cues in CNN
    • Start and end of broadcast
    • Anchor/Reporter handoff, Reporter/Anchor handoff
    • Cataphoric Segment (“still ahead …”)
  • Time Enhanced Finite State Machine representing discourse states such as anchor segment, reporter segment, advertisement
  • Other features: named entities, part of speech, discourse shifts “>>” speaker change, “>>>” subject change
cu summarization without words does importance of what is said correlates with how it is said
CU: Summarization without Words: Does importance of ‘what’ is said correlates with ‘how’ it is said?
  • Hypothesis: “Speakers change their amplitude, pitch, speaking rate to signify importance of words, phrases, sentences.”
    • If so, then the prediction labels for sentences predicted using acoustic features (A) should correlate with labels predicted using lexical features (L)
    • In fact, this seems to be true (corr .74 between precitions of A and L
is it possible to build good automatic speech summarization without any transcripts
Is It Possible to Build ‘good’ Automatic Speech Summarization Without Any Transcripts?
  • Just using A+S without any lexical features we get 6% higher F-measure and 18% higher ROUGE-avg than the baseline
evaluation using rouge
Evaluation using ROUGE
  • F-measure too strict
    • Predicted summary sentences must match summary sentences exactly
    • What if content is similar but not identical?
  • ROUGE(s)…
rouge metric
ROUGE metric
  • Recall-Oriented Understudy for Gisting Evaluation (ROUGE)
  • ROUGE-N (where N=1,2,3,4 grams)
  • ROUGE-L (longest common subsequence)
  • ROUGE-S (skip bigram)
  • ROUGE-SU (skip bigram counting unigrams as well)
  • Does ROUGE solve the problem?
next class
Next Class
  • Emotional speech
  • HW 4 assigned