early error detection on word level
Skip this Video
Download Presentation
Early error detection on word level

Loading in 2 Seconds...

play fullscreen
1 / 16

Early error detection on word level - PowerPoint PPT Presentation

  • Uploaded on

Early error detection on word level. Gabriel Skantze and Jens Edlund {gabriel,edlund}@speech.kth.se Centre for Speech Technology Department of Speech, Music and Hearing KTH, Sweden. Overview. How do we handle errors in conversational human-computer dialogue?

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Early error detection on word level' - harlan-williams

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
early error detection on word level

Early error detection on word level

Gabriel Skantze and Jens Edlund


Centre for Speech Technology

Department of Speech, Music and Hearing

KTH, Sweden

  • How do we handle errors in conversational human-computer dialogue?
  • Which features are useful for error detection in ASR results?
  • Two studies on selected features:
    • Machine learning
    • Human subjects’ judgement
error detection
Error detection
  • Early error detection
    • Detect if a given recognition result contains errors
    • e.g. Litman, D. J., Hirschberg, J., & Swertz, M. (2000).
  • Late error detection
    • Feed back the interpretation of the utterance to the user (grounding)
    • Based on the user’s reaction to that feedback, detect errors in the original utterance
    • e.g. Krahmer, E., Swerts, M., Theune, T. & Weegels, M. E. (2001).
  • Error prediction
    • Detect that errors may occur later on in the dialogue
    • e.g. Walker, M. A., Langkilde-Geary, I., Wright Hastie, H., Wright, J., & Gorin, A. (2002).
why early error detection
Why early error detection?
  • ASR errors reflect errors in acoustic and language models. Why not fix them there?
    • Post-processing may consider systematic errors in the models, due to mismatched training and usage conditions.
    • Post-processing may help to pinpoint the actual problems in the models.
    • Post-processing can include factors not considered by the ASR, such as:
      • Prosody
      • Semantics
      • Dialogue history
corpus collection
Corpus collection









I have the lawn on my right and a house with number two on my left

i have the lawn on right is and a house with from two on left

study i machine learning
Study I: Machine learning
  • 4470 words
  • 73.2% correct (baseline)
  • 4/5 training data, 1/5 test data
  • Two ML algorithms tested
    • Transformation-based learning (µ-TBL)
      • Learn a cascade of rules that transforms the classification
    • Memory-based learning (TiMBL)
      • Simply store each training instance in memory
      • Compare the test instance to the stored instances and find the closest match
  • Content-words:
    • Baseline: 69.8%, µ-TBL: 87.7%, TiMBL: 87.0%
study ii human error detection
Study II: Human error detection
  • First 15 user utterances from 4 dialogues with high WER
  • 50% of the words correct (baseline)
  • 8 judges
  • Features were varied for each utterance:
    • ASR information
    • Context information
the judges interface
The judges’ interface

Correction field

Dialogue so far

5-best list

Grey scale reflect word confidence

Utterance confidence

conclusions discussion
Conclusions & Discussion
  • ML can be used for early error detection on word level, especially for content words.
  • Word confidence scores have some use.
  • Utterance context and lexical information improve the ML performance.
  • A rule-learning algorithm such as transformation-based learning can be used to pinpoint the specific problems.
  • N-best lists are useful for human subjects. How do we operationalise them for ML?
conclusions discussion1
Conclusions & Discussion
  • The ML improved only slightly from the discourse context.
    • Further work in operationalising context for ML should focus on the previous utterance
  • The classifier should be tested together with a parser or keyword spotter to see if it can improve performance.
  • Other features should be investigated, such as prosody. These may improve performance further.
the end

The End

Thank you for your attention!