Early error detection on word level
This presentation is the property of its rightful owner.
Sponsored Links
1 / 16

Early error detection on word level PowerPoint PPT Presentation


  • 46 Views
  • Uploaded on
  • Presentation posted in: General

Early error detection on word level. Gabriel Skantze and Jens Edlund {gabriel,[email protected] Centre for Speech Technology Department of Speech, Music and Hearing KTH, Sweden. Overview. How do we handle errors in conversational human-computer dialogue?

Download Presentation

Early error detection on word level

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Early error detection on word level

Early error detection on word level

Gabriel Skantze and Jens Edlund

{gabriel,[email protected]

Centre for Speech Technology

Department of Speech, Music and Hearing

KTH, Sweden


Overview

Overview

  • How do we handle errors in conversational human-computer dialogue?

  • Which features are useful for error detection in ASR results?

  • Two studies on selected features:

    • Machine learning

    • Human subjects’ judgement


Error detection

Error detection

  • Early error detection

    • Detect if a given recognition result contains errors

    • e.g. Litman, D. J., Hirschberg, J., & Swertz, M. (2000).

  • Late error detection

    • Feed back the interpretation of the utterance to the user (grounding)

    • Based on the user’s reaction to that feedback, detect errors in the original utterance

    • e.g. Krahmer, E., Swerts, M., Theune, T. & Weegels, M. E. (2001).

  • Error prediction

    • Detect that errors may occur later on in the dialogue

    • e.g. Walker, M. A., Langkilde-Geary, I., Wright Hastie, H., Wright, J., & Gorin, A. (2002).


Why early error detection

Why early error detection?

  • ASR errors reflect errors in acoustic and language models. Why not fix them there?

    • Post-processing may consider systematic errors in the models, due to mismatched training and usage conditions.

    • Post-processing may help to pinpoint the actual problems in the models.

    • Post-processing can include factors not considered by the ASR, such as:

      • Prosody

      • Semantics

      • Dialogue history


Corpus collection

Corpus collection

Speaks

Reads

ASR

Vocoder

Speaks

Listens

User

Operator

I have the lawn on my right and a house with number two on my left

i have the lawn on right is and a house with from two on left


Study i machine learning

Study I: Machine learning

  • 4470 words

  • 73.2% correct (baseline)

  • 4/5 training data, 1/5 test data

  • Two ML algorithms tested

    • Transformation-based learning (µ-TBL)

      • Learn a cascade of rules that transforms the classification

    • Memory-based learning (TiMBL)

      • Simply store each training instance in memory

      • Compare the test instance to the stored instances and find the closest match


Features

Features


Results

Results

  • Content-words:

    • Baseline: 69.8%, µ-TBL: 87.7%, TiMBL: 87.0%


Rules learned by tbl

Rules learned by µ-TBL


Study ii human error detection

Study II: Human error detection

  • First 15 user utterances from 4 dialogues with high WER

  • 50% of the words correct (baseline)

  • 8 judges

  • Features were varied for each utterance:

    • ASR information

    • Context information


Features1

Features


The judges interface

The judges’ interface

Correction field

Dialogue so far

5-best list

Grey scale reflect word confidence

Utterance confidence


Results1

Results


Conclusions discussion

Conclusions & Discussion

  • ML can be used for early error detection on word level, especially for content words.

  • Word confidence scores have some use.

  • Utterance context and lexical information improve the ML performance.

  • A rule-learning algorithm such as transformation-based learning can be used to pinpoint the specific problems.

  • N-best lists are useful for human subjects. How do we operationalise them for ML?


Conclusions discussion1

Conclusions & Discussion

  • The ML improved only slightly from the discourse context.

    • Further work in operationalising context for ML should focus on the previous utterance

  • The classifier should be tested together with a parser or keyword spotter to see if it can improve performance.

  • Other features should be investigated, such as prosody. These may improve performance further.


The end

The End

Thank you for your attention!

Questions?


  • Login