Belief updating in spoken dialog systems
Download
1 / 39

Belief Updating in Spoken Dialog Systems - PowerPoint PPT Presentation


  • 143 Views
  • Uploaded on

Belief Updating in Spoken Dialog Systems. Dan Bohus www.cs.cmu.edu/~dbohus [email protected] Computer Science Department Carnegie Mellon University Pittsburgh, PA, 15217. problem. stems mostly from speech recognition spans most domains and interaction types.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Belief Updating in Spoken Dialog Systems' - bernad


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Belief updating in spoken dialog systems

Belief Updating in Spoken Dialog Systems

Dan Bohus

www.cs.cmu.edu/~dbohus

[email protected]

Computer Science Department

Carnegie Mellon University

Pittsburgh, PA, 15217


Problem
problem

  • stems mostly from speech recognition

  • spans most domains and interaction types

spoken language interfaces lack robustness when faced with understanding errors.


More concretely
more concretely …

S: What city are you leaving from?

U: Urbana Champaign [OKAY IN THAT SAME PAY]

S: Sorry I’m not sure I understood what you said. What city are you leaving from ?

U: Urbana Champaign [FOR MINUTE SINCE HEY]

S: I’m still having trouble understanding you. To go on, I need you to answer the following question. What city are you leaving from?

U: Chicago [CHICAGO]

S: traveling from Chicago. Where would you like to go?

U: Huntsville [SEOUL]

S: traveling to Seoul. What day did you need to travel?

U: no no I’m traveling to Birmingham [THE TRAVELING TO BERLIN P_M]

S: traveling in the afternoon. Okay what day would you be departing chicago

U: /uh/ the tenth of august [AT THE TENTH OF AUGUST]

S: a flight tomorrow… I have a Korean Airlines flight departing Chicago at 1:40pm, arrives Seoul at 5pm the next day. Do you want to take that?

U: my destination is Birmingham [FLIGHT DESTINATION MR WEEKEND]

S: traveling on Saturday, August 12th … I have a flight departing Chicago at 1:40pm arrives Seoul at ………


Non and mis understandings

NON

understanding

MIS

understanding

non- and misunderstandings

S: What city are you leaving from?

U: Urbana Champaign [OKAY IN THAT SAME PAY]

S: Sorry I’m not sure I understood what you said. What city are you leaving from ?

U: Urbana Champaign [FOR MINUTE SINCE HEY]

S: I’m still having trouble understanding you. To go on, I need you to answer the following question. What city are you leaving from?

U: Chicago [CHICAGO]

S: traveling from Chicago. Where would you like to go?

U: Huntsville [SEOUL]

S: traveling to Seoul. What day did you need to travel?

U: no no I’m traveling to Birmingham [THE TRAVELING TO BERLIN P_M]

S: traveling in the afternoon. Okay what day would you be departing chicago

U: /uh/ the tenth of august [AT THE TENTH OF AUGUST]

S: a flight tomorrow… I have a Korean Airlines flight departing Chicago at 1:40pm, arrives Seoul at 5pm the next day. Do you want to take that?

U: my destination is Birmingham [FLIGHT DESTINATION MR WEEKEND]

S: traveling on Saturday, August 12th … I have a flight departing Chicago at 1:40pm arrives Seoul at ………


Approaches for increasing robustness
approaches for increasing robustness

  • gracefully handle errors through interaction

  • fix recognition

  • detect the problems

  • develop a set of recovery strategies

  • know how to choose between them (policy)


Six not so easy pieces

misunderstandings

non-understandings

detection

strategies

policy

six not-so-easy pieces …


Belief updating
belief updating

  • construct more accurate beliefs by integrating information over multiple turns

misunderstandings

detection

S: Where would you like to go?

U: Huntsville

[SEOUL / 0.65]

destination = {seoul/0.65}

S: traveling to Seoul. What day did you need to travel?

U: no no I’m traveling to Birmingham

[THE TRAVELING TO BERLIN P_M / 0.60]

destination = {?}


Belief updating problem statement
belief updating: problem statement

  • given:

    • an initial belief Pinitial(C) over concept C

    • a system action SA

    • a user response R

  • construct an updated belief:

    • Pupdated(C) ← f (Pinitial(C), SA, R)

destination = {seoul/0.65}

S: traveling to Seoul. What day did you need to travel?

[THE TRAVELING TO BERLIN P_M / 0.60]

destination = {?}


Outline
outline

  • related work

  • a restricted version

  • data

  • user response analysis

  • experiments and results

  • some caveats and future work

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Confidence annotation heuristic updates
confidence annotation + heuristic updates

  • confidence annotation

    • traditionally focused on word-level errors [Chase, Cox, Bansal, Ravinshankar]

    • more recently: semantic confidence annotation [Walker, San-Segundo, Bohus]

      • machine learning approach

      • results fairly good, but not perfect

  • heuristic updates

    • explicit confirmation: no → don’t trust ; yes → trust

    • implicit confirmation: no → don’t trust ; o/w → trust

    • suboptimal for several reasons

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Correction detection
correction detection

  • detect if the user is trying to correct the system [Litman, Swerts, Hirschberg, Krahmer, Levow]

  • machine learning approach

    • features from different knowledge sources in the system

    • results fairly good, but not perfect

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Integration
integration

  • confidence annotation and correction detection are useful tools

  • but separately, neither solves the problem

  • bridge together in a unified approach to accurately track beliefs

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Outline1
outline

  • related work

  • a restricted version

  • data

  • user response analysis

  • experiments and results

  • some caveats and future work

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Belief updating general form
belief updating: general form

  • given:

    • an initial belief Pinitial(C) over concept C

    • a system action SA

    • a user response R

  • construct an updated belief:

    • Pupdated(C) ← f (Pinitial(C), SA, R)

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Restricted version 2 simplifications
restricted version: 2 simplifications

  • compact belief

    • system unlikely to “hear” more than 3 or 4 values

      • single vs. multiple recognition results

    • in our data: max = 3 values, only 6.9% have >1 value

    • confidence score of top hypothesis

  • updates after confirmation actions

  • reduced problem

    • ConfTopupdated(C) ← f (ConfTopinitial(C), SA, R)

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Outline2
outline

  • related work

  • a restricted version

  • data

  • user response analysis

  • experiments and results

  • some caveats and future work

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


data

  • collected with RoomLine

    • a phone-based mixed-initiative spoken dialog system

    • conference room reservation

      • search and negotiation

  • explicit and implicit confirmations

    • confidence threshold model (+ some exploration)

  • unplanned implicit confirmations

  • I found 10 rooms for Friday between 1 and 3 p.m. Would like a small room or a large one?

  • I found 10 rooms for Friday between 1 and 3 p.m. Would like a small room or a large one?

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Corpus
corpus

  • user study

    • 46 participants (naïve users)

    • 10 scenario-based interactions each

    • compensated per task success

  • corpus

    • 449 sessions, 8848 user turns

    • orthographically transcribed

    • rich annotation: correct concepts, corrections, etc.

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Outline3
outline

  • related work

  • a restricted version

  • data

  • user response analysis

  • experiments and results

  • some caveats and future work

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


User response types
user response types

  • following Krahmer and Swerts

    • study on Dutch train-table information system

  • 3 user response types

    • YES: yes, right, that’s right, correct, etc.

    • NO: no, wrong, etc.

    • OTHER

  • cross-tabulated against correctness of confirmations

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


User responses to explicit confirmations

~10%

user responses to explicit confirmations

  • from transcripts

    [numbers in brackets from Krahmer&Swerts]

  • from decoded

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Other responses to explicit confirmations
other responses to explicit confirmations

  • ~70% users repeat the correct value

  • ~15% users don’t address the question

    • attempt to shift conversation focus

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


User responses to implicit confirmations
user responses to implicit confirmations

  • Transcripts

    [numbers in brackets from Krahmer&Swerts]

  • Decoded

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Ignoring errors in implicit confirmations
ignoring errors in implicit confirmations

  • users correct later (40% of 118)

  • users interact strategically

    • correct only if essential

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Outline4
outline

  • related work

  • a restricted version

  • data

  • user response analysis

  • experiments and results

  • some caveats and future work

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Machine learning approach
machine learning approach

  • need good probability outputs

  • low cross-entropy between model predictions and reality

    • cross-entropy = negative average log posterior

  • logistic regression

    • sample efficient

    • stepwise approach → feature selection

  • logistic model tree for each action

    • root splits on response-type

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Features target
features. target.

  • initial situation

    • initial confidence score

    • concept identity, dialog state, turn number

  • system action

    • other actions performed in parallel

  • features of the user response

    • acoustic / prosodic features

    • lexical features

    • grammatical features

    • dialog-level features

  • target: was the value correct?

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Baselines
baselines

  • initial baseline

    • accuracy of system beliefs before the update

  • heuristic baseline

    • accuracy of heuristic rule currently used in the system

  • oracle baseline

    • accuracy if we knew exactly when the user is correcting the system

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Results explicit confirmation
results: explicit confirmation

Hard error (%)

Soft error

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Results implicit confirmation
results: implicit confirmation

Hard error (%)

Soft error

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Results unplanned implicit confirmation
results: unplanned implicit confirmation

Hard error (%)

Soft error

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Informative features
informative features

  • initial confidence score

  • prosody features

  • barge-in

  • expectation match

  • repeated grammar slots

  • concept id

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Outline5
outline

  • related work

  • a reduced version. approach

  • data

  • user response analysis

  • experiments and results

  • some caveats and future work

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Eliminate simplification 1
eliminate simplification 1

  • current restricted version

    • belief = confidence score of top hypothesis

    • only 6.9% of cases had more than 1 hypothesis

  • extend to

    • Nhypotheses + 1 (other), where N is a small integer (2 or 3)

    • approach: multinomial generalized linear model

    • use information from multiple recognition hypotheses

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Eliminate simplification 2
eliminate simplification 2

  • current restricted version

    • only updates following system confirmation actions

  • users might correct the system at any point

  • extend to

    • updates after all system actions

related work : restricted version : data : user response analysis : experiment & results : caveats & future work


Shameless self promotion

misunderstandings

non-understandings

detection

strategies

policy

shameless self promotion

- rejection threshold adaptation

- nonu impact on performance

[Interspeech-05]

- comparative analysis of 10 recovery strategies

[SIGdial-05]

  • wizard experiment

  • towards learning nonu recovery policies [Sigdial-05]


Shameless cmu promotion
shameless CMU promotion

  • Ananlada (Moss) Chotimongkol

    • automatic concept and task structure acquisition

  • Antoine Raux

    • turn-taking, conversation micro-management

  • Jahanzeb Sherwani

    • multimodal personal information management

  • Satanjeev Banerjee

    • meeting understanding

  • Stefanie Tomko

    • universal speech interface

  • Thomas Harris

    • multi-participant dialog

  • DoD / Young Researchers’ Roundtable



A more subtle caveat
a more subtle caveat

  • distribution of training data

    • confidence annotator + heuristic update rules

  • distribution of run-time data

    • confidence annotator + learned model

  • always a problem when interacting with the world

  • hopefully, distribution shift will not cause large degradation in performance

    • remains to validate empirically

    • maybe a bootstrap approach?


ad