Sorry i didn t catch that an investigation of non understandings and recovery strategies
Download
1 / 27

- PowerPoint PPT Presentation


  • 136 Views
  • Uploaded on

sorry, I didn’t catch that! – an investigation of non-understandings and recovery strategies. Dan Bohus www.cs.cmu.edu/~dbohus Alexander I. Rudnicky www.cs.cmu.edu/~air Computer Science Department Carnegie Mellon University Pittsburgh, PA, 15213.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '' - zariel


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Sorry i didn t catch that an investigation of non understandings and recovery strategies l.jpg

sorry, I didn’t catch that! – an investigation of non-understandings and recovery strategies

Dan Bohus www.cs.cmu.edu/~dbohus

Alexander I. Rudnicky www.cs.cmu.edu/~air

Computer Science Department

Carnegie Mellon University

Pittsburgh, PA, 15213


Systems often do not understand correctly l.jpg

  • System extracts incorrect information from the user’s turn

MIS-understanding

S: What city are you leaving from?

U: Birmingham [BERLIN PM]

  • System cannot extract any meaningful information from the user’s turn

NON-understanding

S: What city are you leaving from?

U: Urbana Champaign [OKAY IN THAT SAME PAY]

systems often do not understand correctly

  • non-understandings and misunderstandings


Systems often do not understand correctly3 l.jpg

  • System cannot extract any meaningful information from the user’s turn

NON-understanding

S: What city are you leaving from?

U: Urbana Champaign [OKAY IN THAT SAME PAY]

systems often do not understand correctly

  • detection

  • strategies

  • policy (knowing how to engage the strategies)

  • typically trivial; although diagnosis is not

  • large space of strategies

  • tradeoffs between them not well understood

  • simple heuristics: “incremental prompting”


Questions under investigation l.jpg
questions under investigation

  • what are the main causes of non-understandings?

  • how large is their impact on performance?

  • how do various recovery strategies compare to each other?

  • what are the relationships between strategies and user behaviors?

  • data

  • can we improve global dialog performance by using a smarter policy?

  • if yes, can we learn a better policy from data?


Data collection l.jpg
data collection

  • Roomline

    • phone-based, mixed-initiative system

    • conference room reservations

  • experimental design

    • control group: uninformed recovery policy

    • wizard group: recovery policy implemented by wizard

  • 46 participants, first-time users

  • tasks & experimental procedure

    • up to 10 scenario-driven interactions


Non understanding recovery strategies l.jpg
non-understanding recovery strategies

S: For when do you need the conference room?

1. ASK REPEAT

Could you please repeat that?

2. ASK REPHRASE

Could you please try to rephrase that?

3. NOTIFY (NTFY)

Sorry, I didn’t catch that ...

4. YIELD TURN (YLD)

5. REPROMPT (RP)

For when do you need the conference room?

6. DETAILED REPROMPT (DRP)

Right now I need to know the date and time for when you need the reservation …

7. MOVE-ON

Sorry, I didn’t catch that. For which day you need the room?

8. YOU CAN SAY (YCS)

Sorry, I didn’t catch that. For when do you need the conference room? You can say something like tomorrow at 10 am …

9. TERSE YOU CAN SAY (TYCS)

Sorry, I didn’t catch that. You can say something like tomorrow at 10 am …

10. FULL HELP (HELP)

Sorry, I didn’t catch that. I am currently trying to make a conference room reservation for you. Right now I need to know the date and time for when you need the reservation. You can say something like tomorrow at 10 am …


Corpus statistics l.jpg
corpus statistics

  • 449 sessions

  • 8278 user turns

  • utterances transcribed and checked

  • manual annotations

    • misunderstandings

    • correct concept values at each turn

    • sources of understanding errors

    • user response-types to recovery strategies


Questions under investigation8 l.jpg
questions under investigation

  • data

  • what are the main causes of non-understandings?

  • how large is their impact on performance?

  • how do various recovery strategies compare to each other?

  • what are the relationships between strategies and user behaviors?


Causes of non understandings l.jpg

Goal

Interpretation

Semantics

Parsing

Text

Recognition

Audio

channel

End-pointing

causes of non-understandings

system

user

conversationlevel

intentionlevel

signallevel

channellevel


Causes of non understandings10 l.jpg
causes of non-understandings

out-of-application

conversationlevel

16%

out-of-grammar

intentionlevel

16%

ASR error

signallevel

62%

endpointer error

channellevel


Questions under investigation11 l.jpg
questions under investigation

  • data

  • what are the main causes of non-understandings?

  • how large is their impact on performance?

  • how do various recovery strategies compare to each other?

  • what are the relationships between strategies and user behaviors?

data: causes of non-understandings : impact on performance : strategy comparison : user behaviors


Modeling impact on performance l.jpg
modeling impact on performance

  • logistic regression

    • P(Task Success) =

1

1 + e-(α + β·FNON)


Questions under investigation13 l.jpg
questions under investigation

  • data

  • what are the main causes of non-understandings?

  • how large is their impact on performance?

  • how do various recovery strategies compare to each other?

  • what are the relationships between strategies and user behaviors?

data: causes of non-understandings : impact on performance : strategy comparison : user behaviors


Strategy performance recovery rate l.jpg
strategy performance – recovery rate

  • overall logistic ANOVA

    • significant differences in mean recovery rates

recovery rate

Help

Yield

Notify

MoveOn

RePrompt

AskRepeat

YouCanSay

AskRephrase

TerseYouCanSay

DetailedReprompt

  • all pairs comparison (corrected using FDR)


Questions under investigation15 l.jpg
questions under investigation

  • data

  • what are the main causes of non-understandings?

  • how large is their impact on performance?

  • how do various recovery strategies compare to each other?

  • what are the relationships between strategies and user behaviors?

data: causes of non-understandings : impact on performance : strategy comparison : user behaviors


User response types l.jpg
user response types

  • tagging scheme by Shin

    • also used by Choularton, Raux

  • 5 categories

    • repeat

    • rephrase

    • contradict

    • change

    • other


Response types after non understaning l.jpg
response types after non-understaning

50%

Communicator (Shin et al.)

40%

Pizza (choularton & dale)

Roomline (this study)

30%

20%

10%

0%

contradict

change

other

rephrase

repeat


User response types by strategy l.jpg
user response types by strategy

100%

Other

80%

Change

Rephrase

60%

Repeat

40%

20%

0%

Help

Yield

Notify

MoveOn

RePrompt

AskRepeat

YouCanSay

AskRephrase

TerseYouCanSay

DetailedReprompt


Summary l.jpg
summary

  • sources of non-understandings

  • impact on performance

  • strategy comparison

  • user responses

  • asr, but also “language” errors → more shaping strategies …

  • regression model allows better quantitative assessment

  • help, “move-on” → further investigate “move-on”

  • margin for improving control over user responses

  • can we improve global dialog performance by using a smarter policy?

  • can we learn a better policy from data?

  • yes

  • preliminary results promising … 



Rejections l.jpg

Before rejectionmechanism

After rejectionmechanism

False rejections

Correct rejections

Figure 3. Misunderstandings and non-understandings before and after rejections

rejections


Strategy performance assessment l.jpg
strategy performance assessment

  • recovery rate

  • recovery utility

    • weighted sum of correctly and incorrectly acquired concepts

    • weights are determined in a data-driven fashion

  • recovery efficiency

    • also takes time to recovery into account


Experimental design scenarios l.jpg
experimental design: scenarios

  • 10 scenarios, fixed order

  • presented graphically (explained during briefing)


Strategy pair wise comparison l.jpg
strategy pair-wise comparison

  • recovery performance ranked list, based on pair-wise t-tests:

  • CER evaluation shows similar results



Impact of recovery rate on performance l.jpg
impact of recovery rate on performance

  • recovery = next turn is correctly understood

    • P(Task Success) =

1

1 + e-(α + β·RecoveryRate)