Outline of Talk

Outline of Talk • A brief historical perspective • Cognitive User Interfaces • Statistical Dialogue Modelling • Scaling to the Real World • System Architecture • Some Examples and Results • Conclusions and future work.

Why Talk to Machines? • it should be an easy and efficient way of finding out information and controlling behaviour • sometimes it is the only way • hands-busy eg surgeon, driver, package handler, etc. • no internet and no call-centres e.g. areas of 3rd world • very small devices • one day it might be func.f. Project Natal - Milo

VODIS - circa 1985 Natural language/mixed initiative Train-timetable Inquiry Service 150 word DTW connected speech recognition 8 x 8086 Processors PDP11/45 Logos Speech Recogniser Frame-based Dialogue Manager Words Recognition Grammars DecTalk Synthesiser 128k Mem/ 2x5Mb Disk Text Demo Collaboration between BT, Logica and Cambridge U.

Some desirable properties of a Spoken Dialogue System • able to support reasoning and inference • interpret noisy inputs and resolve ambiguities in context • able to plan under uncertainty • clearly defined communicative goals • performance quantified as rewards • plans optimized to maximize rewards • able to adapt on-line • robust to speaker (accent, vocab, behaviour,..) • robust to environment (noise, location, ..) • able to learn from experience • progressively optimize models and plans over time Cognitive User Interface S. Young (2010). "Cognitive User Interfaces." Signal Processing Magazine 27(3)

Essential Ingredients of a Cognitive User Interface (CUI) • Explicit representation of uncertainty using a probability model over dialogue states e.g. using Bayesian networks • Inputs regarded as observations used to update the posterior state probabilities via inference • Responses defined by plans which map internal states to actions • The system’s design objectives defined by rewards associated with specific state/action pairs • Plans optimized via reinforcement learning • Model parameters estimated via supervised learning and/or optimized via reinforcement learning Partially Observable Markov Decision Process (POMDP)

A Framework for Statistical Dialogue Management Distribution Parameters λ Σ R = r(bt,at) t Belief Observation bt = P(st|ot-1,bt-1; λ ) ot Model Distribution of Dialogue States st Speech Understanding User Policy π(at|bt,θ) Policy Parameters θ Action at Response Generation Reward Reward Function r bt,at

Belief Tracking aka Belief Monitoring Belief is updated following each new user input However, the state space is huge and the above equation is intractable for practical systems. So we approximate: Hidden Information State System (HIS) Track just the N most likely states Factorise the state space and ignore all but major conditional dependencies Graphical Model System (GMS aka BUDS) S. Young (2010). "The Hidden Information State Model" Computer Speech and Language 24(2) B. Thomson (2010). "Bayesian update of dialogue state" Computer Speech and Language 24(4)

Dialogue State • Tourist Information Domain • type = bar,restaurant • food = French, Chinese, none gtype Goal gfood User Behaviour hfood utype htype User Act Recognition/ Understanding Errors Next Time Slice t+1 ufood otype Memory ofood History Observation at time t J. Williams (2007). ”POMDPs for Spoken Dialog Systems." Computer Speech and Language 21(2)

Dialogue Model Parameters (ignoring history nodes for simplicity) gtype gfood gtype gfood utype ufood utype ufood otype ofood otype ofood time t time t+1

Belief Monitoring (Tracking) gfood gtype gfood gtype F C - B R F C - B R ufood utype utype ufood t=1 ofood ofood otype otype t=2 inform(food=french) {0.9} confirm(food=french) affirm() {0.9}

Belief Monitoring (Tracking) gfood gtype gfood gtype F C - B R F C - B R ufood utype utype ufood t=1 ofood ofood otype otype t=2 inform(type=bar, food=french) {0.6} inform(type=restaurant, food=french) {0.3} confirm(type=restaurant, food=french) affirm() {0.9}

Belief Monitoring (Tracking) gfood gtype gfood gtype F C - B R F C - B R ufood utype utype ufood t=1 ofood ofood otype otype t=2 inform(type=bar) {0.4} select(type=bar, type=restaurant) inform(type=bar) {0.4}

Choosing the next action – the Policy gtype gfood Policy Vector 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 type food 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 Quantize F C - B R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 a = select Sample Map All Possible Summary Actions: inform, select, confirm, etc select(type=bar, type=restaurant) inform(type=bar) {0.4}

Policy Optimization Policy parameters chosen to maximize expected reward Natural gradient ascent works well Fisher Information Matrix Gradient is estimated by sampling dialogues and in practice Fisher Information Matrix does not need to be explicitly computed. This is the Natural Actor Critic Algorithm. J. Peters and S. Schaal (2008). "Natural Actor-Critic." Neurocomputing 71(7-9)

Dialogue Model Parameter Optimization Approximation of belief distribution via feature vectors prevents policy differentiation wrt Dialogue Model parameters . However a trick can be used. Assume that are drawn from a prior which is differentiable wrt . Then optimize reward wrt to and sample to get . This is the Natural Belief Critic Algorithm. It is also possible to do maximum likelihood model parameter estimation using Expectation Propagation. F. Jurcicek (2010). "Natural Belief-Critic" Interspeech 2010 B. Thomson (2010). "Parameter learning for POMDP spoken dialogue models. SLT 2010

Performance Comparison in Simulated TownInfo Domain Handcrafted Model and Handcrafted Policy Trained Model and Trained Policy Handcrafted Model and Trained Policy Handcrafted Policy and Trained Model Reward = 100 for success – 1 for each turn taken

Scaling up to Real World Problems • compact representation of dialogue state eg HIS, BUDS • mapping belief states into summary states via quantisation, feature vectors, etc • mapping actions in summary space back into full space Several of the key ideas have already been covered But inference itself is also a problem …

CamInfo Ontology • Many concepts • Many valuesper concept multiple nodes per concept Complex Dialogue State

Belief Propagation Times Time Standard LBP LBP with Grouping LBP with Grouping & Const Prob of Change Network Branching Factor B. Thomson (2010). "Bayesian update of dialogue state" Computer Speech and Language 24(4)

Architecture of the Cambridge Statistical SDS Run-time mode dialog acts speech words Speech Recognition Semantic Decoder Dialogue Manager HIS or BUDS y p(w|y) p(v|y) dialog acts speech words Message Generator Speech Synthesiser p(m|a) p(x|a) a Corpus Data

Architecture of the Cambridge Statistical SDS Training mode dialog acts Dialogue Manager HIS or BUDS User Simulator Error Model p(v|y) dialog acts a Corpus Data

CMU Let’s Go Spoken Dialogue Challenge • Telephone-based spoken dialog system to provide bus schedule information for the City of Pittsburgh, PA (USA). • Based on existing system with real users. • Two stage evaluation process • Control Test with recruited subjects given specific known tasks • Live Test with competing implementations switched according to a daily schedule • Full results to be presented at a special session at SLT Organised by the Dialog Research Center, CMU See http://www.dialrc.org/sdc/

Let’s Go 2010 Control Test Results All Qualifying Systems Predicted Success Rate System Z 89% Success 33% WER System X 65% Success 42% WER B. Thomson "Bayesian Update of State for the Let's Go Spoken Dialogue Challenge.” SLT 2010. System Y 75% Success 34% WER Average Success = 64.8% Average WER = 42.4% Word Error Rate (WER)

CamInfo Demo

Conclusions • End-end statistical dialogue systems can be built and are competitive • Core is a POMDP-based dialogue manager which provides an explicit representation of uncertainty with the following benefits • robust to recognition errors • objective measure of goodness via reward function • ability to optimize performance against objectives • reduced development costs – no hand-tuning, no complex design processes, easily ported to new applications • natural dialogue – say anything, any time • Still much to do • faster learning, off-policy learning, long term adaptation, dynamic ontologies, multi-modal input/output • Perhaps talking to machines is within reach ….

Credits EU FP7 Project: Computational Learning in Adaptive Systems for Spoken Conversation Spoken Dialogue Management using Partially Observable Markov Decision Processes Past and Present Members of the CUED Dialogue Systems Group Milica Gasic, Filip Jurcicek, Simon Keizer, Fabrice Lefevre, Francois Mairesse, Jorge Prombonas, Jost Schatzmann, Matt Stuttle, Blaise Thomson, Karl Weilhammer, Jason Williams, Hui Ye, Kai Yu

Outline of Talk

Outline of Talk

Presentation Transcript

Outline of Talk

Talk outline

Outline of Talk

Outline of Talk

OUTLINE OF TALK

Talk Outline

Outline of Talk

Outline of talk

Talk Outline

Outline of Talk

Outline of talk

Talk Outline

Outline of Talk

Talk Outline

Talk outline

Talk Outline

Outline of talk

Outline of Talk

Outline of talk

Outline of Talk

Outline of Talk

Outline of Talk