Machine Learning and the Information State Update Approach to Dialog Management Oliver Lemon and

Machine Learning and the Information State Update Approach to Dialog Management Oliver Lemon and James Henderson University of Edinburgh

TALK project Overall Coordination: Scientific Coordination:

a TALK research theme:Adaptivity and learning • Reinforcement Learning for dialog management • Complex dialog states from the Information State Update (ISU) approach

Initial objectives • Which aspects of dialog management are amenable to learning and what reward functions are needed for these aspects? • What representation of the dialog state best serves this learning? • What Reinforcement Learning methods are tractable with large scale dialog systems?

Information State Update approach with policy learning

Example information state output: < hello, welcome to the edinburgh informatics flight booking system. what is your departure city? >lastspeaker: userrecogninput: < edinburgh >input: < edinburgh >lastmoves: < [edinburgh],u, [([greet],s),([ask_user_start_city],s)] >filledslotsvalues: < [([ask_user_start_city],s)],[[edinburgh]] >turn: systemoplansteps: ( [ask_user_destination_city] , [release_turn] )nextmoves: < [ask_user_destination_city],s > int: < [release_turn] > . . .

Baseline system • DIPPER [Bos, Klein, Lemon and Oka 2003] for state specification and dialog management • ATK [Young 2004] for speech recognition • Festival [Taylor, Black and Caley 1998] for speech synthesis • O-Plan [Currie and Tate 1991] for dialog planning and content planning and structure

Example DIPPER update rule urule(deliberation, [;;;CONDITIONS: is^lastspeaker=user, empty(is^int) ], [;;;EFFECTS:;;;get the next step in the current plan solve2(next_step(Step)), ;;;push it on the intentions stack of the IS push(is^int,Step) ] ).

Initial data collection(from Cambridge) • Tourist information, with route descriptions • Human-human data • On-line transcription of user input with speech recognition error simulation [Williams and Young 2004]

Example dialog data hello how can i helpAH I'M LOOKING FOR A GOOD RESTAURANT IN THE TOWNright there's a number of restaurants in town %um what sort of food are you looking to -- to eatI'M LOOKING FOR A RESTAURANT NEAR THE CINEMAokay there's a restaurant very near the cinema it's a -- a relaxed chinese restaurant called noble nestNOBLE NEST AND AH WHERE IS IT EXACTLYit's on the corner of north road and fountain roadAND AH WHAT IS THE PRICE OF FOOD THERE%er the food there is %er fourteen pounds per personAH IT'S A CHINESE RESTAURANT RIGHTthat's right yesAND HOW TO REACH NOBLE NEST FROM HOTEL ROYALright okay from the hotel royal it would probably be best to catch the bus outside the hotel royal %um which will take you -- probably catch the bus to art square and then walk %um from art square that being the closest bus stop . . .

Reinforcement Learning • Dialog is modeled as a Markov Decision Process (MDP) • Choose the action a which maximizes the expected reward Q(s,a) given the state s Q(s,a) = R(s,a) +  P(s |s,a) max(Q(s,a)) • Estimate P(s |s,a) from users’ behavior • Estimate Q(s,a) iteratively from sample dialogs s a

Previous applications of RL to dialog management • [Levin, Pieraccini and Eckert 2000], [Singh, Litman, Kearns and Walker 2002] • Choose between a small number of actions • Initiative: system / user / mixed • Confirmation: explicit / none • Have a small number of possible states • Use RL methods which would not scale up to large action sets and large state spaces

Challenges • Tractable Reinforcement Learning with complex actions and large numbers of state features • Learning generic strategies which can be applied to many domains • Discovering useful features of the dialog history • Including partially observable features of the state (using POMDP models)

Example of Combining ML and ISU systems • Utterance classification for Spoken Dialogue Systems [Gabsdil and Lemon, ACL 2004] • Classifying n-best recognition hypotheses: reject, accept, ignore • Based on context features, e.g. dialogue move history • Improves accuracy by 25%

Summary • Adaptation and learning is an important theme in the TALK project • TALK uses the Information State Update approach, with large state representations • Reinforcement Learning will be used to learn dialog management • Challenges in tractable learning with these large complex representations

Machine Learning and the Information State Update Approach to Dialog Management Oliver Lemon and James Henderson University of Edinburgh

a TALK research theme: Adaptivity and learning • Representations most suited to adaptivity and learning • Parts of information state most amenable to optimisation and reward functions needed • Use of Partially Observable Markov Decision Processes and Bayesian Networks providing tractable learning for ISU-based systems

Machine Learning and the Information State Update Approach to Dialog Management Oliver Lemon and

Machine Learning and the Information State Update Approach to Dialog Management Oliver Lemon and

Presentation Transcript

Machine Learning: Learning finite state environments

Statistical Machine Learning- The Basic Approach and Current Research Challenges

Dialog Management

The Constructivist Approach to teaching and learning

Machine Learning: Learning finite state environments

A Scalable Machine Learning Approach to Go

Dialog Management

Replication Management using the State-Machine Approach Fred B. Schneider Summary and Discussion :

A Finite-State Approach to Machine Translation

A Machine Learning Approach to Programming

A Finite-State Approach to Machine Translation

The ALMA Computing Project Update and Management Approach

Lemon Monitoring Update

Machine Learning and Information Retrieval

The Constructivist Approach to teaching and learning

Information Theoretic Signal Processing and Machine Learning

Machine Learning: Learning finite state environments

Machine Learning for Personal Information Management

A Scalable Machine Learning Approach to Go

Machine Learning: Learning finite state environments

The Constructivist Approach to teaching and learning

The Constructivist Approach to teaching and learning