Automating Annotation of System-User Dialogue Phenomena with SpeechLogic™

Representing Act–Topic–based Dialogue Phenomena Hans Dybkjær SpeechLogic™, Prolog Development Center A/S&Laila DybkjærNISLab, University of Southern Denmark

System-user dialogue (Opening) • S 0.1 – PAUSE1000 – Welcome to FerieKonto’s automatic guide. – You may let the system suggest topics or you can try your own words. – You select topics by speaking to the system. – Choose between: – PAUSE200 – 'employee'. – PAUSE200 – ’On leave'. – PAUSE200 – 'live abroad'. – PAUSE200 – Or 'stopped working' due to 'age' or 'health'. – PAUSE500– Say a topic, say ’repeat' or ask for 'more options'? • U 0.1 I’m a student • S 1.1 – Did you ask for – Main menu • U 1.1 student • S 2.1 – If you are a student with SU, you may get your holiday allowance without taking leave from the studies. – But if you work besides the studies, you must take vacation at your work before you can get allowance. • S 2.2 – If you have work, it is your employer who must sign your FerieKonto certificate, otherwise it is the social authorities in your city. • S 2.3 – PAUSE500 (Offer) (Request) Start (Check) (Repair) (Information) (Information) Success A typical, system driven dialogue with meta-communication

Dialogue, continued (Offer) • S 2.4 – Do you want to ask for more? • U 2.1 must I write the saturdays • S 3.1 – From the holiday year 2002 that starts 1 May, the vacation week is changed to 5 vacation days, and saturdays are thus no longer counted as part of the vacation. • S 3.2 – Would you like to know more about the 'holiday year'? • U 3.1 no thanks goodbye • S 4.1 – Goodbye and thank you for calling us. • Event 1 nothing recorded • Event 2 disconnect Start (Request) (Information) Success (Offer) (Reject + request) Start (Feedback) (Other) (Other) Success FAQ conversation has many small transactions

Automating annotation • Key concerns for commercial spoken dialogue systems (SDSs): • High transaction success rate • Smooth dialogue • Very time consuming and costly to measure manually • Manual annotation more or less only possibility today Two–step approach towards automatic annotation • Annotate utterances with a basic act–topic structure • Can be automated using a parser • Transform act–topic patterns into transaction segments • Using rule engine Automation only way to serious statistics

First step: annotation Move = Act {Topic*} .u: .inform {T.student} “I am a student" .s: .inform {T.student, T.payment, T.money} “If you are a student living from a grant you may get your holiday allowance while still studying. If you also work you need to take holiday to get your money.” … .s: .inform {T.more} “Is there anything else you want to ask about?" .u: .accept {} “Yes” … Turn = Move+ (of same speaker) Context free annotation, simplistic,

Second step: Transformation • Apply act–topic rules to step one annotated dialogues • Transform basic acts into composite acts rule select1 _y: .select Ts_a <– _x: .inform Ts_a _y: .accept {} where _x != _y end rule Formalism designed for the problem

Rule examples Exchanges and transactions belong to different levels

Dialogue phenomena investigated • Different acts, different topics, and different speakers • Rejecting topics • Differentiating topic names and topic values • Sets of topics • Patterns across turns (move sets) not using all moves in that turn • Topic relation: (sub-topic) IS–A (topic) • Meta–communication (repair, clarification, ...) • Multi–level rules: Some rules only apply after match by other rules • Summarising feedback Only structural phenomena – propositional contents not considered

Status and next steps • Restriction: Task oriented human–computer spoken dialogues • Declarative rewrite rules with constraints • Customised language close to dialogue analysis domain • Smoothness criteria not clearly defined • How does smoothness affect transaction counts? • To obtain a fully automated annotation process • Parse dialogues to produce basic act–topic annotation • Combine into automatic batch system • Need for evaluation of method • Test on larger number of dialogues of different kinds • Establish human coder baseline and compare Both theoretical work and practical tools needed

Supplementary slides Warning: you are over time!

Background • Over–the–phone FAQ SDS on holiday allowance • General (non–person related) questions, e.g. • is Saturday considered a holiday? • 2700 lines of grammar, 800 (full) words in vocabulary • 85 semantic concepts in input, 100+ stories in output • Contractual minimum transaction success rate, but • transaction not clearly defined • no baseline from human–human dialogues • Approach to measurement: • Transactions defined in terms of patterns of act–topics • Manually: 225 test and 217 production system dialogues • Created web–based manual annotation tool Complex domain but simple tasks

Problem in only identifying topics • Basically only distinguish between two composite acts: • select: continue with same topic • request: change to new topic • So cannot distinguish success and failure More distinction needed

Name and value • Solution: distinguish • topic name N: the mentioning of a topic • topic value V: details about a topic Simple yet powerful distinction – may still be parseable

More rule examples Many variations possible

Transaction success • Dialogue level task completion? • Works if task is well–defined and goal state clear • But many independent tasks • So no single clear goal state • Need to define transaction at sub–task level • What constitutes a transaction? • Initiation and conclusion? • Is miscommunication part of a transaction? • What about sub–tasks? • Sub–task level transactions may also inform on which parts of the system may be problematic • Start and end does not tell about dialogue smoothness Provide users with required information

Smooth dialogues • More precise overview of problems and their causes and seriousness • Same topic may have fail and success in same call • Few or many repairs • distinction between unwanted and erroneous information • erroneous information is unacceptable (tomorrow is Friday, phone 36 36 00 01) • other information than asked for may be more or less serious(fax instead of phone, fax instead of email) • misunderstanding a yes for a no is usually not so serious (repairable) but can be a nuisance • Misrecognitions • Information blocks may contain more than asked for You are way beyond your time frame!

Automating Annotation of System-User Dialogue Phenomena with SpeechLogic™

Automating Annotation of System-User Dialogue Phenomena with SpeechLogic™

Presentation Transcript

Dialogue Design - Documenting the User Interface

Spoken Dialogue Systems: System Overview

Spoken Dialogue System Architecture

System Center User Group

Booking System User Guide

SYSTEM DESIGN/USER INTERFACE

System Center User Group

Norwegian social dialogue system

User Identification System (UIS)

System Center User Group

System Center User Group

User Producer Dialogue in Gender Statistics: Bangladesh Experience

Clarification in Spoken Dialogue Systems : Modeling User Behaviors

System Center User Group

Grid User Management System

USER IDENTIFICATION SYSTEM

Grid User Management System

Session 3 : User-Producer Dialogue

User Simulation for Spoken Dialogue Systems

Producer – User Dialogue

Dialogue Design - Documenting the User Interface

Adapting to User Affect in a Spoken Dialogue System