Dialog structure design and annotation
1 / 50

Dialog Structure Design and Annotation - PowerPoint PPT Presentation

  • Updated On :

Dialog Structure Design and Annotation. Ananlada Chotimongkol Language Technologies Institute School of Computer Science Carnegie Mellon University. Out Line. Existing Annotation Schemes Linguistic Oriented Engineering Oriented HCRC dialog structure Conversation Acts DAMSL Comparison

Related searches for Dialog Structure Design and Annotation

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Dialog Structure Design and Annotation' - tamal

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Dialog structure design and annotation l.jpg

Dialog Structure Design and Annotation

Ananlada Chotimongkol

Language Technologies Institute

School of Computer Science

Carnegie Mellon University

Out line l.jpg
Out Line

  • Existing Annotation Schemes

    • Linguistic Oriented

    • Engineering Oriented

      • HCRC dialog structure

      • Conversation Acts

      • DAMSL

      • Comparison

  • Form-based dialog structure

Structure of a dialog l.jpg
Structure of a dialog

  • Explain how the conversation is organized

    • To create a theory of dialog in order to understand the meaning of the dialog

      • Linguistic-Oriented

    • To develop a procedure that support a computer agent in a dialog system

      • Engineering-Oriented

Linguistic oriented l.jpg

  • Some are extended from discourse structure (focus on monologue text)

  • Provide basic theory for the engineering-oriented one

    • Speech Act Theory: capture speaker’s intention

    • Rhetorical Structure Theory: explain the coherence between parts of text

    • Dialog Grammar: capture regular patterns in the dialog

Engineering oriented l.jpg

  • HCRC structure (Edinburgh)

  • Conversation Acts (Rochester)

  • DAMSL (Multiparty Discourse Group)

Hcrc dialog structure l.jpg
HCRC Dialog Structure

Carletta, J., Isard, A., Isard, S., Kowtko, J., Doherty-Sneddon, G., Anderson, A., HCRC dialogue structure coding manual, 1996


  • Domain = map description

  • Focus on describing the phenomenon occurs in the Map Task corpus

    • But claim to be task-independent

  • Focus on high level structure

  • Can use in conjunction with other coding scheme

3 level structure l.jpg
3-level structure

  • Transaction: a sub-dialog that accomplish a major goal of the task

    • In Map Task = 1 segment of the route

  • Game (interaction, exchange): a set of utterances composes of an initiation and a sequence of responses that fulfills the initiations purpose

  • Move (dialog act): an utterance or part of utterance that serves a particular propose e.g. as an initiation or a response

Move coding scheme l.jpg
Move Coding Scheme

  • Tradeoff between semantic distinction and coding consistency

  • 12 moves from 3 categories

    • Initiating Moves: set up an expectation at the beginning of the game

      • Instruct, Explain, Check, Align, Query-YN and Query-W

    • Response: follow the initiation and fulfill the expectation

      • Acknowledge, Reply-Y, Reply-N, Reply-W and Clarify

    • Ready: occur in the transition between games

Game coding scheme l.jpg
Game Coding Scheme

  • Game’s purpose = the name of game’s initiating move

  • All games begin with an initiating move but not all initiating moves begin games

  • Game can be nested e.g. contain clarification sub-dialog

Transaction coding scheme l.jpg
Transaction Coding Scheme

  • Divide the dialog into transactions

    • Different between giver and follower’s perspectives

  • For a giver, how he divides a route into sub-task

    • 4 types of transactions: normal, review, overview and irrelevant

    • Each transaction (except irrelevant) is associated with a route segment on the map

  • For a follower, how he perceives a segment and performs some actions

    • 2 types of actions: drawing a line and crossing out a line

  • A transaction isn’t nest (too large)

Discussion l.jpg

  • No real dialog application. Use as a data for analyzing phenomena in dialog

  • Emphasize on how the information is conveyed e.g. as a question or a response, rather than what information is conveyed (concept)

  • Annotate the purpose of the utterance in general e.g. instruct, explain, question, rather than the purpose that each utterance serves according to the task e.g. describe the movement or describe the landmark

Conversation acts l.jpg
Conversation Acts

  • David R. Traum and Elizabeth A. Hinkelman, "Conversation Acts in Task-Oriented Spoken Dialogue", In Computational Intelligence, 8(3):575--599, 1992. Also appears as TR 425, Computer Science Dept.

  • Emphasize

    • Mutual understanding between participants

    • Dialog mechanisms that serve in coordination and maintenance of the dialog itself rather than the direct task.

Dialog units l.jpg
Dialog units

  • Utterance unit (UU)

    • Continuous speech by the same speaker

    • Each speaker turn can contain more than one UU

  • Discourse Unit (DU)

    • A sequence of an initial presentation and subsequent utterances by each party that are needed to make a unit grounded

Classes of conversation acts l.jpg
Classes of Conversation acts

  • 4 classes

    • Turn-taking acts (sub-UU acts)

    • Grounding acts (UU acts)

    • Core speech acts (DU acts?)

    • Argumentation acts (multiple DUs)

  • More general than speech act theory

Turn taking act l.jpg
Turn-taking Act

  • Can have more than one turn-taking act in an utterance (sub-UU act)

  • Coordinate the control of the speaking channel

  • Types of turn-taking acts

    • take-turn, keep-turn, release-turn, assign-turn and pass-up-turn

  • Turn-taking acts occur all the time

    • Should we annotate all of them?

    • Which one is important?

Grounding act l.jpg
Grounding Act

  • Correspond to one utterance unit (UU act)

  • Coordinate mutual understanding

  • Types of grounding acts

    • Initiate (an initial component of a DU)

    • Continue

    • Acknowledge

    • Repair

    • ReqRepair

    • ReqAck

    • Cancel (close off the current DU as ungrounded)

Core speech act l.jpg
Core Speech Act

  • Similar to a traditional speech act

  • Coordinates the local flow of changes in belief, intentions and obligations

  • Types of core speech acts:

    • Inform, WHQ, YNQ, Accept, Request, Reject, Suggest, Eval, ReqPerm, Offer, Promise

  • Doesn’t correspond to any of dialog units?

Argumentation act l.jpg
Argumentation Act

  • Compose of combinations of core speech acts (Multiple DUs act)

  • Coordinate discourse purpose

  • Is at the same level as Rhetorical Relations and Adjacency Pairs

  • Types of argument acts: Elaborate, Summarize, Clarify, Q&A, Convince, Find-Plan

  • Build up hierarchy with in the same class

    • The high level acts correspond to steps in task structure (task-dependent?)

    • The lower level acts Q&A

Damsl dialog act markup in several layers l.jpg
DAMSL (Dialog Act Markup in Several Layers)

  • Coding Dialogs with the DAMSL Annotation Scheme. Mark Core, James Allen. AAAI Fall Symposium on Communicative Action in Humans and Machines, 1997.

  • J. Allen and M. Core. “Draft of DAMSL: Dialog Act Markup in Several Layers”, 1997.

Damsl tag set l.jpg

  • Developed by Multiparty Discourse Group

  • Contain primitive communicative actions that manipulates the common ground directly

  • Allow multiple labels in multiple layers

    • Eliminate the restriction in Speech Act Theory

  • Design to be domain-independent

    • But can add domain relevant acts

  • The annotation can be used to

    • Interpret utterances in dialog

    • Design appropriate dialog strategy

Damsl annotation scheme l.jpg
DAMSL Annotation Scheme

  • 3-layer of annotation for each utterance

    • Forward Communicative Functions

    • Backward Communicative Functions

    • Utterance Features

  • These 3 layers are orthogonal

    • But some utterances may not have a label for every layer

    • Can have more than one label in each layer

  • Utterance segmentation is based on the intentions of the speaker

    • An utterance can have several clauses or just an initial word

Forward communicative function l.jpg
Forward Communicative Function

  • Indicates how the current utterance constrains the future beliefs and actions

    • Similar to actions in speech act theory

  • Types of Forward Communicative Functions

    • Statement

    • Influencing Addressee Future Action

    • Committing Speaker Future Action

    • Performative (make a fact true by saying it)

    • Other Forward Function

Backward communicative function l.jpg
Backward Communicative Function

  • Indicate how the current utterance relates to the previous dialog

  • Types of Backward Communicative Functions

    • Agreement (accept/reject)

    • Understanding

    • Answer (associate with info-request act)

    • Information Relation (How this utterance relates to the previous one)

      • Similar to Rhetorical Relations

Utterance feature l.jpg
Utterance Feature

  • Capture content and form of utterance

  • The features are

    • Information Level: task, task management, communication management

    • Communicative Status: abandoned, uninterpretable

    • Syntactic Features: conventional form, exclamatory form

Discussion25 l.jpg

  • Focus on the primitive purpose of the utterance

    • Need more detail representation to get the key information in the utterance

    • Also need higher level representations such as plans and discourse structures

  • Are these 3 layers orthogonal?

  • Are there too many tags for each utterance?

Comparison levels of annotation l.jpg





Comparison: Levels of Annotation

Conver. Acts

  • Argumentation acts

  • Core speech acts

  • Grounding

  • Turn-taking


  • Forward

  • Backward

  • Utterance Features

Comparison levels of annotation27 l.jpg





(The same level as all DAMSL tags)

Conver. Acts

Argumentation acts

(Dialog Unit)

Core speech acts



Comparison: Levels of Annotation

Comparison tags for utterance level l.jpg



Statement, Influencing-Addressee-Future-Action, Committing- Speaker-Future Action, Performative


Agreement (accept/reject),


Answer, Information Relation

Comparison: tags for utterance level


  • Initiation

    Instruct, Explain, Check, Align, Query-YN and Query-W

  • Response

    Acknowledge, Reply-Y, Reply-N, Reply-W and Clarify

Conver. Acts

  • Inform, Suggest, Offer, Promise Request, ReqPerm, WHQ, YNQ, Accept, Reject, Eval,

Form based dialog structure l.jpg
Form-based dialog structure

  • Why we need a new structure

    • The existing structures are too general

    • Want to capture domain information e.g. task structure, key concepts

    • Want to create a dialog system from a structure

  • Choose to work on a form-based dialog system

  • Represent a structure of a dialog in term of forms and slots

Three level organization l.jpg
Three-level organization

  • Task (dialog)

    A task is a subset of conversation that serves a particular goal of a dialog.

  • Episode (sub-task)

    A set of utterances that corresponds to a smaller step in a task

  • Concept

    An important piece of domain information that the participants would like to communicate in the dialog

Form representation l.jpg
Form representation

  • A form is a repository of related pieces of information (concepts)

  • A sub-task is equivalent to form

    • A sub-task is a smallest practical unit

  • A task = collection of forms (sub-tasks)

How the task can be accomplished using a form l.jpg
How the task can be accomplished using a form?

  • The sub-task is accomplished by manipulating the form:

    • *Fill in the slots

    • *Execute the form

    • Discuss the result

  • Operations

Operations l.jpg

  • Operation is an utterance or a part of an utterance (turn) that causes a unique consequence in the conversation

    U:fill_form_info: I'D LIKE TO FLY TO ArLoc:[HOUSTON ]ArLoc:[TEXAS ]

    S: access_DB:


Question answer pair l.jpg
Question & Answer pair

  • Q&A are separated into 2 operations by a turn boundary

  • The consequence of the answer is depended on the question especially the yes/no answer


    U: init_form :I NEED A HOTEL IN HOUSTON



    U: respond:YES

Let s go l.jpg
Let’s Go

  • Goal: request information about the bus schedule

  • Tasks: (multiple system functions)

    • Ask bus number

    • Ask departure time

    • Ask stop

    • Etc.

  • One form for each task (a simple task)

  • Concept: bus_number, hour, minute, depature_location

List of operations l.jpg
List of Operations

  • Form-filling operations

    • init_form

    • fill_form_info

    • change_form_info

  • Form execution operations

    • access_DB (task-specific)

  • Discuss-result operations

    • inform_result

    • navigate_results

Air travel domain l.jpg
Air Travel Domain

  • Goal: Reserve a flight with optional hotel and car

  • Tasks:

    • Reserve a flight

    • Reserve a car

    • Reserve a hotel

  • But car and hotel are always parts of flight reservation.

  • So it is better to think of them as sub-tasks

  • One form for each sub-task

  • Concept: airline, city, date, time

Flight reservation l.jpg
Flight Reservation

  • There are 3 form executions (DB access) in the flight reservation episode

    • Retrieve departure flight

    • Retrieve arrival flight

    • Retrieve fare

  • Fare is depended on the flights

  • Embedded forms


flight info

flight info






Map task description l.jpg
Map Task: description

  • Conversation between 2 participants

    • Giver: has a map with a route on it

    • Follower: has a map without a route

  • Task: a giver tell the follower how to draw the route on the follower’s map

  • The maps are not exactly the same

Map task characteristic l.jpg
Map Task: Characteristic

  • More casual conversation

    • Disfluency

    • Repetition

    • Anaphora

  • No well-defined form

    • No constraint from the backend

    • There are many ways to describe a segment

  • Need a lot of grounding processes

Map task structure l.jpg
Map Task: Structure

  • Goal: draw a map from a description

  • Task: draw a line (a route)

  • Sub-task

    • draw a segment of a line

    • Locate a new landmark (can be embedded)

Grounding process l.jpg
Grounding Process

  • Create mutual understanding between participants

    • Check understanding, correctness of communication

      • Confirmation and clarification

    • Define a new term

      • Discuss the attributes of the object e.g. check landmark and create landmark

Grounding process in form based structure l.jpg
Grounding process in form-based structure

  • Confirmation

    • If ‘yes’, increases the confidence on the slot value

    • If ‘no’, crosses out the value from the slot

  • Clarification

    S:ask_fill_form_info:INTO ArLoc:[INTERCONTINENTAL ]AIRPORT OR ArLoc:[HOBBY ]

    U: fill_form_info:AT THE /UH/ ArLoc:[INTERCONTINENTAL ]

Grounding process in form based structure 2 l.jpg
Grounding process in form-based structure (2)

  • Define a new term

    • A form is a collection of object attributes

      FOLLOWER: fill_form_info:  but golden beach is away in Loc:[the far right].

      Landmark: golden beach

      Location: the far right

Plane simulation task l.jpg
Plane simulation task

  • 3 participants works on the plane simulation

  • Task = take pictures of a list of targets

  • Each participant has different roles: flying the plane, navigating the route, taking a picture

  • There are some restriction on controlling a plane such as speed, altitude and radius from a destination

Dialog structure l.jpg
Dialog Structure

  • Task: Take pictures of a given list of targets

  • Sub-tasks: Take a picture of one target

  • Concept:

    • target

    • waypoint

    • distance

    • speed

    • altitude

Task characteristic l.jpg
Task Characteristic

  • 3-party conversation

  • Command & Control style

  • The physical actions have a time constraint

    • Can’t execute the form right away after all the slots get filled

  • The list of the sub-tasks (targets) is not fixed and not known in advance

Sub task l.jpg

  • Main sub-task = take a picture of the target

  • Also have to control the plane

    • Set destination, altitude and speed (have restriction)

    • Report the result in term of the plan status: altitude, speed, destination and the distance from destination

  • Grounding process

    • Define a landmark as a target or a waypoint

Forms l.jpg

  • target form (take a picture)

    • target name

    • required distance from target

  • control form: contain only a single slot (fly a plane)

    • Altitude

    • Speed

    • Destination (may have radius)

  • grounding form (grounding process)

    • object name

    • attributes e.g. type of landmark