1 / 32

Stochastic Language Generation for Spoken Dialog Systems

Stochastic Language Generation for Spoken Dialog Systems. Alice Oh aliceo@cs.cmu.edu. School of Computer Science Language Technologies Institute Carnegie Mellon University. Big Question. How can we design a good NLG system for spoken dialog systems?. ???. What is NLG ?.

baker-avila
Download Presentation

Stochastic Language Generation for Spoken Dialog Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stochastic Language Generation for Spoken Dialog Systems Alice Oh aliceo@cs.cmu.edu School of Computer Science Language Technologies Institute Carnegie Mellon University

  2. Big Question How can we design a good NLG system for spoken dialog systems? ??? Speech Group

  3. What is NLG? • Natural Language Understanding (NLU) • Natural Language Generation (NLG) Text Semantic (Syntactic) Representation Semantic (Syntactic) Representation Text Speech Group

  4. { act query content name } Example of NLG => What is your full name? (Carnegie Mellon Communicator) Speech Group

  5. Example of NLG cat clause process type material effect-type creative lex ‘score’ tense past participants agent cat proper head cat person-name first-name [lex ‘Michael’] last-name [lex ‘Jordan’] created cat np cardinal [value 36] definite no head [lex ‘point’] => Michael Jordan scored 36 points. (FUF/SURGE Elhadad and Robin, 1996) Speech Group

  6. What is a good NLG system? • High quality output? • Write like Shakespeare? • Talk like … the presidential candidates? • Or… just produce grammatical sentences? • Reusable? • Portable? • What about development & maintenance? Speech Group

  7. What is a spoken dialog system? A task-oriented human-computer interaction via natural spoken dialog • CMU Communicator • Complex travel planning system • Jupiter • Worldwide weather report • What is not a spoken dialog system? • C-STAR speech-to-speech translation system • American Airlines flight information system (IVR) Speech Group

  8. NLG in spoken dialog systems • Language is different from text-based applications • Shorter in length • Simpler in structure • Less strict in following grammatical rules • Lexicon is domain-specific Speech Group

  9. Communicator Project A spoken dialog system in which users engage in a telephone conversation with the system using natural language to solve a complex travel reservation task Components • Sphinx-II speech recognizer • Phoenix semantic parser • Domain agents • Agenda-based dialog manager • Stochastic natural language generator • Festival domain-dependent Text-to-Speech (being integrated) Want to know more? Call toll-free at 1-877-CMU-PLAN Speech Group

  10. Problem Statement • Problem: build a generation engine for a dialog system that can combine the advantages, as well as overcome the difficulties, of the two dominant approaches (template-based generation, and grammar rule-based NLG) • Our Approach: design a corpus-driven stochastic generation engine that takes advantage of the characteristics of task-oriented conversational systems. Some of those characteristics are that • Spoken utterances are much shorter in length • There are well-defined subtopics within the task, so the language can be selectively modeled Speech Group

  11. Stochastic NLG: overview • Language Model: an n-gram language model of domain expert’s language built from a corpus of travel reservation dialogs • Generation: given an utterance class, randomly generates a set of candidate utterances based on the LM distributions • Scoring: based on a set of heuristics, scores the candidates and picks the best one • Slot filling: substitute slots in the utterance with the appropriate values in the input frame Speech Group

  12. Language Models Generation Dialog Manager Candidate Utterances What time on {depart_date}? At what time would you be leaving {depart_city}? Input Frame { act query content depart_time depart_date 20000501 } Tagged Corpora Scoring Best Utterance What time on {depart_date}? Complete Utterance What time on Mon, May 8th? TTS Slot Filling Stochastic NLG: overview Speech Group

  13. Language Models Generation Dialog Manager Candidate Utterances What time on {depart_date}? At what time would you be leaving {depart_city}? Input Frame { act query content depart_time depart_date 20000501 } Tagged Corpora Scoring Best Utterance What time on {depart_date}? Complete Utterance What time on Mon, May 8th? TTS Slot Filling Speech Group

  14. PhoenixParser Words SphinxASR Semantic Frames Speech signal Queries DialogManager Backend Modules Data Festival Synthesis Stochastic NLG Speech Group

  15. Stochastic NLG: Corpora Human-Human dialogs in travel reservations (CMU-Leah, SRI-ATIS/American Express dialogs) Speech Group

  16. Example Utterances in Corpus: What time do you want to depart {depart_city}? What time on {depart_date} would you like to depart? What time would you like to leave? What time do you want to depart on {depart_date}? Output (different from corpus): What time would you like to depart? What time on {depart_date} would you like to depart {depart_city}? *What time on {depart_date} would you like to depart on {depart_date}? Speech Group

  17. Evaluation Transcription Dialogs with OutputS Dialogs Stochastic NLG Dialogs with OutputT Template NLG Batch-mode Generation Comparative Evaluation Speech Group

  18. Preliminary Evaluation • Batch-mode generation using two systems, comparative evaluation of output by human subjects User Preferences (49 utterances total) • Weak preference for Stochastic NLG (p = 0.18) subject stochastic templates difference 1 41 8 33 2 34 15 19 3 17 32 -15 4 32 17 15 5 30 17 13 6 27 19 8 7 8 41 -33 average 27 21.29 5.71 Speech Group

  19. Stochastic NLG: Advantages • corpus-driven • easy to build (minimal knowledge engineering) • fast prototyping • minimal input (speech act, slot values) • natural output • leverages data-collecting/tagging effort Speech Group

  20. Open Issues • How big of a corpus do we need? • How much of it needs manual tagging? • How does the n in n-gram affect the output? • What happens to output when two different human speakers are modeled in one model? • Can we replace “scoring” with a search algorithm? Speech Group

  21. Extra Slides

  22. Current Approaches • Traditional (rule-based) NLG • hand-crafted generation grammar rules and other knowledge • input: a very richly specified set of semantic and syntactic features • Example* (h / |possible<latent| :domain (h2 / |obligatory<necessary| :domain (e / |eat,take in| :agent you :patient (c / |poulet|)))) • You may have to eat chicken • Template-based NLG • simple to build • input: a dialog act, and/or a set of slot-value pairs * from a Nitrogen demo website, http://www.isi.edu/natural-language/projects/nitrogen/ Speech Group

  23. If you set n equal to a large enough number, most utterances generated by LM-NLG will be exact duplicates of the utterances in the corpus. Stochastic NLG can also be thought of as a way to automatically build templates from a corpus Speech Group

  24. Tagging • CMU corpus tagged manually • SRI corpus tagged semi-automatically using trigram language models built from CMU corpus Speech Group

  25. Utterance classes (29) query_arrive_city inform_airport query_arrive_time inform_confirm_utterance query_arrive_time inform_epilogue query_confirm inform_flight query_depart_date inform_flight_another query_depart_time inform_flight_earlier query_pay_by_card inform_flight_earliest query_preferred_airport inform_flight_later query_return_date inform_flight_latest query_return_time inform_not_avail hotel_car_info inform_num_flights hotel_hotel_chain inform_price hotel_hotel_info other hotel_need_car hotel_need_hotel hotel_where Attributes (24) airline flight_num am hotel arrive_airport hotel_city arrive_city hotel_price arrive_date name arrive_time num_flights car_company pm car_price price connect_airline connect_airport connect_city depart_airport depart_city depart_date depart_time depart_tod Tags Speech Group

  26. Stochastic NLG: Generation • Given an utterance class, randomly generates a set of candidate utterances based on the LM distributions • Generation stops when an utterance has penalty score of 0 or the maximum number of iterations (50) has been reached • Average generation time: 75 msec for Communicator dialogs Speech Group

  27. Stochastic NLG: Scoring • Assign various penalty scores for • unusual length of utterance (thresholds for too-long and too-short) • slot in the generated utterance with an invalid (or no) value in the input frame • a “new” and “required” attribute in the input frame that’s missing from the generated utterance • repeated slots in the generated utterance • Pick the utterance with the lowest penalty (or stop generating at an utterance with 0 penalty) Speech Group

  28. Stochastic NLG: Slot Filling • Substitute slots in the utterance with the appropriate values in the input frame Example: What time do you need to arrive in {arrive_city}? What time do you need to arrive in New York? Speech Group

  29. Stochastic NLG: Shortcomings • What might sound natural (imperfect grammar, intentional omission of words, etc.) for a human speaker may sound awkward (or wrong) for the system. • It is difficult to define utterance boundaries and utterance classes. Some utterances in the corpus may be a conjunction of more than one utterance class. • Factors other than the utterance class may affect the words (e.g., discourse history). • Some sophistication built into traditional NLG engines is not available (e.g., aggregation, anaphorization). Speech Group

  30. Evaluation • Must be able to evaluate generation independent of the rest of the dialog system • Comparative evaluation using dialog transcripts • need more subjects • 8-10 dialogs; system output generated batch-mode by two different engines • Evaluation of human travel agent utterances • Do users rate them well? • Is it good enough to model human utterances? Speech Group

  31. Natural Language Understanding (NLU) What is NLG? Text Semantic (Syntactic) Representation Semantic (Syntactic) Representation Text Natural Language Understanding (NLU) Speech Group

  32. Text Semantic (Syntactic) Representation Semantic (Syntactic) Representation Text Natural Language Understanding (NLU) Natural Language Generation (NLG) Speech Group

More Related