1 / 40

Spoken Language Generation: Information Presentation

Spoken Language Generation: Information Presentation. Johanna D. Moore University of Edinburgh Human Communication Research Centre School of Informatics. Introduction. Much of the research in SDS has focused on the information gathering phase of the dialogue

kalb
Download Presentation

Spoken Language Generation: Information Presentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Spoken Language Generation:Information Presentation Johanna D. Moore University of Edinburgh Human Communication Research Centre School of Informatics

  2. Introduction • Much of the research in SDS has focused on the information gathering phase of the dialogue • Communicator evaluation indicates that task duration is negatively correlated with user satisfaction (r = -0.31, p < .001) (Walker, Passonneau, Boland, ACL 2001) • What contributes most to dialogue duration?

  3. Communicator 2001 Dataset

  4. Typical information presentation SYS: I found 9 round trips from Hartford to Orlando on 31st and returning Hartford on November 6th. For option 1, I have an outbound flight with US Air departing at 7 AM and arriving at 9 49 AM non-stop. A return flight is with U S Air departing at 10 30 AM and arriving at 12 28 AM with 1 stopover. Ticket price is 165 dollars. Please say “next option”, or, “flight details”, or, “I’ll take it”. USER: NEXT OPTION SYS: For option 2, I have an outbound flight with U S Air departing at 3 30 PM and arriving at 6 19 PM nonstop. A return flight is with U S Air departing at 8 25 PM and arriving at 12 28 AM with 1 stopover. Ticket price is 165 dollars. Say “next option”, or, “flight details”, or, “I’ll take it”. USER: NEXT OPTION SYS: . . .

  5. Current spoken dialogue systems • Presents each options and selected attributes individually • No comparisons • Relies on users to remember trade-offs between multiple options • Places large memory burden on users • Potentially large payoff for improvements in information presentation

  6. Improving Information Presentation • Apply techniques from text generation • Allows customization to user(s) and discourse history • Improve quality of synthesis by using NL generator to provide info about both meaning and linguistic structure of utterance • Overview of talk: • Case study: FLIGHTS system • Statistical approaches to generation See also Computer Speech and Language (2002) 16. Special Issue on Spoken Language Generation

  7. What NLG Can Do For You… User: I want to travel from Edinburgh to Brussels, arriving by 5 pm. System: There’s a direct flight on BMI with a good price. It arrives at four ten p.m. and costs one hundred and twelve pounds. The cheapest flight is on Ryanair. It arrives at two p.m. and it costs just fifty pounds, but you’d need to connect in Dublin. For a starving student System: You can fly business class on British Airways, arriving at four twenty p.m., but you’d need to connect in London Heathrow. There is a direct flight on BMI, arriving at four ten p.m., but there’s no availability in business class. For a business traveller

  8. Sentence Realizer Text Planner Knowledge Sources Comm Goals Content Selection Discourse Planning Discourse Strategies Dialogue History Domain Model User Model Text Plan Sentence Planner Linguistic Knowledge Sources Aggregation Referring Expression Gen Lexical Choice Aggregation Rules Referring Expression Generation Algorithm Lexicon Grammar Sentence Plan(s) English

  9. ASR (HTK) FLIGHTS architecture Semantic Interpretation User Input Text String Natural Language Understander (Word spotting) Dialogue Manager (DIPPER) Comm Goals Response Generator System Response TTS (Festival) Content Selection Sentence Planner (XSLT) Text Planning (O-Plan) Realizer (OpenCCG) Text String w/ APML Markup User Model Flight DB

  10. Customization happens everywhere • Content selection: what flights and attributes to present to user • Discourse planning: ordering of content, discourse relations • Referring Expression Generation: e.g., The cheapest flight, the five-fifteen, a KLM flight • Aggregation: grouping propositions into clauses and sentences, e.g., There’s a KLM flight arriving Brussels at ten to five, but business class is not available and you’d need to connect in Amsterdam • Discourse cues: e.g., Although, because, but • Scalar Adjectives: e.g., good price, just fifty pounds

  11. Content Selection • Need a domain (or genre) specific method for determining what to say • In FLIGHTS: • Rank options based on predicted utility for the user • Select all options whose value is over a threshold • Select attributes that contribute most to value of selected options (Moore, Foster, Lemon & White, FLAIRS 2004, Carenini & Moore, AI Journal, 2006)

  12. Discourse Planning • Using discourse strategies for producing user-adapted recommendations, comparisons • Produces text plans consisting of basic dialogue acts and rhetorical relations • Orders presentation of options • Groups attributes into positive and negative lists for contrasts • Selects attributes to identify flights • a direct flight, the cheapest flight, the KLM flight • Marks items as theme/rheme for information structure

  13. Information Structure • Theme/Rheme • Theme: part of utterance that connects it to prior discourse • Rheme: part of utterance that advances the discussion by contributing novel information • Theme and rheme phrases marked by distinctive combinations of pitch accents and boundary tones • Focus/Background • Focus: words whose interpretations contribute to distinguishing the theme or rheme from other contextually available alternatives; marked by pitch accents • Background: the unmarked parts of themes and rhemes (Steedman 1991-2002)

  14. Examples • Ex 1: I know when the Ryanair flight LEAVES, but when does it ARRIVE? (The Ryanair flight ARRIVESfocus)theme (at FIVEfocus)rheme L+H* LH% H* LL% • Ex 2: I know the KLM flight arrives at FOUR, but which flight arrives at FIVE? (The RYANAIRfocus flight)rheme (arrives at FIVEfocus)theme H* LL% L+H* LH%

  15. Assigning theme/rheme in FLIGHTS • First option all rheme • Subsequent items: • Identifying, contrastive information is theme • Implements notion of an implicit Question Under Discussion, e.g., After presenting a flight that’s not direct, there’s an implicit question: Are there any direct flights? You can fly business class on British Airways, arriving at four twenty p.m., but you’d need to connect in Manchester. [There’s a DIRECT flight]theme on BMI, arriving at four ten p.m., but there’s no availability in business class.

  16. Controlling Intonation with OpenCCG • OpenCCG realizer adapts previous work on chart realization to CCG, enabling CCG’s unique accounts of coordination and intonation to be employed in NLG systems • Uses information structure to determine types and locations of pitch accents and boundary tones • Measures similarity of realizations to n-gram language model • Treats agenda as priority queue ordered by n-gram scores • Yields best-first anytime algorithm: returns best scoring realization at “any time”, for interactive applications (White and Baldridge, EWNLG9 2003; White, INLG 2004; White, RLaC 2006)

  17. The cheapest(L+H*) flight(LH) is on Ryanair(H* LH). It arrives at two p.m (H* LH) and it costs just fifty(H*) pounds(H* LH), but you’d need to connect(H*) in Dublin(H* LL). unit selection limited domain with APML markup Even though the first(L+H*) flight is not on BMI(L+H* LH), it is the cheapest(H*) one available(LH). unit selection limited domain with APML markup Examples

  18. Q: I'd like a cheap flight from Frankfurt to Geneva, please. And I'd prefer to fly direct. A: There's a direct flight on Lufthansa with a good price, arriving in Geneva at ten thirty nine am and it costs two hundred and fifty five pounds. The cheapest flight is on Air France arriving at one twenty five pm and it costs only one hundred and five pounds, but it requires a connection in Paris Charles de Gaulle. limited domain limited domain with APML markup Examples

  19. Is Tailoring Effective? Evaluation in MATCH Project: • Restaurant recommendation system built using same user modeling techniques • Subjects heard dialogues where recommendations and comparisons were based on own user model or random other model • Subjects judge tailored responses significantly higher • Information quality: System’s response is easy to understand and provides exactly the information I am interested in when choosing a restaurant. • Ranking confidence: Recommended restaurant is somewhere I would like to go. (Walker, Whittaker, Stent, Maloor, Moore, Johnson, Vassiredy, Cognitive Science 28, 2004)

  20. Does Intonation Matter? • Affects meaning • “She only ATE the banana” vs. • “She only ate the BANANA” • Human judgements of output in travel domain show that overall German speech produced with GToBI markup judged better than default intonation (Kruijff-Korbayova, EACL03) • Naturalness • (Ease of comprehension)

  21. Evaluation • Compared three synthesizers • Unit Selection Multisyn • Limited Domain • Limited Domain APML • Hypotheses: • LD_APML >> USM • LD_APML > LD (Neide Franca Rocha, MSc, 2004)

  22. Results: US vs. LD_APML

  23. Results: LD vs. LD_APML

  24. Using N-gram LM in Generation

  25. UM Approach to Info Pres + UM provides information users want to make choices with high confidence + Enables concise presentation of options and their tradeoffs + Users prefer recommendations tailored to their model • Doesn’t scale to large number of options • Does not provide users with an overview of options • Users may (perceive) that they’ve missed out on options

  26. Summarize-and-Refine Approach • Clusters options in database • Based on attributes that yield smallest number of clusters • Summarize these clusters • enumerate, majority, count, singleton • User provides additional constraints (Polifroni, Chung & Seneff, Eurospeech 2003, Chung ACL 2004)

  27. Example User: What restaurants do you have? Sys: I found 983 restaurants. Most of them are located in Boston and Cambridge. There are 32 choices for cuisine. I also have information about price range. User: Okay tell me about the ones in Boston. Sys: I found 401 restaurants in Boston. There are 29 choices for cuisine. User: Do you have any that serve seafood? Sys: I have found 19 seafood restaurants. They are predominantly in the Back Bay, the North End, South Boston and the South End. User: In the South End. Sys: I have found 3 restaurants. Grill Fish on …

  28. SR Approach: + Useful when users don’t know the structure of the database + Domain independent • no fixed set of attributes • make dynamic decisions based on analysis of data -Suboptimal choice of attribute(s) for summarization - Exploration of tradeoffs difficult - May include irrelevant entities

  29. The Combined UM+SR Approach • select relevant options • structure them based on ranking of attributes in user model • automatically determine tradeoffs • tailor summaries • improve overview of options space by briefly summarizing irrelevant options (Demberg and Moore, EACL 2006)

  30. Example Dialogue Turn User Profile “student”: 1 price 2 number of legs departure time arrival time travel time 6 airline fare class layover airport User: I'd like to book a flight from Edinburgh to Brussels for tomorrow. Sys: Ryan Air offers the cheapest flights to Brussels. They cost just 49 pounds but you need to connect in Dublin. There are two flights to choose from. There's an early flight leaving Edinburgh at 8:05 am and arriving at 12:45 pm. To leave later, you can take the 1pm flight arriving Brussels at 5:30 pm. If you want to fly direct, there's a flight on BMI that leaves Edinburgh at 12 pm. It arrives at 1:35 pm and costs 112 pounds. All other flights are more expensive.

  31. Evaluation Within-participants laboratory experiment • 38 subjects • Participants read 6 dialogue pairs (UM+SR vs. SR) Four questionsafter each dialogue: • Did the system give the information in a way that was easy to understand? • Did the system give X a good overview of the available options? • Do you think there were better options the system did not tell X about? • How quickly did the system allow X to find the optimal flight? Forced-choice question after each pair: • Which system would you recommend to a friend?

  32. Results - Forced Choice Q.

  33. Results - Likert Scale Questions 7 . 0 0 Significance levels using two-tailed paired t-test Q2: p = 0.97 Q3: p < 0.0001 Q4: p < 0.0001 Q5: p < 0.001 U M + S R S R 6 . 0 0 5 . 0 0 Mean Likert Scale Value 4 . 0 0 3 . 0 0 2 . 0 0 1 . 0 0 Q 2 : U n d e r - Q 3 : Q 4 : C o n - Q 5 : Q u i c k s t a n d a b i l i t y O v e r v i e w f i d e n c e a c c e s s ( 1 - 3 s c a l e )

  34. Exp 2: Overhearer mode Significance levels using two-tailed paired t-test Q2: p = 0.24 Q3: p < 0.01 Q4: p < 0.002 Q5: p < 0.10 7.00 5.82 5.68 5.67 6.00 5.34 5.24 4.71 5.00 4.00 Mean Likert Scale Value 3.00 2.42 2.31 2.00 1.00 Q2: Q3: Q4: Q5: Quick Understandability Overview Confidence Access

  35. Summary Integration of UM and Clustering allows system to • navigate through a large set of options • structure options according to users' valuations • present relevant options only • automatically present tradeoffs between options Results in • increased overall user satisfaction • better overview of options • increased users' confidence in system

  36. Learning Content Selection Rules • Content selection rules for biographical summaries (Duboue & McKeown, EMNLP 2003) • Uses a corpus of textual biographies and corresponding frame-based knowledge representation • Anchor-based alignment of extracted facts with sentences in text corpus • Learns whether semantic unit should be included in biography • Recall 94%, F-score 51% • Induce rules from included material

  37. Learning Content Selection Rules • Collective classification for content selection (Barzilay and Lapata, HLT/EMNLP 2005) • Again, a binary classification task • All candidates considered simultaneously • Improves coherence because semantically related items often selected together • Evaluation: Aligned newswire summaries of NFL games with database of events • Recall 76.5%, F-score 60.15% • Include chosen events in summary (as in extractive summarization)

  38. Learning for Sentence Planning & Realization • SPaRKy (Stent, Prasad & Walker, ACL 2004) • Input: content plan, a set of dialogue acts and rhetorical relations among them • Learns sentence plans from set of human-ranked training examples • Oh & Rudnicky, CS&L, 2002 • Produces surface realizations for sentence plans based on n-gram statistics • Achieves performance comparable to hand-crafted versions

  39. Credits: The FLIGHTS System Fancy Linguistically Inspired Generation of Highly Tailored Speech Rob Clark Steve Conway Mary Ellen Foster Kallirroi Georgila Oliver Lemon Michael White Thanks to: UK Engineering and Physical Science Research Council

More Related