1 / 10

VUI Evaluation

VUI Evaluation. Paradise Framework. PARADISE Paradigm for Dialogue System Evaluation. Goal: Maximize User Satisfaction. PARADISE Paradigm for Dialogue System Evaluation.

sancha
Download Presentation

VUI Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VUI Evaluation Paradise Framework

  2. PARADISE Paradigm for Dialogue System Evaluation • Goal: Maximize User Satisfaction

  3. PARADISE Paradigm for Dialogue System Evaluation • Performance is modeled as a weighted function of a task-based success measure and dialogue-based cost measures, where weights are computed by correlating user satisfaction with performance. • Dialogue tasks are represented as Attribute Value Matrix (AVM) pairs.

  4. PARADISE Paradigm for Dialogue System Evaluation • Advantages • PARADISE approach addresses performance and user satisfaction • Disadvantages • Too complex to compute. • Need a large sample size up front.

  5. Alternative Approaches • What’s important? • Maximize User Satisfaction • Maximize Task Success

  6. User Satisfaction • How do we measure user satisfaction? • Questionnaires • Interviews • Focus Groups

  7. Task Success • How do we measure task success? • Logging Actual Use • Performance Measurement • Walkthroughs • Pilot Testing

  8. Task Success • For each dialogue and the entire conversation establish AVMs. • Measure task success with respect to: • Task completion time • Accuracy or Errors (e.g. misinterpretations)

  9. Conclusions • PARADISE is good, but too complex! • Measure user satisfaction and task success. • Develop a formula that considers Task Completion Time, Accuracy/Errors and User Satisfaction

More Related