1 / 21

The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications

The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications. 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch. Overview. Voice browsers History of voice markup languages W3C Speech Interface Framework Communication Architecture VoiceXML 2.0

jerom
Download Presentation

The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch

  2. Overview • Voice browsers • History of voice markup languages • W3C Speech Interface Framework • Communication Architecture • VoiceXML 2.0 • Grammars • SALT • Not WAP/WML, Voice over IP

  3. Voice Browser Device (hardware and software) that interprets voice markup languages to generate voice output and interpret voice input.

  4. Companies

  5. History 1990s: companies developed their own markup languages: • PhoneML (AT&T) • PhoneML (Lucent) • VoxML (Motorola) • TalkML (HP Labs) • SpeechML (IBM) => VoiceXML Forum : VoiceXML 1.0 • 1998: W3C Voice Browser Workshop

  6. VoiceXML Specification History • April 1999 – Initial spec – Request For Comment • August 1999 – 0.9 Spec released • March 2000 – 1.0 Spec released • October 2001 – 2.0 Working Draft (W3C) • March 2002 – next Working Draft • 4th quarter 2002 – 2.0 Recommendation W3C?

  7. Why Voice Markup Languages? • “Voicifying” web pages by adding a few VoiceXML tags is not feasible: • basic design principles that make a good web page are very different from those that make an efficient voice interface • e.g. Raggett & Ben-Natan: “Voice Browsers” (W3C, 1998) • … unless you want to create a multimodal interface (cf. SALT) ?

  8. Speech Interface Framework N-gram Grammar ML Speech Recognition Grammar ML Natural Language Semantics ML VoiceXML 2.0 Lexicon ASR Language Understanding Context Inter- pretation Dialog Manager World Wide Web DTMF tone recognizer Telephone System User Prerecorded audio player Media Planning TTS Language Generation Speech Synthesis ML Reusable Components

  9. Communication Architecture

  10. What is VoiceXML? For creating audio dialogs that include • Synthesized speech • Digitized audio • Recognition of spoken and DTMF key input • Recording of spoken input • Telephony • Mixed-initiative conversations Major goal: bring the advantages of web-based development and content delivery to interactive voice response applications.

  11. Advantages of VoiceXML As perceived by Motorola et al: • People want a better mobile user interface while on the go • Device Independent • Open standards create and drive market demand • Easy to program since similar to other XML-based languages • Utilizes existing web infrastructure

  12. Developing applications • To develop VoiceXML applications you have to learn several languages: • VoiceXML • ECMAScript (JavaScript/Jscript) • a grammar format (GSL, JSGF, Speech Recognition Grammar Specification) • a back end scripting language (Perl, Java, …) • Web developers are used to this kind of environment

  13. VoiceXML Basics • XML-based • More structured then HTML (describes structure and semantics of data, not presentation) • Must close all tags (i.e. <prompt> </prompt>) • Structure of language described in a Document Type Description (DTD)

  14. VoiceXML Applications • An application consists of a single application root document as well as zero or more other documents • The application root document is loaded whenever any other document is accessed • The application root document grammars and variables are visible in other application documents Document root Document Document Document

  15. VoiceXML Documents • Documents can contain two types of dialogs: • forms (<form>) • menus (<menu>) • Other elements: • <meta>: metadata, defined as name/value pair • <var>: for declaring variables • <script>: for client-side ECMAScript • <catch>: for catching events • <link>: transitions to other dialogs

  16. Forms and menus • Forms may contain zero or more <field> elements • the user must provide a value for the field before proceeding to the next element in the form • each field may specify a grammar that defines the allowable inputs • Menus may contain one or more <choice> elements • a menu presents the user with a choice of options and then transitions to another dialog

  17. VoiceXML Example 01 <!-- helloworld.vxml --> 02 <?xml version="1.0"?> 03 <vxml version="1.0"> 04 <form> 05 <block> 06 <prompt> 07 Hello World! 08 </prompt> 09 </block> 10 </form> 11 </vxml>

  18. Example with Grammar 01 <vxml version="1.0"> 02 <meta name=“maintainer" content=“christophe@docarch.be"/> 03 <form id="hello"> 04 <field name="item"> 05 <prompt>Would you like coffee, tea, or juice?</prompt> 06<grammar type="application/x-gsl"> 07[coffee tea juice] </grammar> 08 <filled> 09 <prompt>Your <value expr="item"/> 10 will be ready momentarily</prompt> 11 </filled> 12 </field> 13 </form> 14 </vxml>

  19. Dynamic VoiceXML #!perl –w print "Content-type: text/x-vxml \n\n"; $HOMEBUFFER = '<?xml version="1.0"?> <vxml version="1.0"> <form> <block> <prompt> Hello World </prompt> </block> </form> </vxml>'; print $HOMEBUFFER;

  20. Other Markup Languages • JSML: JSpeech Markup Language (Sun) • Dialog ML (Dennis Heuer) • SABLE (SABLE Consortium) • DMML (Dialogue Moves Markup Language) • SALT: Speech Application Language Tags (SALT Forum) • (CallXML, Telephony Markup Language, …) Progress since March 2000 (VoiceXML 1.0) ?

  21. SALT • Speech Application Language Tags (SALT Forum) • SALT Forum founded by Microsoft, Intel, …; 15 October 2001 • very simple set of tags for extending existing markup languages (xHTML, XML) • specification available Q1 2002 • specification submitted to standards body (W3C??) mid 2002

More Related