VoiceXML Technology - PowerPoint PPT Presentation

voicexml technology n.
Skip this Video
Loading SlideShow in 5 Seconds..
VoiceXML Technology PowerPoint Presentation
Download Presentation
VoiceXML Technology

play fullscreen
1 / 45
Download Presentation
VoiceXML Technology
Download Presentation

VoiceXML Technology

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. VoiceXMLTechnology Andrea Piras – Guido Zucconi piras@crs4.itguido@crs4.it 03/09/2001

  2. Contents VoiceXML What’s VoiceXML? Advantages by web Advantages by SR Advantages by phone Architectural Model VoiceXML enable … Voice apps VoiceXML history Now … W3C WG

  3. Contents VoiceXML VoiceXML Techs VoiceXML Techs in Italy VoiceXML Technology Nuance Nuance products SpeechObjects Installation Watcher Processes Access to Watcher

  4. Contents VoiceXML Technology Watchers Launchpad Some about Vocalizer Standard system An other system A complex system Good or bad? E-mate and VoiceXML? Links

  5. What’s VoiceXML? VoiceXML is a Web-based markup language for representing human-computer dialogs using audio output devices (computer-synthesized and/or recorded) and audio input device (voice and/or keypad tones).

  6. Advatanges by web Advantages took by web: • improve web server capabilities • browser more powerful • advanced web data representation (XML) • web application development tools more powerful • internet infrastructure is improving in performance, bandwidth, and quality of service • the growth of the World-Wide Web and of its capabilities

  7. Advantages by SR Advantages took by Speech Recognition: • better algorithms and acoustic models • require hardware less powerful • speech synthesizer nearer to the human talk • improvements in computer-based speech recognition and text-to-speech synthesis

  8. Advantages by phone Advantages tooks by phone: • high diffusion • portable • instant-on • using when driving, with earphone :-)

  9. Architectural Model

  10. Architectural Model Document Server process request form a client, ex. a web server

  11. Architectural Model VoiceXML Interpreter process VoiceXML documents and conduct the dialog

  12. Architectural Model VoiceXML Interpreter Context acquire VoiceXML documents, detect and answer calls

  13. Architectural Model Implementation Platform controlled by VoiceXML Interpreter Context and VoiceXML Interpreter; generate events in response to user actions and system events; require: audio output (TTS, audio files),audio input (SR, audio record, DTMF)

  14. VoiceXML enable … Voice applications developed easily. Applications are easy to deliver because don’t required particular web servers. Work with computers and telephones indifferently.

  15. Voice apps • Information retrieval • Electronic transactions • Telephone services – Call centers • Voice e-mail • Voice Access Control – Voice Recognition • ….

  16. VoiceXML history 1-2/1999 PML PML VoxML SpeechML AT&T Bell Labs PML / PhoneWeb RAMMING REHOR LADD TUCKEY 1995

  17. VoiceXML history VoiceXML 0.9 8/1999 VoiceXML 1.0 3/2000 ACCEPTED 5/2000 3/1999

  18. Now … Voice Browser Working Group Speech Recognition Grammars, Speech Synthesis Markup Language, Natural Language Semantics Markup Language, Multimodal Dialog Markup Language VoiceXML 2.0

  19. W3C WG

  20. VoiceXML Techs WebSphere Voice Server SDK – IBM TTS, ASR, browser Mya Voice Platforms – Motorola gateway, TTS, ASR, browser, download only Mobile Applications Development Kit Voice Web Application Platform – Telera voice browser and voice web server, developed TXML before use VoiceXML, after registering it’s possible checking VoiceXML code on line, no download, California

  21. VoiceXML Techs MagicTalk Voice Gateway - General Magic integrates VoiceXML, speech recognition, and telephony technologies to enable voice access, no download, California Bevocal Cafe after registering it’s possible checking VoiceXML code on line, no download, California Enterprise VoiceXML Server - Tellme after registering in Tellme Studio it’s possible checking VoiceXML code and grammars on line and listen application by phone, no download, California

  22. VoiceXML Techs Natural Voices – AT&T Labs high quality TTS, testable on line, no download Mosquito – Minde voice platform, no download, Utah VoiceGenie Server, browser and applications, after registering in Developer Workshopit’s possible checking VoiceXML code and grammars on line and listen application by phone, no download, Toronto

  23. VoiceXML Techs in Italy VoxNauta – Loquendo voice platform, no download VoceViva – Tiscali voice platform, good TTS and SR, no download

  24. Nuance Californian software house with a complete suite of VoiceXML product. After registering, it’s possible to download almost all products, test voicexml code on line, access to discussion group and read support guides.

  25. Nuance products Nuance 7.0 distributed architecture platform used by the other Nuance products, supports 25 languages Vocalizer TTS avaibles in 9 languages V-Builder graphical tool for to easily create VoiceXML applications Verifier voiceprint identification sotfware V-Optimizer tool for analyzing and tuning deployed applications

  26. Nuance products Voyager voice browser compatible 80% with VoiceXML 1.0 Voice Web Server web server contain a browser full compatible with VoiceXML 1.0 Grammar Builder graphical tool that enables developers to create, view, edit, manage, and test grammars Nuance Foundation SpeechObjects Nuance extension of Speech Objects

  27. SpeechObjects Created inside of the V-Commerce Alliance for using natural language in e-commerce, SO are Java packages for voice applications. Define speech channel, grammar handle. The source code is FREE.

  28. Installation For installing the platform is require: Nuance 7.0.4 - Service Pack 9 and Speech Object 1.1 Vocalizer 1.0 - Service Pack 1 V-Builder 1.2 Voice Web Server 1.2 Installation: 308 Mbyte Installed: 461 Mbyte Installation: 248 Mbyte Installed: 273 Mbyte Installation: 27 Mbyte Installed: 53 Mbyte Installation: 12 Mbyte Installed: 26 Mbyte TOTAL 813 Mbyte INSTALLED

  29. Watcher Watcher is a deamon/service can start, stop, get and set parameters, quiesce and monitor (using the port 7890) about processes inside the Nuance platform. A Watcher process must run on each machine that must be monitored. A Watcher can communicate with the other ones. The default launched processes are: license manager, resource manager, recognition server, recognition client, compilation server

  30. Processes Recognition Server (RecServer.exe) listen for incoming connection request from recognition clients; for each CPU a thread starts; work on port 8200 Compilation Server (compilation-server.exe) compile the grammars; work on port 2527 License Manager (nlm.exe) manage float license across the machines in the network; work on port 8470 Resource Manager (resource-manager.exe): manage the requests of the other processes; do not connect more than 1000 channels; work on port 7777

  31. Processes Recognition Client (RecClient.exe) points where the applications ‘enter’; performs audio playback, recording and controls telephony applications; support the audio providers: native (SB), dialogic (telephony board by Dialogic), nms (telephony hardware), aculab (telephony board by Aculab), h323 (Voice over IP - VoIP) support multiple applications, run applications remotely; can specify the maximun number of threads; 1 recclient each 10 ports and 1 thread each 4 channels; work on port 9200

  32. Access to Watcher 7080 7161 7023

  33. Http Watcher

  34. Watchers Each watcher can communicate with the other one present in the net.

  35. Launchpad Launchpad is a graphical tool able to communicate with all watchers using the same interface. To start: >cd %Nuance%/java >java –cp launchpad.jar nop.frontend.GUI.GUI

  36. Launchpad

  37. Some about Vocalizer By default, work on port 32323. For more TTS Servers in the same machine is necessary to indicate the port used and give a name. Ex: vocalizer tts.resourceName=americanVoice vocalizer -language italian tts.ResourceName=italianVoice tts.Port=32324 vocalizer -language french tts.ResourceName=frenchVoice tts.Port=32325 Good english and french, bad italian.

  38. Standard system

  39. An other system

  40. A complex system

  41. Good or bad? High flexibility and scalability Complete FREE Many languages supported Use with telephone boards High number of port used Hardware resources No telephone simulation Many variables Disk space Speech Recognition JAVA

  42. E-mate and VoiceXML? Each E-mate service will be able to become a voice service, and it can be made extending the Object Browser to use VoiceXML. Now the unique free voice platform is Nuance. Is it possible to install a platform supporting all VoiceXML 1.0 during E-mate installation? Not simple but YES. Is it risky? YES, require to install 760 Mbyte of third part software.

  43. Links VoiceXML Forum http://www.voicexml.org VoiceXML Central http://www.voicexmlcentral.com General Magic http://www.generalmagic.com IBM WebSphere Voice Server SDK Version 1.5 http://www-4.ibm.com/software/speech/enterprise/ep_11.html Mobile Application Development Toolkit http://www.motorola.com/MIMS/ISG/spin/mix/ Telera http://www.telera.com Tellme http://www.tellme.com, http://studio.tellme.com

  44. Links VoiceGenie http://developer.voicegenie.com V-Commerce Alliance http://www.v-commerce.org Natural Voice – AT&T Labs http://www.naturalvoices.att.com Mosquito http://www.minde.com Bevocal Cafe http://cafe.bevocal.com Tiscali VoceViva http://voceviva.tiscali.it Loquendo http://www.loquendo.it

  45. Links Nuance http://www.nuance.com, http://extranet.nuance.com, Nuance Vocalizer 1.0 – Nuance Vocalizer Developer’s Guide, Nuance Voice Web Server Version 1.2 – Installation Guide, Nuance Voyager Version 1.0 – Voice Browser Installation Guide, Nuance Speech Recognition System – Application Developer's Guide,Speech Object & VoiceXML