speech in net l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Speech in .NET PowerPoint Presentation
Download Presentation
Speech in .NET

Loading in 2 Seconds...

play fullscreen
1 / 28

Speech in .NET - PowerPoint PPT Presentation


  • 264 Views
  • Uploaded on

Speech in .NET. Sphinx CMU November 2002. Presenter. casey chesnut brains-N-brawn.com Web Services Mobile / Wireless Speech. Audience. Java / C++ / VB / C# ? VoiceXml ? SALT / Speech .NET ?. Outline. MS Technologies VoiceXml Demo Speech .NET Demo Future Questions (throughout)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Speech in .NET' - libitha


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
speech in net

Speech in .NET

Sphinx CMU

November 2002

presenter
Presenter
  • casey chesnut
  • brains-N-brawn.com
    • Web Services
    • Mobile / Wireless
    • Speech
audience
Audience
  • Java / C++ / VB / C# ?
  • VoiceXml ?
  • SALT / Speech .NET ?
outline
Outline
  • MS Technologies
  • VoiceXml
    • Demo
  • Speech .NET
    • Demo
  • Future
  • Questions (throughout)
  • ~25 slides
ms technologies
MS Technologies
  • Tools
  • Devices
    • Phone
    • Desktop PC
    • Pocket PC
    • Tablet PC
tools
Tools
  • MS Agents
  • SAPI / Speech SDK 5.1 (.NET wrappable)
  • Office
  • AutoPC ???
  • ASP .NET (VoiceXml)
  • (beta) Speech .NET / IE Speech Add-In
  • … SALT Telephony gateway (early 2003)
  • … Pocket IE Speech Add-In (mid 2003)
devices
Devices
  • Phone
    • billions of devices, people are comfortable speaking to
  • Desktop PC
    • large market, speech input is slower and uncomfortable
  • Pocket PC
    • small market, opportunities for speech (device limitations)
  • Tablet PC
    • new market, speech friendly (slate models don’t have keyboards)
phone
Phone
  • ASP .NET w/ VoiceXml 2.0
    • Production quality now
    • Multiple vendor support
  • Speech .NET VoiceOnly
    • Currently no way to deploy and test over a phone
    • Speech .NET Beta 2 has telephony simulation
    • MS target market for Speech .NET
desktop pc
Desktop PC
  • Web
    • Speech .NET MultiModal
      • Beta 2 IE Speech Add-In
    • Embedded control w/SAPI
    • MS Agents
  • Fat
    • SAPI
    • MS Agents
pocket pc
Pocket PC
  • Web
    • SALT Pocket IE Speech Add-Ins (mid 2003)
  • Fat
    • 3rd parties only
    • MS Reader does not support TTS
tablet pc today
Tablet PC - TODAY!
  • Web
    • … same as desktop PC
    • Beta 2 has added support for Tablet PC
    • Virtual keyboard has speech control
  • Fat
    • … same as desktop PC
    • Virtual keyboard has speech control
    • MS Reader should be able to support TTS
    • Digital Ink is currently more compelling to MS
voicexml
VoiceXml
  • XML-based language
    • Declarative – XML tags, grammars
    • Procedural – Javascript
      • Telephony Gateway is the client
    • Event driven – Bargein, Goodbye
    • Object oriented – Properties
usage
Usage
  • Input
    • Speech Recognition (Command and Control)
    • DTMF
    • Voice recording and posting to a server
  • Output
    • Text-To-Speech
    • Prerecorded audio files
  • Telephony control
    • Hang-up, Transfers, …
voicexml15
VoiceXml
  • DEMO
    • /vxml (VS.NET)
    • Mobile ADK (menu1.aspx)
    • BeVocal
voicexml salt
VoiceXml - SALT
  • VoiceXml : ??? : : SALT : Speech .NET
    • Nuance has some WYSIWYG
  • SALT is considered lightweight to VoiceXml
  • SALT was submitted to W3C August 2002
  • VoiceXml is v2.0 in W3C
    • Mandatory W3C grammar spec
      • Beta 2 Speech .NET has moved to W3C SRGS
  • VoiceXml has complementary specs (ccXml)
  • VoiceXml is moving to MultiModal as well
voicexml salt17
VoiceXml - SALT
  • VoiceXml = AT&T, Motorola, TellMe, (IBM)
  • SALT = MS, SpeechWorks, Intel, (BeVocal)
  • VoiceXml has multiple vendor support with venture capital from before the burst
  • Most vendors will support both specs
  • VoiceXml has ~ 15,000 developers
  • SALT has potentially millions
slide18
SALT
  • I have not read the new spec 
  • Remember doing an in-head mapping to VoiceXml when reading an early spec
  • Why
    • Common spec for MultiModal operation
    • Multiple modes of interaction with the same syntax
    • Speech enabling existing sites
  • Why not VoiceXml
    • MultiModal retrofit harder than redo
speech net
Speech .NET
  • MS implementation of SALT
  • (VoiceWebSolutions + DreamWeaver MX)
  • Some Beta 1 Speech .NET apps still work, because SALT has not changed much, but Speech .NET Beta 2 controls have
  • VoiceXml not as portable between vendors as it should be, the Speech .NET controls could help mitigate this for SALT
    • i.e. layer of abstraction for voice browser wars
slide21
Code
  • Creating static grammars and prompts
  • Very little server-side code
    • Only dynamic grammars / prompts
    • Server-side code mods to better support speech
  • Mainly setting properties on Speech controls and tying to client-side javascript
  • Tie javascript to mouse-click events to avoid redundant code
impression
Impression
  • Separate app layers to reduce complexity
    • Voice UI will be less functional, design is key
  • Learning low level SALT might be easier than high level Speech .NET controls
  • Application controls change this in Beta 2
  • Speech .NET has a great debugger (now server side too), grammar, and prompt tools
  • Speech Control Editor was needed for dev
  • IE Audio meter was needed for MultiModal
  • MultiModal has some time to grow
speech net23
Speech .NET
  • DEMO
    • Speech .NET Beta 2 (VS .NET)
    • /noHands (VoiceOnly web app)
industry
Industry
  • Wrote 1st VoiceXml article a year ago
    • Received 1st proposal request last month
    • 1 other proposal request since then
  • Wrote 1st Speech .NET article 5 months ago
    • Request for an article from MSDN magazine
voice recognition
Voice Recognition
  • PSTN is less secure than Internet!
    • More accessible and easier to automate hack
  • Traditionally spoken password OR DTMF pin, also #
  • Clients always confuse with speech recognition
  • Not a part of VoiceXml or SALT specs
    • Telephony gateways proprietary implementations
  • Not useful for identifying somebody
  • Useful for confirming somebody is whom they say they are
  • Prints have to change when device changes
future ms speech
Future (MS Speech)
  • SALT Telephony gateways
  • Speech .NET (VoiceOnly then MultiModal)
  • Pocket IE Speech Add-In
  • NET Fat-client Speech APIs
    • Desktop / Tablet / PPC
  • MS or 3rd party VS .NET VoiceXml controls
  • Possibility for Speech .NET controls to render both SALT and VoiceXml
future
Future
  • Lots of W3C Voice specs …
  • VoiceXml MultiModal browser
  • Auto (hands-free, navigation, radio)
  • 3G (bridge voice and wireless web)
    • offload Speech processing
    • VOIP or PSTN
    • Pocket PC Phone Edition / SmartPhones
  • IBM recently announced chip for Speech on mobile devices