The speech speech
Download
1 / 39

The Speech Speech - PowerPoint PPT Presentation


  • 352 Views
  • Updated On :

The Speech Speech. casey chesnut brains-N-brawn.com Madison .NET April 2007. Powerpoint. Page Up Page Down. brains-N-brawn.com. Pervasive Computing Tablet PC (MVP 03) Compact Framework (MVP 04) Advanced Web Services (MVP 05) Media Center (MVP 06) Speech Location Based Services

Related searches for The Speech Speech

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The Speech Speech' - arleen


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
The speech speech l.jpg

The Speech Speech

casey chesnut

brains-N-brawn.com

Madison .NET April 2007


Powerpoint l.jpg
Powerpoint

  • Page Up

  • Page Down


Brains n brawn com l.jpg
brains-N-brawn.com

  • Pervasive Computing

    • Tablet PC (MVP 03)

    • Compact Framework (MVP 04)

    • Advanced Web Services (MVP 05)

    • Media Center (MVP 06)

    • Speech

    • Location Based Services

    • Artificial Intelligence

    • 3D


Outline l.jpg
Outline

  • Speech Overview

  • Vista Speech Recognition

  • SAPI 5.3 / System.Speech

  • Speech Server 2007


Outline speech overview l.jpg
Outline : Speech Overview

  • Voice User Interface

  • How does it work?

    • Synthesis (TTS)

    • Recognition (SR)


Overview l.jpg
Overview

  • Speech is just another presentation system

    • Synthesis = Output to user

    • Recognition = User input

  • Voice User Interface (VUI)


Vui modes l.jpg
VUI Modes

  • Applications

    • Multi-modal

    • Voice-only


Vui tips l.jpg
VUI Tips

  • Don't replicate the touch-tone-based menu system

  • Restrict options on the main (opening) menu to 4 or fewer

  • Make sure your opening greeting is short

  • Don't design the app solely for the new user

  • Focus on task completion above all

  • What can I say?

    http://blogs.msdn.com/anandis_thoughts/archive/2006/02/08/528181.aspx


Speech synthesis l.jpg
Speech Synthesis

  • Text to Speech

    • Dynamic

    • Prompt database


How synthesis works l.jpg
How Synthesis Works

  • Text parsing

    • Sentences, numbers, symbols, pauses

  • Natural language processing

    • Part of speech, tense

  • Phonemes are looked up or sounded out

  • Diphones are appended together

  • Post process audio to add emphasis

  • Play speech audio


How synthesis works11 l.jpg

Demo

/xnaSynth app

Article

http://www.brains-N-brawn.com/ttSpeech/

http://www.brains-N-brawn.com/xnaSynth/ (codebase from /ttSpeech)

How Synthesis Works


Speech recognition l.jpg
Speech Recognition

  • Speech to Text

    • Dictation

    • Command and Control


How recognition works l.jpg

Audio signal is processed

Look for signals which might be speech

Phonemes are found in audio signals

Phonemes are mapped to a dictionary or words

Dictation or grammar-based

Apply natural language processing

How Recognition Works


How recognition works14 l.jpg
How Recognition Works

  • Demo

    • /wavReader app

  • Article

    • http://www.brains-N-brawn.com/noReco/

    • http://www.brains-N-brawn.com/speakerVerify/ (codebase from /noReco)


Outline vista speech recognizer l.jpg

Built-in to Vista’s shell

Microphone bar

Language support

Can be trained to improve accuracy

Command-and-control, also Dictation

Automagic application support

Horrible Office integration

UAC problems

Outline : Vista Speech Recognizer


Slide16 l.jpg
Demo

  • Say what you see

  • Show numbers

  • Correct

  • Spell it

  • Mouse grid

    http://www.istartedsomething.com/20060808/vista-speech-recognition-screencast/



Slide18 l.jpg
Hack

http://news.bbc.co.uk/1/hi/technology/6320865.stm

  • /micBarExtend – tap and talk


Narrator l.jpg
Narrator

  • Vista’s screen reader


Outline sapi 5 3 system speech l.jpg

Desktop applications

SAPI 5.3

System.Speech

Outline : SAPI 5.3 / System.Speech


Sapi 5 3 l.jpg
SAPI 5.3

  • COM based

  • Native applications

  • Managed apps which need more control


System speech l.jpg
System.Speech

  • Part of .NET 3.0 WPF

  • Managed wrapper built on SAPI 5.3

  • Simple API

  • Standards support (SSML, SRGS)

  • Language support

  • Vista Speech Recognition integration

  • Does not work in XBAP


System speech synthesis l.jpg
System.Speech.Synthesis

  • SpeechSynthesizer

  • SSML

  • PromptBuilder

  • Voices


System speech synthesis24 l.jpg
System.Speech.Synthesis

  • Demo

    • /speechSamples - /speechSynth


System speech recognition l.jpg
System.Speech.Recognition

  • SpeechRecognizer / SpeechRecognizerEngine

  • SRGS

  • GrammarBuilder

  • Advanced users

    • Deep-link functionality

    • Mixed initiative


System speech recognition26 l.jpg
System.Speech.Recognition

  • Demo

    • /speechSamples - /speechReco


System speech27 l.jpg
System.Speech

  • Demo

    • /micBarExtend

    • /mceSapiMcpl

  • Article

    • http://www.brains-N-brawn.com/speechSamples/

    • http://www.brains-N-brawn.com/micBarExtend/

    • http://www.brains-N-brawn.com/mceSapi/ (not updated for Vista yet)


What about mobile devices l.jpg
What about Mobile Devices

  • OEMs can add VoiceCommand

    • VoiceCommand is not accessible to developers

  • WindowsMobile has the SAPI API, but no engines

  • PlatformBuilder is supposed to have engines

  • There are 3rd party engines for purchase



Speech server 2007 l.jpg
Speech Server 2007

  • Telephony Applications

  • Outgoing calls

  • Speaker Independent


Speech server 200731 l.jpg

VOIP

Language support

VoiceXML / SALT

Workflow development model

Reports

Still in beta

Speech Server 2007


Speech server 200732 l.jpg
Speech Server 2007

  • Speech Synthesis

    • Inline

    • PromptBuilder

    • SSML

    • Prompt databases

  • Speech Recognition

    • Inline

    • Dynamic Grammar

    • SRGS

    • Conversational Grammar Builder

    • DTMF


Voicexml l.jpg
VoiceXML

  • Declarative language

  • Article

    • http://www.brains-N-brawn.com/vxml/

    • http://www.brains-N-brawn.com/myVoices/

    • http://www.brains-N-brawn.com/voiceBio/


Slide34 l.jpg
SALT

  • Yet another declarative language

  • Multimodal support has been dropped

  • Article

    • http://www.brains-N-brawn.com/noHands/

    • http://www.brains-N-brawn.com/speechMulti/

    • http://www.brains-N-brawn.com/tabletWeb/

    • http://www.brains-N-brawn.com/mceSalt/


Speech workflow l.jpg
Speech Workflow

  • Speech Sequence Workflow designer

  • Speech activities

    • Statement

    • QuestionAnswer

  • Debugging tools


Speech workflow36 l.jpg
Speech Workflow

  • Demo

    • /speechTextAdv

    • /speakerVerify

    • /mobileRecord

  • Article

    • http://www.brains-N-brawn.com/speechTextAdv/

    • http://www.brains-N-brawn.com/speakerVerify/


Where l.jpg
Where

  • Accessibility

  • Telephony

  • Telematics

  • Home automation

  • Mobile Devices / Tablets

  • Gaming

  • Warehouses


Possible future l.jpg
Possible Future

  • Telematics

  • Service Pack for Office Support

  • Exchange Server 2007

  • Speech Server 2007 release

  • Rumors that WindowsMobile will get a public API

  • Dictation has room to improve

  • Hope that System.Speech will ultimately work in XBAP



ad