speech in, speech out

speech in, speech out

speech-in components Nuance server compiled recognition grammar, master language package,licence manager Nuance client WS0607 – elevator

recognition grammar anticipate user’s responses what pieces of information are needed to complete the dialog? in what order will they be requested? one piece of information at a time in particular order (directed dialog), several pieces at once, in any order, and prompt for missing items (mixed initiative)? WS0607 – elevator

recognition grammar syntax Nuance: Grammar Specification Language (GSL) Diamant: Speech Recognition Grammar Format (SRGF) WS0607 – elevator

recognition grammar GSL grammar: doc in a file with .grammar extension; e.g. mygram.grammar (mygram will be the resulting package name) contents: GrammarRuleName GrammarDescription GrammarRuleName: at least one uppercase character GrammarDescription: sequence of words, grammar names, and operators that define a set of recognizable word sequences words (terminals) in lower-case operators: WS0607 – elevator

recognition grammar GSL grammar: example expressions [morning afternoon evening] “morning”, “afternoon”, “evening” (good [morning afternoon evening]) “good morning”, “good afternoon”, “good evening” (?good [morning afternoon evening]) “good morning”, “good afternoon”, “good evening”, “morning”, “afternoon”, “evening” (thanks +very much) “thanks very much”, “thanks very very much”, ... (thanks *very much) “thanks much”, “thanks very much”, “thanks very very much”, ... WS0607 – elevator

recognition grammar .GO_FLOOR [ FLOOR:f (?the FLOOR:f floor) (?the FLOOR:f please) (?Filler ?the FLOOR:f floor ?please) ] {<floor $f>} Filler [ (i would like to go to) (i want to go to) (uh) ] FLOOR [ first {return("1")} second {return("2")} third {return("3")} fourth {return("4")} ] example GSL grammar .grammar file .slot_definitions file floor WS0607 – elevator

recognition grammar another option: SRGF and export asNuance GSL GrammarTest.bat WS0607 – elevator

master recognition package recognition grammar compiling the package (compile-package.bat) set PKGHOME = path to your gsl file (w/o extension) nuance-compile %PKGHOME% English.America.1.3.0 WS0607 – elevator

recognition grammar testing the grammar (text) parse-tool -package path_to_your_model nl-tool –package path_to_your_model –grammar grammar_in_your_model WS0607 – elevator

speech recognition running Nuance: licence manager: lm.bat recognition server: rs.bat set PKGHOME = path to your compiled model recserver -package %PKGHOME% lm.Addresses=localhost config. ... testing the grammar (speech) xapp -package path to your compiled model lm.Addresses=localhost WS0607 – elevator

Diamant with speech-in running nuance client edit Diamant config file: Clients.ini NuanceClient.bat (btw, have the licence manager and the server running too... duh!...) WS0607 – elevator

Diamant with speech-in adding speech-in add device as usual activate recognition: output <string> „start” (start command) to nuance client read (speech) input from nuance client into variable as usual access recognition confidence (of type Real) like this: var#confidence WS0607 – elevator

speech-out components Mary server online at DFKI... Mary client MaryClient.bat WS0607 – elevator

Diamant with speech-out adding speech-out add device as usual optionally, set format: {format = <string>} (default plain text) and voice {voice = <string>} in output node, output <string> to Mary client as usual WS0607 – elevator

speech-enabled dialogs recognition tends to be imperfect... if recognition confidence low, then, for example (btw, think: grounding): repeat question ask for confirmation („did you say blah?”) inform user what they can say („you can say blah, bloo, and blee, please try again”) but... don’t let user get stuck in endless clarification dialog either! WS0607 – elevator

speech in, speech out

speech in, speech out

Presentation Transcript

The Speech Speech

Speech

Indirect Speech (Reported Speech)

Mapping Out the Speech: Outlines

SPEECH

Speech

SPEECH

Speech

Speech

Speech

REPORTED SPEECH / INDIRECT SPEECH

Speech

Speech

Speech

Reported speech / Indirect speech

Speech

The Use of Speech in Speech-to-Speech Translation

Speech

Speech

Speech

Speech:

Speech 204/ Speech 205