Speaky™
Download
1 / 18

Speaky - PowerPoint PPT Presentation


  • 313 Views
  • Updated On :

Speaky™ Media Center SpeechTEK 2007 New York August 20-23, 2007 Mediavoice presents Speaky Media Center Mediavoice is a 2000 born company with the mission to develop innovative solutions with the state of the art of speech technology The Company has already developed 3 patented solutions

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Speaky' - andrew


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg

Speaky™ Media Center

SpeechTEK 2007

New York

August 20-23, 2007


Slide2 l.jpg

Mediavoice presents Speaky Media Center

Mediavoice is a 2000 born company with the mission to develop innovative solutions with the state of the art of speech technology

The Company has already developed 3 patented solutions

The solution we present here and we are about to launch on the market is

Speaky Media Center

…a completely novel way to interact with the PC, its services and content


Slide3 l.jpg

Speaky Media Center

Speaky Media Center is a new home computer, a new system thatisabletointeractto people in a natural way, simplyusing voice. A system thatlistensto the user, receiveshis/her voice commands, his/her information and service requests, and speaks the resultsto the user

An innovative system that makes us enter in a new home-automation dimension, that really places the customer, the person, at the center of the digital world, facilitating the access and the use of the digital services and content

A speech intelligent personal assistant who realizes the digital convergence using a simple and for all interface; a system that facilitates the bridging of the Digital Divide


Slide4 l.jpg

Speaky Media Center

This patent pending solution that Mediavoice is presenting is an add-on for the new Windows Vista operating system PCs, consisting in the intelligent interface software Speaky and a special remote control that adds the speech features to the typical characteristics of the remote control for Windows Media Center

With Speaky Media Center you can interact with the PC in a natural way through the voice to see television, to use the video recorder, to search a song and to listen to your own music, to see your own videos, your photos and to navigate the Internet; you can manage your house and all its systems, you can call a person pronouncing his/her name and speaking to the remote control like if it were a telephone or you can play, by yourself or in group (for example role games, speech quiz …)

…all that stuff simply pressing a remote control button and speaking to its microphone!



Slide6 l.jpg

Speaky Media Center: the newVista-compliant,

Radio-Frequencyremote controlfor data and voice, withmicrophone, Push-To-Talkbutton and USB receiver

MICROPHONE

SPEAKY PUSH-TO-TALK BUTTON

RADIO-FREQUENCY USB RECEIVER

Radio-Frequency technology overcomes `the directionality' of the infrared and guarantees a larger range of the remote control


Slide7 l.jpg

Speaky Media Center: whois the user and howhe/sheisfacilitatedby Speaky

  • Speaky’s mission is to facilitate the interaction with the digital world for all, in order to allow the user to access the digital content in an easy and fast way

  • The user who wants to use a personal computer and its content, will no longer need to learn by heart user guides, complex sintaxes, list of menu options and impossible technical information

  • Speaky is also useful for blind and impared people


Slide8 l.jpg

Speaky Media Center: whichissues and solutionsduringProjecting and Development

  • Far or closemic?

  • Speaker dependent or independent?

  • Continuousrecognition or triggeredrecognition?

  • Fixedmicsensitivity or AutomaticGainControluse?

  • Infrared, bluetooth or radio frequencycommunication?

  • Directeddialog or naturallanguage?

  • Howtospeechinternationaltextsbyone voice: Phoneticmapping and languageguesser


Slide9 l.jpg

Speaky Media Center: i) Far or closemic?

  • A very close mic has been chosen to get the best ASR performance: we put a mic on the remote control so that the user can give voice commands from a few centimetres distance


Slide10 l.jpg

Speaky Media Center: ii) Speaker dependent or independent?

  • Speaky Media Center is first af all a ‘social/family solution’, so it must be open to all the users and must be easy and ready to use without any training time so we have choosen the speaker independency


Slide11 l.jpg

Speaky Media Center: iii) Continuousrecognition or triggeredrecognition?

  • We use the Speaky button as a push to talk button: when the user pushes it, the system sets to zero the audio volume and starts the ASR session which will be stopped when the user releases the Speaky button

  • In this way we get the best ASR performance because:

    • The ASR starts to recognize just before the user starts speaking

    • The ASR listens only to the user speaking because all the media are stopped during his speaking


Slide12 l.jpg

Speaky Media Center: iv) Fixedmicsensitivity or AutomaticGainControluse?

  • using a static mic sensitivity the user may speak from a restricted mic distance range, instead:

  • using the Automatic Gain Control (AGC) feature means that mic sensitivity follows speaker’s behaviour: if the speaker speaks too loud or too close to the mic, the sensitivity decreases; instead if the user speaks from far or low, then the mic sensitivity increases

  • using the AGC feature the user is more free and may speak in a wider range so the use of Speaky is more natural and with better ASR performance


Slide13 l.jpg

Speaky Media Center: v) Infrared, bluetooth or radio frequencycommunication?

  • there were many ways to make the remote control interact with the Personal Computer:

    • infrared forces the user to point the remote control to the receiver and to ‘see’ it

    • bluetooth has a heavy software stack and it is an expensive proprietary protocol;

    • Instead:

    • radio frequency passes through walls, it is radial so that the user can handle the remote control in any position, it is a low-consumption trasmitting technology and it is open to anyone


Slide14 l.jpg

Speaky Media Center: vi) Directeddialog or naturallanguage?

  • This one may be the most important question to face, during the project of a speech personal assistant

    • the natural language is really reach and complex and it is still very difficult to model

    • Speaky’s scope is very large and rich so the user may say really anything speaking to it

    • more than that, the user may take pauses and mumble during his speech, and this behaviour is very dangerous for the ASR performance

    • so:

    • We chose an ‘enriched directed dialog’, that is the user may say everything he sees on the screen. Anyway the system recognizes also the most used prefixes for the specific areas in order to make its understanding as wide as possible, including the cases in which the users, although he knows that he can just say the directed commands, he may also say ‘something around’ those commands (for example let me hear, let me see, i want to listen to….)


Slide15 l.jpg

Speaky Media Center: vii) Howtospeech

internationaltextsbyone voice: Phoneticmapping and languageguesser

  • In many Speaky areas the user may interact with international titles, for example in music, videos, tv programs

    • in order to permit him to pronounce all the different language titles of his collection, we used a particular feature of Loquendo ASR named ‘phonetic mapping and language guesser’

    • with this innovative feature the system is able to guess the native language of the title, to translate its pfonemes from the original language to the current ASR language so that one ASR (using one language) may recognize many different language titles


Slide16 l.jpg

Speaky Media Center: OPEN Speaky

  • Speaky Media Center is an open programming environment with an easy to use API, to integrate any content and software provider in a fast and simply way


Slide17 l.jpg

Speaky Media Center: Partner and First Customer

  • Loquendo is Speaky’s speech engines partner: BOOTH # 509, where you can try Speaky Media Center

  • Olidata, the first italian PC vendor, is Speaky’s first customer

  • …we are looking for US commercial and technological partners!


Slide18 l.jpg

Speaky Media Center: nextsteps…

  • many Speaky’ news are coming soon…Stay with us!

  • www.mediavoice.it


ad