1 / 22

what I will talk about

Today and the Future of Wearable Agents Emmett Coin Director of Speech Research and Development SpeechTEK 2007 West, February 21, 2007. what I will talk about. Definition of a wearable voice agent Overview of voice-based agents in logistics the subsystems in a real world voice application

Download Presentation

what I will talk about

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Today and the Future of Wearable AgentsEmmett CoinDirector of Speech Research and DevelopmentSpeechTEK 2007 West, February 21, 2007

  2. what I will talk about • Definition of a wearable voice agent • Overview of voice-based agents in logistics • the subsystems in a real world voice application • Some of the more difficult issues • Futures • Summary • Conclusion

  3. what is a “Wearable Voice Agent”? • NOT just a voice app on a cell phone or PDA • Voice dialing • Stock quote • One shot use • Rather it IS a partner that: • Complements the task (a teammate) • Adds value (faster, more accurate, less injury etc.) • Used for extended periods of time (maybe all day long) • Requires no (or very little) hand/eye time • Is as small as possible • Becomes Invisible (forget that the device is there)

  4. some examples • Currently • Battlefield: Translation • Inspection: Insurance, Q/C • Logistics: Distribution Centers • Consumer: GPS route computers • Very Near Future • Retail: Extend Distribution to the Sales Clerk • Consumer: Organize lists and errands • Industry: Process Control

  5. voice in logistics • Distribution Centers • The way we move the vast majority of products from manufacturer to consumer • Moving from many homogeneous collections to many heterogeneous collections • Many Suppliers (send product TO the Center) • Many Stores (receive product FROM the Center) • A massive repackaging task • A sizable fraction of the cost retail products • One of the biggest sectors of “wearable voice agents” to date

  6. voice in logistics

  7. a “Selector” talking to Jennifer • The agent tells the human: • where to go • what to “select” • how many to “select” • where to put the item(s) • The human tells the agent: • Location checkstring • Quantity selected • If the bin is empty

  8. a “Selector Agent” in the Refrigerator.

  9. some things that just happened • Selector was directed to product • Location was verified • Some product had unique weights entered • Others had expiration dates to verify • Selector needed the agent to repeat • Selector was lifting (80 lbs), walking, driving, reading, etc. while talking

  10. fast interaction • Overlapped dialog • Look ahead • Independent use • Eyes • Hands • Speech • Natural corrections • Low cognitive load

  11. accommodation • Linguistics • Finishing each others sentences • The classic “barge-in” • The never (but maybe soon) seen “interruption” • Expectation • Predicting dialog flow • When the response is marginal but expected • Response is “legal” but how probable?

  12. accommodation example: agent side • In conventional voice applications the prompts need to be clear and unambiguous. • But for an agent “co-worker” this would be tedious. • In the beginning a natural prompt speed is best for learning the routine. • Later, however, “natural” will feel like “slow-mo” and must be “snappier”. • Later still, the human and agent know each other well and just cut to the chase further shortening the prompt.

  13. components of a voice agent • Small device • Light weight, long battery life, rugged • Speech Technologies • Recognition,Text-to-Speech and recorded waves • Multi-Modal fits in here too. • Dialog Management • A core system that controls the goals of the interaction • Connectivity • The “real” work usually involves information external to the agent

  14. simple view of a generic voice platform • Most “PDA”-like platforms run some version of Windows CE or Windows Mobile • They need full-duplex GOOD quality audio IO • Enough “cycles” to do the ASR and TTS • Low level control over “power management”

  15. a more complicated view

  16. some industrial hardware platforms devicesSmall_3.JPG devicesSmall_3.JPG

  17. did I mention they have to be tough……

  18. would regular folks “talk” with a computer? • Obviously Hands and Eyes free • Grocery shopping • Assembling a child’s toy • Cooking a new recipe • We think differently (freely? Innovatively?) when we talk • Talking is a low (perceived) cognitive load • People get “writers block” more often than “talkers block” • To off load and manage the fussy details of our lives

  19. Futures • The latest cell phones have the power to support a voice-based agent. • They cost 1/10th of a present day industrial device • It is just a matter of time before we talk TO our phone as well as ON it.

  20. Summary • Wearable voice agents • Have been here for a while • Proven and make good business sense • Declining in cost • Expanding the range of worker multi-tasking • Can be effortless to use

  21. Conclusions • They are more places than you think • They are REAL TOOLS not window dressing • They are just in their infancy • I am looking forward to my next new synthetic agent!

  22. Thank you! • Contact: • Emmett Coin • Director of Speech Research and Development • coin@lucasware.com • 724 940 7041 • www.lucasware.com

More Related