1 / 40

Open Source  Telecommunications: Enabling Anyone to Build a Bad Telephony Application

Open Source  Telecommunications: Enabling Anyone to Build a Bad Telephony Application. Jeff Dworkin Segment Marketing Manager jeff.dworkin@dialogic.com. Human Factors in Voice Interface Design . Jeff Dworkin Segment Marketing Manager jeff.dworkin@dialogic.com. Dialogic at a Glance.

russ
Download Presentation

Open Source  Telecommunications: Enabling Anyone to Build a Bad Telephony Application

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Open Source Telecommunications:Enabling Anyone to Build a Bad Telephony Application Jeff Dworkin Segment Marketing Manager jeff.dworkin@dialogic.com

  2. Human Factors in Voice Interface Design Jeff Dworkin Segment Marketing Manager jeff.dworkin@dialogic.com

  3. Dialogic at a Glance Mission: To Enable Secure Multimedia Communications Through Any Network To And From Any Endpoint In The World Company Highlights • Privately-held corporation • Headquartered in Montreal, Quebec with over 700 employees, including ~37% in R&D functions • 14 major offices and 27 additional sales locations globally • Industry leader in communications enabling technology solutions • Dialogic is the most recognized name in the converged communications enabling industry and remains the market segment leader • Deployed in over 80% of Fortune 2,000 companies and in the vast majority of service provider networks in over 80 countries • Founded in 1984 • Numerous Industry Firsts in Mobile Video and VoIP • 79 Unique Registered Patents and over 60 Pending Patents • Over 70 Million Ports Shipped Leading Enabling Technology Solutions Thought Leadership Deep Domain Expertise Industry Standard Solutions

  4. Dialogic Evolution • Multimedia/Video VAS enabling leadership • Extend into Web communication innovators • TDM to IP Transition Leadership • HD Voice • Enabling Video IP Streaming Value Added Services • Extends Mobile VAS Segment Leadership • Extends Technology Enabling MSS Leadership • Video Algorithmic and Analytics Leadership • Extends Technology Enabling MSS Leadership • Fax Segment MSS Leadership • Deeper Service Provider Segment Products / Customers • Service Provider gateway and IP media server • Converged Communications Technology Enabling Market Segment Share Leadership • Dialogic “pioneer” history, relationships and patent portfolio • Enterprise Gateway • Established SS7 / Signaling Part of Business • Established HMP as core to Dialogic customer value proposition 2010 2009 2008 2006 2007 “VIDEO IS THE NEW VOICE”™ “VIDEO IS THE NEW VOICE”™

  5. What is Human Factors? • Ergonomics – an applied science concerned with designing and arranging things people use, such that they interact most efficiently and safely. • Ergonomics is the physical part • Human Factors encompasses the physical as well as the mental and emotional. • The Man/Machine Interface

  6. Persistence, Memory and Time PEOPLE JUST DON’T LISTEN!

  7. Persistence, Memory and TimeTelephony InterfacesVs.Visual Interfaces

  8. Persistence In a visual display, data remains on the display until replaced by new data. • This allows users to: • Return to a task after interruption • Review – by scanning back and forth – among several possible menu choices • Eliminate or minimize the effects of time by scrolling freely between the past and the present • Maintain context – even when confronted with multiple tasks

  9. Memory • The serial presentation of auditory information places heavy demands on working memory • More impactful on novice users • More impactful on older users

  10. Time Time is the enemy of the spoken user interface -Bruce Balentine/David P. Morgan, How To Build a Speech Recognition Application • Defeating this enemy requires repeating critical information until it “sticks” • Yet it takes time to say things • “Hold on – I’m writing this down” • Cultural/Social issues can cause communication breakdown • Issues of Prosody/Timing • What’s your phone Number? • Is it 973-555-1212 or is it 9735-5-1212?

  11. Machine Output

  12. Machine Spoken Output Prompts – indicate it is time for user input. Feedback – presents the application state that results from user input, allowing the user to compare original intent with final results. Instructions – give information to the user about operating the user interface or understanding the task. Help – offer context sensitive corrective action. Often adopts a separate mode or state aimed at coaching. Application Data – the content or information that the user seeks or intends to modify.

  13. Silence, the Silent Killer • People will wait without feedback for six to eight seconds. • Anything longer than that and callers will think something wrong • Causes frustration • Causes people to hang up • If a processing delay or a wait in queue lasts more than six seconds, give the caller feedback • Music, Information, Advertising • If using tones, explain the tone or callers may think the tone is an indication that they have been disconnected

  14. PromptsAsking the User to Do Something Machine Output

  15. Action-Goal vs Goal-Action • Action-Goal • Press one for sales… • Goal-Action • For sales, press one… • Goal-Action reflects the way people think, using Action-Goal can cause confusion. What you are saying: Press One for Sales…Press Two for Marketing…Press Three for Support. What is heard: Press One (not heard because the user is not paying attention yet) for Sales, press two…for Marketing…press three, for support…???

  16. Please, Now and Thank-You • “Social Graces” just add to the length of the communication • For sales, please press one now… • For sales, press one… • Many phone-based interfaces are tedious because they unnecessarily put the word “please” in front of every action statement on a menu (e.g., “For more information, please press 4.” (Scumacher)

  17. Anthropomorphism Definition: The attribution of human characteristics to non-human beings • This is not the same as the system having a “personality” • Experts disagree on the use of anthropomorphism In my opinion: • Avoid anthropomorphism • The more “like” a person people believes the system to be more they want to communicate with it like it is a person, but it is not a person, it is a machine • If you must personify, let the personality be a narrator or guide, not the machine

  18. Compression The speed or tempo at which recordings are played • Should be between 135 words/min and 170 words/min • Software can be used to compress (or speed-up) playback while maintaining the pitch of the voice • Faster may seem better, but it can cause error due to retention issues and response mistakes…especially in older adults (Sharit, 2003) • Faster tempo can cause Perceived Enunciation Errors or Mondegreens • Bad Moon Rising by Creedence Clearwater Revival • There’s a bathroom on the right • There’s a bad moon on the rise • Mairzydoats and dozy doats and liddlelamzydivey • Mares eat oats and does eat oats and little lambs eat ivy • For information and directions, press 5… • ??? I see the bad moon arising.I see trouble on the way.I see earthquakes and lightnin'.I see bad times today.(CHORUS:)Don't go around tonight,Well, it's bound to take your life,

  19. Short /Long Prompts vs Short/Long Recordings • Prevents repeating irrelevant prompts during error correction. • With “Long Recordings” you can end up with this: • “Thank you for calling XYZ, please enter your PIN” • “That was not a correct entry” • “Thank you for calling XYZ, please enter your PIN” With dial-through, dial-ahead and/or barge in, why is this relevant? • With “Short Recordings” the interaction is better • “Thank you for calling XYZ”… “Please enter your PIN” • “That was not a correct entry” • “Please enter your PIN”

  20. Feedback Presents the application state that results from user input, allowing the user to compare original intent with final results • Echoing user input for confirmation • You entered “ABC”, if this is correct, press 1, if you need to try again, press 2 • You said “ABC”, is this correct? • Do not echo menu choices • For technical support press 1… “Technical Support Menu” Can be tedious for experienced users, the feedback can be implied in the follow up prompt “For new product installation support, press 1, for trouble shooting an existing implementation, press 2”

  21. Instructions vs. Help • Instructions • Give information to the user about operating the user interface or understanding the task. • Often terse and to the point. • Given as part of the operation of the system • “At any time during this call, press the # key to go back one menu” • Help • Offer context sensitive corrective action. • Often adopts a separate mode or state aimed at coaching. • Perhaps use a different voice for Help. • Not trying to operate the system but learn about the system. • “This system allows you to retrieve you account information without having to speak to an agent. You will need your Account Number and the Last Four Digits of Your Social Security Number to access your account information.”

  22. Lists, Menus and User Input

  23. Hierarchy vs Skip and Scan Hierarchy • For Sales, press 1…For Support, press 2. • There are four matches…For Jan Smith, press 1…For John Smith, press 2…For Ken Smith press 3. • “Lakeview Terrace”, press 9…”Burn after Reading” ,press 10…”Igor”, press 11. Skip and Scan • Sales. To select this option, press 1. For the next option, press 9. For the previous option, press 7. • Jan Smith. To select this option, press 1. For the next option, press 9. For the previous option, press 7. • “Lakeview Terrace” To select this option, press 1. For the next option, press 9. For the previous option, press 7.

  24. Number of Choices Per Menu • The primacy and recency effects • Designers should also consider the primacy and recency effect that enables users to remember the first and last options most frequently. The recency effect makes the last few items presented in a list the easiest to recall. However, a short disturbance or interference can make it difficult to remember the last few items (Baddeley, 1999). • Most people can only remember 5 choice • Some can remember more, some less • More complex instructions are harder to remember • Older users have more difficulty remembering • 5 items +2, depending on user base and complexity, is a good rule of thumb • Dynamic Menus only present the options that available to the user based on their permissions

  25. Delimiters: To # or not to # • What is that thing (#) called • Pound, Number Sign, Hash, Octothorpe, Square? • Telling them where it is • The # Key is located at the lower right corner of your keypad. • Enter your 4 Digit PIN followed by the # sign? • Why required the # if you know length of the expected input? • Enter your 4 Digit PIN • What to do if they enter # anyway?

  26. Press vs Enter • Use Press when a single digit entry is required • Implies that no Delimiter (#) is Needed • “For Sales, Press 1...” • Use Enter when a multi-digit entry is required • Doesn’t matter if it is a fixed-length entry or a variable-length entry • “Enter your 4-digit PIN Now” • “Enter you PIN, followed by the # key”

  27. Consistent use of keys • [0] should always be for exiting out to human being if one is available. • Don’t hide the existence of human being • Other Key can be used consistently throughout an application • There is no standard for this but it is a design choice • [9] = Always return to the Main Menu • [7] = Jump back One Menu

  28. Other User Inputs • Directional Metaphors • Mnemonics • Alphabetic Input • Two Button – Key then Position • Two Button – Key then Location • Count along the key

  29. DTMFvsSpeech Recognition

  30. DTMF or ASR: Different or Better • DTMF: STRENGTHS • Familiarity • Ubiquity • Speed • Privacy • Efficiency • Availability • Cost • DTMF: WEAKNESSES • Auditory Only • Taxes Working Memory • Limited Input Device • Variability in Equipment • ASR: STRENGTHS • Hands Free in a Mobile World • Flexible • Adaptable • Good for Data Intensive Input • Automated Attendant • Lists • ASR: WEAKNESSES • Cost • Difficult to Recover From Errors • Error Amplification • Regional Issues • Legally Ambiguous

  31. ASR Menus • Don’t mimic DTMF menus • “To Pay with Visa, press 1 or say one” • “To Pay with Visa, press or say 1” • “To Pay with Visa, say Visa” • How about • “What Credit Card Would You Like to Use to Pay for That”

  32. Error Correction in Speech Recognition

  33. The Infinite Loop of Misunderstanding – Part One User : “Call Mom at Home” App : Did you say “Call Mom at Home”? User : “Yes” App : Response Not Understood. Please repeat User : “Call Mom at Home” App : Did you say “Call Mom at Home”? User : “Yes” App : Response Not Understood. Please repeat

  34. The Infinite Loop of Misunderstanding – Part Two App: Did you say “Call Mom at Home?” User: “Yes” App : Did you say “Yes” or “No”? User : “Yes” App : Did you say “Yes” or “No”? User : “Yes” App : Did you say “Yes” or “No”?

  35. Breaking the Loop App: Did you say ‘”Call Mom at Home?” User: “Yes” App : Was that a Yes? Case One User: “Yes” First and second utterances match. So the answer is yes. Case Two User: “No” First and second utterances DO NOT match. So the answer is NO.

  36. Grunt Detection – When All Else Fails App: Thank you for calling ABC Company, what would you like to do? User: “UNGSLDFKJ” App: This system can provide NEWS… User: <no response> App: Weather… User: <no response> App : Sports… User: “UNGSLDFKJ” App: Today’s sporting news…

  37. Implied Yes/No User: “Call Mom at Home” App: Calling Mom at Home User: <no response> PLACE THE CALL User: “Call Mom at Home” App: Calling Mom at Home User: “Yes.” PLACE THE CALL User: “Call Mom at Home” App: Calling Mom at Home User: “Hold it!” DO NOT PLACE THE CALL AND VERIFY

  38. Wrap Up • Telephony User Interfaces are significantly different from visual interfaces • Understanding and Designing to these differences will play a significant role in the success or failure of the system • When implementing ASR, how you handle what the system DOESN’T is often more important than handling what the system DOES understand. • Understand your users • Understand their goals • Build the system they want to use, not the one you want build

  39. References • How to Build a Speech Recognition Application • Balentine & Morgan, 2001 • It’s Better to be a Good Machine than a Bad Person • Balentine, 2007 • Increasing the Usability of Interactive Voice Response Systems: Research and Guidelines for Phone-Based Systems • Scumacher, Hardzinski & Schwarz, 1995 • Skip and Scan: Cleaning up Telephone Interfaces • Resnick & Virzi, 1992 • Effects of Age, Speech Rate, and Environmental Support in Using Telephone Voice Menu Systems • Sharit, Czaja, Nair, Lee, 2003

  40. Dialogic, Dialogic Pro, Brooktrout, Diva, Diva ISDN, Making Innovation Thrive, Video is the New Voice, Diastar, Cantata, TruFax, SwitchKit, SnowShore, Eicon, Eicon Networks, NMS Communications, NMS (stylized), Eiconcard, SIPcontrol, TrustedVideo, Exnet, EXS, Connecting to Growth, Fusion, Vision, PacketMedia, NaturalAccess, NaturalCallControl, NaturalConference, NaturalFax and Shiva, among others as well as related logos, are either registered trademarks or trademarks of Dialogic Corporation or its subsidiaries (“Dialogic”). The names of actual companies and products mentioned herein are the trademarks of their respective owners. Dialogic encourages all users of its products to procure all necessary intellectual property licenses required to implement their concepts or applications, which licenses may vary from country to country. Dialogic may make changes to specifications, product descriptions, and plans at any time, without notice. This document discusses one or more open source products, systems and/or releases. Dialogic is not responsible for your decision to use open source in connection with Dialogic products (including without limitation those referred to herein), nor is Dialogic responsible for any present or future effects such usage might have, including without limitation effects on your products, your business, or your intellectual property rights. USE CASE(S)Any use case(s) shown and/or described herein represent one or more examples of the various ways, scenarios or environments in which Dialogic products can be used.  Such use case(s) are non-limiting and do not represent recommendations of Dialogic as to whether or how to use Dialogic products. 06/10 www.dialogic.com

More Related