1 / 44

A Few of Speech Recognition's Greatest Blunders

A Few of Speech Recognition's Greatest Blunders. David Thomson CTO, SpeechPhone (VoiceXML Tools Committee chair) david@speechphone.com. Over 22 years in the field: some breakthroughs, some disasters. Field Problem Examples. Germs and money User training Echo cancellation

hina
Download Presentation

A Few of Speech Recognition's Greatest Blunders

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Few of Speech Recognition's Greatest Blunders David Thomson CTO, SpeechPhone (VoiceXML Tools Committee chair) david@speechphone.com

  2. Over 22 years in the field: some breakthroughs, some disasters.

  3. Field Problem Examples • Germs and money • User training • Echo cancellation • Inexperienced management • Last-minute "improvements" • User interface testing • Half-duplex speakerphones • Ventilation • Fire safety • Leading the market • Offering too much • Component "upgrade" • Tuning

  4. Chapter: Analog Echo Germs and Money

  5. ATM Speaker Verification Pick up the phone and say the following digit string: 3594. 3594 • Two levels of security: PIN and voiceprint. • Random digit strings protect from recordings.

  6. Chapter: Analog Echo User Training

  7. MovieFone (777-FILM) Hello and welcome to MovieFone... • MovieFone w/ASR • MovieFone was the dominant U.S. movie information service, taking over 80,000,000 calls/year. • ASR overwhelmingly preferred over touch-tone in caller survey. • Users favored menu-based over spontaneous input.

  8. Example MovieLocator Transaction What science fiction movies are playing? Near what city? Wheaton. Near Wheaton, Pirates of the Caribbean is playing at the Ogden 6 theater. What time is it showing? At the Ogden 6 theater, Pirates of the Carribean shows at 7:30. Movie information conversation. The recognizer is designed to understand any reasonable movie information request from the caller.

  9. Would You Use This To Find Movies? never sometimes often always Newspaper 0 8 6 7 Phone the Theater 11 5 4 1 MovieFone 10 10 1 0 MovieLocator 8 5 6 2 Menu-based 3 6 8 4 Total = 22 subjects

  10. ASR vs. Human Attendants ASR: - 96.2% calls routed correctly Receptionists: - 87% calls routed correctly Conditions: Callers were greeted with “How may I direct your call?” and were routed to one of over 30 departments. Accuracy was scored by the customer.

  11. Chapter: Analog Echo Echo cancellation

  12. Echo in an Analog System -11 dBm signal Prompt Generator Telephone Network Tip/Ring Card -15 dB Hybrid -6 dB Echo Canceller -7 dB Line:-9 dB -25 dbm Signal Speech Recognizer Speech: -40 dBm Echo: -33 dBm SNR: -7 dB Low speech signal strength and strong echos generated by the local network card conspire to make speech recognition difficult. Speech is up to 9 dB quieter and echos are about 31 dB louder than in a digital system, for a total signal-to-noise ratio loss of 40 dB.

  13. Chapter: Analog Echo Inexperienced Management

  14. Voice Verification and Dialing • Panic response to competitor. • No initial business case. • Used unproven SV platform. • Heavy use of inexperienced contractors. • Poor budgeting. • Distributed development organization. • Turf battles, technical disagreements, egos. • Changing feature requirements. • Staff of 60, 4 years, $70M.

  15. Chapter: Analog Echo Last-Minute “Improvements”

  16. Heat Sink Failure Epoxy Beads

  17. Chapter: Analog Echo User Interface Testing

  18. Multilingual Digit Dialer Vier drei fünf vier zwei null sechs drei sieben. • Complex user interface • Language dependencies ignored • No testing on naïve users • User errors exceeded ASR errors • System was deployed, then removed

  19. Chapter: Analog Echo Half-Duplex Speakerphones

  20. Telephone Network Name Dialing - Placing a Call (Dial tone) Call home Voice Dialer Calling “home”

  21. What can I do for you now? Half-Duplex Speakerphones Half-Duplex Speakerphone Speaker Prompt Call messages. Speech Recognition System Response ) ) ) ) Microphone Unless user speech can force the handsfree phone to switch off the prompt, the recognition system hears nothing.

  22. Lesson: Record or die

  23. Unmasking Half-Duplex Equipment Ready? OK Speakerphone user Handset user Go. 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10. 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10.

  24. Chapter: Analog Echo Ventilation

  25. Extreme Temperature Environment 120 degrees Frame 1 Frame 2 Airflow Fan Door Vent Hall Window (20 yards) 

  26. A/C Frame cooling example - side view Monitor Monitor Monitor A/C Unit A/C Unit A/C Unit Master PC Master PC Master PC A. Ideal airflow B. Air leaks C. Ducted frame

  27. Improved Airflow

  28. Chapter: Analog Echo Fire Safety - 1

  29. Example of Flammability Failure IR View

  30. Chapter: Analog Echo Fire Safety - 2

  31. Central Office Grade Speech Server LAN Card Photo of CDSUs in a frame: d:\ppt\cdsu.jpg 48V Power

  32. Backplane Current Sense Resistors Sense Resistors

  33. Chapter: Analog Echo Leading the Market

  34. Telco Data Network Wi-Fi Network Wi-Fi Voice Dialing Mobile Device Call David Thomson SoftPhone VoiceDial VoIP Gateway SDK ASR TTS

  35. Chapter: Analog Echo Offering too Much

  36. 1 2 3 4 5 6 7 8 9 * 0 # Connecting 630-555-1212 A service that does everything Business Directory. Movie Locator Messages Shopping Welcome to Lucent Technologies Automated Business Call Dialer. Please say the name of the Business to Call. For information, say ‘help.’ Weather Line VoiceXML Voice E-mail Business Directory Voice Dialing United Airlines. Calling United. To cancel, say ‘cancel.’  Business may subscribe to be listed in this service.

  37. Privacy Manager

  38. Talking Call Waiting http://www.ameritech.com/navigation/site/1,1935,150,00.html • Now, you can HEAR who's behind the call waiting beep. • First, you hear the Call Waiting "beep" and then you hear • the name of the second caller. • Once you've heard the name, you decide if you want to • "click over" and take the call. It's that simple! • Talking Call Waiting is only $2.50 a month if you currently • have Call Waiting on your phone line. • Talking Call Waiting is currently available in our Major • Market areas of: • Chicago, IL • Indianapolis, IN • Detroit, MI • Akron, OH • Cleveland, OH • Columbus, OH • Dayton, OH • Milwaukee, WI or Call to Order Today 1-888-635-5050 $2.50/mo. Talking Call Waiting Instructions

  39. Chapter: Analog Echo Component “Upgrade”

  40. Processor (before die shrink)

  41. Chapter: Analog Echo Tuning

  42. Field Accuracy Improves Over Time Error Rate Land-Line Models Wireless Digit Dialing Trial New Models from Field Data Final Tuning Lab 1st Iteration 2nd Iteration Final

  43. Other Assorted Field Problems • ASR works, forces touch-tone failures • Late beep causes people to speak early • Voice enhancement wrecked spectrum • Failure to record left developers blind • Speech takes the heat for unrelated bugs

  44. For Slides or More Information David Thomson david@speechphone.com Phone 949-655-1693

More Related