1 / 37

Theory and Practice

Theory and Practice. Nothing so challenging as a practical problem Nothing so practical as a good theory. Two Themes for Talk. Speech Perception as Pattern Recognition Perception is Multimodal People as Optimal Parallel Processors Relationship between theory and practice

marja
Download Presentation

Theory and Practice

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Theory and Practice • Nothing so challenging as a practical problem • Nothing so practical as a good theory

  2. Two Themes for Talk • Speech Perception as Pattern Recognition • Perception is Multimodal • People as Optimal Parallel Processors • Relationship between theory and practice • Apply theory & technology before it is perfected • Valuable findings in application and evaluation • No shortage of new challenges

  3. Baldi and Language Tutoring • For Hearing-Impaired • Second Language Learning • For Reading Disabled • First Language Learning

  4. Two Principles of Perception • Multimodal Synergy • more sources, better performance • Optimal Parallel Processing • best use of the sources

  5. Two Principles of Learning • Time on Task • more time, better performance • Massed versus Spaced Practice • spaced better than massed

  6. Language Training • Speech Production Deficits • Ear instructs the tongue • Can eye instruct the tongue? • Advantages of Talking Head

  7. Baldi as Language Tutor: Advantages • Computers are popular with kids • One agent for each Student • Perpetual Agent • Extreme Patience • No Intimidation • Can Highlight Critical Organs • Can Hide Noncritical Components • Can Reveal Normally-Hidden Parts

  8. The CSLU Speech Toolkit

  9. CSLU Toolkit Is: • Authoring tools for building and using interactive language systems • Research tools for developing core language technologies and studying human communication • Learning tools supporting for all areas of human language technology (25 tutorials) • Available free ofcharge from CSLU Web site for research and education

  10. CSLU Speech ToolkitFree for Education Uses http://cslu.cse.ogi.edu/toolkit http://mambo.ucsc.edu/psl/tools

  11. Toolkit Computer Requirements Minimum Requirements • Intel Pentium machine • 200 MHz processor • 64 MB Ram (128 preferred) • ~75 MB free disk space • Windows 98 or NT 4.0* • Microphone / Speakers(sound-blaster compat.) *

  12. Main Toolkit Components A programming environment integrating: • Speech Recognition • Text-to-Speech Synthesis • Facial Animation • Natural Language Understanding • Speech Analysis & corpus development • Rapid Application Development

  13. Dialogue Modeling • Rapid Application Developer (RAD) • Graphical drag-and-drop interface • Dialogues are constructed by connecting states into flowcharts • A scripting language (Tcl/Tk) provides flexibility

  14. Rapid Application Developer

  15. Speech Recognition • Word Spotting • vocabulary and speaker independent keyword spotter • Alpha-digit and digit recognizers • Rejection of OOV words using garbage model • Optional grammars

  16. Applying the Technology • Limitations of Current Technology • Speech Recognition by Machine • Collecting kid’s auditory/visual data base • Retrain recognizer

  17. Applying the Technology • Limitations of Current Technology • Speech Synthesis • Sounds Robotic • Teachers asked for natural voice

  18. Forced Alignment of Baldi and Natural Speech

  19. Implications of Research Findings • Research Findings: Ease and Efficiency of Multimodal Perception • Research Hypothesis: Production Training should be Multimodal

  20. Training with Baldi’s New Tongue and Palate

  21. Important Questions forLanguage Training • Can the eye instruct the tongue? • Can the eye/ear in combination do better? • Is valid assessment possible?

  22. Second Language Learning • Urgent Need • Shortage of Instruction • Value of Visible Speech • Individually Guided Instruction • Potential for Interactive Dialog

  23. Learning English

  24. Learning Spanish

  25. Learning to Read • Importance of Spoken Language for reading • Flexibility of Synthetic Speech • Written Language Presentation with Spoken Language • Segmenting and Highlighting Written Patterns • Electronic Textbooks and Books

  26. Learning a First Language • feedback in the crib • virtual caregiver • storybook reader • parent simulator • perpetual companion

  27. Paralinguistic Synthesis • Nonspeech Segments • Breadth Noise, Cough, Clear Throat, Laugh, Lip Smack, Sneeze, Tongue Click, Burp

  28. Paralinguistic Synthesis • Suprasegmental Visual Information • Eye and head movements for reference • Eyebrow movements with pitch • Eye widening with pitch • Eye blinks at word onsets

  29. Happy Angry Surprise Fear Sad Disgust

  30. Baldi is growing a body

  31. Gui for Text Markup Language • Prosody • Emotion • Gesture

More Related