1 / 25

Radboud University Nijmegen

How to integrate automatic speech recognition (ASR) into CALL applications Helmer Strik Department of Linguistics Centre for Language and Speech Technology (CLST) Radboud University Nijmegen, The Netherlands. Radboud University Nijmegen. Overview. Introduction ASR: automatic speech recognition

Download Presentation

Radboud University Nijmegen

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. How to integrateautomatic speech recognition (ASR) into CALL applicationsHelmer StrikDepartment of LinguisticsCentre for Language and Speech Technology (CLST)Radboud University Nijmegen, The Netherlands Radboud University Nijmegen

  2. Overview • Introduction • ASR: automatic speech recognition • ASR-based tutoring • ASR-based CALL • ASR-based literacy training • Conclusions LESLLA, Antwerpen, 24-11-2008

  3. Introduction • Students who receive 1-on-1 instruction perform as well as the top two percent of students who receive traditional classroom instruction [Bloom 1984] • A human tutor for every student is not feasible  computer tutors For language learning: CALL Many text-based CALL systems Include speech  speech-based CALL system LESLLA, Antwerpen, 24-11-2008

  4. Speech inside • Many applications with ‘speech’: • Screen readers [#] • Reading pen • Mobile phone: photo + OCR + TTS • Some also (useful) for CALL • [#] LESLLA, Antwerpen, 24-11-2008

  5. Speech inside (cont’d) • Many applications with ‘speech’ Screen readers, reading pen, etc. • Some also (useful) for CALL • However, usually the learner can • only listen (TTS: text-to-speech) • or, also speak, but … • no assessment, or • the learner has to carry out the assessment, e.g. by comparing with examples •  use ASR / speech technology • Is it feasible? LESLLA, Antwerpen, 24-11-2008

  6. ASR: automatic speech recognition • What is ASR? • Speech to text conversion • Applications: • Dictation • Command and control • Spoken dialogue systems (information) • etc. • ASR is not flawless, and it will probably never be • esp. for non-native speech • Note: this is not even the case for humans! LESLLA, Antwerpen, 24-11-2008

  7. cgn2-s vb Speech Recognition mii nn LESLLA, Antwerpen, 24-11-2008

  8. ASR-based tutoring • ITS: Intelligent Tutoring Systems • Spoken dialogue system for learning • Subject matter: math, physics, etc. • Examples: • ITSPOKE, Univ. of Pittsburgh, Litman et al. Topic: Physics • SCoT, Stanford Univ., Peters et al. Topic (SCoT-DC): shipboard damage control • Communicate with speech • the subject matter doesn’t have to be speech LESLLA, Antwerpen, 24-11-2008

  9. ASR-based CALL • The subject matter is speech (language) • Late 1990’s: • 1998: STiLL, Marholmen (Sweden); 1st time the CALL and Speech communities met • 1999: Special Issue of CALICO, 'Tutors that Listen‘, focusing on ASR (mainly ‘discrete ASR’) LESLLA, Antwerpen, 24-11-2008

  10. ASR-based literacy training • What has been done? • Reading tutors (the learner reads, not the PC): • Listen, CMU, Pittsburgh; Mostow et al. (1994) • STAR system, UK; Russel et al. (1996) • SPACE, KU Leuven; Van hamme, Duchateau, et al. • … and many others [#] • FtL: Foundations to Literacy, Boulder; Cole, Wise, et al. LESLLA, Antwerpen, 24-11-2008

  11. ASR-based literacy training • Foundations to Literacy • Interactive Books • Teach fluent reading & comprehension • Foundational Skills Tutors • Teach underlying reading skills Phonics LESLLA, Antwerpen, 24-11-2008

  12. ASR-based literacy training (cont’d) • What has been done? • Reading tutors: • Listen, CMU, Pittsburgh; Mostow et al. (1994) • STAR system, UK; Russel et al. (1996) • SPACE, KU Leuven; Van hamme, Duchateau, et al. • …, and many others • FtL: Foundations to Literacy, Boulder; Cole, Wise, et al. • Mostly for children • And for adults? • What is needed? • What is possible, and what is not? • … LESLLA, Antwerpen, 24-11-2008

  13. ASR-based CALL • ASR is not flawless, and it will probably never be • esp. for non-native speech • Be aware of what is (not) possible with ASR technology • Problematic issues and possible solutions: • Noise, esp. background speech  min., head-sets • Disfluencies  min., improve autom. handling • Non-native pronunciation • Recognizing utterances  utterance verification • Detect pronunciation errors  classifiers LESLLA, Antwerpen, 24-11-2008

  14. ASR-based CALL • Our research: • Non-natives • Assessment of oral proficiency • Dutch-CAPT – pronunciation • ASR / UV – Utterance Verification • PED – Pronunciation Error Detection • DISCO – pronunciation, morphology, syntax • TST-AAP • People with speech disability for training & as communication aid (AAC) • ASR for dysarthric speech • EST: E-learning based Speech Therapy LESLLA, Antwerpen, 24-11-2008

  15. ASR-based CALL • Project Dutch-CAPT • (Computer Assisted Pronuciation Training) LESLLA, Antwerpen, 24-11-2008

  16. LESLLA, Antwerpen, 24-11-2008

  17. ASR-based CALL (cont’d) • Project Dutch-CAPT • (CAPT: Computer Assisted Pronuciation Training) • Exp. group: used the Dutch-CAPT system • 2 control groups: didn’t use Dutch-CAPT • The reduction in the number of pronunciation errors made was significantly larger for the exp. group, • Training: 4 weeks x 1 session of 30’ – 60’ LESLLA, Antwerpen, 24-11-2008

  18. ASR-based CALL (cont’d) • ASR is not flawless, and it will probably never be • esp. for non-native speech • Be aware of what is (not) possible with ASR technology • Problematic issues and possible solutions: • Noise, esp. background speech  min., head-sets • Disfluencies  min., improve autom. handling • Non-native pronunciation • Recognizing utterances  utterance verification • Detect pronunciation errors  classifiers • Mix of expertise needed: • ASR techn., L-acq., pedagogy, design, … LESLLA, Antwerpen, 24-11-2008

  19. ASR-based literacy training • Demonstration project TST-AAP • Existing course • Add speech technology: • Detect whether words & sounds were pronounced (correctly) LESLLA, Antwerpen, 24-11-2008

  20. ASR-based literacy training • Listening; PC: produces speech • Text-To-Speech (TTS); quality good enough? • Recorded speech, concatenation • Speaking; PC: recognizes speech • Phonics (see FtL) • PC: Recognize words, utterances: CMs for Utt. Ver. • PC: Recognize sounds: CMs for Phon. Ver. (contrasts) • Reading (reading tutors) • PC: Recognize words, utterances • PC: Pointer in the text (‘track’ the reader) • PC: Help when encountering problems • PC: Change tempo  read faster LESLLA, Antwerpen, 24-11-2008

  21. ASR-based CALL • Advantages of using speech (vs. writing) • Self-explanation • Extra information: • Prosody (stress, accent) • Emotions • Confidence • Other useful techniques: • VTH [#] LESLLA, Antwerpen, 24-11-2008

  22. Conclusions • ASR is not flawless • ASR-based tutoring is possible (restricted domain) • general topics; ITS: ITSPOKE, SCoT • CALL; many systems: non-natives, disabled, etc. • Literacy training • So far mainly for children • And for adults !? • Needed • Mix of expertise: techn., L-acq., pedagogy, design, … • Improved ASR, speech technology • Projects, funds LESLLA, Antwerpen, 24-11-2008

  23. Questions? • Why are there so few ASR-based CALL / literacy applications for adults? • What are, in this context, important differences between children & adults? • What is needed? • Listening; PC: produces speech • Speaking; PC: recognizes speech • Phonics • Reading (reading tutors) • What else? THE END LESLLA, Antwerpen, 24-11-2008

  24. Questions? • Why are there so few ASR-based CALL / literacy applications for adults? • What are, in this context, important differences between children & adults? • What is needed? • Listening; PC: produces speech • Speaking; PC: recognizes speech • Phonics • Reading (reading tutors) • What else? LESLLA, Antwerpen, 24-11-2008

  25. LESLLA, Antwerpen, 24-11-2008

More Related