1 / 7

Audio Visual Speech Recognition

Grammar. Dictionary. Decoder. Features extraction. Acoustic models. Audio Visual Speech Recognition. Audio processing. Features extraction Digits detection Digits recognition: Acoustic parameters : MFCC Context independent HMMs Decoding : Time synchronous algorithm

Download Presentation

Audio Visual Speech Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grammar Dictionary Decoder Features extraction Acoustic models Audio Visual Speech Recognition Projet de recherche sur crédit incitatif GET 2005

  2. Audio processing • Features extraction • Digits detection • Digits recognition: • Acoustic parameters : MFCC • Context independent HMMs • Decoding : Time synchronous algorithm • Sound effect • Noise : Babble • Recognition experiments Projet de recherche sur crédit incitatif GET 2005

  3. Video processing • Video extraction • Lips localisation • Images interpolation (same frequency as speech) • Features extraction • DCT and DCT2 (DCT+LDA) • Projections : PRO et PRO2 (PRO+LDA) • Recognition experiments Projet de recherche sur crédit incitatif GET 2005

  4. Fusion techniques • Parameters fusion : • Concatenation • Dimension decrease : Linear Discriminant Analysis (LDA) • Modelisation : classical HMM with one stream • Scores fusion : Multi-stream HMM Projet de recherche sur crédit incitatif GET 2005

  5. Experimental results :parameters fusion Projet de recherche sur crédit incitatif GET 2005

  6. Experimental results : Scores fusion at -5db Projet de recherche sur crédit incitatif GET 2005

  7. Bibliography • G. Potamianos, C. Neti, G. Gravier, A. Garp, A. W. Senior,  « Recent Advances in the Automatic Recognition of Audiovisuel Speech ». In proceedings of IEEE Vol. 91, pages 1306-1326. sept 2003. • J.N. Gowdy, A. Subramanya, C. Bartels, and J. Bilmes, « DBN-Based Multi-Stream Models for Audio-Visual Speech Recognition ». IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, May 2004. Montreal, Canada • F. Brugger, L. Zouari, H. Bredin, A. Ameheaye, G. Chollet, D. Pastor et Y. Ni, « Reconnaissance de la parole audiovisuelle par VMike ». XVIèmes Journées d’Etude sur la Parole JEP. Dinard 2006. Projet de recherche sur crédit incitatif GET 2005

More Related