1 / 10

VOICE RECOGNITION USING AN HMM BASED DESIGN

PURE Research Symposium Spring 2009. VOICE RECOGNITION USING AN HMM BASED DESIGN. Richard Muryanto and Nicholas Corso Mentored by: Sun Yu. Introduction. In engineering applications voice recognition systems has many diverse uses.

josiah
Download Presentation

VOICE RECOGNITION USING AN HMM BASED DESIGN

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PURE Research Symposium Spring 2009 VOICE RECOGNITION USING AN HMM BASED DESIGN Richard Muryanto and Nicholas Corso Mentored by: Sun Yu

  2. Introduction • In engineering applications voice recognition systems has many diverse uses. • Many schemes exist to implement voice recognition systems(DTW,HMM...) • In this educational project we used Hidden Markov Models to implement a real-time voice recognition system.

  3. System Overview http://labrosa.ee.columbia.edu/doc/HTKBook21/img15.gif

  4. Hidden Markov Models • Hidden Markov Models (HMM) are a way of modeling probabilities involving states of systems that can not directly observed. • HMMs can be characterized in terms of a few key parameters. http://www.info.ucl.ac.be/Research/Areas/Images/RT-Pict-HMM.png

  5. Hidden Markov Models: Cont. • Classically there are three main algorithms associated to HMMs • evaluation\decoding\learning • For an HMM based voice recognition system the Baum-Welch Algorithm is pivotal to the training of the system.

  6. System Implementation Pre-recorded Data VAD MFCC Feature Extraction Training HMM (Baum-Welch) Recorded Data VAD MFCC Compute Likelihood Display Output ML Decision

  7. VAD and MFCC • Voice Activity Detection (VAD) determines which parts of a voice signal are actual data and which are silence. • The VAD algorithm used here utilizes the short-time energy, and zero crossing rate to decide if there is voice activity. • Mel-Frequency Cepstral Coefficients (MFCC) was used to extract characteristic information from the speech vectors.

  8. Observed Data Ability for the System to Recognize Training Data

  9. Possible Extensions • With more time the effects of environment on recognition rate could be investigated. • With further investigation the effects of parameters in the Baum-Welch Algorithm could explored. • A larger word set could be implemented.

  10. References • Ramírez, J.; J. M. Górriz, J. C. Segura (2007). "Voice Activity Detection. Fundamentals and Speech Recognition System Robustness". in M. Grimm and K. Kroschel. Robust Speech Recognition and Understanding. pp. 1–22. • Rabiner, Lawrence R. “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition” in Proceedings of the IEEE. V.77, No.2, February 1989. • Taoran Lu,  Chao Zhang,   Dan Zhu "Recognition by HMM"

More Related