1 / 15

Which is the best speech recognizer?

Seneff’s Auditory Model Miriam Cordero Ruiz (SONY Advanced Technology Center Stuttgart) Leuven, july 2002. Which is the best speech recognizer?. Introduction. Auditory System Seneff’s Model Stage I Stage II Conclusions. Human Auditory System. Basilar Membrane. 4kHz.

hartwig
Download Presentation

Which is the best speech recognizer?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Seneff’s Auditory ModelMiriam Cordero Ruiz(SONY Advanced Technology Center Stuttgart)Leuven, july 2002

  2. Which is the best speech recognizer?

  3. Introduction • Auditory System • Seneff’s Model • Stage I • Stage II • Conclusions

  4. Human Auditory System Basilar Membrane 4kHz

  5. Human Auditory System Basilar Membrane 400Hz

  6. Human Auditory System Basilar Membrane Critical Bands (Zwicker)

  7. t t Human Auditory System Inner Hair Cells Neural Mecanichal

  8. Structure of the model ENVELOPE DETECTOR Mean rate spectrum CRITICAL BAND FILTER BANK HAIR CELL SYNAPSE MODEL SYNCHRONY DETECTOR synchrony spectrum STAGE II STAGE I STAGE III

  9. Stage I: Auditory Filter Bank 40 channels (20 - 6700 Hz) BW1channel=0,5 Barks

  10. INITIAL COMPLEX ZEROES ZERO OF CASCADE ZERO OF CASCADE ZERO OF CASCADE ……. RESONATOR RESONATOR RESONATOR CHANNEL 1 CHANNEL 2 CHANNEL 40 Design of the Auditory Filter Bank f(Hz)

  11. Stage II Model Physiological Data Half Wave Rectification Harmonics Firing prob. nerve fiber Short Term Adaptation synchrony reduction smooths saturated stimuli LP Filter Synchrony Automatic Gain Control Refractory Effect < 1kHz

  12. CRITICAL BAND FILTER BANK HALFWAVE RECTIFICATION SHORT-TERM ADAPTATION LOW PASS FILTER RAPID AGC Stages I+II Model Signal Output STAGE I STAGE II

  13. Results

  14. Other Peripheral Models • Patterson-Meddis • Gammatone Filterbank • Lyon’s Cochlear Model • Gammatone Filterbank • Adaptation Stage

  15. Conclusions • Based on biological data • Front-End for Speech Processing • Speech Recognition, Speaker ID, Localization…. • Better performance

More Related