BUILDING A HIGHLY ACCURATE MANDARIN SPEECH RECOGNIZER

Acoustic segmentation VTLN/CMN/CVN Speaker clustering CW MLP SAT,MLLR, LM3 nonCW MLP SI, qLM3 CW PLP SAT, fMPE, MLLR, LM3 h N-best qLM4 Adaptation qLM4 Adaptation Char-level Confusion Network Combination Final Char Sequence BUILDING A HIGHLY ACCURATE MANDARIN SPEECH RECOGNIZER M.Y. Hwang1, G. Peng1, W. Wang2, A. Faria3, A. Heidel4, M. Ostendorf1 1. Department of Electrical Engineering, University of Washington, Seattle, WA, USA 2. SRI International, Menlo Park, CA, USA 3. ICSI Berkeley, Berkeley, USA 4. Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan Topic-based LM Adaptation Diphthongs for BC Fast Speech Phone-72 vs. Phone 81 on Dev07, nonCW PLP ML, SI, qLM3 Latent Dirichlet Allocation Topic Model One sentence • Training • One topic per sentence. • Train 64 topic-dep. 4-gram LM1 , LM2, … LM64. • Decoding • Top n topics per decoded sentence, where qi’ > threshold. MLP and Pitch Features on Eval04, nonCW ML (Hub4 Training), SI, qLM2 Decoding Architecture • fMPE for the Weaker CW PLP Model • Speaker-independent fMPE applied after speaker-dependent SAT feature transform. • HMM for computing ht is 3500 states x 32 Gaussians/state ML model. • For each frame, compute 5 neighboring frames of Gaussian posteriors: 3500x32x5 = 560K • M is MPE learned. LM Adaptation and CNC • PLP Models Experiments (Character Error Rates) Acoustic Segmentation on Eval06, nonCW MLP, SI, qLM3 • Use Phone-81 pronunciation. • 3500 states x 128 Gaussians/state, decision-tree based state clustering. • Gender-independent. • Cross-word triphone model with SAT feature transform, followed by fMPE feature transform. • MPE trained. Future Work • Topic-dep LM adaptation with fine topics, on either word graphs or word lattices. • Untie /I/ and /IH/ phones for the PLP model.

BUILDING A HIGHLY ACCURATE MANDARIN SPEECH RECOGNIZER

BUILDING A HIGHLY ACCURATE MANDARIN SPEECH RECOGNIZER

Presentation Transcript

Evaluating Speech Separation with a Speech Recognizer

Speech Recognizer Training

Building A Highly Accurate Mandarin Speech Recognizer

An Introduction to Mandarin Speech Recognition

Recognizer Issues

Building a Highly Contagious Church

Segmental GPD training of HMM based speech recognizer

Mandarin Chinese Speech Recognition

Building A Highly Accurate Mandarin Speech Recognizer

Building A Highly Accurate Mandarin Speech Recognizer

Using SONIC to build a speech recognizer

BUILDING A HIGHLY ACCURATE MANDARIN SPEECH RECOGNIZER

ADVANCES IN MANDARIN BROADCAST SPEECH RECOGNITION

Building Accurate Pro-Formas

Building Accurate Pro-Formas

Highly Accurate JD Edwards Customers List

HTS-based Mandarin Text-to-Speech System

Speech Recognizer Training