1 / 39

Performance Improvement of GMM Computation in Sphinx 3.6

Performance Improvement of GMM Computation in Sphinx 3.6. Arthur Chan Carnegie Mellon University Mar 10, 2005. This seminar. Not very refined. Some info is missing. ~30 slides. Outline: Overview of GMM Computation in Sphinx 3.X (x<5) (<- This part is not new.)

kane
Download Presentation

Performance Improvement of GMM Computation in Sphinx 3.6

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance Improvement of GMM Computation in Sphinx 3.6 Arthur Chan Carnegie Mellon University Mar 10, 2005

  2. This seminar • Not very refined. Some info is missing. • ~30 slides. • Outline: • Overview of GMM Computation in Sphinx 3.X (x<5) (<- This part is not new.) • 3 Improvement with Experimental Results (<- This part is new.) • Discussion

  3. Mechanism of GMM Computation in S3.X(X<5)

  4. Scores Senone Computation Search (Information For Pruning GMM) Computation at every frame in Sphinx

  5. Computation of GMMs in a Continuous HMM ASR system • Order of Computation: • #Frames x #GMMs x #Gaussian x Feature length • Typical Numbers: • #Frames = 1000 • #GMMs = 5000 • #Gaussians = 8 to 64 • Feature length = 39 • Not practical to fully compute them.

  6. Overview of GMM Computation in Sphinx 3.X (x<=5) • Philosophy • No single technique will give the best accuracy/speed trade-off. • Techniques in the literature can be categorized and combined in a systematic manner. • Four Level Categorization of GMM Computation Techniques • Frame-level (Down-sampling) • GMM-level (CI-based GMM Selection) • Gaussian-level (VQ-based and SVQ-based Gaussian Selection) • Component-level (Sub-vector quantization) • 3.4:75-80% speed gain with ~5-10% rel. degradation.

  7. Fast GMM Computation: Level 1: Frame Selection -Compute GMM in one and other frame only -Improvement : Compute GMM only if current frame is similar to previous frame

  8. Fast GMM Computation: Level 2 : Senone/GMM Selection GMM -Compute GMM only when its base-phones are highly likely -Others backed-off by the base phone scores. -Similar to -Julius: (Akinobu 1999) -Microsoft’s Rich Get Richer (RGR) heuristics

  9. Fast GMM Computation:Level 3 : Gaussian Selection Gaussian GMM

  10. Fast GMM Computation: Level 4 : LDA Gaussian Feature Component

  11. Frame-level and GMM-level Techniques in S3.X (X<=5) • Frame-level: • Skipping Frames: • Only compute GMMs for 1 out of N frames • Copied the most recently computed frames. • GMM-level: • Use CI GMM as an approximate score • If a CD GMM has good CI GMM scores (within a beam) • Compute the full CD score • If not • Back off to the CI score. • Good CI GMM scores is defined as • Within the beam of the best CI GMM score.

  12. Weaknesses of the Frame-level and GMM-level Techniques • Frame-level • Deteriorate performance significantly (>10%) • Hard to tune. • GMM-level • The number of GMMs computed varied from frame to frame. • ->Worst case performance is poor • CI score is used to back off • ->Search performance degrades because a lot of scores are the same.

  13. Baseline Experiments

  14. Baseline experiments • Tested on 3 tasks • Tested in a tough condition • Manually tuned • Tune on test set (Sorry, couldn’t get the dev. set.) • Optimized one dimension at a time. • Very close to optimal • Goal • faster. • graceful degradation (<5%)

  15. Tasks evaluated (General Description)

  16. Tasks evaluated (Baseline Speed/Accuracy on 2.2GHz P4)

  17. Proposed Methods

  18. Proposed Methods (A glance) • The goals of the 3 methods • Method 1: Try to reduce the variance of GMM Computation time. • Method 2: Try to make CI-GMMS more well-behaved • Method 2 and a half: Try to make Down-sampling to more well-behaved. • Didn’t work. We will try to analyse why. • Method 3: An idea inspired by the analysis.

  19. Method 1: Use a fixed upper bound for GMMs computed in each frame • Only compute the CD scores if • Corresponding CI is within CI beam AND • The number of CD GMMs computed would not exceed a certain number. • Advantages: • Per utt. GMM computation can be more predictable. • Get a better bargain in trading off computation.

  20. Method 1: Results

  21. Method 2 : Use the best Gaussian index from the previous frame. • Best Gaussian Index: What does it mean? • Index for the best Gaussian score in a GMM. • Why is it useful? • Two major reasons from literature: • 1, In reality, the best Gaussian score dominates the GMM scores. (up to 95-99%) • 2, Usually, the collision rate of the best Gaussian indices in the current and previous frames is quite high. (Literature say 70%) • (Q: Are these assumptions really correct?)

  22. Method 2 (Algorithm) • In CIGMMS, • for those non-computed senone (was backed off to CI) • If the best index of previous frame is available, assume it is the current best index • Compute GMM • This improves the smoothing performance of CIGMMS • Better accuracy • We can use a tighter beam.

  23. Results

  24. Method 2 and a half (Algorithm) • In Frame-Dropping • When last index is available, assume it is the current best index. • Compute GMM.

  25. Results • Not shown • Because there is no improvement • Why better approximation doesn’t give any gain?

  26. Comparison of Different types of GMM Scores Approximation • GMM scores • Use current best index • not plausible because the whole GMM need to compute first. • Use previous score • but the current frame information is not used. • Use previous best index • If the two assumptions is true, this is a good method. • Use corresponding CI score • Replace the CD score by CI score. Hurt the best performed senones

  27. Analysis 1 : Log Likelihood distortion if current index use. (Is assumption 1 correct?)

  28. Analysis 2 : Is the collision rate always 70%? • On average, YES • For the top senones in noisy task, NO • In the ICSI task, the hit rate for the top 50 senones sometimes will drop to 50%

  29. Analysis 3: Relative magnitude of distortioncaused by different approximations • If Distortion by using current index is 1 • In Frame dropping, (significant Degradation): • Distortion by using previous index is • Comm. : 20 (in 2 mix) , 40 (in 32 mix) • ICSI. : 10 (in 2 mix), 20 (in 32 mix) • Distortion by using previous score • Not tested coz I don’t have time. • Ad-hoc observation : < using previous index • but >>better than CI score. • In CI-GMM Selection, not much degradation • But • Distortion by using the CI score is 100 times than using previous index • ~200-1000

  30. Some thoughts • Why Frame dropping doesn’t work if distortion is not low? • Why CI GMM Selection work if distortion is so high? • My Answer: • It doesn’t matter which approximation was used • What it matters is whether the best scores are computed. • CI GMMS still keep the best GMM scores. • Frame dropping always throwing away the N best GMM scores.

  31. Method 3 • Motivations: • At every frame: best senone scores still need to be computed even in frames need to be ignored. • Concerns: • But how to preserve the effectiveness of down-sampling?

  32. Method 3 • Another very simple idea. • Trick: Use CIGMMS for **every** frame. • But for alternative frame, or frames we want to “ignore”, • Multiply a factor F (0<= F <=1) to the CI-GMMS beam.

  33. Idea 3 (Results)

  34. Idea 3 (Discussion) • Advantage of the scheme • Best senone scores are still computed when F> 0 • More tunable • Tightening factor is a real number • Preserve the properties of CI-GMMs and Down-sampling. • When F=0, Equivalent to down-sampling • When F=1, Equivalent to CI-based GMM Selection • A smoothing between Frame-level and Gaussian-level. • Idea is dynamic beam

  35. Summary

  36. Conclusion • Only 20-25% gain obtained in 3 computation improvements. (90% last time) • Pruned and non-pruned conditions are different scenarios • The performance gain of jointly optimizing two levels would give around 5-10% solid gain. • It’s time to leave GMM computation and work some other things.

  37. Side note: Snapshots of Recent Development of Sphinx 3.6 • The use of per frame CI GMM score is still not optimal • Jim, “Why don’t you use lexical retrieval? It’s very easy to implement.” • Still no improvement in search • Alex, “Seriously…… When can you implement a search using lexical tree copies?” • ICSI/CALO Meeting task give us a lot of fun/pain. • Sphinx 3: 20-30% improvement doesn’t always show up. • “Arthur, do you want to say something?” • Some S3 and ST’s functions look really funny/awful. • Yitao, “*Sigh*”. • Dave, Evandro, (Shake their heads)

  38. Acknowledgement • Thanks • Ravi • Alex • Evandro • Dave

  39. Q & A

More Related