1 / 36

A New Approach to Utterance Verification Based on Neighborhood Information in Model Space

A New Approach to Utterance Verification Based on Neighborhood Information in Model Space. Author :Hui Jiang , Chin-Hui Lee. Reporter : 陳燦輝. Reference.

rainer
Download Presentation

A New Approach to Utterance Verification Based on Neighborhood Information in Model Space

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A New Approach to Utterance Verification Based on Neighborhood Information in Model Space Author :Hui Jiang, Chin-Hui Lee Reporter : 陳燦輝

  2. Reference • [1] Hui Jiang,  Chin-Hui Lee,  “A new approach to utterance verification based on neighborhood information in model space” ,Speech and Audio Processing, IEEE Transactions on, Vol. 11, No. 5. (2003), pp. 425-434. • [2] H. Jiang, K. Hirose, and Q. Huo, “Robust speech recognition based on Bayesian prediction approach,” IEEE Trans. Speech Audio Processing,vol. 7, pp. 426–440, July 1999. • [3] N. Merhav and C.-H. Lee, “A minimax classification approach with application To robust speech recognition,” IEEE Trans. Speech Audio Processing, vol. 1, pp. 90–100, 1993.

  3. Outline • Introduction • UV based on neighborhood information • Bayes factors : a bayesian tool for verification problems. • Experiments • Summary and Conclusions

  4. Introduction • The major difficulty with likelihood ration test-based in utterance verification is how to model the alternative hypothesis. • It is very important to know the properties of competing source distributions. • In this paper, we are going to investigate a novel idea to perform utterance verification based on neighborhood information in model space.

  5. UV based on neighborhood information Nested neighborhoods in model space :

  6. UV based on neighborhood information (cont) Nested neighborhoods in model space (cont) : Fig. 1. Illustration of the structure of nested neighborhoods in HMM model space.

  7. UV based on neighborhood information (cont) Nested neighborhoods in model space (cont) :

  8. UV based on neighborhood information (cont) Nested neighborhoods in model space (cont) :

  9. UV based on neighborhood information (cont) For a given speech segment X, assume that a ASR system recognizes it as word W which is represented by an HMM model • Traditionally , We usually formulate UV as a statistical hypothesis testing problem. • Here, we translate the above hypothesis testing into the following ones

  10. UV based on neighborhood information (cont) Fig. 2. Illustration of hypothesis testing in the scenario of detecting speech recognition errors based on the neighborhood information.

  11. Bayes factors • The Bayesian approach to hypothesis testing involves the calculation and evaluation of the so-called Bayes factor. • Given the observation X along with two hypotheses and , Bayes factors is computed as

  12. Bayes factors (cont) • In order to use Bayes factors to solve the hypothesis testing problem, i.e. , two important issue must be addressed • How to properly choose prior distribution p(.) of HMM model parameter for each hypothesis. • How to quantitatively define neighborhoods

  13. Bayes factors (cont)

  14. Bayes factors (cont)

  15. Bayes factors (cont)

  16. Bayes factors (cont)

  17. Bayes factors (cont)

  18. Bayes factors (cont)

  19. Bayes factors (cont)

  20. Bayes factors (cont)

  21. Bayes factors (cont)

  22. Bayes factors (cont)

  23. Bayes factors (cont)

  24. Bayes factors (cont) • In this paper, in order to balance contribution from different models in the neighborhood, we introduce an exponential scale factor into the integral calculation. • The exponential scale factor is important equalize the contributions from different models in the neighborhood during the computation of Bayes factor. • If we choose , the models with large likelihood values are emphasized. On the other hand if the models with smaller likelihood values will be put more weight.

  25. Bayes factors (cont)

  26. Bayes factors (cont)

  27. Bayes factors (cont)

  28. Bayes factors (cont)

  29. Experiments • We evaluate proposed methods on Bell Labs communicator system • In our recognition system, we used a 38-dimension feature vector, consisting of 12 Mel LPCCEP, 12 delta CEP, 12 delta-delta CEP, delta and delta-delta log-energy • The acoustic models are state-tied, tri-phone CDHMM models, which consist of roughly 4K distinct HMM states with an average 13.2 Gaussian mixture per state.

  30. Experiments (cont) • A class-based, tri-gram LM including 2600 words is used. • The ASR system achieves 15.8% WER in our independent evaluation set, which includes in total 1395 utterances. • Based on the word and phoneme segmentations generated by the recognizer, we calculate a confidence score for every recognized word.

  31. Experiments (cont) • Baseline system : likelihood ratio test. • New approach with settings in Case I • We choose neighborhood and constrained uniform prior distribution. Since we use static, delta and delta-delta feature, we slightly modify the neighborhood definition in (2) as

  32. Experiments (cont) • New approach with settings in Case I (cont) • For the state-dependent setting , we first set up to a small value, and to a large value. According to (26) we have manually checked the range and • New approach with settings in Case || • We choose delta priors in (27) and (28) in the level of HMM state. • At first, for each distinct state, we calculate its distance from all other states. The distance between two HMM states is computed as the minimum euclidean distance between every possible pair of Gaussian components from these states

  33. Experiments (cont) • New approach with settings in Case || (cont) • For each state, we sort all other states according to their distances form the underlying state. • In the first case, denoted as Case II-A, for each underlying HMM state, we choose neighborhood sizes to include exactly other states in and in • In the second case, denoted as Case II-B, from the top 1500 sorted states, we choose neighborhood sizes for to include all other states with distance less than and one’s distance between and

  34. Experiments (cont) TABLE I VERIFICATION PERFORMANCE COMPARISON (EQUAL ERROR RATE IN %) OF BASELINE UV METHOD (LRT + ANTI-MODELS) WITH THE PROPOSED NEW APPROACH IN SEVERAL DIFFERENT SETTINGS. IN EACH CASE, THE BEST PERFORMANCE OF THE NEW APPROACH AND ITS CORRESPONDING PARAMETER SETTING ARE GIVEN. HERE WE ALWAYS FIX = 1.2

  35. Experiments (cont) Fig. 3. Comparison of ROC curves for different methods when verifying mis-recognized words against correctly recognized words in ASR outputs.

  36. Summary and Conclusions • The basic idea is to assume that all competing models of a given model sit inside one neighborhood of the underlying model. • More research works are still need to search for a better neighborhood definition in high- dimension HMM model space. • Another possible research direction for future works , in stead of Bayes factors, such as generalized likelihood ratio testing (GLRT) can also be used to implement the neighborhood based UV

More Related