1 / 29

Applications of Support Vector Machines to Speech Recognition

Applications of Support Vector Machines to Speech Recognition. Advisor : Dr. Hsu Graduate : Chun Kai Chen Author: Aravind Ganapathiraju, Jonathan E. Hamaker and Joseph Picone. IEEE 2004. Outline. Motivation Objective Introduction Speech Recognition

scottcbrown
Download Presentation

Applications of Support Vector Machines to Speech Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Applications of Support Vector Machines to Speech Recognition Advisor :Dr. Hsu Graduate: Chun Kai Chen Author: Aravind Ganapathiraju, Jonathan E. Hamaker and Joseph Picone IEEE 2004

  2. Outline • Motivation • Objective • Introduction • Speech Recognition • Support Vector Machines • Experimental Results • Conclusions • Personal Opinion

  3. Motivation • There are problems with an ML formulation for applications such as speech recognition. • Higher dimensional problem will never achieve perfect classification.

  4. Objective • Apply SVMs to overcome higher dimensionalproblems and achieve perfect classification. • Application of SVMs to large vocabulary speech recognition • To the development and optimization of an SVM/HMM hybrid system

  5. Introduction • Speech Recognition • Speech Recognition Process • Hidden Markov Model • Application of SVMs to Speech Recognition: • Review the SVM approach • Discuss applications to speech recognition • Present experimental results

  6. Speech Recognition

  7. Speech Recognition Process (MFCC)

  8. Hidden Markov Model (1/2) • A HMM is a doubly stochastic process with an underlying stochastic process that is not observable (it is hidden) • It is a state transition process described • For speech modeling applications, the HMM is a generator of vector sequences.

  9. Hidden Markov Model (2/2) • Finite-State Machine + Probability Process

  10. HMMs Problems • Maximizing the likelihood (ML) • estimate the parameters that guarantee convergence • Expectation–maximization (EM) • estimation with good convergence properties, although it does not guarantee finding the global maximum • Problems with an ML formulation • will never achieve perfect classification

  11. Global maximum problem

  12. Support Vector Machines

  13. SVM • Support Vector Classification的目標是在高維度的特徵空間中找出一個區分平面(separating hyperplanes )。而此區分平面(separating hyperplanes )可以找出最佳的邊界。 • ERM and SRM be used to find a good hyperplane • ERM: empirical risk minimization • Can be used to find a good hyperplane , although this does not guarantee a unique solution • SRM: structure risk minimization • Can help choose the best hyperplane by ordering the hyperplanes based on the margin • Real-world classification problems • ANNs • attempt overcome many of problems • Slow convergence during training and a tendency to overfit the data.

  14. A hyperplane classifier

  15. Kernels • Allow a dot product to be computed in a higher dimensional space • Linear • Polynomial • Radial basis function (RFB) • Slower than polynomial kernels but better performance • Sigmoid

  16. One-against-all method • yi • are the class assignments • w • represents the weight vector defining the classifier, • b • is a bias term • εi • the ’s arethe slack variables.

  17. Applications to speech recognition • Hybrid approaches • SVMs cannot model the temporal structure of speech effectively. • So, we still need use HMM structure to model temporal evolution • Use NN only to estimate posterior probabilities

  18. Several issues arise • Posterior estimation • Segmental Modeling • N-best List Rescoring

  19. Posterior estimation • There issignificant overlapin the feature space. • SVMs providea distance or discriminate that can beused to compare classifiers. • Main concerns in using SVMs • lack of a clearrelationshipbetween distance from the margin • the posterior classprobability • We used a sigmoid distribution tomapthe output distances to posteriors

  20. Sigmoid

  21. Segmental Modeling (1/2) • At frame-level stillnot computationally feasibleto train on all data availablein the large corpora. • In our work, we have chosen to use a segment-basedapproach to avoid these issues. • Segmental data takes better advantage of the correlation in adjacent frames of speech data. • A related problem is the variable length or duration problem.

  22. Segmental Modeling (2/2) • A simple but effective approach motivated by the three-state HMMs is to assume that the segments are composed of a fixed number of sections. • The first and third sections model the transitioninto and outof the segment • The second section models thestable portionof the segment

  23. The concept of segmental probability model (SPM)

  24. N-best List Rescoring • GenerateN-best lists using HMM system • Alignmentfor each hypothesis in the N-best list using the HMM system. • Segment-level feature vectors are generated from these alignments. • The N-best list is reordered based on the likelihood, andthe top hypothesisis used tocalibratethe performance of the system.

  25. Overview of a hybrid HMM/SVM system

  26. Experimental Results • The Deterding vowel data • Simple but popular static classification task • Used to benchmarknonlinearclassifiers. • Spoken Letters and Numbers • Spoken letters and long distance telephone lines. • OGI Alphadigits (AD) • Confusable for telephone-quality speech (e.g. “p” vs “b”)

  27. Conclusions • A support vector machine as a classifier in a continuous speech recognition system. • A hybrid SVM/HMM system has been developed. • The results obtained in the experiments clearly indicate the classification power of SVMs and affirm the use of SVMs for acoustic modeling. • Further research into the segmentation issue

  28. Personal Opinion • I need study more and more… and I wish god can give me more time

More Related