1 / 60

7- Speech Recognition

7- Speech Recognition. Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types. 7- Speech Recognition (Cont’d). HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm

Download Presentation

7- Speech Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 7-Speech Recognition • Speech Recognition Concepts • Speech Recognition Approaches • Recognition Theories • Bayse Rule • Simple Language Model • P(A|W) Network Types

  2. 7-Speech Recognition (Cont’d) • HMM Calculating Approaches • Neural Components • Three Basic HMM Problems • Viterbi Algorithm • State Duration Modeling • Training In HMM

  3. Recognition Tasks • Isolated Word Recognition (IWR) Connected Word (CW) , And Continuous Speech Recognition (CSR) • Speaker Dependent, Multiple Speaker, And Speaker Independent • Vocabulary Size • Small <20 • Medium >100 , <1000 • Large >1000, <10000 • Very Large >10000

  4. Speech Recognition Concepts Speech recognition is inverse of Speech Synthesis Speech Text NLP Speech Processing Speech Synthesis Understanding NLP Speech Processing Speech Phone Sequence Text Speech Recognition

  5. Speech Recognition Approaches • Bottom-Up Approach • Top-Down Approach • Blackboard Approach

  6. Bottom-Up Approach Signal Processing Voiced/Unvoiced/Silence Feature Extraction Segmentation Sound Classification Rules Signal Processing Knowledge Sources Phonotactic Rules Feature Extraction Lexical Access Segmentation Language Model Segmentation Recognized Utterance

  7. Top-Down Approach Inventory of speech recognition units Word Dictionary Task Model Grammar Semantic Hypo thesis Syntactic Hypo thesis Unit Matching System Lexical Hypo thesis Feature Analysis Utterance Verifier/ Matcher Recognized Utterance

  8. Blackboard Approach Acoustic Processes Lexical Processes Black board Environmental Processes Semantic Processes Syntactic Processes

  9. Recognition Theories • Articulatory Based Recognition • Use from Articulatory system for recognition • This theory is the most successful until now • Auditory Based Recognition • Use from Auditorysystem for recognition • Hybrid Based Recognition • Is a hybrid from the above theories • Motor Theory • Model the intended gesture of speaker

  10. Recognition Problem • We have the sequence of acoustic symbols and we want to find the words that expressed by speaker • Solution : Finding the most probable of word sequence by having Acoustic symbols

  11. Recognition Problem • A : Acoustic Symbols • W : Word Sequence • we should find so that

  12. Bayse Rule

  13. Bayse Rule (Cont’d)

  14. Simple Language Model Computing this probability is very difficult and we need a very big database. So we use from Trigram and Bigram models.

  15. Simple Language Model (Cont’d) Trigram : Bigram : Monogram :

  16. Simple Language Model (Cont’d) Computing Method : Number of happening W3 after W1W2 Total number of happening W1W2 AdHoc Method :

  17. Error Production Factor • Prosody (Recognition should be Prosody Independent) • Noise (Noise should be prevented) • Spontaneous Speech

  18. P(A|W) Computing Approaches • Dynamic Time Warping (DTW) • Hidden Markov Model (HMM) • Artificial Neural Network (ANN) • Hybrid Systems

  19. Dynamic Time Warping Method (DTW) • To obtain a global distance between two speech patterns a time alignment must be performed Ex : A time alignment path between a template pattern “SPEECH” and a noisy input “SsPEEhH”

  20. Artificial Neural Network . . . Simple Computation Element of a Neural Network

  21. Artificial Neural Network (Cont’d) • Neural Network Types • Perceptron • Time Delay • Time Delay Neural Network Computational Element (TDNN)

  22. Artificial Neural Network (Cont’d) Single Layer Perceptron . . . . . .

  23. Artificial Neural Network (Cont’d) Three Layer Perceptron . . . . . . . . . . . .

  24. Hybrid Methods • Hybrid Neural Network and Matched Filter For Recognition Acoustic Features Output Units Speech Delays PATTERN CLASSIFIER

  25. Neural Network Properties • The system is simple, But too much iteration is needed for training • Doesn’t determine a specific structure • Regardless of simplicity, the results are good • Training size is large, so training should be offline • Accuracy is relatively good

  26. Hidden Markov Model • Observation : O1,O2, . . . • States in time : q1, q2, . . . • All states : s1, s2, . . ., sN Sj Si

  27. Hidden Markov Model (Cont’d) • Discrete Markov Model Degree 1 Markov Model

  28. Hidden Markov Model (Cont’d) : Transition Probability from Si to Sj ,

  29. Discrete Markov Model Example S1 : The weather is rainy S2 : The weather is cloudy S3 : The weather is sunny cloudy sunny rainy rainy cloudy sunny

  30. Hidden Markov Model Example (Cont’d) Question 1:How much is this probability: Sunny-Sunny-Sunny-Rainy-Rainy-Sunny-Cloudy-Cloudy

  31. Hidden Markov Model Example (Cont’d) The probability of being in state i in time t=1 Question 2:The probability of staying in state Si for d days if we are in state Si? d Days

  32. Discrete Density HMM Components • N : Number Of States • M : Number Of Outputs • A (NxN) : State Transition Probability Matrix • B (NxM): Output Occurrence Probability in each state • (1xN): Initial State Probability : Set of HMM Parameters

  33. Three Basic HMM Problems • Given an HMM and a sequence of observations O,what is the probability ? • Given a model and a sequence of observations O, what is the most likely state sequence in the model that produced the observations? • Given a model and a sequence of observations O, how should we adjust model parameters in order to maximize ?

  34. First Problem Solution We Know That: And

  35. First Problem Solution (Cont’d) Computation Order :

  36. Forward Backward Approach Computing 1) Initialization

  37. Forward Backward Approach (Cont’d) 2) Induction : 3) Termination : Computation Order :

  38. Backward Variable 1) Initialization 2)Induction

  39. Second Problem Solution • Finding the most likely state sequence Individually most likely state :

  40. Viterbi Algorithm • Define : P is the most likely state sequence with this conditions : state i , time t and observation o

  41. Viterbi Algorithm (Cont’d) 1) Initialization Is the most likely state before state i at time t-1

  42. Viterbi Algorithm (Cont’d) 2) Recursion

  43. Viterbi Algorithm (Cont’d) 3) Termination: 4)Backtracking:

  44. Third Problem Solution • Parameters Estimation using Baum-Welch Or Expectation Maximization (EM) Approach Define:

  45. Third Problem Solution (Cont’d) : Expectation value of the number of jumps from state i : Expectation value of the number of jumps from state i to state j

  46. Third Problem Solution (Cont’d)

  47. Baum Auxiliary Function By this approach we will reach to a local optimum

  48. Restrictions Of Reestimation Formulas

  49. Continuous Observation Density • We have amounts of a PDF instead of • We have Mixture Coefficients Variance Average

  50. Continuous Observation Density • Mixture in HMM M1|1 M1|2 M1|3 M2|1 M2|2 M2|3 M3|1 M3|2 M3|3 M4|1 M4|2 M4|3 S2 S3 S1 Dominant Mixture:

More Related