1 / 18

Viterbi Training

Viterbi Training. It is like Baum-Welsh. Instead of the As and Es , the most probable paths for the training sequences are derived using the Viterbi algorithm. Guaranteed to converge. Maximize. 1. Baum-Welsh Example:. Generating model. Estimated model from 300 rolls of dice.

dalmar
Download Presentation

Viterbi Training

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Viterbi Training • It is like Baum-Welsh. • Instead of the As and Es, the most probable paths for the • training sequences are derived using the Viterbi algorithm. • Guaranteed to converge. • Maximize 1

  2. Baum-Welsh Example: Generating model Estimated model from 300 rolls of dice 2

  3. Estimated model from 30000 rolls of dice 3

  4. Modeling with labeled sequences 4

  5. CML (conditional maximum likelihood) 5

  6. 3.4 HMM model structure • Fully connected model? • Never works in practice due to local maxima • In practice, successful models are constructed based on knowledge about the problem • If we set akl=0, in the Baum-Weltch estimation, akl will remain 0 • How to choose a model with our knowledge?

  7. Duration modeling

  8. Silent States for 200 states, it requires 200*199/2 transitions for 200 non-silent states, it requires around 600 transitions

  9. For HMM without loops consisting entirely of silent states, all HMM algorithms in Section 3.2 and 3.3 could be extended. For forward algorithm: For HMM with loops consisting entirely of silent states, we could eliminate silent states by calculating the effective transition probabilities between real states in the model

  10. 3.5 Higher Order Markov Chains 2nd-order Markov Chain 11

  11. 12

  12. NORF: Non-coding Open Reading Frame 13

  13. 14

  14. Inhomogeneous Markov Chain • Use three different Markov Chains to model coding regions Pr(x) = • n-th order emission probabilities 15

  15. 3.6 Numerical stability of HMM algorithms • To avoid underflow error, two ways to deal with the problem • The log transformation • Scaling of probabilities • The log transformation

  16. Scaling of probabilities

More Related