 Download Presentation Hidden Markov Models

Loading in 2 Seconds...

# Hidden Markov Models - PowerPoint PPT Presentation

Hidden Markov Models. A Hidden Markov Model consists of. A sequence of states { X t |t  T } = { X 1 , X 2 , ... , X T } , and A sequence of observations { Y t |t  T } = { Y 1 , Y 2 , ... , Y T }. I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation ## Hidden Markov Models

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
1. Hidden Markov Models

2. A Hidden Markov Model consists of • A sequence of states {Xt|t T} = {X1, X2, ... , XT} , and • A sequence of observations {Yt |tT} ={Y1, Y2, ... , YT}

3. The sequence of states {X1, X2, ... , XT} form a Markov chain moving amongst the M states {1, 2, …, M}. • The observation Yt comes from a distribution that is determined by the current state of the process Xt. (or possibly past observations and past states). • The states, {X1, X2, ... , XT}, are unobserved (hence hidden).

4. Y3 Y1 Y2 YT … X3 X1 X2 XT The Hidden Markov Model

5. Some basic problems: from the observations {Y1, Y2, ... , YT} 1. Determine the sequence of states {X1, X2, ... , XT}. 2. Determine (or estimate) the parameters of the stochastic process that is generating the states and the observations.;

6. Examples

7. Example 1 • A person is rolling two sets of dice (one is balanced, the other is unbalanced). He switches between the two sets of dice using a Markov transition matrix. • The states are the dice. • The observations are the numbers rolled each time.

8. Balanced Dice

9. Unbalanced Dice

10. Example 2 • The Markov chain is two state. • The observations (given the states) are independent Normal. • Both mean and variance dependent on state. HMM AR.xls

11. Example 3 –Dow Jones

12. Daily Changes Dow Jones

13. Hidden Markov Model??

14. Bear and Bull Market?

15. Speech Recognition • When a word is spoken the vocalization process goes through a sequence of states. • The sound produced is relatively constant when the process remains in the same state. • Recognizing the sequence of states and the duration of each state allows one to recognize the word being spoken.

16. The interval of time when the word is spoken is broken into small (possibly overlapping) subintervals. • In each subinterval one measures the amplitudes of various frequencies in the sound. (Using Fourier analysis). The vector of amplitudes Yt is assumed to have a multivariate normal distribution in each state with the mean vector and covariance matrix being state dependent.

17. Hidden Markov Models for Biological Sequence Consider the Motif: [AT][CG][AC][ACGT]*A[TG][GC] Some realizations: A C A - - - A T G T C A A C T A T C A C A C - - A G C A G A - - - A T C A C C G - - A T C

18. .4 A.2 C.4 G.2 T.2 .6 .6 A.8 C G T.2 A C.8 G.2 T A.8 C.2 G T A C1.0 G T A C G.2 T.8 A C.8 G.2 T 1.0 1.0 1.0 .4 1.0 Hidden Markov model of the same motif : [AT][CG][AC][ACGT]*A[TG][GC]

19. Begin End Profile HMMs

20. Computing Likelihood Let pij = P[Xt+1 = j|Xt = i] and P = (pij) = the MM transition matrix. Let = P[X1 = i] and = the initial distribution over the states.

21. Now assume that P[Yt = yt |X1 = i1, X2 = i2, ... , Xt= it] = P[Yt = yt | Xt= it] = p(yt| ) = Then P[X1 = i1,X2 = i2..,XT = iT, Y1 = y1, Y2 = y2, ... , YT = yT] = P[X= i, Y= y] =

22. Therefore P[Y1 = y1, Y2 = y2, ... , YT = yT] = P[Y= y]

23. In the case when Y1, Y2, ... , YT are continuous random variables or continuous random vectors, Let f(y| ) denote the conditional distribution of Yt given Xt= i. Then the joint density of Y1, Y2, ... , YT is given by = f(y1, y2, ... , yT) = f(y) where = f(yt| )

24. Efficient Methods for computing Likelihood The Forward Method Consider

25. The Backward Procedure

26. Prediction of states from the observations and the model:

27. The Viterbi Algorithm (Viterbi Paths) Suppose that we know the parameters of the Hidden Markov Model. Suppose in addition suppose that we have observed the sequence of observations Y1, Y2, ... , YT. Now consider determining the sequence of States X1, X2, ... , XT.

28. Recall that P[X1 = i1,... , XT = iT, Y1 = y1,... , YT = yT] = P[X= i, Y= y] = Consider the problem of determining the sequence of states, i1, i2, ... , iT , that maximizes the above probability. This is equivalent to maximizing P[X = i|Y = y] = P[X = i,Y = y] / P[Y = y]

29. The Viterbi Algorithm We want to maximize P[X= i, Y= y] = Equivalently we want to minimize U(i1, i2, ... , iT) Where U(i1, i2, ... , iT) = -ln (P[X= i, Y= y]) =

30. Minimization of U(i1, i2, ... , iT) can be achieved by Dynamic Programming. • This can be thought of as finding the shortest distance through the following grid of points. • By starting at the unique point in stage 0 and moving from a point in staget to a point in staget+1 in an optimal way. The distances between points in staget and points in staget+1 are equal to:

31. ... Stage 0 Stage 1 Stage 2 Stage T-1 Stage T Dynamic Programming

32. By starting at the unique point in stage 0 and moving from a point in staget to a point in staget+1 in an optimal way. • The distances between points in staget and points in staget+1 are equal to:

33. ... Stage 0 Stage 1 Stage 2 Stage T-1 Stage T Dynamic Programming

34. Dynamic Programming ... Stage 0 Stage 1 Stage 2 Stage T-1 Stage T

35. Let i1 = 1, 2, …, M Then and it+1 = 1, 2, …, M; t = 1,…, T-2

36. Finally

37. Summary of calculations of Viterbi Path 1. i1 = 1, 2, …, M 2. it+1 = 1, 2, …, M; t = 1,…, T-2 3.

38. An alternative approach to prediction of states from the observations and the model: It can be shown that:

39. Backward Probabilities 1. 2. HMM generator (normal).xls

40. Estimation of Parameters of a Hidden Markov Model If both the sequence of observations Y1, Y2, ... , YT and the sequence of States X1, X2, ... , XT is observed Y1 = y1, Y2 = y2, ... , YT = yT, X1 = i1, X2 = i2, ... , XT = iT, then the Likelihood is given by:

41. the log-Likelihood is given by:

42. In this case the Maximum Likelihood estimates are: = the MLE of qi computed from the observations yt where Xt = i.

43. MLE (states unknown) If only the sequence of observations Y1 = y1, Y2 = y2, ... , YT = yT are observed then the Likelihood is given by:

44. It is difficult to find the Maximum Likelihood Estimates directly from the Likelihood function. • The Techniques that are used are 1. The Segmental K-means Algorithm 2. The Baum-Welch (E-M) Algorithm

45. The Segmental K-means Algorithm In this method the parameters are adjusted to maximize where is the Viterbi path