expectation maximization em algorithm n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Expectation-Maximization (EM) Algorithm PowerPoint Presentation
Download Presentation
Expectation-Maximization (EM) Algorithm

play fullscreen
1 / 52

Expectation-Maximization (EM) Algorithm

724 Views Download Presentation
Download Presentation

Expectation-Maximization (EM) Algorithm

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Expectation-Maximization (EM) Algorithm Original slides from Tatung University (Taiwan) Edited by: Muneem S.

  2. Contents • Introduction • Main Body • Mixture Model • EM-Algorithm on GMM • Appendix  Missing Data

  3. EM Algorithm Introduction

  4. Introduction • EM is typically used to compute maximum likelihoodestimates given incomplete samples. • The EM algorithm estimates the parameters of a model iteratively. • Starting from some initial guess, each iteration consists of • an E step (Expectation step) • an M step (Maximization step)

  5. Applications • Discovering the value of latent variables • Estimating the parameters of HMMs • Estimating parameters of finite mixtures • Unsupervised learning of clusters • Filling in missing data in samples • …

  6. EM Algorithm Main Body

  7. Maximum Likelihood

  8. Latent Variables Incomplete Data Complete Data

  9. Complete Data  Complete Data Likelihood

  10. Complete Data Complete Data Likelihood A function of latent variable Y and parameter  A function of parameter  A function of random variable Y. The result is in term of random variable Y. Computable If we are given ,

  11. Expectation Expectation: Conditional Expectation:

  12. Expectation Step Let (i1) be the parameter vector obtained at the (i1)th step. Define (Conditional Expectation of log likelihood of complete data)

  13. Maximization Step Let (i1) be the parameter vector obtained at the (i1)th step. Define

  14. EM Algorithm Mixture Model

  15. Mixture Models • If there is a reason to believe that a data set is comprised of several distinct populations, a mixture model can be used. • It has the following form: with

  16. Mixture Models Let yi{1,…, M} represents the source that generates the data.

  17. Mixture Models Let yi{1,…, M} represents the source that generates the data.

  18. Mixture Models

  19. Mixture Models

  20. Mixture Models Given x and , the conditional density of ycan be computed.

  21. Complete-Data Likelihood Function

  22. Expectation g: Guess

  23. Expectation g: Guess

  24. Expectation Zero when yi l

  25. Expectation

  26. Expectation

  27. Expectation 1

  28. Maximization Given the initial guess g, We want to find , to maximize the above expectation. In fact, iteratively.

  29. EM Algorithm EM-Algorithm on GMM

  30. The GMM (Guassian Mixture Model) Guassian model of a d-dimensional source, say j : GMM with M sources:

  31. Goal Mixture Model subject to To maximize:

  32. Goal Mixture Model Correlated with l only. Correlated with l only. subject to To maximize:

  33. Finding l Due to the constraint on l’s, we introduce Lagrange Multiplier, and solve the following equation.

  34. Finding l 1 N 1

  35. Finding l

  36. Only need to maximize this term Finding l Consider GMM unrelated

  37. Only need to maximize this term Finding l Therefore, we want to maximize: How? knowledge on matrix algebra is needed. unrelated

  38. Finding l Therefore, we want to maximize:

  39. Summary EM algorithm for GMM Given an initial guess g, find new as follows Not converge

  40. Susanna Ricco

  41. APPENDIX

  42. EM Algorithm Example: Missing Data

  43.  Univariate Normal Sample Sampling

  44.  Maximum Likelihood Sampling We want to maximize it. Given x, it is a function of  and 2

  45. Log-Likelihood Function Maximize this instead By setting and

  46. Max. the Log-Likelihood Function

  47. Max. the Log-Likelihood Function

  48.  Miss Data Missing data Sampling

  49. E-Step be the estimated parameters at the initial of the tth iterations Let

  50. E-Step be the estimated parameters at the initial of the tth iterations Let