Expectation-Maximization (EM) Algorithm

1 / 52

Expectation-Maximization (EM) Algorithm

Expectation-Maximization (EM) Algorithm

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

1. Expectation-Maximization (EM) Algorithm Original slides from Tatung University (Taiwan) Edited by: Muneem S.

2. Contents • Introduction • Main Body • Mixture Model • EM-Algorithm on GMM • Appendix  Missing Data

3. EM Algorithm Introduction

4. Introduction • EM is typically used to compute maximum likelihoodestimates given incomplete samples. • The EM algorithm estimates the parameters of a model iteratively. • Starting from some initial guess, each iteration consists of • an E step (Expectation step) • an M step (Maximization step)

5. Applications • Discovering the value of latent variables • Estimating the parameters of HMMs • Estimating parameters of finite mixtures • Unsupervised learning of clusters • Filling in missing data in samples • …

6. EM Algorithm Main Body

7. Maximum Likelihood

8. Latent Variables Incomplete Data Complete Data

9. Complete Data  Complete Data Likelihood

10. Complete Data Complete Data Likelihood A function of latent variable Y and parameter  A function of parameter  A function of random variable Y. The result is in term of random variable Y. Computable If we are given ,

11. Expectation Expectation: Conditional Expectation:

12. Expectation Step Let (i1) be the parameter vector obtained at the (i1)th step. Define (Conditional Expectation of log likelihood of complete data)

13. Maximization Step Let (i1) be the parameter vector obtained at the (i1)th step. Define

14. EM Algorithm Mixture Model

15. Mixture Models • If there is a reason to believe that a data set is comprised of several distinct populations, a mixture model can be used. • It has the following form: with

16. Mixture Models Let yi{1,…, M} represents the source that generates the data.

17. Mixture Models Let yi{1,…, M} represents the source that generates the data.

18. Mixture Models

19. Mixture Models

20. Mixture Models Given x and , the conditional density of ycan be computed.

21. Complete-Data Likelihood Function

22. Expectation g: Guess

23. Expectation g: Guess

24. Expectation Zero when yi l

25. Expectation

26. Expectation

27. Maximization Given the initial guess g, We want to find , to maximize the above expectation. In fact, iteratively.

28. EM Algorithm EM-Algorithm on GMM

29. The GMM (Guassian Mixture Model) Guassian model of a d-dimensional source, say j : GMM with M sources:

30. Goal Mixture Model subject to To maximize:

31. Goal Mixture Model Correlated with l only. Correlated with l only. subject to To maximize:

32. Finding l Due to the constraint on l’s, we introduce Lagrange Multiplier, and solve the following equation.

33. Finding l 1 N 1

34. Finding l

35. Only need to maximize this term Finding l Consider GMM unrelated

36. Only need to maximize this term Finding l Therefore, we want to maximize: How? knowledge on matrix algebra is needed. unrelated

37. Finding l Therefore, we want to maximize:

38. Summary EM algorithm for GMM Given an initial guess g, find new as follows Not converge

39. Susanna Ricco

40. APPENDIX

41. EM Algorithm Example: Missing Data

42.  Univariate Normal Sample Sampling

43.  Maximum Likelihood Sampling We want to maximize it. Given x, it is a function of  and 2

44. Log-Likelihood Function Maximize this instead By setting and

45. Max. the Log-Likelihood Function

46. Max. the Log-Likelihood Function

47.  Miss Data Missing data Sampling

48. E-Step be the estimated parameters at the initial of the tth iterations Let

49. E-Step be the estimated parameters at the initial of the tth iterations Let