This presentation is the property of its rightful owner.
1 / 32

# EM Algorithm and Mixture of Gaussians PowerPoint PPT Presentation

EM Algorithm and Mixture of Gaussians. Collard Fabien - 20046056 김진식 (Kim Jinsik) - 20043152 주찬혜 (Joo Chanhye) - 20043595. Summary. Hidden Factors EM Algorithm Principles Formalization Mixture of Gaussians Generalities Processing Formalization Other Issues

## Related searches for EM Algorithm and Mixture of Gaussians

EM Algorithm and Mixture of Gaussians

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## EM AlgorithmandMixture of Gaussians

Collard Fabien - 20046056

김진식 (Kim Jinsik) - 20043152

주찬혜 (Joo Chanhye) - 20043595

### Summary

• Hidden Factors

• EM Algorithm

• Principles

• Formalization

• Mixture of Gaussians

• Generalities

• Processing

• Formalization

• Other Issues

• Bayesian Network with hidden variables

• Hidden Markov models

• Bayes net structures with hidden variables

2

Hidden factors

### The Problem : Hidden Factors

• Unobservable / Latent / Hidden

• Make them as variables

• Simplicity of the model

3

162

54

54

486

54

Symptom 1

Symptom 2

Symptom 3

Hidden factors

2

2

2

Smoking

Diet

Exercise

708 Priors !

4

Heart Disease

54

Hidden factors

2

2

2

Smoking

Diet

Exercise

78 Priors

6

6

6

Symptom 1

Symptom 2

Symptom 3

5

EM Algorithm

• Expectation

• Maximization

6

EM Algorithm

### Principles : Generalities

• Given :

• Cause (or Factor / Component)

• Evidence

• Compute :

• Probability in connection table

7

E Step : For each evidence (E),

Use parameters to compute probability distribution

Weighted Evidence :

P(causes/evidence)

M Step : Update the estimates of parameters

Based on weighted evidence

EM Algorithm

### Principles : The two steps

Parameters :

P(effects/causes)

P(causes)

8

EM Algorithm

### Principles : the E-Step

• Perception Step

• For each evidence and cause

• Compute probablities

• Find probable relationships

9

EM Algorithm

### Principles : the M-Step

• Learning Step

• Recompute the probability

• Cause event / Evidence event

• Sum for all Evidence events

• Maximize the loglikelihood

• Modify the model parameters

10

EM Algorithm

### Formulae : Notations

• Terms

•  : underlying probability distribution

• x : observed data

• z : unobserved data

• h : current hypothesis of 

• h’ : revised hypothesis

• q : a hidden variable distribution

• Task : estimate  from X

• E-step:

• M-step:

11

EM Algorithm

### Formulae : the Log Likelihood

• L(h) estimates the fitting of the parameter h to the data x with the given hidden variables z :

• Jensen's inequality for any distribution of hidden states q(z) :

• Defines the auxiliary function A(q,h):

• Lower bound on the log likelihood

• What we want to optimize

12

EM Algorithm

### Formulae : the E-step

• Lower bound on log likelihood :

• H(q) entropy of q(z),

• Optimize A(q,h)

• By distribute data over hidden variables

13

EM Algorithm

### Formulae : the M-step

• Maximise A(q,h)

• By choosing the optimal parameters

• Equivalent to optimize likelihood

14

EM Algorithm

### Formulae : Convergence (1/2)

• EM increases the log likelihood of the data at every iteration

• Kullback-Liebler (KL) divergence

• Non negative

• Equals 0 iff q(z)=p(z/x,h)

15

### Formulae : Convergence (2/2)

• Likelihood increases at each iteration

• Usually, EM converges to a local optimum of L

16

### Problem of likelihood

• Can be high dimensional integral

• Latent variables  additional dimensions

• Likelihood term can be complicated

17

Mixture of Gaussians

### The Issue : Mixture of Gaussian

• Unsupervised clustering

• Set of data points (Evidences)

• Data generated from mixture distribution

• Continuous data : Mixture of Gaussians

• Not easy to handle :

• Number of parameters is Dimension-squared

18

Mixture of Gaussians

### Gaussian Mixture model (2/2)

• Distribution

• Likelihood of Gaussian Distribution :

• Likelihood given a GMM :

• N number of Gaussians

• wi the weight of Gaussian I

• All weights positive

• Total weight = 1

19

### EM for Gaussian Mixture Model

• What for ?

• Find parameters:

• Weights: wi=P(C=i)

• Means: i

• Covariances: i

• How ?

• Guess the priority Distribution

• Guess components (Classes -or Causes)

• Guess the distribution function

20

Mixture of Gaussians

### Processing : EM Initialization

• Initialization :

• Assign random value to parameters

21

Mixture of Gaussians

### Processing : the E-Step (1/2)

• Expectation :

• Pretend to know the parameter

• Assign data point to a component

22

Mixture of Gaussians

### Processing : the E-Step (2/2)

• Competition of Hypotheses

• Compute the expected values of Pij of hidden indicator variables.

• Each gives membership weights to data point

• Normalization

• Weight = relative likelihood of class membership

23

Mixture of Gaussians

### Processing : the M-Step (1/2)

• Maximization :

• Fit the parameter to its set of points

24

Mixture of Gaussians

### Processing : the M-Step (2/2)

• For each Hypothesis

• Find the new value of parameters to maximize the log likelihood

• Based on

• Weight of points in the class

• Location of the points

• Hypotheses are pulled toward data

25

Mixture of Gaussians

### Applied formulae : the E-Step

• Find Gaussian for every data point

• Use Bayes’ rule:

26

Maximize A

For each parameter of h, search for :

Results :

μ*

σ2*

w*

Mixture of Gaussians

### Applied formulae : the M-Step

27

Mixture of Gaussians

### Eventual problems

• Gaussian Component shrinks

• Variance 0

• Likelihood infinite

• Gaussian Components merge

• Same values

• Share the data points

• A Solution : reasonable prior values

28

Other Issues

29

Other Issues

### Hidden Markov models

• Forward-Backward Algorithm

• Smooth rather than filter

30

Other Issues

### Bayes net with hidden variables

• Pretend that data is complete

• Or invent new hidden variable

• No label or meaning

31

### Conclusion

• Widely applicable

• Diagnosis

• Classification

• Distribution Discovery

• Does not work for complex models

• High dimension

•  Structural EM

32