Loading in 5 sec....

Incomplete Graphical ModelsPowerPoint Presentation

Incomplete Graphical Models

- By
**jabir** - Follow User

- 118 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Incomplete Graphical Models' - jabir

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Incomplete Graphical Models

### Incomplete Graphical Models

Nan Hu

Outline

- Motivation
- K-means clustering
- Coordinate Descending algorithm

- Density estimation
- EM on unconditional mixture

- Regression and classification
- EM on conditional mixture

- A general formulation of EM Algorithm

K-means clustering

Problem: Given a set of observations

how to group them into a set of K clustering, supposing the value of K is given.

- First Phase
- Second Phase

K-means clustering

- Coordinate descent algorithm
- The algorithm is trying to minimize distortion measure J
by setting the partial derivatives to zero

Unconditional Mixture

Problem: If the given sample data demonstrate multimodal densities, how to estimate the true density?

Fit a single density with this bimodal case.

Although algorithm converges, the results bear little relationship to the truth.

Unconditional Mixture

- A “divide-and-conquer” way to solve this problem
- Introducing latent variable Z

Multinomial node taking on one of K values

Z

Assign a density model for each subpopulation, overall density is

X

Back

Unconditional Mixture

- Gaussian Mixture Models
- In this model, the mixture components are Gaussian distributions with parameters

- Probability model for a Gaussian mixture

Unconditional Mixture

- Posterior probability of latent variable Z:
- Log likelihood:

Unconditional Mixture

- Partial derivative of over using Lagrange Multipliers
- Solve it, we have

Unconditional Mixture

- Partial derivative of over
- Setting it to zero, we have

Unconditional Mixture

- Partial derivative of over
- Setting it to zero, we have

Unconditional Mixture

- EM algorithm from expected complete log likelihood point of view
Suppose we observed the latent variables ,

the data set becomes completely observed, the likelihood is defined as the complete log likelihood

Unconditional Mixture

We treat the as random variables and take expectations conditioned on X and .

Note are binary r.v., where

Use this as the “best guess” for , we have

Expected complete log likelihood

Unconditional Mixture

- Minimizing expected complete log likelihood by setting the derivatives to zero, we have

Conditional Mixture

- Graphical Model

For regression and classification

X

The relationship between X and Z can be modeled in a discriminative classification way, e.g. softmax func.

Z

Y

Latent variable Z, multinomial node taking on one of K values

Back

Conditional Mixture

- By marginalizing over Z,
- X is taken to be always observed. The posterior probability is defined as

Conditional Mixture

- Some specific choice of mixture components
- Gaussian components
- Logistic components
Where is the logistic function:

Conditional Mixture

- Parameter estimation via EM
Complete log likelihood :

Use expectation as the “best guess”, we have

Conditional Mixture

- The expected complete log likelihood can then be written as
- Taking partial derivatives and setting them to zero to find the update formula for EM

Conditional Mixture

Summary of EM algorithm for conditional mixture

- (E step): Calculate the posterior probabilities
- (M step): Use the IRLS algorithm to update the parameter , base on data pairs .
- (M step): Use the weighted IRLS algorithm to update the parameters , based on the data points , with weights .

Back

General Formulation

- - all observable variables
- - all latent variables
- - all parameters
Suppose is observed, the ML estimate is

However, is in fact not observed

Complete log likelihood

Incomplete log likelihood

General Formulation

- Suppose factors in some way, complete log likelihood turns to be
- Since is unknown, it’s not clear how to solve this ML estimation. However, if we average over the r.v. of

General Formulation

- Use as an estimate of , complete log likelihood becomes expected complete log likelihood
- This expected complete log likelihood becomes solvable, and hopefully, it’ll also improve the complete log likelihood in some way. (The basic idea behind EM.)

General Formulation

- Given , maximizing is equal to maximizing the expected complete log likelihood

General Formulation

- From above, at every step of EM, we maximized .
- However, how do we know whether the finally maximized also maximized incomplete log likelihood ?

General Formulation

- EM and alternating minimization
- Recall the maximization of the likelihood is exactly the same as minimization of KL divergence between the empirical distribution and the model.
- Including the latent variable , KL divergence comes to be a “complete KL divergence” between joint distributions on .

Summary

- Unconditional Mixture
- Graphic model
- EM algorithm

- Conditional Mixture
- Graphic model
- EM algorithm

- A general formulation of EM algorithm
- Maximizing auxiliary function
- Minimizing “complete KL divergence”

Thank You!

Download Presentation

Connecting to Server..