Lecture 11 Generalizations of EM
Last Time • Example of Gaussian mixture model. • E-step: compute sufficient statistics w.r.t. posterior • M-step: maximize Q. • MoG_demo
Generalizations • Map-EM: include prior for parameters. EM computes maximum a-posteriori distribution. • By interchanging the role of X and the parameters we can also compute the maximum likely configuration for P(x). • “Generalized EM” (GEM) we only need to do partial M-steps. • We can apply EM to maximize positive functions of a special form. • We can do partial E-steps as well !
Variational EM (VEM) • EM can be viewed as coordinate ascent on Q(theta,q), where q(y) is a parameterized family of distributions. • Optimal value for q=p(y|x,theta). • But, we don’t even have to be able to include that optimal solution in the allowed family. In this case we maximize a bound on the log-likelihood which still makes sense. • This approximate EM algorithm can be very helpful in making an intractable E-step tractable (at the expense of accuracy). • A simple example is k-means, where we choose q(y) to be a delta peak at a certain mean.