Understanding Exponential Family Distributions and Iterative Scaling Algorithms in Machine Learning

Lecture 7 Generalized Iterative Scaling Exponential Family Distributions

Iterative Scaling • Two Bounds for convex (concave) functions: Jensen and variational bounds. • We have seen MaxEnt models in the unsupervised setting. Supervised setting: We can again go the discriminative or the generative path. • Discriminative: Conditional random fields. • GIS: a parallel bound optimization algorithm for (conditional) random fields and MaxEnt distributions.

Exponential Family Distr. • ExpFamDistr just like feature representation of undirected graphical models. • Example: multinomial, Bernoulli, Gaussian, Poisson,... • Mean is first derivative of logZ. • Variance is second derivative of logZ • LogZ = Convex function of parameters, one-to-one correspondence between value and derivative. • value = canonical parameters derivative = moments • these representations are duals of each other. • Sufficient statistics determine the parameters values completely. • In case of multiple data cases, their sum is SS. • Next week: ML learning and IRLS.

Understanding Exponential Family Distributions and Iterative Scaling Algorithms in Machine Learning

Understanding Exponential Family Distributions and Iterative Scaling Algorithms in Machine Learning

Presentation Transcript

Lecture 7

Lecture # 7

Lecture 7

Lecture 7

Software Engineering Lecture 7 Lecture # 7

Lecture 7

Lecture # 7

Lecture # 7

Lecture 7

Lecture 7

Lecture 7

Lecture 7

Lecture 7

Lecture 7

Lecture 7

Lecture 7

Software Engineering Lecture 7 Lecture # 7

Lecture 7

LECTURE № 7

Lecture 7

Lecture 7

Lecture 7