Posterior Regularization for Structured Latent Variable Models

Posterior Regularization for Structured Latent Variable Models Li Zhonghua I2R SMT Reading Group

Outline • Motivation and Introduction • Posterior Regularization • Application • Implementation • Some Related Frameworks

Motivation and Introduction Prior Knowledge We posses a wealth of prior knowledge about most NLP tasks.

Motivation and Introduction--Prior Knowledge

Motivation and Introduction Leveraging Prior Knowledge Possible approaches and their limitations

Motivation and Introduction--Limited Approach Bayesian Approach : Encode prior knowledge with a prior on parameters • Limitation:Our prior knowledge is not about parameters! • Parameters are difficult to interpret; hard to get desired effect.

Motivation and Introduction--Limited Approach Augmenting Model : Encode prior knowledge with additional variables and dependencies. limitation: may make exact inference intractable

Posterior Regularization • A declarative language for specifying prior knowledge -- Constraint Features & Expectations • Methods for learning with knowledge in this language -- EM style learning algorithm

Posterior Regularization

Posterior Regularization Original Objective :

Posterior Regularization EM style learning algorithm

Posterior Regularization Computing the Posterior Regularizer

Application Statistical Word Alignments IBM Model 1 and HMM

Application One feature for each source word m, that counts how many times it is aligned to a target word in the alignment y.

Application Define feature for each target-source position pair i,j . The feature takes the value zero in expectation if a word pair i ,j is aligned with equal probability in both directions.

Application Learning Tractable Word Alignment Models with Complex Constraints CL10

Application • Six language pairs • both types of constraints improve over the HMM in terms of both precision and recall • improve over the HMM by 10% to 15% • S-HMM performs slightly better than B-HMM • S-HMM performs better than B-HMM in 10 out of 12 cases • improve over IBM M4 9 times out of 12

Application

Implementation • http://code.google.com/p/pr-toolkit/

Some Related Frameworks

more info: http://sideinfo.wikkii.com many of my slides get from there Thanks!

Posterior Regularization for Structured Latent Variable Models