550 likes | 611 Views
Explore Probabilistic models, Bayesian inference, Variational Inference, Message Passing with an example in Robotics and Vision. Learn about optimizing Bayesian networks using graphical models.
E N D
Variational Inference and Variational Message Passing John WinnMicrosoft Research, Cambridge 12th November 2004 Robotics Research Group, University of Oxford
Overview • Probabilistic models & Bayesian inference • Variational Inference • Variational Message Passing • Vision example
Overview • Probabilistic models & Bayesian inference • Variational Inference • Variational Message Passing • Vision example
Object class C P(C) Surface colour Lighting colour L S P(L) P(S|C) Image colour I P(I|L,S) Bayesian networks • Directed graph • Nodes represent variables • Links show dependencies • Conditional distributions at each node • Defines a joint distribution: P(C,L,S,I)=P(L) P(C) P(S|C) P(I|L,S)
Bayesian inference Object class C Observed variables V and hidden variables H. Hidden Surface colour Lighting colour Hidden variables includeparameters and latent variables. L S Learning/inference involves finding: Image colour I P(H1, H2…| V) Observed
Bayesian inference vs. ML/MAP • Consider learning one parameter θ • How should we represent this posterior distribution?
Bayesian inference vs. ML/MAP • Consider learning one parameter θ Maximum of P(V| θ) P(θ) P(V| θ) P(θ) θ θMAP
High probability mass Bayesian inference vs. ML/MAP • Consider learning one parameter θ High probability density P(V| θ) P(θ) θ θMAP
Bayesian inference vs. ML/MAP • Consider learning one parameter θ P(V| θ) P(θ) θ θML Samples
Bayesian inference vs. ML/MAP • Consider learning one parameter θ P(V| θ) P(θ) θ Variational approximation θML
Overview • Probabilistic models & Bayesian inference • Variational Inference • Variational Message Passing • Vision example
Variational Inference (in three easy steps…) • Choose a family of variational distributions Q(H). • Use Kullback-Leibler divergence KL(Q||P) as a measure of ‘distance’ between P(H|V) and Q(H). • Find Q which minimises divergence.
Q Minimising KL(P||Q) Q P KL Divergence Exclusive Minimising KL(Q||P) P Inclusive
Minimising the KL divergence • For arbitrary Q(H) fixed maximise minimise where • We choose a family of Q distributions where L(Q) is tractable to compute.
Minimising the KL divergence KL(Q || P) maximise ln P(V) fixed L(Q)
Minimising the KL divergence KL(Q || P) maximise ln P(V) fixed L(Q)
Minimising the KL divergence KL(Q || P) maximise ln P(V) fixed L(Q)
Minimising the KL divergence KL(Q || P) maximise ln P(V) fixed L(Q)
Minimising the KL divergence KL(Q || P) maximise ln P(V) fixed L(Q)
Optimal solution for one factor given by Factorised Approximation • Assume Q factorises No further assumptions are required!
Example: Univariate Gaussian • Likelihood function • Conjugate prior • Factorized variational distribution
Overview • Probabilistic models & Bayesian inference • Variational Inference • Variational Message Passing • Vision example
Variational Message Passing • VMP makes it easier and quicker to apply factorised variational inference. • VMP carries out variational inference using local computations and message passing on the graphical model. • Modular algorithm allows modifying, extending or combining models.
Local Updates For factorised Q, update for each factor depends only on the Markov blanket: Updates can be carried out locally at each node.
For example, the Gaussian distribution T mg é ù é ù g X m g = + - gm + 2 ln P ( X | , ) ln 0 1 1 ê ú ê ú 2 2 - g p 2 / 2 X 2 ë û ë û VMP I: The Exponential Family • Conditional distributions expressed in exponential family form. = + + T ln P ( X | θ ) θ u ( X ) g ( θ ) f ( X ) ‘natural’ parameter vector sufficient statistics vector
= + + T ln P ( X | θ ) θ u ( X ) g ( θ ) f ( X ) = + + T ln P ( Z | X , Y ) φ ( Y , Z ) u ( X ) g ' ( X ) f ' ( Y , Z ) VMP II: Conjugacy • Parents and children are chosen to be conjugate i.e. same functional form X Y same Z • Examples: • Gaussian for the mean of a Gaussian • Gamma for the precision of a Gaussian • Dirichlet for the parameters of a discrete distribution
= + + T ln P ( X | θ ) θ u ( X ) g ( θ ) f ( X ) • Parent to child (X→Z) = + + T ln P ( Z | X , Y ) φ ( Y , Z ) u ( X ) g ' ( X ) f ' ( Y , Z ) • Child to parent (Z→X) VMP III: Messages • Conditionals • Messages X Y Z
VMP IV: Update • Optimal Q(X) has same form as P(X|θ) but with updated parameter vector θ* Computed from messages from parents • These messages can also be used to calculate the bound on the evidence L(Q) – see Winn & Bishop, 2004.
VMP Example • Learning parameters of a Gaussian from N data points. γ μ mean precision (inverse variance) x data N
VMP Example Message from γ to all x. γ μ need initialQ(γ) x N
VMP Example Messages from each xnto μ. γ μ x N Update Q(μ) parameter vector
VMP Example Message from updated μ to all x. γ μ x N
VMP Example Messages from each xnto γ. γ μ x N Update Q(γ) parameter vector
Features of VMP • Graph does not need to be a tree – it can contain loops (but not cycles). • Flexible message passing schedule – factors can be updated in any order. • Distributions can be discrete or continuous, multivariate, truncated (e.g. rectified Gaussian). • Can have deterministic relationships (A=B+C). • Allows for point estimates e.g. of hyper-parameters
VMP Software: VIBES Free download from vibes.sourceforge.net
Overview • Probabilistic models & Bayesian inference • Variational Inference • Variational Message Passing • Vision example
Flexible sprite model Proposed by Jojic & Frey (2001) Set of images e.g. frames from a video x N
Flexible sprite model f π Sprite appearance and shape x N
Flexible sprite model f π Sprite transform for this image (discretised) T m x Mask for this image N
Flexible sprite model b f π Background T m Noise β x N
VMP π f b T m β x N Winn & Blake (NIPS 2004)
Results of VMP on hand video Original video Learned appearance and mask Learned transforms (i.e. motion)
Conclusions • Variational Message Passing allows approximate Bayesian inference for a wide range of models. • VMP simplifies the construction, testing, extension and comparison of models. • You can try VMP for yourself vibes.sourceforge.net