80 likes | 291 Views
Variational Inference for the Indian Buffet Process. Finale Doshi-Velez, Kurt T. Miller, Jurgen Van Gael and Yee Whye Teh AISTATS 2009 Presented by: John Paisley, Duke University, Dept. of ECE. Introduction.
E N D
Variational Inference for the Indian Buffet Process Finale Doshi-Velez, Kurt T. Miller, Jurgen Van Gael and Yee Whye Teh AISTATS 2009 Presented by: John Paisley, Duke University, Dept. of ECE
Introduction • This paper provides variational inference equations for the stick-breaking construction of the Indian buffet process (IBP). In addition, bounds are given on truncated stick-breaking approximations of the IBP to the infinite stick-breaking IBP. • Outline of Presentation • Review of IBP and stick-breaking construction • Variational inference for the IBP • Truncation error bounds for variational inference • Results on a linear-Gaussian model for toy and real data
Indian Buffet Process • First customer selects features • The ith customer selects feature k with probability , fraction of all customers selecting this feature. • The ith customer then selects new features. Below is the probability of the binary matrix Z. The top term is the probability of K dishes, bottom is for permutation.
The Stick-Breaking Construction of the IBP • Rather than marginalizing out ~ , being the probability of selecting a dish, a stick-breaking construction can be used.* (Note: The above generative process is written by the presenter. The probability values are presented in the paper in decreasing order as below) • This stick-breaking representation is for this specific parameterization of the beta distribution. ~ * Y.W. The, D. Gorur & Z. Ghahramani (2007). Stick-breaking construction for the Indian buffet process. 11th AISTAT.
VB Inference for the Stick-Breaking Construction Focus on inference for the parameters A lower bound approximation needs to be made for one of the terms. This is given at right, where the authors introduce a multinomial distribution, q, and optimize for this parameter (lower right). This is for the likelihood of z, the posterior of v is more complicated. Using this multinomial lower bound, “terms decompose independently for each vm and we get a closed form exponential family update.”
Truncation Error for VB Inference Given a truncation of the stick-breaking construction at level K, how close are we to the infinite model? A bound is given using the same motivation as Ishwaran & James* in their calculation for the Dirichlet process. * H. Ishwaran & L.F. James (2001). Gibbs sampling methods for stick-breaking priors. JASA. After deriving approximations, an upper bound is, At right is a comparison of this bound with an estimation of this value using 1000 Monte Carlo simulations for N = 30, \alpha = 5.
Results: Synthetic Data (lower left) Randomly generated data and calculated the log-likelihoods of test data using the inferred models as a function of time. This indicates that variational inference is both better and faster. (right) More information about speed for toy problem.
Results: Two Real Datasets • Yale Faces: 721, 32 x 32 images of 14 people with different expressions and lighting. • Speech Data: 245 observations from 10 microphones and 5 speakers • At right, we can see that the variational inference methods outperforms and is faster than Gibbs sampling for the Yale Faces • Performance and speed is worse for the speech dataset. A reason is that the dataset is only 10 dimensional, while Yale is 1032-D. In this small dimensional case, inference is fast for MCMC and the VB approximation becomes apparent.