accelerated sampling for the indian buffet process n.
Skip this Video
Loading SlideShow in 5 Seconds..
Accelerated Sampling for the Indian Buffet Process PowerPoint Presentation
Download Presentation
Accelerated Sampling for the Indian Buffet Process

Loading in 2 Seconds...

play fullscreen
1 / 12

Accelerated Sampling for the Indian Buffet Process - PowerPoint PPT Presentation

  • Uploaded on

Accelerated Sampling for the Indian Buffet Process. Finale Doshi-Velez and Zoubin Ghahramani ICML 2009 Presented by: John Paisley, Duke University. Introduction.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Accelerated Sampling for the Indian Buffet Process' - athena-alexander

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
accelerated sampling for the indian buffet process

Accelerated Sampling for the Indian Buffet Process

Finale Doshi-Velez and Zoubin Ghahramani

ICML 2009

Presented by: John Paisley, Duke University

  • The IBP is a nonparametric prior for inferring the number of underlying features in a dataset, as well as which subset of these features are used by a given observation, e.g., the number of underlying notes in a piece of music and which notes are used at what times.
  • Gibb sampling currently has some issues: The uncollapsed Gibbs sampler is slow to mix, while the collapsed Gibbs sampler is slow to complete an iteration when the number of samples is large.
  • This paper presents an accelerated Gibbs sampling method for the linear-Gaussian model that is fast in both respects.
the ibp and linear gaussian model
The IBP and Linear-Gaussian Model
  • The IBP story:
    • The first customer walks into the buffet and samples dishes.
    • The Nth customer samples previously tasted dishes with probability proportional to the number of previous customers who have sampled the dish, and samples more dishes.

This prior can be used for the linear-Gaussian model, to infer the number of underlying vectors that are linearly combined to construct a matrix.

ibp calculations for z
IBP Calculations for Z
  • Integrating out A, the posteriors for for Z are calculated for existing features as,

and for new features as,

  • Collapsing out the loadings matrix, A, the likelihood term is,

which significantly increases the amount of computation that needs to be done to calculate the likelihood.

ibp calculations for z1
IBP Calculations for Z
  • When A is not integrated out, inference is faster because matrices do not need to be inverted. Also, when finding the posterior for a value in the nth row of Z, we don’t need to worry about the other rows of Z since the likelihood can be represented as
  • For the linear-Gaussian model, the posterior of A is Gaussian with,
accelerated gibbs sampling
Accelerated Gibbs Sampling
  • Goal: Develop a sampler that mixes like the collapsed sampler, but has the speed of the uncollapsed sampler.
  • To do this, select a window and split the data into two groups.
accelerated gibbs sampling1
Accelerated Gibbs Sampling
  • By splitting the data this way, we can write the probabilities for Z as,
  • We therefore don’t need to worry about X-w when calculating likelihoods.
accelerated gibbs sampling2
Accelerated Gibbs Sampling

We can efficiently compute means and covariances with data removed,

And then efficiently update the mean and covariance using all data using the statistics for the window.

And rank-one updates can be used when updating

experiments synthetic data
Experiments (Synthetic Data)

Generate a linear-Gaussian model from the IBP prior with D = 10.

experiments real data
Experiments (real data)
  • Experiments were run on several real datasets for 500 iterations or 150 hours (max). Data set information is below. Per-iteration run times and performance measures are at right.
discussion and conclusion
Discussion and Conclusion
  • The accelerated Gibbs sampler achieved similar performance with the other two sampling methods, but at a much faster rate.
  • Rank-one updates are less precise (due to round-offs). It is important to sometimes invert the entire matrix.
  • Performance is also faster than slice-sampling. Also, it doesn’t rely on proposals (Metropolis-Hastings) or particle counts (particle filters).
  • Efficient computations were obtained by collapsing locally on a window using posteriors calculated from data outside of this window.