1 / 15

Efficient Gaussian Process Regression for large Data Sets

Efficient Gaussian Process Regression for large Data Sets. ANJISHNU BANERJEE, DAVID DUNSON, SURYA TOKDAR Biometrika , 2008. Introduction.

dolan
Download Presentation

Efficient Gaussian Process Regression for large Data Sets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficient Gaussian Process Regression for large Data Sets ANJISHNU BANERJEE, DAVID DUNSON, SURYA TOKDAR Biometrika, 2008

  2. Introduction We have noisy observations from the unknown function observed at locations respectively The unknown function f is assumed to be a realization of a GP. The prediction for new input x, Where K_{f,f} is n x n covariance matrix. Problem: O(n^3) in performing necessary matrix inversions with n denoting the number of data points

  3. Key idea • Existing solutions : “Knots” or “landmarks” based solutions • Determining the location and spacing of knots, with the choice having substantial impact • Methods have been proposed for allowing uncertain numbers and locations of knots in the predictive process using reversible jump • Unfortunately, such free knot methods increase the computational burden substantially, partially eliminating the computational savings due to a low rank method • Motivated by the literature on compressive sensing, the authors propose an alternative random projection of all the data points onto a lower-dimensional subspace

  4. Nystrom Approximation (Williams and Seeger, 2000)

  5. Nystrom Approximation (Williams and Seeger, 2000)

  6. Landmark based Method • Let X*= Let, Defining An approximation to

  7. Random projection method • The key idea for random projection is to use instead of where Let be the random projection approximation to ,

  8. Some properties • When m = n, So, and we get back the original process with a full rank random projection • Relation to Nystrom approximation • Approximation in the machine learning literature were viewed as reduced rank approximations to the covariance matrices • It is easy to see that corresponds to a Nystrom approximation to

  9. Choice of Ф

  10. Low Distortion embeddings • Embed matrix K from a using random projection matrix • Embeddings with low distortion properties have been well-studied and J-L transforms are among the most popular

  11. Given a fixed rank m what is the near optimal projection for that rank m?

  12. Finding range for a given target error condition

  13. Results • Conditioning numbers • Full covariance matrix for a smooth GP tracked at a dense set of locations will be ill-conditioned and nearly rank-deficient in practice • Inverses may be highly unstable and severely degrade inference

  14. Results • Parameter estimation (Toy data)

  15. Results • Parameter estimation (real data)

More Related