Gaussian processes
1 / 19

Gaussian Processes - PowerPoint PPT Presentation

  • Updated On :

Gaussian Processes. Li An [email protected] The Plan. Introduction to Gaussian Processes Revisit Linear regression Linear regression updated by Gaussian Processes Gaussian Processes for Regression Conclusion. Why GPs?. Here are some data points! What function did they come from?

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Gaussian Processes' - libitha

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

The plan l.jpg
The Plan

  • Introduction to Gaussian Processes

  • Revisit Linear regression

    • Linear regression updated by Gaussian Processes

  • Gaussian Processes for Regression

  • Conclusion

Why gps l.jpg
Why GPs?

  • Here are some data points! What function did they come from?

    • I have no idea.

  • Oh. Okay. Uh, you think this point is likely in the function too?

    • I have no idea.

Why gps4 l.jpg
Why GPs?

  • You can’t get anywhere without making some assumptions

  • GPs are a nice way of expressing this ‘prior on functions’ idea.

  • Can do a bunch of cool stuff

    • Regression

    • Classification

    • Optimization

Gaussian l.jpg

  • Unimodal

  • Concentrated

  • Easy to compute with

    • Sometimes

  • Tons of crazy properties

  • Linear regression revisited l.jpg
    Linear Regression Revisited

    • Linear regression model: Combination of M fixed basis functions given by , so that

    • Prior distribution

    • Given training data points , what is the joint distribution of ?

    • is the vector with elements , this vector is given by

      where is the design matrix with elements

    Linear regression revisited7 l.jpg
    Linear Regression Revisited

    • , y is a linear combination of Gaussian distributed variables given by the elements of w, hence itself is Gaussian.

    • Find its mean and covariance

    Definition of gp l.jpg
    Definition of GP

    • A Gaussian process is defined as a probability distribution over functions y(x), such that the set of values of y(x) evaluated at an arbitrary set of points x1,.. Xn jointly have a Gaussian distribution.

      • Probability distribution indexed by an arbitrary set

      • Any finite subset of indices defines a multivariate Gaussian distribution

    • Input space X, for each x the distribution is a Gaussian, what determines the GP is

      • The mean function µ(x) = E(y(x))

      • The covariance function (kernel) k(x,x')=E(y(x)y(x'))

      • In most applications, we take µ(x)=0. Hence the prior is represented by the kernel.

    Linear regression updated by gp l.jpg
    Linear regression updated by GP

    • Specific case of a Gaussian Process

    • It is defined by the linear regression model

      with a weight prior

      the kernel function is given by

    Kernel function l.jpg
    Kernel function

    • We can also define the kernel function directly.

    • The figure show samples of functions drawn from Gaussian processes for two different choices of kernel functions

    Gp for regression l.jpg
    GP for Regression

    Take account of the noise on the observed target values,

    which are given by

    Gp for regression12 l.jpg
    GP for regression

    • From the definition of GP, the marginal distribution p(y) is given by

    • The marginal distribution of t is given by

    • Where the covariance matrix C has elements

    Gp for regression13 l.jpg
    GP for Regression

    • The sampling of data points t

    Gp for regression14 l.jpg
    GP for Regression

    • We’ve used GP to build a model of the joint distribution over sets of data points

    • Goal:

    • To find , we begin by writing down the joint distribution

    Gp for regression15 l.jpg
    GP for Regression

    • The conditional distribution is a Gaussian distribution with mean and covariance given by

    • These are the key results that define Gaussian process regression.

    • The predictive distribution is a Gaussian whose mean and variance both depend on

    Gp for regression17 l.jpg
    GP for Regression

    • The only restriction on the kernel is that the covariance matrix given by

      must be positive definite.

    • GP will involve a matrix of size n*n, for which require computations.

    Conclusion l.jpg

    • Distribution over functions

    • Jointly have a Gaussian distribution

    • Index set can be pretty much whatever

      • Reals

      • Real vectors

      • Graphs

      • Strings

    • Most interesting structure is in k(x,x’), the ‘kernel.’

    • Uses for regression to predict the target for a new input

    Questions l.jpg

    • Thank you!