Kernel methods gaussian processes
Download
1 / 15

- PowerPoint PPT Presentation


  • 249 Views
  • Updated On :

Kernel Methods – Gaussian Processes. Presented by Shankar Bhargav. Gaussian Processes. Extending role of kernels to probabilistic discriminative models leads to framework of Gaussian processes Linear regression model Evaluate posterior distribution over W

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '' - chandler


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Kernel methods gaussian processes l.jpg

Kernel Methods – Gaussian Processes

Presented by Shankar Bhargav

Arizona State University DMML


Gaussian processes l.jpg
Gaussian Processes

  • Extending role of kernels to probabilistic discriminative models leads to framework of Gaussian processes

  • Linear regression model

    • Evaluate posterior distribution over W

  • Gaussian Processes: Define probability distribution over functions directly

Arizona State University DMML


Linear regression l.jpg
Linear regression

x - input vector

w – M Dimensional weight vector

Prior distribution of w given by the Gaussian form

Prior distribution over w induces a probability distribution over function y(x)

Arizona State University DMML


Linear regression4 l.jpg
Linear regression

Y is a linear combination of Gaussian distributed variables given by elements of W,

where is the design matrix with elements

We need only mean and covariance to find the joint distribution of Y

where K is the Gram matrix with elements

Arizona State University DMML


Gaussian processes5 l.jpg
Gaussian Processes

  • Defn. : Probability distributions over functions y(x) such that the set of values of y(x) evaluated at an arbitrary set of points jointly have a gaussian distribution

    • Mean is assumed zero

    • Covariance of y(x) evaluated at any two values of x is given by the kernel function

Arizona State University DMML


Gaussian processes for regression l.jpg
Gaussian Processes for regression

To apply Gaussian process models for regression we need to take account of noise on observed target values

Consider noise processes with gaussian distribution

with

To find marginal distribution over ‘t’ we need to integrate over ‘Y’

where covariance matrix C

has elements

Arizona State University DMML


Gaussian processes for regression7 l.jpg
Gaussian Processes for regression

Joint distribution over is given by

Conditional distribution of is a Gaussian distribution with mean and covariance given by

where and is N*N covariance matrix

Arizona State University DMML


Learning the hyperparameters l.jpg
Learning the hyperparameters

  • Rather than fixing the covariance function we can use a parametric family of functions and then infer the parameter values from the data

  • Evaluation of likelihood function where denotes the hyperparameters of Gaussian process model

  • Simplest approach is to make a point estimate of by maximizing the log likelihood function

Arizona State University DMML


Gaussian process for classification l.jpg
Gaussian Process for classification

  • We can adapt gaussian processes to classification problems by transforming the output using an appropriate nonlinear activation function

    • Define Gaussian process over a function a(x), and transform using Logistic sigmoid function ,we obtain a non-Gaussian stochastic process over functions

Arizona State University DMML


Slide10 l.jpg

The left plot shows a sample from the Gaussian process prior over functions a(x). The right plot shows the result of transforming this sample using a logistic sigmoid function.

Probability distribution function over target variable is given by Bernoulli distribution on one dimensional input space

Arizona State University DMML


Gaussian process for classification11 l.jpg
Gaussian Process for classification over functions a(x). The right plot shows the result of transforming this sample using a logistic sigmoid function.

  • To determine the predictive distribution

    we introduce a Gaussian process prior over vector , the Gaussian prior takes the form

    The predictive distribution is given by

    where

Arizona State University DMML


Gaussian process for classification12 l.jpg
Gaussian Process for classification over functions a(x). The right plot shows the result of transforming this sample using a logistic sigmoid function.

  • The integral is analytically intractable so may be approximated using sampling methods.

  • Alternatively techniques based on analytical approximation can be used

    • Variational Inference

    • Expectation propagation

    • Laplace approximation

Arizona State University DMML


Slide13 l.jpg

Illustration of Gaussian process for classification over functions a(x). The right plot shows the result of transforming this sample using a logistic sigmoid function.

Optimal decision boundary – Green

Decision boundary from Gaussian Process classifier - Black

Arizona State University DMML


Connection to neural networks l.jpg
Connection to Neural Networks over functions a(x). The right plot shows the result of transforming this sample using a logistic sigmoid function.

  • For a broad class of prior distributions over w, the distribution of functions generated by a neural network will tend to a Gaussian process as M -> Infinity

  • In this Gaussian process limit the ouput variables of the neural network become independent.

Arizona State University DMML


Thank you l.jpg
Thank you over functions a(x). The right plot shows the result of transforming this sample using a logistic sigmoid function.

Arizona State University DMML