Kernel methods overview
Download
1 / 24

Kernel methods - overview - PowerPoint PPT Presentation


  • 168 Views
  • Uploaded on

Kernel methods - overview. Kernel smoothers Local regression Kernel density estimation Radial basis functions. Introduction. Kernel methods are regression techniques used to estimate a response function from noisy data Properties:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Kernel methods - overview' - melanie-ashley


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Kernel methods overview
Kernel methods- overview

  • Kernel smoothers

  • Local regression

  • Kernel density estimation

  • Radial basis functions

Data Mining and Statistical Learning - 2008


Introduction
Introduction

Kernel methods are regression techniques used to estimate a response function

from noisy data

Properties:

  • Different models are fitted at each query point, and only those observations close to that point are used to fit the model

  • The resulting function is smooth

  • The models require only a minimum of training

Data Mining and Statistical Learning - 2008


A simple one dimensional kernel smoother
A simple one-dimensional kernel smoother

where

Data Mining and Statistical Learning - 2008


Kernel methods splines and ordinary least squares regression ols
Kernel methods, splines and ordinary least squares regression (OLS)

  • OLS: A single model is fitted to all data

  • Splines: Different models are fitted to different subintervals (cuboids) of the input domain

  • Kernel methods: Different models are fitted at each query point

Data Mining and Statistical Learning - 2008


Kernel weighted averages and moving averages
Kernel-weighted averages and moving averages regression (OLS)

The Nadaraya-Watson kernel-weighted average

where  indicates the window size and the function D shows how the weights change with distance within this window

The estimated function is smooth!

K-nearest neighbours

The estimated function is piecewise constant!

Data Mining and Statistical Learning - 2008


Examples of one dimesional kernel smoothers

Epanechnikov kernel regression (OLS)

Tri-cube kernel

Examples of one-dimesional kernel smoothers

Data Mining and Statistical Learning - 2008


Issues in kernel smoothing
Issues in kernel smoothing regression (OLS)

  • The smoothing parameter λ has to be defined

  • When there are ties at xi : Compute an average y value and introduce weights representing the number of points

  • Boundary issues

  • Varying density of observations:

    • bias is constant

    • the variance is inversely proportional to the density

Data Mining and Statistical Learning - 2008


Boundary effects of one dimensional kernel smoothers
Boundary effects of one-dimensional regression (OLS)kernel smoothers

Locally-weighted averages can be badly biased on the boundaries if the response function has a significant slope apply local linear regression

Data Mining and Statistical Learning - 2008


Local linear regression
Local linear regression regression (OLS)

Find the intercept and slope parameters solving

The solution is a linear combination of yi:

Data Mining and Statistical Learning - 2008


Kernel smoothing vs local linear regression
Kernel smoothing vs local linear regression regression (OLS)

Kernel smoothing

Solve the minimization problem

Local linear regression

Solve the minimization problem

Data Mining and Statistical Learning - 2008


Properties of local linear regression
Properties of local linear regression regression (OLS)

  • Automatically modifies the kernel weights to correct for bias

  • Bias depends only on the terms of order higher than one in the expansion of f.

Data Mining and Statistical Learning - 2008


Local polynomial regression
Local polynomial regression regression (OLS)

  • Fitting polynomials instead of straight lines

    Behavior of estimated response function:

Data Mining and Statistical Learning - 2008


Polynomial vs local linear regression
Polynomial vs local linear regression regression (OLS)

Advantages:

  • Reduces the ”Trimming of hills and filling of valleys”

    Disadvantages:

  • Higher variance (tails are more wiggly)

Data Mining and Statistical Learning - 2008


Selecting the width of the kernel
Selecting the width of the kernel regression (OLS)

Bias-Variance tradeoff:

Selecting narrow window leads to high variance and low bias whilst selecting wide window leads to high bias and low variance.

Data Mining and Statistical Learning - 2008


Selecting the width of the kernel1
Selecting the width of the kernel regression (OLS)

  • Automatic selection ( cross-validation)

  • Fixing the degrees of freedom

Data Mining and Statistical Learning - 2008


Local regression in r p
Local regression in regression (OLS)RP

The one-dimensional approach is easily extended to p dimensions by

  • Using the Euclidian norm as a measure of distance in the kernel.

  • Modifying the polynomial

Data Mining and Statistical Learning - 2008


Local regression in r p1
Local regression in regression (OLS)RP

”The curse of dimensionality”

  • The fraction of points close to the boundary of the input domain increases with its dimension

  • Observed data do not cover the whole input domain

Data Mining and Statistical Learning - 2008


Structured local regression models
Structured local regression models regression (OLS)

Structured kernels (standardize each variable)

Note: A is positive semidefinite

Data Mining and Statistical Learning - 2008


Structured local regression models1
Structured local regression models regression (OLS)

Structured regression functions

  • ANOVA decompositions (e.g., additive models)

    Backfitting algorithms can be used

  • Varying coefficient models (partition X)

  • INSERT FORMULA 6.17

Data Mining and Statistical Learning - 2008


Structured local regression models2
Structured local regression models regression (OLS)

Varying coefficient

models (example)

Data Mining and Statistical Learning - 2008


Local methods
Local methods regression (OLS)

  • Assumption: model is locally linear ->maximize the log-likelihood locally at x0:

  • Autoregressive time series. yt=β0+β1yt-1+…+ βkyt-k+et ->

    yt=ztT β+et. Fit by local least-squares with kernel K(z0,zt)

Data Mining and Statistical Learning - 2008


Kernel density estimation
Kernel density estimation regression (OLS)

  • Straightforward estimates of the density are bumpy

  • Instead, Parzen’s smooth estimate is preferred:

    Normally, Gaussian kernels are used

Data Mining and Statistical Learning - 2008


Radial basis functions and kernels
Radial basis functions and kernels regression (OLS)

Using the idea of basis expansion, we treat kernel functions as basis functions:

where ξj –prototype parameter, λj-scale parameter

Data Mining and Statistical Learning - 2008


Radial basis functions and kernels1
Radial basis functions and kernels regression (OLS)

Choosing the parameters:

  • Estimate {λj,ξj} separately from βj (often by using the distribution of X alone) and solve least-squares.

Data Mining and Statistical Learning - 2008