Loading in 5 sec....

Kernel methods - overviewPowerPoint Presentation

Kernel methods - overview

- 168 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Kernel methods - overview' - melanie-ashley

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Kernel methods- overview

- Kernel smoothers
- Local regression
- Kernel density estimation
- Radial basis functions

Data Mining and Statistical Learning - 2008

Introduction

Kernel methods are regression techniques used to estimate a response function

from noisy data

Properties:

- Different models are fitted at each query point, and only those observations close to that point are used to fit the model
- The resulting function is smooth
- The models require only a minimum of training

Data Mining and Statistical Learning - 2008

Kernel methods, splines and ordinary least squares regression (OLS)

- OLS: A single model is fitted to all data
- Splines: Different models are fitted to different subintervals (cuboids) of the input domain
- Kernel methods: Different models are fitted at each query point

Data Mining and Statistical Learning - 2008

Kernel-weighted averages and moving averages regression (OLS)

The Nadaraya-Watson kernel-weighted average

where indicates the window size and the function D shows how the weights change with distance within this window

The estimated function is smooth!

K-nearest neighbours

The estimated function is piecewise constant!

Data Mining and Statistical Learning - 2008

Epanechnikov kernel regression (OLS)

Tri-cube kernel

Examples of one-dimesional kernel smoothersData Mining and Statistical Learning - 2008

Issues in kernel smoothing regression (OLS)

- The smoothing parameter λ has to be defined
- When there are ties at xi : Compute an average y value and introduce weights representing the number of points
- Boundary issues
- Varying density of observations:
- bias is constant
- the variance is inversely proportional to the density

Data Mining and Statistical Learning - 2008

Boundary effects of one-dimensional regression (OLS)kernel smoothers

Locally-weighted averages can be badly biased on the boundaries if the response function has a significant slope apply local linear regression

Data Mining and Statistical Learning - 2008

Local linear regression regression (OLS)

Find the intercept and slope parameters solving

The solution is a linear combination of yi:

Data Mining and Statistical Learning - 2008

Kernel smoothing vs local linear regression regression (OLS)

Kernel smoothing

Solve the minimization problem

Local linear regression

Solve the minimization problem

Data Mining and Statistical Learning - 2008

Properties of local linear regression regression (OLS)

- Automatically modifies the kernel weights to correct for bias
- Bias depends only on the terms of order higher than one in the expansion of f.

Data Mining and Statistical Learning - 2008

Local polynomial regression regression (OLS)

- Fitting polynomials instead of straight lines
Behavior of estimated response function:

Data Mining and Statistical Learning - 2008

Polynomial vs local linear regression regression (OLS)

Advantages:

- Reduces the ”Trimming of hills and filling of valleys”
Disadvantages:

- Higher variance (tails are more wiggly)

Data Mining and Statistical Learning - 2008

Selecting the width of the kernel regression (OLS)

Bias-Variance tradeoff:

Selecting narrow window leads to high variance and low bias whilst selecting wide window leads to high bias and low variance.

Data Mining and Statistical Learning - 2008

Selecting the width of the kernel regression (OLS)

- Automatic selection ( cross-validation)
- Fixing the degrees of freedom

Data Mining and Statistical Learning - 2008

Local regression in regression (OLS)RP

The one-dimensional approach is easily extended to p dimensions by

- Using the Euclidian norm as a measure of distance in the kernel.
- Modifying the polynomial

Data Mining and Statistical Learning - 2008

Local regression in regression (OLS)RP

”The curse of dimensionality”

- The fraction of points close to the boundary of the input domain increases with its dimension
- Observed data do not cover the whole input domain

Data Mining and Statistical Learning - 2008

Structured local regression models regression (OLS)

Structured kernels (standardize each variable)

Note: A is positive semidefinite

Data Mining and Statistical Learning - 2008

Structured local regression models regression (OLS)

Structured regression functions

- ANOVA decompositions (e.g., additive models)
Backfitting algorithms can be used

- Varying coefficient models (partition X)
- INSERT FORMULA 6.17

Data Mining and Statistical Learning - 2008

Structured local regression models regression (OLS)

Varying coefficient

models (example)

Data Mining and Statistical Learning - 2008

Local methods regression (OLS)

- Assumption: model is locally linear ->maximize the log-likelihood locally at x0:
- Autoregressive time series. yt=β0+β1yt-1+…+ βkyt-k+et ->
yt=ztT β+et. Fit by local least-squares with kernel K(z0,zt)

Data Mining and Statistical Learning - 2008

Kernel density estimation regression (OLS)

- Straightforward estimates of the density are bumpy
- Instead, Parzen’s smooth estimate is preferred:
Normally, Gaussian kernels are used

Data Mining and Statistical Learning - 2008

Radial basis functions and kernels regression (OLS)

Using the idea of basis expansion, we treat kernel functions as basis functions:

where ξj –prototype parameter, λj-scale parameter

Data Mining and Statistical Learning - 2008

Radial basis functions and kernels regression (OLS)

Choosing the parameters:

- Estimate {λj,ξj} separately from βj (often by using the distribution of X alone) and solve least-squares.

Data Mining and Statistical Learning - 2008

Download Presentation

Connecting to Server..