- 291 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Introduction to Smoothing Splines' - albert

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Outline

- Introduction
- Linear and polynomial regression, and interpolation
- Roughness penalties
- Interpolating and Smoothing splines
- Cubic splines
- Interpolating splines
- Smoothing splines
- Natural cubic splines
- Choosing the smoothing parameter
- Available software

Key Words

- roughness penalty
- penalized sum of squares
- natural cubic splines

Motivation

Spline(y18)

Introduction

- Linear and polynomial regression :
- Global influence
- Increasing of polynomial degrees happens in discrete steps and can not be controlled continuously
- Interpolation
- Unsatisfactory as explanations of the given data

Roughness penalty approach

- A method for relaxing the model assumptions in classical linear regression along lines a little different from polynomial regression.

Roughness penalty approach

- Aims of curving fitting
- A good fit to the data
- To obtain a curve estimate that does not display too much rapid fluctuation
- Basic idea: making a necessary compromise between the two rather different aims in curve estimation

Roughness penalty approach

- Quantifying the roughness of a curve
- An intuitive way:

(g: a twice-differentiable curve)

- Motivation from a formalization of a mechanical device: if a thin piece of flexible wood, called a spline, is bent to the shape of the graph g, then the leading term in the strain energy is proportional to

Roughness penalty approach

- Penalized sum of squares
- g: any twice-differentiable function on [a,b]
- : smoothing parameter (‘rate of exchange’ between residual error and local variation)
- Penalized least squares estimator

Roughness penalty approach

Curve for a large value of

Roughness penalty approach

Curve for a small value of

Interpolating and Smoothing Splines

- Cubic splines
- Interpolating splines
- Smoothing splines
- Choosing the smoothing parameter

Cubic Splines

- Given a
- On each interval (a,t1), (t1,t2), …, (tn,b), g is a cubic polynomial
- The polynomial pieces fit together at points ti (called knots) s.t. g itself and its first and second derivatives are continuous at each ti, and hence on the whole [a,b]

Cubic Splines

- How to specify a cubic spline
- Natural cubic spline (NCS) if its second and third derivatives are zero at a and b, which implies d0=c0=dn=cn=0, so that g is linear on the two extreme intervals [a,t1] and [tn,b].

Natural Cubic Splines

Value-second derivative representation

- We can specify a NCS by giving its value and second derivative at each knot ti.
- Define

which specify the curve g completely.

- However, not all possible vectors represent a natural spline!

Natural Cubic Splines

Value-second derivative representation

- Theorem 2.1

The vector and specify a natural spline g if and only if

Then the roughness penalty will satisfy

Natural Cubic Splines

Value-second derivative representation

Natural Cubic Splines

Value-second derivative representation

- R is strictly diagonal dominant, i.e.

R is positive definite, so we can define

Interpolating Splines

- To find a smooth curve that interpolate (ti,zi), i.e. g(ti)=zi for all i.
- Theorem 2.2

Suppose and t1<…

Interpolating Splines

- The natural cubic spline interpolant is the unique minimizer of over S2[a,b] that interpolate the data.
- Theorem 2.3

Suppose g is the interpolant natural cubic spline,

then

Smoothing Splines

- Penalized sum of squares
- g: any twice-differentiable function on [a,b]
- : smoothing parameter (‘rate of exchange’ between residual error and local variation)
- Penalized least squares estimator

Smoothing Splines

1. The curve estimator is necessarily a natural cubic spline with knots at ti, for i=1,…,n.

Proof: suppose g is the NCS

Smoothing Splines

2. Theorem 2.4

Let be the natural cubic spline with knots at ti for which . Then for any in S2[a,b]

Smoothing Splines

3. The Reinsch algorithm

The matrix has bandwidth 5 and is symmetric and strictly positive-definite, therefore it has a Cholesky decomposition

Smoothing Splines

3. The Reinsch algorithm for spline smoothing

Step 1: Evaluate the vector .

Step 2: Find the non-zero diagonals of

and hence the Cholesky decomposition factors L and D.

Step 3: Solve

for by forward and back substitution.

Step 4: Find g by .

Smoothing Splines

4. Some concluding remarks

- Minimizing curve essentially does not depend on a and b, as long as all the data points lie between a and b.
- If n=2, for any , setting to be the straight line through the two points (t1,Y1) and (t2,Y2) will reduce S(g) to zero.
- If n=1, the minimizer is no longer unique, since any straight line through (t1,Y1) will yield a zero value S(g).

Choosing the Smoothing Parameter

- Two different philosophical approaches
- Subjective choice
- Automatic method – chosen by data
- Cross-validation
- Generalized cross-validation

Choosing the Smoothing Parameter

- Cross-validation
- Generalized cross-validation

Available Software

smooth.spline in R

- Description:

Fits a cubic smoothing spline to the supplied data.

- Usage:

plot(speed, dist)

cars.spl <- smooth.spline(speed, dist)

cars.spl2 <- smooth.spline(speed, dist, df=10)

lines(cars.spl, col = "blue")

lines(cars.spl2, lty=2, col = "red")

Available Software

Example 1

library(modreg)

y18 <- c(1:3,5,4,7:3,2*(2:5),rep(10,4))

xx <- seq(1,length(y18), len=201)

(s2 <- smooth.spline(y18)) # GCV

(s02 <- smooth.spline(y18, spar = 0.2))

plot(y18, main=deparse(s2$call), col.main=2)

lines(s2, col = "blue");

lines(s02, col = "orange");

lines(predict(s2, xx), col = 2)

lines(predict(s02, xx), col = 3);

mtext(deparse(s02$call), col = 3)

Available Software

Example 1

Available Software

Example 2

data(cars) ## N=50, n (# of distinct x) =19

attach(cars)

plot(speed, dist, main = "data(cars) & smoothing splines")

cars.spl <- smooth.spline(speed, dist)

cars.spl2 <- smooth.spline(speed, dist, df=10)

lines(cars.spl, col = "blue")

lines(cars.spl2, lty=2, col = "red")

lines(smooth.spline(cars, spar=0.1))

## spar: smoothing parameter (alpha) in (0,1]

legend(5,120,c(paste("default [C.V.] => df =",round(cars.spl$df,1)), "s( * , df = 10)"), col = c("blue","red"), lty = 1:2, bg='bisque')

detach()

Available Software

Example 2

Extensions of Roughness penalty approach

- Semiparametric modeling: a simple application to multiple regression
- Generalized linear models (GLM)
- To allow all the explanatory variables to be nonlinear
- Additive model approach

Reference

- P.J. Green and B.W. Silverman (1994) Nonparametric Regression and Generalized Linear Models. London: Chapman & Hall

Download Presentation

Connecting to Server..