one suggestion for choosing the bandwidth is to let

Our final topics this semester are density estimation and nonparametric regression in sections 10.1 and 10.2... • First we’ll estimate a probability density function, f(x), using a sample of size n from the population with that density: X1 , X2 , ... , Xn . The simplest example of a density estimator is the old fashioned histogram. Divide the range of values into intervals (usually equal length) with endpoints a1 < a2 < ... < ak . Then we may estimate f(x) where ai < x <= ai+1 by • use between 5-10 intervals in the histogram, though one recommendation is for interval width • A better method is called the kernel method of nonparametric density estimation; this method basically uses points nearby x to estimate f(x). Let w(z) be a probability density function symmetric around 0 (we’ll use the standard normal) – w is called the kernel with bandwidth D and our kernel estimate of f(x) is

one suggestion for choosing the bandwidth is to let • as with histograms, smaller values of delta will yield estimates that fit more details in the data, while larger values will lead to smoother estimates – our textbook says “one should not be timid about varying delta” – try several values around the suggested values to see which gives a better fitting estimate... • two other methods of bandwidth selection are: • cross-validation (hcv – see below) • Sheather-Jones approach (hsj – see below) • besides the R function density which gives smooth estimates of a univariate distribution of data, there is another package in R that adds several functions of interest to us, including the two bandwidth selectors above. It is called sm (for “smoothing”). Load it in R by library(sm) - if your version of R doesn’t have it available, go to CRAN and download it... I’ll show you how in class • another function in the base package of R that also helps with density estimation is called rug... see the help file and try it on the faithful data...

The use of regression methods to model two-dimensional data is widespread – but the usual parametric assumptions of normality are often not applicable and nonparametric regression attempts to fill the gap... also, we sometimes just want to use a smooth curve (rather than a line) to guide our exploration of a scatterplot • The basic idea is to use the (x,y) data to fit a more general function than a linear one: fit y = m(x) + e instead of y = ax + b + e . Here y is the response, x is the explanatory variable, m(x) is the (unknown) underlying function relating x to y, and e is assumed to be an independent error term with mean=0 and s.d.=s . • As density estimation trys to “average” counts of data locally, so does nonparametric regression; i.e., the estimated curve at x is made of “averages” of the response in a neighborhood of x – local regression. • There are many approaches... a simple kernel method is called the local mean estimator

Here’s the local mean estimator (w is the kernel function, h is the bandwidth – frequently assume w is N(0,h)) • Note that this is just a weighted mean of the response values, the highest weights being assigned to those values of the data closest to the x where m(x) is being estimated. • As before, the smoothing parameter (bandwidth) h controls the width of the kernel and hence the amount of smoothing. When h is the s.d. of a normal kernel, observations within about 3h of x will contribute to the estimate. As h increases, some characteristics of the relationship between x and y are missed and as h decreases, the estimate tracks the data too closely and the result is a “wiggly” estimate. • Another more powerful regression estimator is the local linear estimator – it is implemented in R in the sm package as sm.regression(x,y) and in a similar function called loess in R... we’ll see now how these work...

The local linear regression function sm.regression uses the usual least squares fitting with weights applied by the kernel function with bandwidth h... find the values of a and b that minimize • In sm.regression, the default kernel is the normal kernel with s.d.=h and mean=0. Note that this can be considered as an extension of the usual least squares regresssion in the sense that as h gets larger and larger, the local linear regression approaches usual least squares regression (lm(x,y) = sm.regression(x,y) when h is very large) • The following page gives some R code that will implement the sm.regression function on a dataset of Bowman and Azzalini’s involving carbon dating... Their provide.data function puts the dataset into the workspace...

provide.data(radioc) #Cal.age is the “true” calendar age, #Rc.age is the carbon dated age par(mfrow=c(1,2)) plot(Cal.age,Rc.age); abline(0,1,lty=2) #note the 2 ages don’t agree #look at those points whose calendar #age is between 2000 and 3000 ind=(Cal.age>2000 & Cal.age<3000) #this is a logical function, TRUE when #both are true ind #now subset wrt ind and do l.s. and #local linear regression and compare... cal.age=Cal.age[ind]; rc.age=Rc.age[ind] sm.regression(cal.age,rc.age,h=30) sm.regression(cal.age,rc.age,h=1000, lty=2,add=T)

one suggestion for choosing the bandwidth is to let

one suggestion for choosing the bandwidth is to let

Presentation Transcript

Let No One Deceive You!

Suggestion

Choosing The One Thing That Is Needed

Better intimacy through Communication - One Man's Suggestion

Suggestion for Reform

Bandwidth for the Office

“Let us Edify One Another”

Let The Right One In

Access to Bandwidth: Proposals for Action

Suggestion for Book 3

LET THE BEST ONE WIN! WHO IS THE BEST?

Party Dresses For Girls – Choosing The Perfect One

Guide for choosing the best Xbox One Games

Is Bankruptcy a great suggestion for your needs ?

Let to Know which one is better Lease or Buy iPad kiosk and stands for the presentations?

Kittens for Sale in Singapore - Choosing the One That Is Right for You

Why One To One Tutoring Is Important for Students

The Number One Suggestion for Termite-Free Home

HOME ENTERTAINMENT Unit: Choosing The Right One For You

Choosing the Best Preschool Franchise for Your Little One

Suggestion for discussion

Diamond Necklace | Choosing the Right One