The examination of residuals
Download
1 / 67

The Examination of Residuals - PowerPoint PPT Presentation


  • 240 Views
  • Updated On :

The Examination of Residuals. where is an observation and is the corresponding fitted value obtained by use of the fitted model. The residuals are defined as the n differences :.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The Examination of Residuals' - xue


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Slide2 l.jpg

  • The residuals are defined as the n differences :


Slide3 l.jpg

  • Namely; the random departures are assumed

  • i) to have zero mean,

  • ii) to have a constant variance, s2,

  • iii) independent, and

  • iv) follow a normal distribution.


Slide4 l.jpg

Thus if the fitted model is correct, nonlinear regression analysis are based certain assumptions about the random departures from the proposed model.

  • the residuals should exhibit tendencies that tend to confirm the above assumptions, or at least, should not exhibit a denial of the assumptions.


Slide5 l.jpg

1. Overall. nonlinear regression analysis are based certain assumptions about the random departures from the proposed model.

The principal ways of plotting the residuals ei are:

2. In time sequence, if the order is known.

3. Against the fitted values

4. Against the independent variables xij for each value of j

In addition to these basic plots, the residuals should also be plotted

5. In any way that is sensible for the particular problem under consideration,


Overall plot l.jpg

Overall Plot nonlinear regression analysis are based certain assumptions about the random departures from the proposed model.

The residuals can be plotted in an overall plot in several ways.


1 the scatter plot l.jpg

2. The histogram. nonlinear regression analysis are based certain assumptions about the random departures from the proposed model.

1. The scatter plot.

3. The box-whisker plot.

4. The kernel density plot

5. a normal plot or a half normal plot on standard probability paper.


Slide8 l.jpg

1. The Kolmogorov-Smirnov test.

2. The Chi-square goodness of fit test


Slide9 l.jpg

The Kolmogorov-Smirnov test

  • The empirical distribution function is defined below for n random observations

Fn(x) = the proportion of observations in the sample that are less than or equal to x.


Slide10 l.jpg

Let F distribution function as a tool for testing the goodness of fit of a distribution. 0(x) denote the hypothesized cumulative distribution function of the population (Normal population if we were testing normality)

If F0(x) truly represented distribution of observations in the population than Fn(x) will be close to F0(x) for all values of x.


Slide11 l.jpg

The Kolmogorov-Smirinov test statistic is : distribution function as a tool for testing the goodness of fit of a distribution.

= the maximum distance between Fn(x) and F0(x).

  • If F0(x) does not provide a good fit to the distributions of the observation - Dn will be large.

  • Critical values for are given in many texts


Slide12 l.jpg

The Chi-square goodness of fit test

  • Let fi denote the observed frequency in each of the class intervals of the histogram.

  • Let Ei denote the expected number of observation in each class interval assuming the hypothesized distribution.


Slide13 l.jpg

The hypothesized distribution is rejected if the statistic: the goodness of fit of a distribution.

  • is large. (greater than the critical value from the chi-square distribution with m - 1 degrees of freedom.

  • m = the number of class intervals used for constructing the histogram).


Slide14 l.jpg

The in the above tests it is assumed that the residuals are independent with a common variance of s2.

Note.

This is not completely accurate for this reason:

Although the theoretical random errors ei are all assumed to be independent with the same variance s2, the residuals are not independent and they also do not have the same variance.


Slide15 l.jpg

They will however be approximately independent with common variance if the sample size is large relative to the number of parameters in the model.

It is important to keep this in mind when judging residuals when the number of observations is close to the number of parameters in the model.


Slide16 l.jpg

Time Sequence Plot variance if the sample size is large relative to the number of parameters in the model.

The residuals should exhibit a pattern of independence.

If the data was collected in time there could be a strong possibility that the random departures from the model are autocorrelated.


Slide17 l.jpg

Namely the random departures for observations that were taken at neighbouring points in time are autocorrelated.

This autocorrelation can sometimes be seen in a time sequence plot.

The following three graphs show a sequence of residuals that are respectively

i) positively autocorrelated ,

ii) independent and

iii) negatively autocorrelated.


I positively auto correlated residuals l.jpg
i) Positively auto-correlated residuals taken at neighbouring points in time are autocorrelated.


Ii independent residuals l.jpg
ii) Independent residuals taken at neighbouring points in time are autocorrelated.


Iii negatively auto correlated residuals l.jpg
iii) Negatively auto-correlated residuals taken at neighbouring points in time are autocorrelated.


Slide21 l.jpg

There are several statistics and statistical tests that can also pick out autocorrelation amongst the residuals. The most common are:

i) The Durbin Watson statistic

ii) The autocorrelation function

iii) The runs test


Slide22 l.jpg

The Durbin-Watson statistic which is used frequently to detect serial correlation is defined by the following formula:

The Durbin Watson statistic:

If the residuals are serially correlated the differences, ei - ei+1, will be stochastically small. Hence a small value of the Durbin-Watson statistic will indicate positive autocorrelation. Large values of the Durbin-Watson statistic on the other hand will indicate negative autocorrelation. Critical values for this statistic, can be found in many statistical textbooks.


Slide23 l.jpg

The autocorrelation function at lag k is defined by detect serial correlation is defined by the following formula::

The autocorrelation function:

This statistic measures the correlation between residuals the occur a distance k apart in time. One would expect that residuals that are close in time are more correlated than residuals that are separated by a greater distance in time. If the residuals are independent than rk should be close to zero for all values of k A plot of rk versus k can be very revealing with respect to the independence of the residuals. Some typical patterns of the autocorrelation function are given below:


Slide24 l.jpg

This statistic measures the correlation between residuals the occur a distance k apart in time.

One would expect that residuals that are close in time are more correlated than residuals that are separated by a greater distance in time.

If the residuals are independent than rk should be close to zero for all values of k A plot of rk versus k can be very revealing with respect to the independence of the residuals.


Slide25 l.jpg

Auto correlation pattern for independent residuals the occur a distance k apart in time.

Some typical patterns of the autocorrelation function are given below:



Slide28 l.jpg

This test uses the fact that the residuals will oscillate about zero at a “normal” rate if the random departures are independent.

The runs test:

If the residuals oscillate slowly about zero, this is an indication that there is a positive autocorrelation amongst the residuals.

If the residuals oscillate at a frequent rate about zero, this is an indication that there is a negative autocorrelation amongst the residuals.


Slide29 l.jpg

In the “runs test”, one observes the time sequence of the “sign” of the residuals:

+ + + - - + + - - - + + +

and counts the number of runs (i.e. the number of periods that the residuals keep the same sign).

This should be low if the residuals are positively correlated and high if negatively correlated.


Slide30 l.jpg

Plot Against fitted values the “sign” of the residuals:and the Predictor Variables Xij

If we "step back" from this diagram and the residuals behave in a manner consistent with the assumptions of the model we obtain the impression of a horizontal "band " of residuals which can be represented by the diagram below.


Slide31 l.jpg

Individual observations lying considerably outside of this band indicate that the observation may be and outlier.

An outlier is an observation that is not following the normal pattern of the other observations.

Such an observation can have a considerable effect on the estimation of the parameters of a model.

Sometimes the outlier has occurred because of a typographical error. If this is the case and it is detected than a correction can be made.

If the outlier occurs for other (and more natural) reasons it may be appropriate to construct a model that incorporates the occurrence of outliers.


Slide32 l.jpg

If our "step back" view of the residuals resembled any of those shown below we should conclude that assumptions about the model are incorrect. Each pattern may indicate that a different assumption may have to be made to explain the “abnormal” residual pattern.

b)

a)


Slide33 l.jpg

Pattern a) indicates that the variance the random departures is not constant (homogeneous) but increases as the value along the horizontal axis increases (time, or one of the independent variables).

This indicates that a weighted least squares analysis should be used.

The second pattern, b) indicates that the mean value of the residuals is not zero.

This is usually because the model (linear or non linear) has not been correctly specified.

Linear and quadratic terms have been omitted that should have been included in the model.


Example analysis of residuals l.jpg

Example – Analysis of Residuals is not constant (homogeneous) but increases as the value along the horizontal axis increases (time, or one of the independent variables).

Motor Vehicle Data

Dependent = mpg

Independent = Engine size, horsepower and weight


Slide35 l.jpg

When a linear model was fit and residuals examined graphically the following plot resulted:


Slide36 l.jpg

The pattern that we are looking for is: graphically the following plot resulted:


Slide37 l.jpg

The pattern that was found is: graphically the following plot resulted:

This indicates a nonlinear relationship:

This can be handle by adding polynomial terms (quadratic, cubic, quartic etc.) of the independent variables or transforming the dependent variable


Slide38 l.jpg

Performing the log transformation on the dependent variable (mpg) results in the following residual plot

There still remains some non linearity


The log transformation l.jpg
The log transformation (mpg) results in the following residual plot


The box cox transformations l.jpg
The Box-Cox transformations (mpg) results in the following residual plot

l = 2

l = 1

l = 0

l = -1

l = -1


Slide41 l.jpg

The log ( (mpg) results in the following residual plotl = 0) transformation was not totally successful - try moving further down the staircase of the family of transformations

(l = -0.5)




Slide44 l.jpg

This corresponds to the model transformations (

or

and


Slide45 l.jpg

Checking normality with a transformations (P-P plot


Example l.jpg

Example transformations (

Non-Linear Regression


Slide47 l.jpg

In this example we are measuring the amount of a compound in the soil:

  • 7 days after application

  • 14 days after application

  • 21 days after application

  • 28 days after application

  • 42 days after application

  • 56 days after application

  • 70 days after application

  • 84 days after application


Slide48 l.jpg

This is carried out at two test plot locations the soil:

  • Craik

  • Tilson

6 measurements per location are made each time


The data l.jpg
The data the soil:


Graph l.jpg
Graph the soil:





Slide54 l.jpg

ANOVA Table (Craik) error by Excel

Parameter Estimates (Craik)


Slide55 l.jpg

Testing Hypothesis error by Excel: similar to linear regression

Caution: This statistic has only an approximate F – distribution when the sample size is large


Slide56 l.jpg

Example: Suppose we want to test error by Excel

H0: c = 0 against HA: c ≠ 0

Complete model

Reduced model


Slide57 l.jpg

ANOVA Table (Complete model) error by Excel

ANOVA Table (Reduced model)


Slide58 l.jpg

The Test error by Excel


Use of dummy variables l.jpg

Use of Dummy Variables error by Excel

Non Linear Regression


Slide60 l.jpg

The Model: error by Excel

or

where


Slide61 l.jpg

The data file error by Excel



Slide63 l.jpg

ANOVA Table error by Excel

Parameter Estimates


Slide64 l.jpg

Testing Hypothesis error by Excel:

Suppose we want to test

H0: Da = a1 – a2 = 0 and Dk = k1 – k2 = 0


Slide65 l.jpg

The Reduced Model: error by Excel

or


Slide66 l.jpg

ANOVA Table error by Excel

Parameter Estimates


Slide67 l.jpg

The error by ExcelF Test

Thus we accept the null Hypothesis that the reduced model is correct


ad