residuals
Download
Skip this Video
Download Presentation
Residuals

Loading in 2 Seconds...

play fullscreen
1 / 50

Residuals - PowerPoint PPT Presentation


  • 69 Views
  • Uploaded on

Residuals. A continuation of regression analysis. Lesson Objectives. Continue to build on regression analysis . Learn how residual plots help identify problems with the analysis. Case X Y 1 73 175 2 68 158 3 67 140 4 72 207 5 62 115. ^.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Residuals' - jessamine-french


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
residuals
Residuals

A continuation ofregression analysis

slide2

Lesson Objectives

  • Continue to build on regression analysis.
  • Learn how residual plotshelp identify problems with the analysis.
slide3
CaseXY

1 73 175

2 68 158

3 67 140

4 72 207

5 62 115

^

Wt = – 332.73 + 7.189 Ht

Example 1: Sample of n = 5 students,Y = Weight in pounds,X = Height in inches.

continued …

Prediction equation:

To be foundlater.

r-square = ?

Std. error = ?

slide4

^

Y = – 332.7 + 7.189X

Example 1, continued

220

·

200

·

180

·

160

WEIGHT

Residuals = distance from point to line, measuredparallel to Y- axis.

·

140

·

120

100

60

64

68

72

76

HEIGHT

slide5
Calculation: For each case,

residual =

observed value

estimated mean

^

ei = yi - yi

For the ith case,

slide6
Compute the fitted value and residual for the 4th person in the sample; i.e., X = 72 inches, Y = 207 lbs.

^

y =

fitted value =

4

^

y4 - y4

Example 1, continued

-332.73 + 7.189()

= _________

residual = e4 =

=

= __________

residual plots
ResidualPlots

Scatterplot of residuals vs. the predicted means of Y, Y; or an X-variable.

^

slide8

^

Y = – 332.7 + 7.189X

Example 1, continued

e4 = +22.12.

220

·

200

·

180

·

160

WEIGHT

Residuals = distance from point to line, measuredparallel to Y- axis.

·

140

·

120

100

60

64

68

72

76

HEIGHT

slide9

Example 1, continued

·

24

e4 is theresidual for the 4th case,= +22.12.

Residual Plot

16

8

·

·

0

Residuals

·

-8

Regression line from previous plot is rotated to horizontal.

·

-16

-24

60

64

68

72

76

HEIGHT

residual plot
Residual Plot

Scatterplot of residuals versus the predicted means of Y, Y; or an X-variable, or Time.

^

Expect random dispersion around a horizontal line at zero.

Problems occur if: • Unusual patterns • Unusual cases

slide11

Residuals versus X

l

l

l

l

l

l

l

l

l

l

l

Residuals

l

0

l

l

l

l

l

l

l

l

l

Good random pattern

X, or time

slide12

Residuals versus X

l

l

l

l

l

l

l

l

l

l

l

l

l

Residuals

l

l

l

0

l

l

l

l

l

l

l

Next step: ________ to determineif a recording error has occurred.

l

Outliers?

X, or time

slide13

Residuals versus X

Next step: Add a “quadratic term,”or use “______.”

l

l

l

l

l

l

l

l

l

l

l

l

Residuals

l

l

0

l

l

l

l

l

l

l

l

l

l

l

l

l

l

Nonlinear relationship

X, or time

slide14

Residuals versus X

l

Next step: Stabilize variance by using “________.”

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

Residuals

l

0

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

Variance is increasing

X, or time

residual plots help identify
Unusual patterns:

qPossible curvature in the data.

qVariances that are not constant as X changes.

Unusual cases:

qOutliers

q High leverage cases

q Influential cases

Residual Plots help identify
three properties of residuals
Three properties of Residuals

illustrated with somecomputations.

slide17
73 175

68 158

67 140

72 207

62 115

^

^

e = Y – Y

Y

.01

Property 1.

Y = Weight

X = Height

^

Y = – 332.73 + 7.189 X

Residuals

XY

–17.07

192.07

Find the sum of the

residuals.

156.12

1.88

. . .

 round-off error

slide19
73 175

68 158

67 140

72 207

62 115

^

^

e = Y – Y

Y

867.98

.01

Property 2.

Y = Weight

X = Height

^

Y = – 332.73 + 7.189 X

e2

XY

192.07

156.12

148.93184.88112.99

–17.07

1.88

–8.93 22.12 2.01

291.38

3.53

79.74489.29 4.04

Find the sum of squaresof the residuals.

properties of least squares line1
1.Residuals always sum to zero.

“SSE for any other line”.

Sei2= SSE = 867.98 <

Properties of Least Squares Line

2. This “least squares” line produces a smaller “Sum of squared residuals” than any other straight line can.

slide21

X = 68.4, Y = 159

Y

Property 3.

220

·

200

·

180

·

160

WEIGHT

·

140

·

120

100

60

64

68

72

76

X

HEIGHT

properties of least squares line2
1. Residuals always sum to zero.

2. This “least squares” line produces a smaller “Sum of squared residuals” than any other straight line can.

3. Line always passes through the point ( x, y ).

Properties of Least Squares Line
slide23

Illustration of unusual cases:

  • Outliers
  • Leverage
  • Influential
slide24

X

Y

outlier

l

l

l

l

l

l

“Unusual point” does not follow pattern. It’s near the X-mean; the entire line pulled toward it.

l

l

l

l

l

l

l

l

l

l

X

slide25

X

l

Y

l

“Unusual point” does not follow pattern. The line is pulled down and twistedslightly.

l

l

l

l

l

l

l

l

l

l

l

outlier

l

l

l

X

slide26

X

“Unusual point” is farfrom the X-mean, but still follows the pattern.

Y

l

Highleverage

l

l

l

l

l

l

l

l

l

l

l

l

l

l

X

slide27

influential

X

“Unusual point” is far from the X-mean, but does not follow the pattern.Line really twists!

Y

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

l

leverage

& outlier,

X

slide28
High Leverage Case:

An extreme X value relative to the other X values.

Definitions:

Outlier:

An unusual y-value relative to the pattern of the other cases.Usually has a large residual.

slide30
High leverage

Definitions: continued

Conclusion:

potentially influential.

High leverage & Outlier

influential!!

slide33

Lesson Objectives

  • Learn two ways to use Minitab to runa regression analysis.
  • Learn how to read output from Minitab.
slide34

Example 3, continued …

Can height be predicted using shoe size?

Step 1?

DTDP

slide35

Female

Male

Example 3, continued …

Can height be predicted using shoe size?

Graph

Scatterplot

Plot …

“Jitter” added in X-direction.

The scatter for eachsubpopulation is about the same; i.e., there is“constant variance.”

slide36

Example 3, continued …

Stat

Method 1

Regression

Regression …

Y = a + bX

slide37

Example 3, continued …

Copied from “Session Window.”

Can height be predicted using shoe size?

Regression Analysis: Height versus Shoe Size

The regression equation is

Height = 50.5 + 1.87 Shoe Size

Predictor Coef SE Coef T P

Constant 50.5230 0.5912 85.45 0.000

Shoe Siz 1.87241 0.06033 31.04 0.000

S = 1.947 R-Sq = 79.1% R-Sq(adj) = 79.0%

Analysis of Variance

Source DF SS MS F P

Regression 1 3650.0 3650.0 963.26 0.000

Error 255 966.3 3.8

Total 256 4616.3

slide38

Least squares estimated coefficients.

Example 3, continued …

Can height be predicted using shoe size?

Regression Analysis: Height versus Shoe Size

The regression equation is

Height = 50.5 + 1.87 Shoe Size

Predictor Coef SE Coef T P

Constant 50.5230 0.5912 85.45 0.000

Shoe Siz 1.87241 0.06033 31.04 0.000

S = 1.947 R-Sq = 79.1% R-Sq(adj) = 79.0%

Analysis of Variance

Source DF SS MS F P

Regression 1 3650.0 3650.0 963.26 0.000

Error 255 966.3 3.8

Total 256 4616.3

Total “Degrees of Freedom”= Number of cases - 1

slide39

SSRTSS

3650.04616.3

R-Sq =

=

Example 3, continued …

Can height be predicted using shoe size?

Regression Analysis: Height versus Shoe Size

The regression equation is

Height = 50.5 + 1.87 Shoe Size

Predictor Coef SE Coef T P

Constant 50.5230 0.5912 85.45 0.000

Shoe Siz 1.87241 0.06033 31.04 0.000

S = 1.947 R-Sq = 79.1% R-Sq(adj) = 79.0%

Analysis of Variance

Source DF SS MS F P

Regression 1 3650.0 3650.0 963.26 0.000

Error 255 966.3 3.8

Total 256 4616.3

slide40

3.8

S = MSE

=

Example 3, continued …

Can height be predicted using shoe size?

Regression Analysis: Height versus Shoe Size

The regression equation is

Height = 50.5 + 1.87 Shoe Size

Predictor Coef SE Coef T P

Constant 50.5230 0.5912 85.45 0.000

Shoe Siz 1.87241 0.06033 31.04 0.000

S = 1.947 R-Sq = 79.1% R-Sq(adj) = 79.0%

Analysis of Variance

Source DF SS MS F P

Regression 1 3650.0 3650.0 963.26 0.000

Error 255 966.3 3.8

Total 256 4616.3

Standard Error of Regression.Measure of variation around the regression line.

Sum of squared residuals

Mean Squared ErrorMSE

slide41

Example 3, continued …

Can height be predicted using shoe size?

Are there anyproblems visiblein this plot?

___________

No “Jitter” added.

slide42

Height = 50.52 + 1.872 Shoe

Example 3, continued …

Can height be predicted using shoe size?

Least squares regression equation:

Std. error = 1.947 inches

r-square = 79.1%,

The two summary measuresthat should always begiven with the equation.

slide43

Example 3, continued …

Can height be predicted using shoe size?

Stat

Method 2

This program gives a scatterplot with the regression superimposed on it.

Regression

Fitted Line Plot …

Y = a + bX

slide44

Example 3, continued …

Can height be predicted using shoe size?

The fit looks

slide45

Example 3, continued …

Can height be predicted using shoe size?

Regression Analysis: Height versus Shoe Size

The regression equation is

Height = 50.5 + 1.87 Shoe Size

Predictor Coef SE Coef T P

Constant 50.5230 0.5912 85.45 0.000

Shoe Siz 1.87241 0.06033 31.04 0.000

S = 1.947 R-Sq = 79.1% R-Sq(adj) = 79.0%

Analysis of Variance

Source DF SS MS F P

Regression 1 3650.0 3650.0 963.26 0.000

Error 255 966.3 3.8

Total 256 4616.3

What information do these values provide?

slide46

1

How do you determine if theX-variable is a useful predictor?

Use the“t-statistic”or the F-stat.

“t” measures how many standard errors the estimated coefficient is from “zero.”

“F” = t2 for simple regression.

slide47

2

How do you determine if theX-variable is a useful predictor?

A “P-value” is associated with “t” and “F”.

The further “t” and “F” are from zero,in either direction, the smaller the corresponding P-value will be.

P-value: a measure of the “likelihoodthat the true coefficient IS ZERO.”

slide48

If the P-value IS SMALL (typically “<0.10”),

3

then conclude:

1. It is unlikely that the true coefficient is really zero, and therefore,

2.  The X variable IS a useful predictor for the Y variable. Keep the variable!

If the P-value is NOT SMALL (i.e., “> 0.10”), then conclude:

1.   For all practical purposes the true coefficient MAY BE ZERO; therefore

2.   The X variable IS NOT a useful predictor of the Y variable. Don’t use it.

slide49

Example 3, continued …

Can height be predicted using shoe size?

Could “shoe size”have a truecoefficient thatis actually “zero”?

Regression Analysis: Height versus Shoe Size

The regression equation is

Height = 50.5 + 1.87 Shoe Size

Predictor Coef SE Coef T P

Constant 50.5230 0.5912 85.45 0.000

Shoe Siz 1.87241 0.06033 31.04 0.000

S = 1.947 R-Sq = 79.1% R-Sq(adj) = 79.0%

Analysis of Variance

Source DF SS MS F P

Regression 1 3650.0 3650.0 963.26 0.000

Error 255 966.3 3.8

Total 256 4616.3

“t” measures how many standard errors the estimated coefficient is from “zero.”

P-value: a measure of the likelihoodthat the true coefficient is “zero.”

The P-value for Shoe Size IS SMALL (< 0.10).

Conclusion:

The “shoe size” coefficient is NOT zero!“Shoe size” IS a useful predictor of the mean of “height”.

slide50

The logic just explained

is statistical inference.

This will be covered in more detail during the last three weeks of the course.

ad