chapter 5
Download
Skip this Video
Download Presentation
Chapter 5

Loading in 2 Seconds...

play fullscreen
1 / 23

Chapter 5 - PowerPoint PPT Presentation


  • 95 Views
  • Uploaded on

Chapter 5. Residuals, Residual Plots, & Influential points. Residuals (error) -. The vertical deviation between the observations & the LSRL the sum of the residuals is always zero error = observed - expected. Residual plot . A scatterplot of the ( x , residual) pairs.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Chapter 5' - trevor-mccray


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
chapter 5

Chapter 5

Residuals, Residual Plots, & Influential points

residuals error
Residuals (error) -
  • The vertical deviation between the observations & the LSRL
  • the sum of the residuals is always zero
  • error = observed - expected
residual plot
Residual plot
  • A scatterplot of the (x, residual) pairs.
  • Residuals can be graphed against other statistics besides x
  • Purpose is to tell if a linear association exist between the x & y variables
slide4

60

64

68

Consider a population of adult women. Let’s examine the relationship between their height and weight.

Weight

Height

slide5

Residuals

Weight

60

64

68

Height

Suppose we now take a random sample from our population of women.

residual plot1
Residual plot
  • A scatterplot of the (x, residual) pairs.
  • Residuals can be graphed against other statistics besides x
  • Purpose is to tell if the model is an appropriate fit between the x & y variables
  • If no pattern exists between the points in the residual plot, then the model is appropriate.
slide7

Model is appropriate

Model is NOT appropriate

slide8

Residuals

x

Age Range of Motion

35 154

24 142

40 137

31 133

28 122

25 126

26 135

16 135

14 108

20 120

21 127

30 122

One measure of the success of knee surgery is post-surgical range of motion for the knee joint following a knee dislocation. Is there a linear relationship between age & range of motion?

Sketch a residual plot.

Since there is no pattern in the residual plot, there is a linear relationship between age and range of motion

slide9

Residuals

Age Range of Motion

35 154

24 142

40 137

31 133

28 122

25 126

26 135

16 135

14 108

20 120

21 127

30 122

Plot the residuals against the y-hats. How does this residual plot compare to the previous one?

slide10

Residuals

Residuals

x

Residual plots are the same no matter if plotted against x or y-hat.

coefficient of determination
Coefficient of determination-
  • r2
  • the proportion of variation in y that can be attributed to a approximate linear relationship between x & y
  • remains the same no matter which variable is labeled x
slide12

Age Range of Motion

35 154

24 142

40 137

31 133

28 122

25 126

26 135

16 135

14 108

20 120

21 127

30 122

Let’s examine r2.

Suppose you were going to predict a future y but you didn’tknow the x-value. Your best guess would be the overall mean of the existing y’s.

Total sum of the squared deviations -

Total variation

SStotal = 1564.917

slide13

Age Range of Motion

35 154

24 142

40 137

31 133

28 122

25 126

26 135

16 135

14 108

20 120

21 127

30 122

Now suppose you were going to predict a future y but you DO know the x-value. Your best guess would be the point on the LSRL for that x-value (y-hat).

Sum of the squared residuals using the LSRL.

SSResid = 1085.735

slide14

SSTotal = 1564.917

Age Range of Motion

35 154

24 142

40 137

31 133

28 122

25 126

26 135

16 135

14 108

20 120

21 127

30 122

SSResid = 1085.735

By what percent did the sum of the squared error go down when you went from just an “overall mean” model to the “regression on x” model?

This is r2 – the amount of the variation in the y-values that is explained by the x-values.

slide15

Age Range of Motion

35 154

24 142

40 137

31 133

28 122

25 126

26 135

16 135

14 108

20 120

21 127

30 122

How well does age predict the range of motion after knee surgery?

30.6% of the variation in range of motion after knee surgery can be explained by the approximate linear regression of age and range of motion.

slide16

Interpretation of r2

r2% of the variation in y can be explained by the approximate linear regression of x & y.

slide17

Computer-generated regression analysis of knee surgery data:

Predictor Coef Stdev T P

Constant 107.58 11.12 9.67 0.000

Age 0.8710 0.4146 2.10 0.062

s = 10.42 R-sq = 30.6% R-sq(adj) = 23.7%

Be sure to convert r2 to decimal beforetaking the square root!

NEVER use adjusted r2!

What is the equation of the LSRL?

Find the slope & y-intercept.

What are the correlation coefficient and the coefficient of determination?

influential point
Influential point-
  • A point that influences where the LSRL is located
  • If removed, it will significantly change the slope of the LSRL
slide20

Racket Resonance Acceleration

(Hz) (m/sec/sec)

1 105 36.0

2 106 35.0

3 110 34.5

4 111 36.8

5 112 37.0

6 113 34.0

7 113 34.2

8 114 33.8

9 114 35.0

10 119 35.0

11 120 33.6

12 121 34.2

13 126 36.2

14 189 30.0

One factor in the development of tennis elbow is the impact-induced vibration of the racket and arm at ball contact.

Sketch a scatterplot of these data.

Calculate the LSRL & correlation coefficient.

Does there appear to be an influential point? If so, remove it and then calculate the new LSRL & correlation coefficient.

which of these measures are resistant
Which of these measures are resistant?
  • LSRL
  • Correlation coefficient
  • Coefficient of determination

NONE – all are affected by outliers

ad