slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Simple Linear Regression PowerPoint Presentation
Download Presentation
Simple Linear Regression

Loading in 2 Seconds...

play fullscreen
1 / 35

Simple Linear Regression - PowerPoint PPT Presentation


  • 86 Views
  • Uploaded on

Simple Linear Regression. Data available : (X,Y).    G oal : To predict the response Y. (i.e. to obtain the fitted response function f(X)). How to determine this regression function? (need to estimate the parameters.). Least Squares Fitting Method.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Simple Linear Regression' - teneil


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide2

Data available:(X,Y)

   Goal:To predict the response Y.

(i.e. to obtain the fitted response function f(X))

How to determine this regression function?

(need to estimate the parameters.)

Least Squares Fitting Method

least squares regression function

Least Squares Regression Function:

Least Squares Estimates

slide7

Terminology

Fitted model

True model

Fitted regression function

slide13

REGRESSION ON MIDTERM GRADE

Obs MIDTERM FINAL

1 68 75

2 49 63

3 60 57

4 68 88

5 97 88

6 82 79

7 59 82

8 50 73

9 73 90

10 39 62

11 71 70

12 95 96

13 61 76

14 72 75

15 87 85

16 40 40

17 66 74

18 58 70

19 58 75

20 77 72

Figure 1.4 SAS PROC PRINT output for the grade data problem.

slide14

TITLE ‘REGRESSION ON MIDTERM GRADE’;

DATA;

INPUT MIDTERM FINAL;

CARDS;

68 75

49 63

60 57

. .

77 72

;

PROC PLOT;

PLOT FINAL*MIDTERM=’O’ PRED*MIDTERM=’P’ / OVERLAY;

LABEL FINAL=’FINAL’;

 PROC RANK NORMAL=VW;

VAR RESID;

RANKS NSCORE;

  •  PROC PLOT;
  • PLOT RESID*NSCORE=’R’;
  • LABEL NSCORE=’NORMAL SCORE’;
  • RUN;

PROC PRINT; 

PROC REG;

MODEL FINAL=MIDTERM / P;

OUTPUT PREDICTED=PRED

RESIDUAL=RESID;

slide15

REGRESSION ON MIDTERM GRADE

Model: MODEL1

Dependent Variable: FINAL

Analysis of Variance

Sum of Mean

Source DF Squares Square F Value Pr > F

Model 1 1774.44117 1774.44117 24.26 0.0001

Error 18 1316.55883 73.14216

Corrected Total 19 3091.00000

Root MSE 8.55232 R-Square 0.5741

Dependent Mean 74.50000 Adj R-Sq 0.5504

Coeff Var 11.47962

Parameter Estimates

Parameter Standard

Variable DF Estimate Error t Value Pr > |t|

Intercept 1 34.56757 8.32984 4.15 0.0006

MIDTERM 1 0.60049 0.12192 4.93 0.0001

slide16

Dep Var Predicted

Obs FINAL Value Residual

1 75.0000 75.4007 -0.4007

2 63.0000 63.9915 -0.9915

3 57.0000 70.5968 -13.5968

4 88.0000 75.4007 12.5993

5 88.0000 92.8149 -4.8149

6 79.0000 83.8076 -4.8076

7 82.0000 69.9963 12.0037

8 73.0000 64.5920 8.4080

9 90.0000 78.4032 11.5968

10 62.0000 57.9866 4.0134

11 70.0000 77.2022 -7.2022

12 96.0000 91.6139 4.3861

13 76.0000 71.1973 4.8027

14 75.0000 77.8027 -2.8027

15 85.0000 86.8100 -1.8100

16 40.0000 58.5871 -18.5871

17 74.0000 74.1998 -0.1998

18 70.0000 69.3959 0.6041

19 75.0000 69.3959 5.6041

20 72.0000 80.8051 -8.8051

Sum of Residuals 0

Sum of Squared Residuals 1316.55883

Predicted Residual SS (PRESS) 1668.47241

slide17

|

100 +

| o

|

| o p p

| o o

| o

| o p

80 + p o

F | o p pp

I | o o o o o

N | o pp o o

A | p p

L | p

| o o

60 + p

| p o

|

|

|

|

|

40 + o

|

-+------------+------------+------------+------------+------------+------------+------------+

30 40 50 60 70 80 90 100

NOTE: 6 obs hidden.

MIDTERM

Figure 1.6 Output for the first PROC PLOT step for the grade data problem.

slide18

20 +

|

|

|

| R

| R R

10 +

| R

R |

e | R R R

s | R

i |

d 0 +---------------------------------R---------R--R---------------------------------------------

u | R R

a | R

l | R R

| R

| R

-10 +

|

| R

|

|

| R

-20 +

|

--+----------+----------+----------+----------+----------+----------+----------+----------+--

55 60 65 70 75 80 85 90 95

Predicted Value of FINAL

Figure 1.7 The remainder of the output from the first PROC PLOT step.

slide19

20 +

|

|

|

| R

| R R

10 +

| R

R |

e | R R R

s | R

i |

d 0 + R R R

u | R R

a | R

l | R R

| R

| R

-10 +

|

| R

|

|

| R

-20 +

|

--+----------+----------+----------+----------+----------+----------+----------+----------+--

-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0

NORMAL SCORE

slide21

*Pearson’s Correlation Coefficient

*Goal:The degree of linear correlation

between two variables.

The range lies between –1 and 1.

slide22

*Coefficient of Determination: the fraction of the

variance in y that is explained by regression on x.

Goal:may be used as an index of linearity for the

relation of y to x.

Definition:

slide23

120 +

| o

|

| o

|

100 + o

|

| o

| o

P | o

R 80 +

E | o o

S | o

S | o

U | o

R 60 + o o

| o

| o

| o

| o

40 + o o

| o o

| o o o

| o

|

20 +

|

---+---------+---------+---------+---------+---------+---------+---------+---------+--

10 15 20 25 30 35 40 45 50

VOLUME

Figure 3.3: A plot of the air pressure data (an example of residual analysis).

slide24

|

30 +

|

|

|

| *

|

20 +

|

R |

e | *

s |*

i |

d 10 + * *

u |

a | *

l | * *

|

| * *

0 +------------------------------------------------------------------------------*-------------

| *

| * *

| * *

| * *

| * * * *

-10 + * * *

|

-+---------+---------+---------+---------+---------+---------+---------+---------+---------+-

16.357 25.007 33.658 42.308 50.959 59.609 68.259 76.910 85.560 94.210

Predicted Value of P

Figure 3.4 The residual on fit plot after fitting the model P= a + b V + e to the air pressure data.

slide25

0.50 +

| *

|

| *

|

0.25 +

|

| * * * * * *

| * * *

R | * * *

e 0.00 +-----------------------*--------------------------*------------------------

s | *

i | * * *

d |

u | *

a -0.25 + *

l | *

| *

|

|

-0.50 +

|

|

| *

|

-0.75 +

---+-------------+-------------+-------------+-------------+-------------+--

20 40 60 80 100 120

Predicted Value of P

Figure 3.5 The residual on the fit plot using the model P = a + b/V +e for the air pressure data.

slide26

Weighted Regression

Problem: (unequal variance)

Model:

Claim:minimize

Ordinary Regression

Model:

Claim:minimize

slide27

How to determine the weights?

So the optimal weights are inversely proportional to the variances of the y.

slide28

PROC REG;

MODEL P=VI;

WIGHT W;

OUTPUT P=FIT R=RES;

DATA;

SET;

WRES=SQRT(W)*RES;

DATA;

INPUT V P;

VI=1/V;

CARDS;

48 29.1

.

.

.

12 117.6

;

PROC RANK NORMAL=VW;

VAR WRES;

RANKS NSCORE;

PROC PLOT;

PLOT WRES*FIT=’*’ / VREF=0 VPOS=30;

POLT WRES*NSCORE=’*’ /VPOS=30;

LABEL WRES=’WEIGHTED RESIDUAL’ NSCORE=’NORMAL SCORE’;

RUN;

PROC REG;

MODEL P=VI;

OUTPUT P=LSFIT;

DATA;

SET;

W=1/LSFIT;

slide29

|

0.050 +

|

| *

W | *

E |

I 0.025 + * * *

G | * * *

H | *

T | * * *

E | * *

D 0.000 +-----------------------*---------------------------------------------------

| *

R | * * *

E | *

S |

I -0.025 + *

D | *

U |

A | *

L |

-0.050 + *

|

|

|

| *

-0.075 +

|

---+-------------+-------------+-------------+-------------+-------------+--

20 40 60 80 100 120

Predicted Value of P

Figure 3.13 Weighted residual plot for a weighted fit of the model P = a + b/V + e to the air pressure data .

slide30

0.0002 +

|

|

| *

| *

0.0001 + * *

| *

|

R | * * *

e | * *

s 0 +------*--------*-------------------------------*---------------*--------------------*

i | * * * *

d | *

u | * *

a | *

l -0.0001 + *

|

|

|

|

-0.0002 +

|

|

| *

|

-0.0003 +

|

---+---------------+---------------+---------------+---------------+---------------+--

-0.034 -0.029 -0.024 -0.019 -0.014 -0.009

Predicted Value of PT

Figure 3.17 Residual on fit plot for the model –1/ P =α+ BV + e in air pressure data.

slide31

|

|

0.0002 +

|

|

| *

| *

0.0001 + * *

| *

|

R | * * *

e | * *

s 0 + * * * * *

i | * * * *

d | *

u | * *

a | *

l -0.0001 + *

|

|

|

|

-0.0002 +

|

|

| *

|

-0.0003 +

|

---+------------------+------------------+------------------+------------------+--

-2 -1 0 1 2

NORMAL SCORE

Figure 3.18 Residual normal probability plot for the model –1/ P =α+ BV + e in air pressure data..

slide32

|

|

0.0001 + *

| *

| *

|

| *

0.00005 + * *

| *

| *

R | *

e | * *

s 0 +----------------------------------------------------*------------------------

i | * * *

d | * *

u |

a | * *

l -0.00005 + *

| * *

|

|

| *

-0.0001 +

|

|

| *

|

-0.00015 +

|

---+-------+-------+-------+-------+-------+-------+-------+-------+-------+--

-0.033 -0.030 -0.027 -0.024 -0.021 -0.018 -0.016 -0.013 -0.010 -0.007

Predicted Value of PT

Figure 3.19 Residual on fit plot for the model –1/ P =α+ BV + e in Example 3.4 after deleting the first data point.

slide33

|

|

0.0001 + *

| *

| *

|

| *

0.00005 + * *

| *

| *

R | *

e | * *

s 0 + *

i | * * *

d | * *

u |

a | * *

l -0.00005 + *

| * *

|

|

| *

-0.0001 +

|

|

| *

|

-0.00015 +

|

---+------------------+------------------+------------------+------------------+--

-2 -1 0 1 2

NORMAL SCORE

Figure 3.20 Residual normal probability plot for the model –1/ P =α+ BV + e in Example 3.4 after deleting the first data point.

slide34

How to determine the weights of transformation T

such that

(assuming T is monotonic increasing)