- By
**gil** - Follow User

- 113 Views
- Uploaded on

Download Presentation
## Chapter 11 (Continued)

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Linear Multiple Regression Model

### Linear regression Model specification:

### Parameter Estimation:You gather the observations for all variables and estimate model parameters

### Model Specification Example

### Model Evaluation

### Variation Measures

### Testing Overall Significance of regression parameters

### Testing Overall Significance Example

EPI809/Spring 2008

Types of Regression Models

EPI809/Spring 2008

Learning Objectives:

This part focuses on Linear Multiple Regression Model: After studying the materials in this section, you should be able to:

- Understand the general concepts behind Linear Multiple Regression Model
- Fit and Interpret Linear Multiple Regression Computer Output
- Perform model diagnosis: Test Overall and partial Significance of a multiple Regression Model, Perform Residual Analysis
- Describe Linear Regression Pitfalls

EPI809/Spring 2008

Regression Modeling Steps

- Specify the model and estimate all unknown parameters
- Evaluate Model
- Use Model for Prediction & Estimation

EPI809/Spring 2008

Decide what you want to do and select the dependent variable

List all potential independent variables for your model

EPI809/Spring 2008

Linear Multiple Regression Model

1. Relationship between 1 dependent & 2 or more independent variables is a linear function

Population Y-intercept

Population slopes

Random error

Dependent (response) variable

Independent (explanatory) variables

EPI809/Spring 2008

Linear Regression Assumptions

- Mean of Distribution of Error Is 0
- Distribution of Error Has Constant Variance
- Distribution of Error is Normal
- Errors Are Independent

Extremely Important

EPI809/Spring 2008

EPI809/Spring 2008

Interpretation of Estimated Coefficients

EPI809/Spring 2008

Interpretation of Estimated Coefficients

1. Slope (k)

- Estimated averaged Y Changes by k for Each 1 Unit Increase in XkHolding All Other Variables Constant
- Example from textbook: If 1 = 0.13, then the systolic blood pressure (Y) Is Expected to Increase by 0.13 for Each 1 Unit Increase in birthweighyt (X1) Given fixed age (X2)

^

^

^

EPI809/Spring 2008

Interpretation of Estimated Coefficients

^

1. Slope (k)

- Estimated Y Changes by k for Each 1 Unit Increase in XkHolding All Other Variables Constant
- Example form textbook: If 1 = 0.13, then the systolic blood pressure (Y) Is Expected to Increase by 0.13 for Each 1 Unit Increase in birthweighyt (X1) Given fixed age (X2)

2. Y-Intercept (0), predicted average value of Y When all Xk’s are set 0

^

^

^

EPI809/Spring 2008

Variance of Error estimate

- Assuming model is correctly specified…
- Best (unbiased) estimator ofis
- It is used in formula for computing
- Exact formula is too complicated to show
- But higher value for s leads to higher

EPI809/Spring 2008

Parameter Estimation Example

- You’re a Vet epidemiologist for the county cooperative. You gather the following data:

MilkFoodweight 1 1 2 4 8 8 1 3 1 3 5 7

2 6 4

4 10 6

- What is the linear relationshipbetween cows’ food intake, weight and milk yield?

© 1984-1994 T/Maker Co.

EPI809/Spring 2008

Dependent variable is milk yield (lb)

Independent variables for our model are Food intake (lb.) and weight (X100 lb.)

EPI809/Spring 2008

Sample SAS codes for plotting DATA

Data Cow; /*Reading data in SAS*/

input Milk Food weight@@;

cards;

1 1 2 4 8 8 1 3 1

3 5 7 2 6 4 4 10 6

;

run;

- procgplot;

plot milk*food milk*weight;

run;

EPI809/Spring 2008

Some plots

EPI809/Spring 2008

Sample SAS codes for fitting a multiple linear regression

PROCREG data=Cow;

model milk = food weight;

run;

EPI809/Spring 2008

Parameter Estimates

Parameter Standard

Variable DF Estimate Error t Value Pr > |t|

Intercept 1 0.06397 0.25986 0.25 0.8214

Food 1 0.20492 0.05882 3.48 0.0399

weight 1 0.28049 0.06860 4.09 0.0264

ParameterEstimation SAS Output^

P

^

0

^

^

s

^

1

2

p

EPI809/Spring 2008

Parameter Estimates

Sum of Mean

Source DF Squares Square F Value Pr > F

Model 2 9.24974 4.62487 55.44 0.0043

Error 3 0.25026 0.08342

Corrected Total 5 9.50000

Root MSE 0.28883 R-Square 0.9737

Dependent Mean 2.50000 Adj R-Sq 0.9561

Coeff Var 11.55309

ParameterEstimation SAS OutputS

EPI809/Spring 2008

Interpretation of Coefficients Solution

^

1. Slope (1)

- Milk yield Is Expected to Increase by .2049 for Each 1 lb. Increase in food intake Holding the weight Constant

EPI809/Spring 2008

Interpretation of Coefficients Solution

^

1. Slope (1)

- Milk yield Is Expected to Increase by .2049 for Each 1 lb. Increase in food intake Holding the weight Constant
- Slope (2)

-Milk yield Is Expected to Increase by .2805 for Each 1 unit (x100 lb.) Increase in weight Holding the food intake Constant

^

EPI809/Spring 2008

EPI809/Spring 2008

Evaluating Multiple Regression Models

1. Examine Variation Measures

2. Test Significance of Overall Model, portions of overall model and Individual Coefficients

3. Check conditions of a multiple linear regression model using Residuals

4. Assess Multicollinearity among ind. variables

EPI809/Spring 2008

EPI809/Spring 2008

Coefficient of Multiple Determination

- Proportion of Variation in Y ‘Explained’ by All X Variables Taken Together

EPI809/Spring 2008

Check Your Understanding

- If you add a variable to the model, how will that affect the R-squared value for the model?

EPI809/Spring 2008

Adjusted R2

- R2 Never Decreases When New X Variable Is Added to Model (Disadvantage When Comparing Models)
- Solution: Adjusted R2
- Each additional variable reduces adjusted R2, unless SSE goes up enough to compensate

EPI809/Spring 2008

Check Your Understanding

Using the Vet example: If you add a variable to the model, How will that affect R-squared and the estimate of standard deviation (of the error term)?

EPI809/Spring 2008

Check Your Understanding: solution

- Model with food intake only:

S = 0.64126, R-Square = 0.8269

& Adj R-Sq = 0.7836

- Model with food intake and weight:

S = 0.28883, R-Square =0.9737

& Adj R-Sq =0.9561

EPI809/Spring 2008

EPI809/Spring 2008

Testing Overall Significance

- Tests if there is a Linear Relationship Between AllX Variables Together & Y
- Hypotheses
- H0: 1 = 2 = ... = k = 0
- No Linear Relationship
- Ha: At Least One Coefficient Is Not 0
- At Least One X Variable linearly Affects Y
- Uses F test statistic

EPI809/Spring 2008

Overall SignificanceRejection Rule

- Reject H0 in favor of Ha if fcalc falls in colored area
- Reject H0 for Ha if P-value = P(F>fcalc)<α

Reject H

0

Do Not

Reject H

0

F

0

F

(

k

,

n

-K-1

, 1-α)

EPI809/Spring 2008

Testing Overall Significance Example

- You’re a Vet epidemiologist for the county cooperative. You gather the following data:

MilkFoodweight 1 1 2 4 8 8 1 3 1 3 5 7

2 6 4

4 10 6

- Are cows’ food intake and weight both linearly related to cows’ milk yield? Test at 5% significance level

© 1984-1994 T/Maker Co.

EPI809/Spring 2008

Model:

Hypotheses

H0: 1 = 2 = 0 (No Linear Relationship)

Ha: At Least One Coefficient Is Not 0

EPI809/Spring 2008

Parameter Estimates

Sum of Mean

Source DF Squares Square F Value Pr > F

Model 2 9.24974 4.62487 55.44 0.0043

Error 3 0.25026 0.08342

Corrected Total 5 9.50000

Testing Overall SignificanceSAS Computer OutputMS(Model) MS(Error)

k

n - k -1

n - 1

P-Value

EPI809/Spring 2008

Thinking Challenge

- k=18, n=20, R-squared=.95
- Would need an F-value >247.3 to reject the null hypothesis!

EPI809/Spring 2008

Thinking challenge

- F-test for model is significant
- Does the model have the best available predictors for y?
- Are all the terms in the model important for predicting y?
- Or what?

EPI809/Spring 2008

Download Presentation

Connecting to Server..