Loading in 5 sec....

Multiple Linear Regression.PowerPoint Presentation

Multiple Linear Regression.

- 60 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Multiple Linear Regression.' - robert-larson

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

Presentation Transcript

Multiple Linear Regression.

- Concept and uses.
- Model and assumptions.
- Intrinsically linear models.
- Model development and validation.
- Problem areas.
- Non-normality.
- Heterogeneous variance.
- Correlated errors.
- Influential points and outliers.
- Model inadequacies.
- Collinearity.
- Errors in X variables.

AGR206

Concept & Uses.

Did you know?

ANOVA

REGRESSION

- Description restricted to data set. Did biomass increase with pH in the sample?
- Prediction of Y. How much biomass we expect to find in certain soil conditions?
- Extrapolation for new conditions: can we predict biomass in other estuaries?
- Estimation and understanding. How much does biomass change per unit change in pH and controlling for other factors?
- Control of process: requires causality. Can we create sites with certain biomass by changing the pH?

AGR206

Body fat example in JMP.

- Three variables (X1, X3, X3) were measured to predict body fat % (Y) in people.
- Random sample of people.
- Y was measured by an expensive and very accurate method (assume it reveals true %fat).
- X1: thickness of triceps skinfold
- X2: thigh circumference
- X3: midarm circumference.
- Bodyfat.jmp

AGR206

Ho’s or values “of interest”

- Does thickness of triceps skinfold contribute significantly to predict fat content?
- What is the CI for fat content for a person whose X’s have been measured?
- Do I have more or less fat than last summer?
- Do I have more fat than recommended?

AGR206

Model and Assumptions. Yi=b0+ b1 Xi1+…+ bp Xip+ei In matrix notation the model and solution are exactly the same as for SLR:Y= Xb + eb=(X’X)-1(X’Y) All equations from SLR apply without change.

- Linear, additive model to relate Y to p independent variables.
- Note: here, p is number of variables, but some authors use p for number of parameters, which is one more than variables due to the intercept.

- where ei are normal and independent random variables with common variance s2.

AGR206

Linear models

- Linear, and intrinsically linear models.
- Linearity refers to the parameters. The model can involve any function of X’s for as long as they do not have parameters that have to be adjusted.
- A linear model does not always produce a hyperplane.
- Yi=b0+ b1 f1(Xi1)+…+ bp fp(Xi1)+ei

- Polynomial regression.
- Is a special case where the functions are powers of X.

AGR206

Matrix Equations

AGR206

Extra Sum of Squares

- Effects of order of entry on SS.
- The 4 types of SS.
- Partial correlation.

AGR206

Response plane and error

Y

Yi

E{Yi}

X2

X1

The response surface in more than 3D is a hyperplane.

AGR206

Model development

- What variables to include.
- Depends on objective:
- descriptive -> no need to reduce number of variables.
- Prediction and estimation of Yhat: OK to reduce for economical use.
- Estimation of b and understanding: sensitive to deletions; may bias MSE and b. No real solution other than getting more data from better experiment. (Sorry!)

- Depends on objective:

AGR206

Variable Selection

- Effects of elimination of variables:
- MSE is positively biased unless true b for variables eliminated is 0.
- bhat and Yhat are biased unless previous condition or variables eliminated are orthogonal to those retained.
- Variance of estimated parameters and predictions is usually lower.
- There are conditions for which MSE for reduced model (including variance and bias2) is smaller.

AGR206

Criteria for variable selection

- R2 - Coefficient of determination.
- R2 = SSReg/SSTotal

- MSE or MSRes - Mean squared residuals.
- if all X’s in it estimates s2.

- R2adj - Adjusted R2.
- R2adj = 1-MSE/MSTo = =1-[(n-1)/(n-p)] (SSE/SSTo)

- Mallow’s Cp
- Cp=[SSRes/MSEFull] + 2 p- n(p=number of parameters)

AGR206

Example

AGR206

Checking assumptions.

- Note that although we have many X’s, errors are still in a single dimension.
- Residual analysis is performed as for SLR, sometimes repeated over different X’s.
- Normality. Use proc univ normal option. Transform.
- Homogeneity of variance. Plot error vs. each X. Transform. Weighted least squares.
- Independence of errors.
- Adequacy of model. Plots errors. LOF.

- Influence and outliers. Use influence option in proc reg.
- Collinearity. Use collinoint option of proc reg.

AGR206

code for PROC REG

data s00.spart2;

set s00.spartina;

colin=2*ph+0.5*acid+sal+rannor(23);

run;

proc reg data=s00.spart2;

model bmss= colin h2s sal eh7 ph acid

p k ca mg na mn zn cu nh4 /

r influence vif collinoint stb partial;

run;

model colin=ph sal acid;

run;

AGR206

Spartina ANOVA output

Model: MODEL1

Dependent Variable: BMSS

Analysis of Variance

Sum of Mean

Source DF Squares Square F Value Prob>F

Model 15 16369583.2 1091305.552 11.297 0.0001

Error 29 2801379.9 96599.307

C Total 44 19170963.2

Root MSE 310.80429 R-square 0.8539

Dep Mean 1000.80000 Adj R-sq 0.7783

C.V. 31.05558

AGR206

Parameters and VIF

Parameter Estimates

Parameter Standard T for H0: Standardized

Variable DF Estimate Error Parameter=0 Prob > |T| Estimate

INTERCEP 1 3809.233562 3038.081 1.254 0.2199 0.00000000

COLIN 1 -178.317065 58.718 -3.037 0.0050 -1.06227792

H2S 1 0.336242 2.656 0.127 0.9001 0.01563626

SAL 1 150.513276 61.960 2.429 0.0216 0.84818417

EH7 1 2.288694 1.785 1.282 0.2099 0.12813770

PH 1 486.417077 306.756 1.586 0.1237 0.91891994

ACID 1 -24.816449 109.856 -0.226 0.8229 -0.09422943

P 1 0.153015 2.417 0.063 0.9500 0.00639498

K 1 -0.733250 0.439 -1.668 0.1061 -0.33059243

CA 1 -0.137163 0.111 -1.230 0.2286 -0.35706572

MG 1 -0.318586 0.243 -1.308 0.2010 -0.45340287

NA 1 -0.005294 0.022 -0.239 0.8127 -0.05520175

MN 1 -4.279887 4.836 -0.885 0.3835 -0.15872971

ZN 1 -26.270852 19.452 -1.351 0.1873 -0.32953283

CU 1 346.606818 99.295 3.491 0.0016 0.54452366

NH4 1 0.539373 3.061 0.176 0.8614 0.03862822

Variance

Inflation

0.00000000

24.28364757

3.02785626

24.19556405

1.98216733

66.64921013

34.53131689

2.02507775

7.79660017

16.72702792

23.82835726

10.57323219

6.38589662

11.81574077

4.82931410

9.53842459

AGR206

Download Presentation

Connecting to Server..