1 / 18

# Lecture 6 Notes - PowerPoint PPT Presentation

Lecture 6 Notes. Note: I will e-mail homework 2 tonight. It will be due next Thursday. The Multiple Linear Regression model (Chapter 4.1) Inferences from multiple regression analysis (Chapter 4.2)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Lecture 6 Notes

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

### Lecture 6 Notes

• Note: I will e-mail homework 2 tonight. It will be due next Thursday.

• The Multiple Linear Regression model (Chapter 4.1)

• Inferences from multiple regression analysis (Chapter 4.2)

• In multiple regression analysis, we consider more than one independent variable x1,…,xK . We are interested in the conditional mean of y given x1,…,xK .

### Automobile Example

• A team charged with designing a new automobile is concerned about the gas mileage that can be achieved. The design team is interested in two things:

(1) Which characteristics of the design are likely to affect mileage?

(2) A new car is planned to have the following characteristics: weight – 4000 lbs, horsepower – 200, cargo – 18 cubic feet, seating – 5 adults. Predict the new car’s gas mileage.

• The team has available information about gallons per 1000 miles and four design characteristics (weight, horsepower, cargo, seating) for a sample of cars made in 1989. Data is in car89.JMP.

### Best Single Predictor

• To obtain the correlation matrix and pairwise scatterplots, click Analyze, Multivariate Methods, Multivariate.

• If we use simple linear regression with each of the four independent variables, which provides the best predictions?

### Best Single Predictor

• Answer: The simple linear regression that has the highest R2 gives the best predictions because recall that

• Weight gives the best predictions of GPM1000Hwy based on simple linear regression.

• But we can obtain better predictions by using more than one of the independent variables.

### Multiple Linear Regression Model

• The expected value of the disturbances is zero for each ,

• The variance of each is equal to ,i.e.,

• The are normally distributed.

• The are independent.

### Point Estimates for Multiple Linear Regression Model

• We use the same least squares procedure as for simple linear regression.

• Our estimates of are the coefficients that minimize the sum of squared prediction errors:

• Least Squares in JMP: Click Analyze, Fit Model, put dependent variable into Y and add independent variables to the construct model effects box.

### Root Mean Square Error

• Estimate of :

• = Root Mean Square Error in JMP

• For simple linear regression of GP1000MHWY on Weight, . For multiple linear regression of GP1000MHWY on weight, horsepower, cargo, seating,

### Residuals and Root Mean Square Errors

• Residual for observation i = prediction error for observation i =

• Root mean square error = Typical size of absolute value of prediction error

• As with simple linear regression model, if multiple linear regression model holds

• About 95% of the observations will be within two RMSEs of their predicted value

• For car data, about 95% of the time, the actual GP1000M will be within 2*3.54=7.08 GP1000M of the predicted GP1000M of the car based on the car’s weight, horsepower, cargo and seating.

• Confidence intervals: confidence interval for :

Degrees of freedom for t equals n-(K+1). Standard error of , , found on JMP output.

• Hypothesis Test:

Decision rule for test: Reject H0 if or

where

p-value for testing is printed in JMP output under Prob>|t|.

### Inference Examples

• Find a 95% confidence interval for ?

• Is seating of any help in predicting gas mileage once horsepower, weight and cargo have been taken into account? Carry out a test at the 0.05 significance level.

### Partial Slopes vs. Marginal Slopes

• Multiple Linear Regression Model:

• The coefficient is a partial slope. It indicates the change in the mean of y that is associated with a one unit increase in while holding all other variables fixed.

• A marginal slope is obtained when we perform a simple regression with only one X, ignoring all other variables. Consequently the other variables are not held fixed.

### Partial Slopes vs. Marginal Slopes Example

• In order to evaluate the benefits of a proposed irrigation scheme in a certain region, suppose that the relation of yield Y to rainfall R is investigated over several years.

• Data is in rainfall.JMP.

Higher rainfall is associated with lower temperature.

Rainfall is estimated to be beneficial once temperature is held fixed.

Multiple regression provides a better picture of the benefits of

an irrigation scheme because temperature would be held fixed in

an irrigation scheme.