1 / 16

Multiple regression

Multiple regression. More than one indicator variable may be responsible for the variation we see in the response. Gas mileage is a function of weight, horsepower, use of air conditioning, etc.

kelvin
Download Presentation

Multiple regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple regression • More than one indicator variable may be responsible for the variation we see in the response. • Gas mileage is a function of weight, horsepower, use of air conditioning, etc. • Metal fatigue in airplanes is a function of number of takeoffs and landings, climbout speed, landing speed, etc. • Incidence of heart attack is a function of age, BMI, cholesterol levels, etc. • If the function that defines the relationship between the indicator variables and the response is linear, then we have multiple linear regression, i.e., • If a polynomial relationship between indicators and response is the best fit, then we have polynomial regression, e.g., 1 ETM 620 - 09U

  2. Multiple linear regression: Matrix approach • The viscosity of slurry is believed to be a function of the temperature and the feed rate. A number of readings were taken with the following results: Hypothesize the relationship, Y = β0 + β1 x1 + β2 x2 + ε and calculate the estimate, 2 ETM 620 - 09U

  3. Matrix form of the equation • Define the matrices: 3 ETM 620 - 09U

  4. General Matrix Form • We obtain the least squares estimates (b0, b1, b2) of (β0, β1, β2) by solving the matrix equation: for b, or 4 ETM 620 - 09U

  5. From Excel, XTX = (XTX)-1 = XTY = b =

  6. Or, using regression analysis on Excel

  7. How do we interpret these results? R2– the degree to which the variability of the data is accounted for in the model will naturally increase as number of regressor variables increases adjusted R2– adjusted to reflect how well the addition of new regressors improves the ability of the model to account for the variability in the data. adjusted R2 > R2if the new term significantly decreases MSE adjusted R2 << R2 if the new term is not significant In our example, R2 = _______________ ; adj R2 = ________________ Interpretation?

  8. Confidence intervals around β values … Calculated by, Given in the regression results … Interpretation?

  9. A trickier example… The gas mileage for a passenger automobile is believed to be a function of the weight of the car and the horsepower of the engine. Several cars were tested with the following results: 9 ETM 620 - 09U

  10. Regression results from Excel …

  11. Let’s try it in Minitab … What do the residuals look like? What does the output of the regression tell us? What do we get if we try “Stepwise Regression”?

  12. Polynomial regression … Example: The expected yield of a crop of marigolds is hypothesized to be a function of the days after the first bloom. Yield (in number of blooms) from a given plot was counted in one growing season with the results as given in the data file. Step 1: plot the data …

  13. Plot of the data …

  14. Fitting the polynomial … Hypothesize the model, In Excel, In Minitab,

  15. Indicator variables Allows us to include qualitative factors in regression analysis … machine type grade of fuel operator Example, In addition to SAT scores, an admissions officer is concerned that whether or not a student attended private high school might affect the freshman GPA. Data from 20 students resulted is given in the data file. Conduct the analysis and interpret the results …

  16. Problems in multiple regression Multicollinearity Influential observations Autocorrelation

More Related