1 / 23

Ch 8 Linear Regression

Ch 8 Linear Regression. AP Statistics Mrs Johnson. 3 Ways to Write the Least Squares Regression Line. From Data Using the calculator, you input data into lists, run a linear regression through the data From statistics The LSRL runs through the centroid

zulema
Download Presentation

Ch 8 Linear Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ch 8 Linear Regression AP Statistics Mrs Johnson

  2. 3 Ways to Write the Least Squares Regression Line • From Data • Using the calculator, you input data into lists, run a linear regression through the data • From statistics • The LSRL runs through the centroid • Using the statistics r, sx, sy, and the mean of x and y, we can write equation of the LSRL from formulas GIVEN on the AP exam. • From computer output • Many times you will be given computer output – the slope and y intercept are always in this given data

  3. Interpreting SLOPE in a problem: • When asked to interpret slope – remember that slope is the change in y over the change in x • State the following: As the ________ (explanatory variable) increases by 1 _______ (insert unit) the __________ (response variable) is predicted to increase/decrease (use appropriate word given sign of slope) by _______ (insert slope here and units). • As the caloric content of a burger increases by 1 calorie, the fat content of the burger is PREDICTED to increase by _____ grams.

  4. Interpreting y-intercepts: • The y intercept occurs when the explanatory variable is 0. • Interpretation depends on the example – often times there is no real application for the y-intercept. • When the explanatory variable is 0, the response variable is predicted to be _____. (sub 0 into the equation and solve)

  5. Coefficient of Determination – R2 • R2 is the squared correlation coefficient R • Gives the proportion (percentage) of the data’s variation accounted for by the model • R2 = 0 would means NONE of the variation of the data is in the model, useless. • R2= 1 would mean ALL of the variation in the data is accounted for in the model

  6. Coefficient of Determination – R2 • Example: • A given data set has a correlation coefficient, r, of 0.8. • R2= 0.64 --- Interpretation 64% of the variance in the data is accounted for in our model • A given data set has a correlation coefficient, r, of 0.4. • R2 = 0.16 – Interpretation 16% of the variance in the data is accounted for in our model

  7. Coefficient of Determination – R2 • NOTE: When interpreting R2, use this fill in the blank: • According to the linear model, _______ (insert R2 value as a percentage) of the variability in response variable is accounted for by the variation in explanatory variable.

  8. Predicting with LSRL • Using the LSRL – we can predict y values given x values • CAUTION – only use LSRL to predict behavior within the bounds of your data • Do NOT extrapolate beyond data • Only interpolate within given data set • Using the LSRL from previous example. Determine the fat content for a burger with 550 calories.

  9. Example – Fat / Calorie Content Finding the LSRL from given data – using calculator • Insert data into L1 (fat) and L2 (cal) • Go to Stat – Calc #8 – LinReg(a+bx) • Select appropriate lists and STORE regression line • Write regression line using WORDS as variables

  10. Interpret Slope: • As the fat content in a burger increases by 1 grams, the caloric contentis PREDICTED to increase by _____ calories. • What is the y intercept in the burger example? • A burger with 0 fat grams, there is predicted to have _____ calories. • Interpret r • Interpret r-squared • Predict the calories for a burger with 35 grams of fat

  11. Residual • The difference between the predicted value, , and the actual value from a data point, y. • Residual plots • Important tool for determining if a line is the best fit for data • A line is a good fit according to the residual plot IF: • No apparent pattern – no direction or shape • Scattered horizontally, with no major gaps or outliers

  12. Residual Plots • No pattern – indicates line is a good fit • U – Shaped pattern – indicates non-linear would be best fit • Upside down u shaped pattern indicates non linear would be best fit

  13. Residual Plot of Example: • Once you run a regression in your calculator, the residuals are created automatically and ready for you to display • From STAT PLOT, keep the x list as L1 and go to y list and find RESID in the list menu • Zoom 9 will show you the residual plot • Back to the burger data – what is the residual of your 35 grams of fat burger? • Does our line OVER or UNDER predict? • Negative residuals mean our line OVER predicts • Positive residuals mean our line UNDER PREDICTS

  14. Set 2: Writing the Line of Best Fit – from statistics given about data • The line of best fit will be written in the form: • y-hat = predicted value • b0 = y intercept • b1 = slope • Finding the slope of the best fit line: • Sy= standard deviation of response variable • Sx= standard deviation of explanatory variable • r= correlation coefficient

  15. Finding the y intercept • Finding the y intercept of the best fit line: • From the equation for predicted value of y • Given the mean values for x and y • Given the value of b1 – slope – calculated from statistics r, sx, sy • Use the given point and solve for b0

  16. Example: #36 pg 193 • Given that a line is the form of best fit for a set of data which compares fat and calories on 11 brands of fast food chicken sandwiches, and given the summary statistics:

  17. Example #36 pg 193 continued • Write the equation for line of best fit. • Interpret the slope in the context of the problem • Explain the meaning of the y intercept • What does it mean if a sandwich has a negative residual? • If a sandwich had 23 grams of fat, what is the predicted value for calories?

  18. Method #3 – Writing the LSRL from Computer Output • Given the following data set – comparing height (ft) and weight (lb) for 10 people in a weight loss program • Describe and interpret the correlation

  19. Reading Computer Output for Line of Best Fit Dependent Variable is: Weight R-Squared = 0.91 Variable Coefficient SE (Coeff) Constant -289.5 2.606 Height 86.1 .0013

  20. Linear Regression Line

  21. Residual Plot – Is Line a Good Fit?

  22. Interpreting Slope / y intercepts Slope interpretation: As the height of the participant in the weight loss program increases by 1 foot, the predicted weight of the participant increases by approximately 86 lbs. OR As the height of the participant increases by 1 inch, the weight of the participant increases by approximately 7 lbs.

  23. Interpreting y-intercepts: • The y intercept occurs when the explanatory variable is 0. • What is the y intercept in this example: • -289.5 • No real life interpretation – but the actual interpretation is for a participant in the weight loss program who is 0 feet tall, the predicted weight would be -289.5 lbs.

More Related