240 likes | 477 Views
Ch 8 Linear Regression. AP Statistics Mrs Johnson. 3 Ways to Write the Least Squares Regression Line. From Data Using the calculator, you input data into lists, run a linear regression through the data From statistics The LSRL runs through the centroid
E N D
Ch 8 Linear Regression AP Statistics Mrs Johnson
3 Ways to Write the Least Squares Regression Line • From Data • Using the calculator, you input data into lists, run a linear regression through the data • From statistics • The LSRL runs through the centroid • Using the statistics r, sx, sy, and the mean of x and y, we can write equation of the LSRL from formulas GIVEN on the AP exam. • From computer output • Many times you will be given computer output – the slope and y intercept are always in this given data
Interpreting SLOPE in a problem: • When asked to interpret slope – remember that slope is the change in y over the change in x • State the following: As the ________ (explanatory variable) increases by 1 _______ (insert unit) the __________ (response variable) is predicted to increase/decrease (use appropriate word given sign of slope) by _______ (insert slope here and units). • As the caloric content of a burger increases by 1 calorie, the fat content of the burger is PREDICTED to increase by _____ grams.
Interpreting y-intercepts: • The y intercept occurs when the explanatory variable is 0. • Interpretation depends on the example – often times there is no real application for the y-intercept. • When the explanatory variable is 0, the response variable is predicted to be _____. (sub 0 into the equation and solve)
Coefficient of Determination – R2 • R2 is the squared correlation coefficient R • Gives the proportion (percentage) of the data’s variation accounted for by the model • R2 = 0 would means NONE of the variation of the data is in the model, useless. • R2= 1 would mean ALL of the variation in the data is accounted for in the model
Coefficient of Determination – R2 • Example: • A given data set has a correlation coefficient, r, of 0.8. • R2= 0.64 --- Interpretation 64% of the variance in the data is accounted for in our model • A given data set has a correlation coefficient, r, of 0.4. • R2 = 0.16 – Interpretation 16% of the variance in the data is accounted for in our model
Coefficient of Determination – R2 • NOTE: When interpreting R2, use this fill in the blank: • According to the linear model, _______ (insert R2 value as a percentage) of the variability in response variable is accounted for by the variation in explanatory variable.
Predicting with LSRL • Using the LSRL – we can predict y values given x values • CAUTION – only use LSRL to predict behavior within the bounds of your data • Do NOT extrapolate beyond data • Only interpolate within given data set • Using the LSRL from previous example. Determine the fat content for a burger with 550 calories.
Example – Fat / Calorie Content Finding the LSRL from given data – using calculator • Insert data into L1 (fat) and L2 (cal) • Go to Stat – Calc #8 – LinReg(a+bx) • Select appropriate lists and STORE regression line • Write regression line using WORDS as variables
Interpret Slope: • As the fat content in a burger increases by 1 grams, the caloric contentis PREDICTED to increase by _____ calories. • What is the y intercept in the burger example? • A burger with 0 fat grams, there is predicted to have _____ calories. • Interpret r • Interpret r-squared • Predict the calories for a burger with 35 grams of fat
Residual • The difference between the predicted value, , and the actual value from a data point, y. • Residual plots • Important tool for determining if a line is the best fit for data • A line is a good fit according to the residual plot IF: • No apparent pattern – no direction or shape • Scattered horizontally, with no major gaps or outliers
Residual Plots • No pattern – indicates line is a good fit • U – Shaped pattern – indicates non-linear would be best fit • Upside down u shaped pattern indicates non linear would be best fit
Residual Plot of Example: • Once you run a regression in your calculator, the residuals are created automatically and ready for you to display • From STAT PLOT, keep the x list as L1 and go to y list and find RESID in the list menu • Zoom 9 will show you the residual plot • Back to the burger data – what is the residual of your 35 grams of fat burger? • Does our line OVER or UNDER predict? • Negative residuals mean our line OVER predicts • Positive residuals mean our line UNDER PREDICTS
Set 2: Writing the Line of Best Fit – from statistics given about data • The line of best fit will be written in the form: • y-hat = predicted value • b0 = y intercept • b1 = slope • Finding the slope of the best fit line: • Sy= standard deviation of response variable • Sx= standard deviation of explanatory variable • r= correlation coefficient
Finding the y intercept • Finding the y intercept of the best fit line: • From the equation for predicted value of y • Given the mean values for x and y • Given the value of b1 – slope – calculated from statistics r, sx, sy • Use the given point and solve for b0
Example: #36 pg 193 • Given that a line is the form of best fit for a set of data which compares fat and calories on 11 brands of fast food chicken sandwiches, and given the summary statistics:
Example #36 pg 193 continued • Write the equation for line of best fit. • Interpret the slope in the context of the problem • Explain the meaning of the y intercept • What does it mean if a sandwich has a negative residual? • If a sandwich had 23 grams of fat, what is the predicted value for calories?
Method #3 – Writing the LSRL from Computer Output • Given the following data set – comparing height (ft) and weight (lb) for 10 people in a weight loss program • Describe and interpret the correlation
Reading Computer Output for Line of Best Fit Dependent Variable is: Weight R-Squared = 0.91 Variable Coefficient SE (Coeff) Constant -289.5 2.606 Height 86.1 .0013
Interpreting Slope / y intercepts Slope interpretation: As the height of the participant in the weight loss program increases by 1 foot, the predicted weight of the participant increases by approximately 86 lbs. OR As the height of the participant increases by 1 inch, the weight of the participant increases by approximately 7 lbs.
Interpreting y-intercepts: • The y intercept occurs when the explanatory variable is 0. • What is the y intercept in this example: • -289.5 • No real life interpretation – but the actual interpretation is for a participant in the weight loss program who is 0 feet tall, the predicted weight would be -289.5 lbs.