Chapter 8 linear regression part a
This presentation is the property of its rightful owner.
Sponsored Links
1 / 27

Chapter 8: Linear Regression—Part A PowerPoint PPT Presentation


  • 68 Views
  • Uploaded on
  • Presentation posted in: General

Chapter 8: Linear Regression—Part A. A.P. Statistics. Linear Model. Making a scatterplot allows you to describe the relationship between the two quantitative variables.

Download Presentation

Chapter 8: Linear Regression—Part A

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Chapter 8 linear regression part a

Chapter 8: Linear Regression—Part A

A.P. Statistics


Linear model

Linear Model

  • Making a scatterplot allows you to describe the relationship between the two quantitative variables.

  • However, sometimes it is much more useful to use that linear relationship to predict or estimate information based on that real data relationship.

  • We use the Linear Model to make those predictions and estimations.


Linear model1

Linear Model

Normal Model

Linear Model

Allow us to make predictions and estimations about the population and future events.

It is a model of real data, as long as that data has a linear relationship between two quantitative variables.

Allows us to make predictions and estimations about the population and future events.

It is a model of real data, as long as that data has a nearly symmetric distribution.


Linear model and the least squared regression line

Linear Model and the Least Squared Regression Line

  • To make this model, we need to find a line of best fit.

  • This line of best fit is the “predictor line” and will be the way we predict or estimate our response variable, given our explanatory variable.

  • This line has to do with how well it minimizes the residuals.


Residuals and the least squares regression line

Residuals and the Least Squares Regression Line

  • The residual is the difference between the observed value and the predicted value.

  • It tells us how far off the model’s prediction is at that point

  • Negative residual: predicted value is too big (overestimation)

  • Positive residual: predicted value is too small (underestimation)


Residuals

Residuals


Least squares regression line

Least Squares Regression Line

  • The LSRL attempts to find a line where the sum of the squared residuals are the smallest.

  • Why not just find a line where the sum of the residuals is the smallest?

    • Sum of residuals will always be zero

    • By squaring residuals, we get all positive values, which can be added

    • Emphasizes the large residuals—which have a big impact on the correlation and the regression line


Scatterplot of math and verbal sat scores

Scatterplot of Math and Verbal SAT scores


Scatterplot of math and verbal sat scores with incorrect lsrl

Scatterplot of Math and Verbal SAT scores with incorrect LSRL


Scatterplot of math and verbal sat scores with correct lsrl

Scatterplot of Math and Verbal SAT scores with correct LSRL


Correlation and the line standardized data

Correlation and the Line(Standardized data)

  • LSRL passes through

    and

  • LSRL equation is:

    “moving one standard deviation from the mean in x, we can expect to move about r standard deviations from the mean in y .”


Interpreting standardized slope of lsrl

Interpreting Standardized Slope of LSRL

LSRL of scatterplot:

For every standard deviation above (below) the mean a sandwich is in protein, we’ll predict that that its fat content is 0.83 standard deviations above (below) the mean.


Lsrl that models data in real units

LSRL that models data in real units

ProteinFat

LSRL Equation:


Interpreting lsrl

Interpreting LSRL

Slope: One additional gram of protein is associated with an additional 0.97 grams of fat.

y-intercept: An item that has zero grams of protein will have 6.8 grams of fat.

ALWAYS CHECK TO SEE IF y-INTERCEPT MAKES SENSE IN THE CONTEXT OF THE PROBLEM AND DATA


Properties of the lsrl

Properties of the LSRL

The fact that the Sum of Squared Errors (SSE, same as Least Squared Sum)is as small as possible means that for this line:

  • The sum and mean of the residuals is 0

  • The variation in the residuals is as small as possible

  • The line contains the point of averages


Assumptions and conditions for using lsrl

Assumptions and Conditions for using LSRL

Quantitative Variable Condition

Straight Enough Condition

if not—re-express

Outlier Condition

with and without ?


Residuals and lsrl

Residuals and LSRL

  • Residuals should be used to see if a linear model is appropriate and in addition the LSRL that was calculated

  • Residuals are the part of the data that has not been modeled in our linear model


Residuals and lsrl1

Residuals and LSRL

What to Look for in a Residual Plot to Satisfy Straight Enough Condition:

No patterns, no interesting features (like direction or shape), should stretch horizontally with about same scatter throughout, no bends or outliers.

The distribution of residuals should be symmetric if the original data is straight enough.

Looking at a scatterplot of the residuals vs. the x-value is a good way to check the Straight Enough Condition, which determines if a linear model is appropriate.


Residuals again

Residuals, again


When analyzing the relationship between two variables thus far

When analyzing the relationship between two variables (thus far)

ALWAYS:

  • Plot the data and describe the relationship*

  • Check Three Regression

    Assumptions/Conditions

  • Compute correlation coefficient

  • Compute Least Squared Regression Line

  • Check Residual Plot (Again)

  • Interpret relationship (intercept, slope, correlation and general conclusion)

    * Calculate mean and standard deviation for each variable, if possible


  • Login