Chapter 7
Download
1 / 26

Chapter 7 - PowerPoint PPT Presentation


  • 84 Views
  • Uploaded on

Chapter 7. Correlation, Bivariate Regression, and Multiple Regression. Pearson’s Product Moment Correlation. Correlation measures the association between two variables. Correlation quantifies the extent to which the mean, variation & direction of one variable are related to another variable.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Chapter 7' - nora-tyler


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Chapter 7

Chapter 7

Correlation, Bivariate Regression, and Multiple Regression


Pearson s product moment correlation
Pearson’s Product Moment Correlation

  • Correlation measures the association between two variables.

  • Correlation quantifies the extent to which the mean, variation & direction of one variable are related to another variable.

  • r ranges from +1 to -1.

  • Correlation can be used for prediction.

  • Correlation does not indicate the cause of a relationship.


Scatter plot
Scatter Plot

  • Scatter plot gives a visual description of the relationship between two variables.

  • The line of best fit is defined as the line that minimized the squared deviations from a data point up to or down to the line.





Will a linear fit work1
Will a Linear Fit Work? Relationship

y = 0.5246x - 2.2473

R2 = 0.4259


Linear fit
Linear Fit Relationship

y = 0.0012x - 1.0767

R2 = 0.0035


Evaluating the strength of a correlation
Evaluating the Strength of a Correlation Relationship

  • For predictions, absolute value of r < .7, may produce unacceptably large errors, especially if the SDs of either or both X & Y are large.

  • As a general rule

    • Absolute value r greater than or equal .9 is good

    • Absolute value r equal to .7 - .8 is moderate

    • Absolute value r equal to .5 - .7 is low

    • Values for r below .5 give R2 = .25, or 25% are poor, and thus not useful for predicting.


Significant correlation
Significant Correlation?? Relationship

If N is large (N=90) then a .205 correlation is significant.

ALWAYS THINK ABOUT R2

How much variance in Y is X accounting for?

r = .205

R2 = .042, thus X is accounting for 4.2% of the variance in Y.

This will lead to poor predictions.

A 95% confidence interval will also show how poor the prediction is.


Venn diagram shows r 2 the amount of variance in y that is explained by x
Venn diagram shows (R Relationship2) the amount of variance in Y that is explained by X.

R2=.64 (64%) Variance in Y that is explained by X

Unexplained Variance in Y. (1-R2) = .36, 36%


The vertical distance (up or down) from a data point to the line of best fit is a RESIDUAL.

r = .845

R2 = .714 (71.4%)

Y = mX + b

Y = .72 X + 13


Standard error of estimate se e sd of y
Standard Error of Estimate line of best fit is a RESIDUAL.(SEE)SD of Y

Prediction Errors

The SEE is the SD of the prediction errors (residuals) when predicting Y from X. SEE is used to make a confidence interval for the prediction equation.


Bivariate linear regression
Bivariate Linear Regression line of best fit is a RESIDUAL.


Linear regression statistics
Linear Regression: Statistics line of best fit is a RESIDUAL.

Enter the variables

Click Statistics Button


Linear regression statistics settings
Linear Regression: Statistics Settings line of best fit is a RESIDUAL.


Linear regression output
Linear Regression: Output line of best fit is a RESIDUAL.

71.5% percent of the variance in Y is explained by X.

Correlation (r) r = .845 between X and Y.


Regression output
Regression Output line of best fit is a RESIDUAL.

Prediction Equation

Y = .726 (X) + 12.859

95% CI

Y = .726 (X) + 12.859 ± 1.96 (6.06)


The se e is used to compute confidence intervals for prediction equation
The SE line of best fit is a RESIDUAL.E is used to compute confidence intervals for prediction equation.


Example of a 95 confidence interval
Example of a 95% confidence interval. line of best fit is a RESIDUAL.

Both r and SDY are critical in accuracy of prediction.

If SDY is small and r is big, predictions are will be small.

If SDY is big and r is small, predictions are will be large.

We are 95% sure the mean falls between 45.1 and 67.3


Multiple regression
Multiple Regression line of best fit is a RESIDUAL.

  • Multiple regression is used to predict one Y (dependent) variable from two or more X (independent) variables.

  • The advantage of multivariate or bivariate regression is

    • Provides lower standard error of estimate

    • Determines which variables contribute to the prediction and which do not.


Multiple regression1
Multiple Regression line of best fit is a RESIDUAL.

  • b1, b2, b3, … bn are coefficients that give weight to the independent variables according to their relative contribution to the prediction of Y.

  • X1, X2, X3, … Xn are the predictors (independent variables).

  • C is a constant, similar to Y intercept.

  • Body Fat = Abdominal + Tricep + Thigh


List the variables and order to enter into the equation
List the variables and order to enter into the equation line of best fit is a RESIDUAL.

  • X2 has biggest area (C), it comes in first.

  • X1 comes in next area (A) is bigger than area (E). Both A and E are unique, not common to C.

  • X3 comes in next, it uniquely adds area (E).

  • X4 is not related to Y so it is NOT in the equation.


Ideal relationship between predictors and y
Ideal Relationship Between Predictors and Y line of best fit is a RESIDUAL.

Each variable accounts for unique variance in Y

Very little overlap of the predictors

Order to enter?

X1, X3, X4, X2, X5


ad