- 84 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Chapter 7' - nora-tyler

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Chapter 7

Correlation, Bivariate Regression, and Multiple Regression

Pearson’s Product Moment Correlation

- Correlation measures the association between two variables.
- Correlation quantifies the extent to which the mean, variation & direction of one variable are related to another variable.
- r ranges from +1 to -1.
- Correlation can be used for prediction.
- Correlation does not indicate the cause of a relationship.

Scatter Plot

- Scatter plot gives a visual description of the relationship between two variables.
- The line of best fit is defined as the line that minimized the squared deviations from a data point up to or down to the line.

Line of Best Fit Minimizes Squared Deviations from a Data Point to the Line

Always do a Scatter Plot to Check the Shape of the Relationship

Will a Linear Fit Work? Relationship

Evaluating the Strength of a Correlation Relationship

- For predictions, absolute value of r < .7, may produce unacceptably large errors, especially if the SDs of either or both X & Y are large.
- As a general rule
- Absolute value r greater than or equal .9 is good
- Absolute value r equal to .7 - .8 is moderate
- Absolute value r equal to .5 - .7 is low
- Values for r below .5 give R2 = .25, or 25% are poor, and thus not useful for predicting.

Significant Correlation?? Relationship

If N is large (N=90) then a .205 correlation is significant.

ALWAYS THINK ABOUT R2

How much variance in Y is X accounting for?

r = .205

R2 = .042, thus X is accounting for 4.2% of the variance in Y.

This will lead to poor predictions.

A 95% confidence interval will also show how poor the prediction is.

Venn diagram shows (R Relationship2) the amount of variance in Y that is explained by X.

R2=.64 (64%) Variance in Y that is explained by X

Unexplained Variance in Y. (1-R2) = .36, 36%

The vertical distance (up or down) from a data point to the line of best fit is a RESIDUAL.

r = .845

R2 = .714 (71.4%)

Y = mX + b

Y = .72 X + 13

Standard Error of Estimate line of best fit is a RESIDUAL.(SEE)SD of Y

Prediction Errors

The SEE is the SD of the prediction errors (residuals) when predicting Y from X. SEE is used to make a confidence interval for the prediction equation.

Bivariate Linear Regression line of best fit is a RESIDUAL.

Linear Regression: Statistics line of best fit is a RESIDUAL.

Enter the variables

Click Statistics Button

Linear Regression: Statistics Settings line of best fit is a RESIDUAL.

Linear Regression: Output line of best fit is a RESIDUAL.

71.5% percent of the variance in Y is explained by X.

Correlation (r) r = .845 between X and Y.

Regression Output line of best fit is a RESIDUAL.

Prediction Equation

Y = .726 (X) + 12.859

95% CI

Y = .726 (X) + 12.859 ± 1.96 (6.06)

The SE line of best fit is a RESIDUAL.E is used to compute confidence intervals for prediction equation.

Example of a 95% confidence interval. line of best fit is a RESIDUAL.

Both r and SDY are critical in accuracy of prediction.

If SDY is small and r is big, predictions are will be small.

If SDY is big and r is small, predictions are will be large.

We are 95% sure the mean falls between 45.1 and 67.3

Multiple Regression line of best fit is a RESIDUAL.

- Multiple regression is used to predict one Y (dependent) variable from two or more X (independent) variables.
- The advantage of multivariate or bivariate regression is
- Provides lower standard error of estimate
- Determines which variables contribute to the prediction and which do not.

Multiple Regression line of best fit is a RESIDUAL.

- b1, b2, b3, … bn are coefficients that give weight to the independent variables according to their relative contribution to the prediction of Y.
- X1, X2, X3, … Xn are the predictors (independent variables).
- C is a constant, similar to Y intercept.
- Body Fat = Abdominal + Tricep + Thigh

List the variables and order to enter into the equation line of best fit is a RESIDUAL.

- X2 has biggest area (C), it comes in first.
- X1 comes in next area (A) is bigger than area (E). Both A and E are unique, not common to C.
- X3 comes in next, it uniquely adds area (E).
- X4 is not related to Y so it is NOT in the equation.

Ideal Relationship Between Predictors and Y line of best fit is a RESIDUAL.

Each variable accounts for unique variance in Y

Very little overlap of the predictors

Order to enter?

X1, X3, X4, X2, X5

Download Presentation

Connecting to Server..