- 84 Views
- Uploaded on
- Presentation posted in: General

Psychology 340 Spring 2010

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Statistics for the Social Sciences

Prediction with multiple variables

Psychology 340

Spring 2010

- Multiple regression
- Comparing models, Delta r2
- Using SPSS

- Typically researchers are interested in predicting with more than one explanatory variable
- In multiple regression, an additional predictor variable (or set of variables) is used to predict the residuals left over from the first predictor.

- Bi-variate regression prediction models

Y = intercept + slope (X) + error

“residual”

“fit”

- Multiple regression prediction models

- Bi-variate regression prediction models

Y = intercept + slope (X) + error

whatever variability

is left over

First

Explanatory

Variable

Second

Explanatory

Variable

Third

Explanatory

Variable

Fourth

Explanatory

Variable

- Multiple regression prediction models

whatever variability

is left over

First

Explanatory

Variable

Second

Explanatory

Variable

Third

Explanatory

Variable

Fourth

Explanatory

Variable

- Predict test performance based on:

- Study time

- Test time

- What you eat for breakfast

- Hours of sleep

versus

versus

- Predict test performance based on:

- Study time

- Test time

- What you eat for breakfast

- Hours of sleep

- Typically your analysis consists of testing multiple regression models to see which “fits” best (comparing r2s of the models)

- For example:

Response variable

Total variability it test performance

Total study time

r = .6

Model #1: Some co-variance between the two variables

- If we know the total study time, we can predict 36% of the variance in testperformance

R2 for Model = .36

64% variance unexplained

Model #2: Add test time to the model

- Little co-variance between these test performance and test time
- We can explain more the of variance in test performance

R2 for Model = .49

Response variable

Total variability it test performance

Total study time

r = .6

51% variance unexplained

Test time

r = .1

Model #3: No co-variance between these test performance and breakfast food

- Not related, so we can NOT explain more the of variance in test performance

R2 for Model = .49

Response variable

Total variability it test performance

breakfast

r = .0

Total study time

r = .6

51% variance unexplained

Test time

r = .1

Model #4: Some co-variance between these test performance and hours of sleep

- We can explain more the of variance
- But notice what happens with the overlap (covariation between explanatory variables), can’t just add r’s or r2’s

R2 for Model = .60

Response variable

Total variability it test performance

breakfast

r = .0

Total study time

r = .6

40% variance unexplained

Hrs of sleep

r = .45

Test time

r = .1

Setup as before: Variables (explanatory and response) are entered into columns

- A couple of different ways to use SPSS to compare different models

- Analyze: Regression, Linear

- Predicted (criterion) variable into Dependent Variable field

- All of the predictor variables into the Independent Variable field

- Method 1:enter all the explanatory variables together
- Enter:

- The variables in the model

- r for the entire model

- r2 for the entire model

- Unstandardized coefficients

- Coefficient for var1 (var name)

- Coefficient for var2 (var name)

- Coefficient for var1 (var name)

- Coefficient for var2 (var name)

- The variables in the model

- r for the entire model

- r2 for the entire model

- Standardized coefficients

- Which β to use, standardized or unstandardized?

- Unstandardized β’s are easier to use if you want to predict a raw score based on raw scores (no z-scores needed).
- Standardized β’s are nice to directly compare which variable is most “important” in the equation

- First Predictor variable into the Independent Variable field

- Click the Next button

- Method 2: enter first model, then add another variable for second model, etc.
- Enter:

- Predicted (criterion) variable into Dependent Variable field

- Second Predictor variable into the Independent Variable field

- Click Statistics

- Method 2 cont:
- Enter:

- Click the ‘R squared change’ box

- Shows the results of two models

- The variables in the first model (math SAT)

- The variables in the second model (math and verbal SAT)

- Shows the results of two models

- The variables in the first model (math SAT)

- The variables in the second model (math and verbal SAT)

- r2 for the first model

- Model 1

- Coefficients for var1 (var name)

- Coefficients for var1 (var name)

- Coefficients for var2 (var name)

- Shows the results of two models

- The variables in the first model (math SAT)

- The variables in the second model (math and verbal SAT)

- r2 for the second model

- Model 2

- Shows the results of two models

- The variables in the first model (math SAT)

- The variables in the second model (math and verbal SAT)

- Change statistics: is the change in r2 from Model 1 to Model 2 statistically significant?

“residual”

“fit”

- Multiple Regression

- We can test hypotheses about the overall model

- Null Hypotheses

- H0: University GPA is not predicted by SAT verbal or SAT Math scores

- p < 0.05, so reject H0, SAT math and verbal predict University GPA

First

Explanatory

Variable

Second

Explanatory

Variable

Third

Explanatory

Variable

Fourth

Explanatory

Variable

- Multiple Regression

- We can test hypotheses about each of these explanatory hypotheses within a regression model
- So it’ll tell us whether that variable is explaining a “significant”amount of the variance in the response variable

- We can test hypotheses about the overall model

- H0: Coefficient for var1 = 0

- p < 0.05, so reject H0, var1 is a significant predictor

- H0: Coefficient for var2 = 0

- p > 0.05, so fail to reject H0, var2 is a not a significant predictor

- Null Hypotheses

- Multiple Regression

- We can test hypotheses about each of these explanatory hypotheses within a regression model
- So it’ll tell us whether that variable is explaining a “significant”amount of the variance in the response variable

- We can test hypotheses about the overall model

- We can also use hypothesis testing to examine if the change in r2 is statistically significant

- Shows the results of two models

- The variables in the first model (math SAT)

- The variables in the second model (math and verbal SAT)

- r2 for the first model

- Model 1

- Coefficients for var1 (var name)

- Coefficients for var1 (var name)

- Coefficients for var2 (var name)

- Shows the results of two models

- The variables in the first model (math SAT)

- The variables in the second model (math and verbal SAT)

- r2 for the second model

- Model 2

The 0.002 change in r2

is not statistically

significant (p = 0.46)

- Shows the results of two models

- The variables in the first model (math SAT)

- The variables in the second model (math and verbal SAT)

- Change statistics: is the change in r2 from Model 1 to Model 2 statistically significant?

- Bivariate prediction models rarely reported
- Multiple regression results commonly reported

- We can use as many predictors as we wish but we should be careful not to use more predictors than is warranted.
- Simpler models are more likely to generalize to other samples.
- If you use as many predictors as you have participants in your study, you can predict 100% of the variance. Although this may seem like a good thing, it is unlikely that your results would generalize to any other sample and thus they are not valid.
- You probably should have at least 10 participants per predictor variable (and probably should aim for about 30).