1 / 39

Multiple Regression

Multiple Regression. Multiple Regression. Regression Attempts to predict one criterion variable using one predictor variable Addresses the question: Does the predictor significantly predict the criterion?. Multiple Regression. Multiple Regression

menefer
Download Presentation

Multiple Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple Regression

  2. Multiple Regression • Regression • Attempts to predict one criterion variable using one predictor variable • Addresses the question: Does the predictor significantly predict the criterion?

  3. Multiple Regression • Multiple Regression • Attempts to predict one criterion variable using 2+ predictor variables • Addresses the questions: Do the predictors significantly predict the criterion? If so, which predictor is best? • Allows for variance to be removed from one predictor prior to evaluating the rest (like ANCOVA)

  4. Multiple Regression • How to compare the predictive value of 2+ predictors • When comparing multiple predictors within an experiment • Use standardized b (β) • β = bxs/sintercept • z-score = lets you compare performance between 2 variables with different metrics, by addressing performance relative to a sample mean & SD

  5. Multiple Regression • How to compare the predictive value of 2+ predictors • When comparing multiple predictors between experiments • Use b • SE highly variable between experiments  the SE from Exp. 1 ≠ the SE from Exp. 2  β’s from both experiments not comparable • Can’t compare z-score of your Stats grade from this semester with your Stats grade if you take the class again next semester • If next semester’s class is especially dumb, you appear to have gotten much smarter

  6. Multiple Regression • Magnitude of the relationship between one predictor and a criterion (b/β) in a model dependent upon the other predictors in that model • Relationship between IQ and SES (with College GPA and Parents’ SES in the model) will be different if more, less, or different predictors included in the model

  7. Multiple Regression • When comparing the results of 2 experiments using regression, coefficients (b/β) will not be the same • Will be similar to the extent that the regression models are similar • Why not?

  8. Multiple Regression • Coefficients (b/β) represent partial and semipartial (part) correlations, not traditional Pearson’s r • Partial Correlation – the correlation between 2 variables with the variance from one or more variables removed • I.e. correlation between the residuals of both variables, once variance from one or more covariates has been removed

  9. Multiple Regression • Partial Correlation = the amount of the variance in a criterion that is associated with a predictor that could not be explained by the other covariate(s)

  10. Multiple Regression • Semipartial/Part Correlation -the correlation between 2 variables with the variance from one or more variables removed from the predictor only (i.e. not the criterion) • I.e. correlation between the residuals of the predictor, once variance from one or more covariates has been removed, and the criterion

  11. Multiple Regression • Part Correlation = the amount of variance that a predictor explains in a criterion once variance from the covariates has been removed • I.e. the percentage of the total variance left unexplained by the covariate that the predictor accounts for • Since the variance that is removed from the criterion depends on the other predictors in the model, different models yield different regression coefficients

  12. Partial Correlation = B • Part Correlation = B/A + B

  13. Multiple Regression • How to compare the predictive value of 2+ predictors • Remember: Regression coefficients are very unstable from sample to sample, so interpret large differences in coefficients only (> ~.2)

  14. Multiple Regression • Like regression, tests: • Ability of each predictor to predict the criterion variable (tests b’s/β’s) • Overall ability of the model (all predictors combined) to predict the criterion variable (Model R2) • Model R2 = total % variance in criterion accounted for by predictors • Model R = correlation between predictors and criterion • Also can test: • If one or more predictors can predict the criterion if variance from one or more other predictors is removed • If each predictor significantly increases the Model R2

  15. Multiple Regression • Predictors are evaluated with variance from other predictors removed • More than one way to remove this variance • Examine all predictors en masse with variance from all other predictors removed • Remove variance from one or more predictors first, then look at second set • Like in factorial ANCOVA

  16. Multiple Regression • This is done by specifying different selection methods • Selection method = method of inputting predictors into a regression equation • Four most commonly used methods • Commonly-used = Only 4 methods offered by SPSS

  17. Multiple Regression • Selection Methods • Simultaneous – Adds all predictors at once & is therefore the lack of a selection method • Good if there is no theory to guide which predictors should be entered first • But when does this ever happen?

  18. Multiple Regression • Selection Methods • All Subsets – Computer finds method of entering predictors that maximizes overall Model R2 • But SPSS doesn’t do it and it finds best subset in your particular dataset – since data, not theory, guiding selection method not guarantee that the model will generalize to other datasets, particularly in smaller samples

  19. Multiple Regression • Selection Methods • Backward Elimination – Starts will all predictors in the model and eliminates the predictor with least unique variance related to criterion iteratively until all predictors are significant • Iterative = process involving several steps • It begins with all predictors, so predictors with least variance not overlapping with other predictors (i.e. that would be partialled out) are removed • But, also atheoretical/based on data only

  20. Multiple Regression • Selection Methods • Forward Selection – the opposite of backward elimination - starts will the predictor in the model most strongly related to the criterion and adds the predictor next most strongly-related to criterion iteratively until a nonsignificant predictor is found • Step 1: predictor most correlated with the criterion (P1)  Step 2: add strongest predictor when P1 partialled out • But also atheoretical

  21. Multiple Regression • Selection Methods • Stepwise • Technically, any selection method that procedes iteratively (in steps) is stepwise (i.e. both backward elimination and forward selection) • However, usually refers to method where order of predictors is determined in advance by the researcher based upon theory

  22. Multiple Regression • Selection Method • Stepwise • Why would you use it? • Same reason as covariates in ANCOVA • Want to know if Measure A of treatment adherence is better than Measure B? Run stepwise regression and enter Measure B first, then Measure A with treatment outcome as the criterion. • Addresses the question: Does Measure A predict treatment outcome even when variance from Measure B has already been removed (i.e. above and beyond Measure B)?

  23. Multiple Regression • Selection Method • Stepwise • Why would you use it? • Running a repeated-measures design and want to make sure your groups are equal on pre-test scores? Enter the pre-test into the first step of your regression.

  24. Multiple Regression • Assumptions • Linearity of Regression • Variables linearly related to one another • Normality in Arrays • Actual values of DV normally distributed around predicted values (i.e. regression line) – AKA regression line is good approximation of population parameter • Homogeneity of Variance in Arrays • Assumes that variance of criterion is equal for all levels of predictor(s)

  25. Multiple Regression • Issues to be aware of: • Range Restriction • Heterogenous Subsamples • Outliers • With multiple predictors, must be aware of both univariate outliers (unusual values on one variable) as well as multivariate outliers (unusual values on two or more variables)

  26. Multiple Regression • Outliers • Univariate outlier – a man weighing 500 lbs. • Multivariate outlier – a man who is 6’ tall and weights 120 lbs. – Note neither value is a univariate outlier, but both together are quite odd • Three variables define the presence of an outlier in multiple regression: • Distance – distance from the regression line • Leverage – distance from predictor mean • Influence – average of distance and leverage

  27. Distance – distance from the regression line • See A • Leverage – distance from predictor mean • See B • Influence – average of distance and leverage

  28. Multiple Regression • Degree of Overlap in Predictors • Adding predictors is like adding covariates in ANCOVA: In adding one that correlates too highly with others, model R2 remains unchanged but df decreases, making the regression less powerful • Tolerance = multiple R2 between all predictors – want to be low • Examine bivariate correlations between predictors, if correlation exceeds internal consistency (α), get rid of one of them

  29. Multiple Regression • Multiple regression can also test for more complex relationships, such as mediation and moderation • Mediation – when one variable (predictor) operates on another variable (criterion) via a third variable (mediator)

  30. Math self-efficacy mediates math ability and interest in a math major • Must establish paths A & B, and that path C is smaller when paths A & B are included in the model (i.e. math self-efficacy accounts for variance in interest in a math major above and beyond math ability)

  31. Find significant correlations between the predictor and mediator (path A) and mediator and criterion (path B) • Run a stepwise regression with the predictor entered first, then the predictor and mediator entered together in step 2

  32. Multiple Regression • The mediator should be a significant predictor of the criterion in step 2 • The predictor-criterion relationship (b/β) should decrease from step 1 to step 2 • Full mediation: If this relationship is significant in step 1, but nonsignificant in step 2 • Partial mediation: This relationship is significant in step 1, and smaller, but still significant, in step 2

  33. Multiple Regression • Partial mediation • Sobel’s test (1982): tests the statistical significance of this mediation relationship • Regress predictor on mediator (path A) and mediator on criterion (path B) in 2 separate regressions • Calculate sβ for path A & B, where sβ = β/t • Calculate a t-statistic, where df = n – 3 and

  34. Multiple Regression • Multiple regression can also test for more complex relationships, such as mediation and moderation • Moderation (in regression) – when the strength of a predictor-criterion changes as a result of a third variable (moderator) • Interaction (in ANOVA) – when the strength of the relationship between an IV and DV changes as a function of levels of the IV

  35. Multiple Regression • Moderation • Unlike in ANOVA, you have to create a moderator term for yourself by multiplying the predictor and moderator • In SPSS, go to Transform  Compute • Typical to enter the predictor and mediator in the first step of a regression and the interaction term in the second step to determine the contribution of the mediator above and beyond the main effect terms • Just like how variance is partitioned in a factorial ANOVA

  36. Logistic Regression • Logistic Regression = used to predict a dichotomous criterion (only 2 levels) variable with 1+ continuous or discrete predictors • Can’t use linear regression with a dichotomous criterion because: • Dichtotomous = assuming the criterion isn’t normally distributed (i.e. assumption of normality in arrays is violated)

  37. Can’t use linear regression with a dichotomous criterion because: • Regression line fits data more poorly when predictor = 0 (i.e. assumption of homogeneity of variance arrays is violated)

  38. Logistic Regression • Logistic Regression • Interpreting coefficients • In logistic regression, b represents change in log odds in criterion with one point increase in predictor • Raise “ex” where x = b, to find the odds – b = -.0812  e-.0812 = .9220

  39. Logistic Regression • Logistic Regression • Interpreting coefficients • Continuous predictor: One pt. increase in predictor corresponds to decreasing (because b is neg) odds of criterion by factor of .922 (almost 100% or twice as likely) • Dichotomous predictor: Odds of change in one group vs. other group (sign indicates increase or decrease)

More Related