1 / 47

Multivariate Regression

Multivariate Regression. 11/19/2013. Readings. Chapter 8 ( pp 187-206) Chapter 9 Dummy Variables and Interaction Effects (Pollock Workbook) . Opportunities to discuss course content. Office Hours For the Week. When Tuesday 10-12 Thursday 8-12 And by appointment.

hansel
Download Presentation

Multivariate Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multivariate Regression 11/19/2013

  2. Readings • Chapter 8 (pp 187-206) • Chapter 9 Dummy Variables and Interaction Effects (Pollock Workbook)

  3. Opportunities to discuss course content

  4. Office Hours For the Week • When • Tuesday 10-12 • Thursday 8-12 • And by appointment

  5. Course Learning Objectives • Students will be able to interpret and explain empirical data. • Students will achieve competency in conducting statistical data analysis using the SPSS software program.

  6. Ratio and Intervals! Bivariate Regression analysis

  7. Bivariate Linear Regression • Bivariate linear regression is an important statistical technique in the Social Sciences. It allows us to measure the effects of an independent variable on a dependent variable. • It regresses all the values on to a line that best describes the relationship.

  8. Rules for Regression • If you have a Ratio/Interval Dependent variable that takes on at least 11 values • You need data that are not nominal • If you have 30 or more cases (N>30) • If you have a linear relationship. It will not work with curvilinear or exponential relationships.

  9. The Regression Equation! The Constant, where the line crosses the y-axis The Dependent Variable The independent variable The Slope and direction of the line

  10. Weight and MPG • What is the Constant? • A positive or Negative relationship? • Is it significant relationship and and Why • What is the predicted MPG of a car that weighs 2000 lbs?

  11. Multiple regression

  12. What we can do with it • Test the significance, strength and direction of more than one independent variable on the dependent variable, while controlling for the other independent variables. • We can compare the strength of each independent variable against each other • We can examine an entire model at one time!

  13. The Model • Y is the dependent variable • a is the constant • b1x1- first beta coefficient and first independent variable • b2x2- first beta coefficient and first independent variable

  14. This allows us to model additive relationships

  15. Computing a Multiple regression • You put more than one independent variable where you say "independents“ • D.V. Women09 (% of women in parliament) • IV1- womenyear2 (date of enfranchisement) • IV2- pr-sys (pr system) • IV3- pmat12_3 (postmaterialism)

  16. Regression Outputs • These have 3 parts • The Model Summary • ANOVA • The Variables/Model

  17. Part I Things that Begin with “r”

  18. With So Many, How do we know? • There are many R's out there: • lower case "r" for correlation • upper case "R" for regression

  19. Correlation (small r) • r- the pearson’s product movement • r2- The squared pearson correlation coefficient.

  20. The R-Square (large R) • this is a measure of association for the entire model • This is a PRE measure that tells us what percent of the total variation in the dependent variable is explained by our model. • The higher the number, the better our model predicts. • We can increase the R value of our model, by increasing the number of variables, even insignificant ones!

  21. Adjusted R-Square • this "adjusts" for the addition of independent variables. In equations with more than 1 independent variable, it will always be smaller than the R Square. • This is the preferred measure and a PRE model

  22. What the R’s look like The R Square Adj R-Square, the preferred measure

  23. Part II The Analysis of variance (ANOVA)

  24. ANOVA • A Way of testing the null hypothesis for the entire model- We Look at the F-Score • H0 = that there is not relationship between our variables and the dependent variable • HA = There is at least 1 significant variable in the model

  25. What The F-Score tells us • It is like a chi-square for Regression. The F-score tells us if we have a significant regression model • If the F-Score is not significant, we accept the null hypothesis (no relationship). • It usually tells us at least one of our variables is significant. • It is a way of examining the entire regression.

  26. The F-Score • We look at the Sig value and use the p<.05 measurement • In the model above, our p value is .001 • We Reject the null hypothesis • At least one variable is significant

  27. Part III The Model

  28. The Model • What it tells us • Variable relationships and direction • Variable significance • Variable Strength

  29. Old Friends Beta Values T-Tests Test the significance of each independent variable on the dependent variable Accept or reject the null for that variable • Measure the change in the dependent variable • Show the direction of the relationship

  30. Standardized Beta Coefficients • They show us the variables which have the greatest influence. • These are measured in absolute value • The larger the standardized beta, the more influence it has on the dependent variable.

  31. Looking at our Model Beta Values T-Score- Significance

  32. Trying it out

  33. Turning Texas Blue (through Turnout) • Data • Dependent Variable • Turnout 2012 • Independent Variables- • COLGRADCollege % • HISPPER Hispanic percent • BLACKPER African-American percent

  34. Another One • D.V.Palin_therm-post (Feeling thermometer for Palin 0-100) • IV's • enviro_jobs (Environment vs. jobs tradeoff) 0=envir, 1=middle, 2=jobs • educ_r- education in years • Gunown- do you own a gun (1=yes, 5=no) • relig_bible_word (Is Bible actual word of God?) 1=yes, 0=No

  35. Another one from the states • Gay Rights involves many concepts. The Lax-Phillips index uses content validity to address this issue at the state level. It examines the support for the following issues • Adoption • Hate Crimes legislation • Health Benefits • Housing Discrimination • Job Discrimination • Marriage Laws • Sodomy Laws • Civil Unions • It then averages these to get a statewide level

  36. State Example • Dependent Variable- gay_support (higher is more supportive on Lax-Phillips) • Independent Variables • relig_import (% of people in state that say religion provides a great deal of guidance) • south (1=south, 0= NonSouth • abortlaw (restrictions on abortion)

  37. Tautology • it is tempting to use independent variables that are actually components of the dependent variable. • How you will notice this: • if the dependent variables seem to be measures of each other (human development vs. education) they probably are, (female literacy and literacy rate) • High Adj. R-Squares (above .900)

  38. Multicollinearity • Your independent variables should not only be independent of the d.v. (non tautological) but they should be independent of each other! • Picking independent variables that are very closely related, or are actually part of the same measure What can happen here is these variables will negate the influence of each other on the dependent variable.

  39. Symptoms of Multicollinearity • the multiple regression equation is statistically significant (big R values, even a significant ANOVA), but none of the t-ratios are statistically significant • the addition of the collinear independent variable radically changes the values of the standardized beta coefficients (they go from positive to negative, or weak to strong), without a corresponding change in the ADJ R-square. • Variables, that you would swear on a stack of bibles should be related, are not

  40. Solving Tautology and Multicolinearity • Solving tautology- Drop the independent variable • What to do About Multicollinearity • run bivariate correlations on each of your variables. If the r-square value is >.60. • You will want to drop one of the variables, or combine them into a single measure.

  41. Data collection

  42. Collecting Primary Data • Direct Observation • Document Analysis • Interview Data

  43. Document Analysis

  44. Document Analysis (The Written Record) • What is it • When to use it

  45. Types of Document Analysis • The Episodic Record • The Running Record

  46. Limitations and Advantages

More Related