1 / 32

Bivariate Relationships

Bivariate Relationships. Plotting a Line. Review: Covariance. When it tends to be the case that x is greater than the mean when y is greater than the mean AND x is lower than the mean when y is lower than the mean, then there is a positive covariation. Plot showing positive covariance.

geoff
Download Presentation

Bivariate Relationships

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bivariate Relationships Plotting a Line

  2. Review: Covariance • When it tends to be the case that x is greater than the mean when y is greater than the mean AND x is lower than the mean when y is lower than the mean, then there is a positive covariation

  3. Plot showing positive covariance Mean urban % Mean female literacy

  4. Expected value • But we may want to know more specific knowledge than that – we may want to know the expected value of y for each increased value of x • I may know the mean of everyone’s height in class • But if I know gender, then I can generate two expected values • If you remember, we are always trying to do better than the mean

  5. Regression analysis:important to know substantive effect • For every 10K dollars given in humanitarian aid, there is an increase in 3K spent on weapons • Different from every 10K dollars given in humanitarian aid, there is a .5K increase spent on weapons • Different from every 10K dollars given in humanitarian aid, there is a 8K increase spent on weapons • Unit of analysis?

  6. Regression equation • y = a + bx + e • ŷ = a + bx • ŷ is also known as yhat • y is the dependent variable value • yhat is the predicted value • a is the intercept

  7. X and Y • Y X • 2 1 • 2 • 4 3 • 3 4 • 6 5 • 5 6

  8. X and Y

  9. Predicted values 6 6.00 5.57 5 5.00 4.74 3 3.91 4.00 y 4 3.00 3.09 2.26 1 2.00 1.43 2 1.00 1.00 2.00 3.00 4.00 5.00 6.00 x

  10. Residual values 6 6.00 1.26 5 5.00 .91 -.57 3 4.00 y 4 3.00 -.91 .57 1 2.00 -1.26 2 1.00 1.00 2.00 3.00 4.00 5.00 6.00 x

  11. Descriptives y x pred res exp unexp tot 2 1 1.43 0.57 4.29 0.73 2.25 1 2 2.26 -1.26 1.54 12.35 6.25 4 3 3.09 0.91 0.17 4.72 0.25 3 4 3.91 -0.91 0.17 23.32 0.25 5 6 5.57 -0.57 4.29 37.73 2.25 6 5 4.74 1.26 1.54 12.15 6.25 generate totvar = ((y-3.5)^2)/n-1 generate exp = ((pred - 3.5)^2)/n-1 generate unexp = ((res - pred)^2)/n-1

  12. Descriptive statistics

  13. Bivariate regression . regr y x, beta Source | SS df MS Number of obs = 6 -------------+------------------------------ F( 1, 4) = 8.76 Model | 12.0142857 1 12.0142857 Prob > F = 0.0416 Residual | 5.48571429 4 1.37142857 R-squared = 0.6865 -------------+------------------------------ Adj R-squared = 0.6082 Total | 17.5 5 3.5 Root MSE = 1.1711 ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| Beta -------------+---------------------------------------------------------------- x | .8285714 .2799417 2.96 0.042 .8285714 _cons | .6 1.090216 0.55 0.611 . ------------------------------------------------------------------------------

  14. Why unstandardized slope is the same as standardized slope? • In other words, what would have to be the case if this is true?

  15. Standard deviations are the same • Descriptive Statistics • Mean Std. Deviation N • y 3.5000 1.87083 6 • x 3.5000 1.87083 6

  16. Correlations and other statistics

  17. Another example

  18. Descriptives Syntax . sum happy Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- happy | 11 1.909091 .700649 1 3 . . sum occpres Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- occpres | 11 1.909091 .8312094 1 3

  19. Life Happiness and Occupational Prestige

  20. General | occpres • Happiness | 1 2 3 | Total • --------------+---------------------------------+---------- • Not Too Happy | 0 1 2 | 3 • | 0.00 25.00 66.67 | 27.27 • --------------+---------------------------------+---------- • Pretty Happy | 3 3 0 | 6 • | 75.00 75.00 0.00 | 54.55 • --------------+---------------------------------+---------- • Very Happy | 1 0 1 | 2 • | 25.00 0.00 33.33 | 18.18 • --------------+---------------------------------+---------- • Total | 4 4 3 | 11 • | 100.00 100.00 100.00 | 100.00 • Kendall's tau-b = -0.3689 ASE = 0.318

  21. Correlation between happiness and occupational prestige • . corr happy prestg80 • (obs=11) • | happy prestg80 • -------------+------------------ • happy | 1.0000 • prestg80 | -0.5181 1.0000

  22. Correlation between happiness and categorical occupational prestige • . corr happy occpres • (obs=11) • | happy occpres • -------------+------------------ • happy | 1.0000 • occpres | -0.3590 1.0000

  23. Life Happiness and Prestige

  24. Regression Syntax • Syntax is regr DV IV • regr happy prest80, beta • Reports beta coefficients – same as Pearson r (when there is only one independent variable) • regr happy prest80 • Reports confidence intervals instead of betas

  25. Regression results Source | SS df MS Number of obs = 11 -------------+------------------------------ F( 1, 9) = 3.30 Model | 1.31753739 1 1.31753739 Prob > F = 0.1026 Residual | 3.59155351 9 .399061502 R-squared = 0.2684 -------------+------------------------------ Adj R-squared = 0.1871 Total | 4.90909091 10 .490909091 Root MSE = .63171 ------------------------------------------------------------------------------ happy | Coef. Std. Err. t P>|t| Beta -------------+---------------------------------------------------------------- prestg80 | -.0380391 .0209348 -1.82 0.103 -.518061 _cons | 3.330371 .8050567 4.14 0.003 . ------------------------------------------------------------------------------ Why are we not that confident in our results? Why is the beta so much larger than the coefficient for the slope?

  26. Regression results Source | SS df MS Number of obs = 11 -------------+------------------------------ F( 1, 9) = 3.30 Model | 244.378788 1 244.378788 Prob > F = 0.1026 Residual | 666.166667 9 74.0185185 R-squared = 0.2684 -------------+------------------------------ Adj R-squared = 0.1871 Total | 910.545455 10 91.0545455 Root MSE = 8.6034 ------------------------------------------------------------------------------ prestg80 | Coef. Std. Err. t P>|t| Beta -------------+---------------------------------------------------------------- happy | -7.055556 3.88302 -1.82 0.103 -.518061 _cons | 50.83333 7.853795 6.47 0.000 . ------------------------------------------------------------------------------ Why is the coefficient so much bigger? What happens to the confidence? Why is the beta the same?

  27. 3.5 3 2.5 2 Are you happy? (mean = 1.9) 1.5 1 0.5 0 0 10 20 30 40 50 60 Occupational Prestige: (mean = 37) Life happiness = 3.33 - .038 Occupational Prestige

  28. What happens to the confidence if we keep the slope the same but double the n?

  29. What happens to the confidence if we keep the doubled n but decrease the variance of occupational prestige?

  30. Syntax • generate occpres = 1 if prestg80 < 33 • replace occpres = 2 if (prestg80 < 45 and prestg80 > 32) • replace occpres = 3 if prestg80 > 45 • label define highmedlow 1 low 2 med 3 high • label values occpres highmedlow

  31. . sum prestg80 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- prestg80 | 11 37.36364 9.542251 22 51

More Related