1 / 27

Using Quasi-variance to Communicate Sociological Results from Statistical Models

Using Quasi-variance to Communicate Sociological Results from Statistical Models. Vernon Gayle & Paul S. Lambert University of Stirling. Gayle and Lambert (2007) Sociology, 41(6):1191-1208.

Download Presentation

Using Quasi-variance to Communicate Sociological Results from Statistical Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Quasi-variance to Communicate Sociological Resultsfrom Statistical Models Vernon Gayle & Paul S. LambertUniversity of Stirling Gayle and Lambert (2007) Sociology, 41(6):1191-1208.

  2. “One of the useful things about mathematical and statistical models [of educational realities] is that, so long as one states the assumptions clearly and follows the rules correctly, one can obtain conclusions which are, in their own terms, beyond reproach. The awkward thing about these models is the snares they set for the casual user; the person who needs the conclusions, and perhaps also supplies the data, but is untrained in questioning the assumptions….

  3. …What makes things more difficult is that, in trying to communicate with the casual user, the modeller is obliged to speak his or her language – to use familiar terms in an attempt to capture the essence of the model. It is hardly surprising that such an enterprise is fraught with difficulties, even when the attempt is genuinely one of honest communication rather than compliance with custom or even subtle indoctrination” (Goldstein 1993, p. 141).

  4. A little biography (or narrative)… • Since being at Centre for Applied Stats in 1998/9 I has been thinking about the issue of model presentation • Done some work on Sample Enumeration Methods with Richard Davies • Summer 2004 (with David Steele’s help) began to think about “quasi-variance” • Summer 2006 began writing a paper with Paul Lambert

  5. The Reference Category Problem • In standard statistical models the effects of a categorical explanatory variable are assessed by comparison to one category (or level) that is set as a benchmark against which all other categories are compared • The benchmark category is usually referred to as the ‘reference’ or ‘base’ category

  6. The Reference Category Problem An example of Some English Government Office Regions 0 = North East of England ---------------------------------------------------------------- 1 = North West England 2 = Yorkshire & Humberside 3 = East Midlands 4 = West Midlands 5 = East of England

  7. Government Office Region

  8. Table 1: Logistic regression prediction that self-rated health is ‘good’ (Parameter estimates for model 1 )

  9. Conventional Confidence Intervals • Since these confidence intervals overlap we might be beguiled into concluding that the two regions are not significantly different to each other • However, this conclusion represents a common misinterpretation of regression estimates for categorical explanatory variables • These confidence intervals are not estimates of the difference between the North West and Yorkshire and Humberside, but instead they indicate the difference between each category and the reference category (i.e. the North East) • Critically, there is no confidence interval for the reference category because it is forced to equal zero

  10. Formally Testing the Difference Between Parameters - The banana skin is here!

  11. Standard Error of the Difference Variance North West (s.e.2 ) Only Available in the variance covariance matrix Variance Yorkshire & Humberside (s.e.2 )

  12. Covariance

  13. Standard Error of the Difference 0.0083 = Variance North West (s.e.2 ) Only Available in the variance covariance matrix Variance Yorkshire & Humberside (s.e.2 )

  14. Formal Tests t = -0.03 / 0.0083 = -3.6 Wald c2 = (-0.03 /0.0083)2 = 12.97; p =0.0003 Remember – earlier because the two sets of confidence intervals overlapped we could wrongly conclude that the two regions were not significantly different to each other

  15. Comment • Only the primary analyst who has the opportunity to make formal comparisons • Reporting the matrix is seldom, if ever, feasible in paper-based publications • In a model with q parameters there would, in general, be ½q (q-1) covariances to report

  16. Firth’s Method (made simple) s.e. difference ≈

  17. Firth’s Method (made simple) s.e. difference ≈ 0.0083 = t = (0.09-0.12) / 0.0083 = -3.6 Wald c2 = (-.03 / 0.0083)2 = 12.97; p =0.0003 These results are identical to the results calculated by the conventional method

  18. The QV based ‘comparison intervals’ no longer overlap

  19. Firth QV Calculator (on-line)

  20. Information from the Variance-Covariance Matrix Entered into the Data Window (Model 1) 0 0 0.00010483 0 0.00007543 0.00011543 0 0.00007543 0.00007543 0.00012312 0 0.00007543 0.00007543 0.00007543 0.00011337 0 0.00007544 0.00007543 0.00007543 0.00007543 0.00011480 0 0.00007545 0.00007544 0.00007544 0.00007544 0.00007545 0.00010268 0 0.00007544 0.00007543 0.00007544 0.00007543 0.00007544 0.00007546 0.00011802 0 0.00007552 0.00007548 0.00007550 0.00007547 0.00007554 0.00007572 0.00007558 0.00015002 0 0.00007547 0.00007545 0.00007546 0.00007545 0.00007548 0.00007555 0.00007549 0.00007598 0.00012356

  21. Benefits Overcomes the reference category problem when presenting models Provides reliable results (even though based on an approximation) Easy(ish) to calculate Has extensions to other models Costs Extra column in results Time convincing colleagues that this is a good thing Conclusion – We should start using method

  22. Conclusion – Why have we told you this… • Categorical X vars are ubiquitous • Interpretation of coefficients is critical to sociological analyses • Subtleties / slipperiness • (cf. in Economics where emphasis is often on precision rather than communication)

More Related