1 / 34

Basic epidemiologic analysis with Stata

Basic epidemiologic analysis with Stata. Biostatistics 212 Lecture 5. Housekeeping. Questions about Lab 4? Lab 3 issues Categorizing continuous variables (21-30 v 20-29) Include p-values when appropriate

marcy
Download Presentation

Basic epidemiologic analysis with Stata

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Basic epidemiologic analysis with Stata Biostatistics 212 Lecture 5

  2. Housekeeping • Questions about Lab 4? • Lab 3 issues • Categorizing continuous variables (21-30 v 20-29) • Include p-values when appropriate • Don’t forget the missing values! Check your work with a cross tabulation, i.e. tab genderp female, missing • Next week we’ll start with the Final Project! • What data will you use? • Explore and clean your data • Start planning tables and figures

  3. Today... • What’s the difference between epidemiologic and statistical analysis? • Interaction and confounding with 2 x 2’s • Stata’s “Epitab” commands • Adjusting for many things at once • Logistic regression • Testing for trends

  4. Epi vs. Biostats • Epidemiologic analysis – Analyzing and interpreting clinical research data in the context of scientific knowledge • Biostatistical analysis – Evaluating the role of chance

  5. Epi vs. Biostats • Epi –Confounding, interaction, and causal diagrams. • What to adjust for? • What do the adjusted estimates mean? C A B A C B

  6. 2 x 2 Tables • “Contingency tables” are the traditional analytic tool of the epidemiologist Outcome + - + - a b OR = (a/b) /(c/d) = ad/bc RR = a/(a+b) / c/(c+d) Exposure c d

  7. 2 x 2 Tables • Example Coronary calcium + - + - 106 585 691 OR = 2.1 (1.6 – 2.7) RR = 1.9 (1.6 – 2.4) Binge drinking 186 2165 2351 292 2750 3042

  8. 2 x 2 Tables • There is a statistically significant association, but is it causal? • Does male gender confound the association? Male Binge drinking Coronary calcium

  9. 2 x 2 Tables • Men more likely to binge • 34% of men, 14% of women • Men have more coronary calcium • 15% of men, 7% of women

  10. 2 x 2 Tables • But what does confounding look like in a 2x2 table? • And how do you adjust for it?

  11. 2 x 2 Tables CAC • First, stratify… + - + - RR = 1.94 (1.55-2.42) Binge In men In women CAC CAC + - + - (34%) (14%) + - + - Binge Binge (15%) (7%) RR = 1.50 (1.16-1.93) RR = 1.57 (0.94-2.62)

  12. 2 x 2 Tables • …compare strata-specific estimates… • (they’re about the same) In men In women CAC CAC + - + - (34%) (14%) + - + - Binge Binge (15%) (7%) RR = 1.50 (1.16-1.93) RR = 1.57 (0.94-2.62)

  13. 2 x 2 Tables CAC • …compare to the crude estimate + - + - RR = 1.94 (1.55-2.42) Binge In men In women CAC CAC + - + - (34%) (14%) + - + - Binge Binge (15%) (7%) RR = 1.50 (1.16-1.93) RR = 1.57 (0.94-2.62)

  14. 2 x 2 Tables • …and then adjust the summary estimate. In men In women CAC CAC + - + - + - + - Binge Binge RR = 1.50 (1.16-1.93) RR = 1.57 (0.94-2.62) RRadj = 1.51 (1.21-1.89)

  15. + - + - RR = 1.94 (1.55-2.42) Binge In men In women CAC CAC + - + - (34%) (14%) + - + - Binge Binge (15%) (7%) RR = 1.50 (1.16-1.93) RR = 1.57 (0.94-2.62) RRadj = 1.51 (1.21-1.89)

  16. 2 x 2 Tables • How do we do this with Stata? • Tabulate – output not exactly what we want. • The “epitab” commands • Stata’s answer to stratified analyses cs, cc csi, cci tabodds, mhodds

  17. 2 x 2 Tables • Example – demo using Stata cs cac binge cs cac binge, by(male) cs cac modalc cs cac modalc, by(racegender) cc cac binge

  18. 2 x 2 Tables • Intermediate commands • csi, cci • No dataset required – just 2x2 cell frequencies csi a b c d csi 106 186 585 2165 (for cac binge)

  19. Multivariable adjustment • Binge drinking appears to be associated with coronary calcium • Association partially due to confounding by gender • What about race? Age? SES? Smoking?

  20. Multivariable adjustmentmanual stratification # 2x2 tables Crude association 1 Adjust for gender 2 Adjust for gender, race 4 Adjust for gender, race, age 68 Adjust for “” + income, education 816 Adjust for “” + “” + smoking 2448

  21. Multivariable adjustmentcs command • cs command • Does manual stratification for you • Lists results from every strata • Tests for overall homogeneity • Adjusted and crude results • Demo cs cac binge, by(male black age)

  22. Multivariable adjustmentcs command • cs command • Does manual stratification for you • Lists results from every strata • Tests for overall homogeneity • Adjusted and crude results • Demo cs cac binge, by(male black age) • Can’t interpret interactions!

  23. Multivariable adjustmentmhodds command • mhodds allows you to look at specific interactions, adjusted for multiple covariates • Does same stratification for you • Adjusted results for each interaction variable • P-value for specific interaction (homogeneity) • Summary adjusted result • Demo mhodds cac binge age, by(racegender)

  24. Multivariable adjustmentmhodds command • mhodds allows you to look at specific interactions, adjusted for multiple covariates • Does same stratification for you • Adjusted results for each interaction variable • P-value for specific interaction (homogeneity) • Summary adjusted result • Demo mhodds cac binge age, by(racegender) • But strata get thin!

  25. Multivariable adjustmentlogistic command • Assumes logit model • Await biostats class for details! • Coefficients estimated, no actual stratification • Continuous variables used as they are

  26. Multivariable adjustmentlogistic command Basic syntax: logistic outcomevar [predictorvar1 predictorvar2 predictorvar3…]

  27. Multivariable adjustmentlogistic command If using any categorical predictors: xi: logistic outcomevar [i.catvar var2…] Creates “dummy variables” on the fly If you forget, Stata won’t know they are categorical, and you’ll get the wrong answer!

  28. Multivariable adjustmentlogistic command Demo logistic cac binge logistic cac binge male logistic cac binge male black logistic cac binge male black age xi: logistic cac binge male black age i.smoke

  29. Multivariable adjustmentlogistic command • Pro’s • Provides all OR’s in the model • Accepted approach • Can deal with continuous variables • Better estimation for large models? • Con’s • Interaction testing more cumbersome, less automatic • More assumptions • Harder to test for trends

  30. Testing for trend • Alcohol consumption can be a lot or a little • Does association increase with larger amounts of consumption? • (no j-shaped curve) • Test of trend? • Look through epitab suite

  31. Testing for trendstabodds command • chi2 test of trend • tabodds cac alccat • Look at output • Adjustment for multiple variables possible • tabodds cac alccat, adjust(age male black)

  32. Approaching your analysis • Number of potential models/analyses is daunting • Where do you start? How do you finish? • My suggestion • Explore • Plan definitive analysis, make dummy tables/figures • Do analysis (do/log files), fill in tables/figures • Show to collaborators, reiterate prn • Write paper

  33. Summary • Make sure you understand confounding and interaction with 2x2 tables in Stata • Epitab commands are a great way to explore your data • Emphasis on interaction • Logistic regression is a more general approach, ubiquitous, but testing for interactions and trends is more difficult

  34. In lab today… • Lab 5 • Epi analysis of coronary calcium dataset • Walks you through evaluation of confounding and interaction • Judgment calls – often no right answer, just focus on reasoning.

More Related