1 / 47

Lecture 6 Comparison of logistic regression and stratified analyses

Lecture 6 Comparison of logistic regression and stratified analyses. . lincom _Itype_1+ _ItypXsmo_1_1 ( 1) _Itype_1 + _ItypXsmo_1_1 = 0 ------------------------------------------------------------------------------

suchin
Download Presentation

Lecture 6 Comparison of logistic regression and stratified analyses

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 6Comparison of logistic regression and stratified analyses Biost 536 Thompson Part 2

  2. Biost 536 Thompson Part 2

  3. . lincom _Itype_1+ _ItypXsmo_1_1 ( 1) _Itype_1 + _ItypXsmo_1_1 = 0 ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.947875 .5067205 2.56 0.010 1.169848 3.243343 ------------------------------------------------------------------------------ . lincom _Itype_1+ _ItypXsmo_1_2 ( 1) _Itype_1 + _ItypXsmo_1_2 = 0 ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.952703 .6272995 2.08 0.037 1.040376 3.665067 ------------------------------------------------------------------------------ . lincom _Itype_1+ _ItypXsmo_1_3 ( 1) _Itype_1 + _ItypXsmo_1_3 = 0 ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.714465 .6671239 1.39 0.166 .7996751 3.675732 ------------------------------------------------------------------------------ Biost 536 Thompson Part 2

  4. What null hypothesis is this LRT assessing? Biost 536 Thompson Part 2

  5. Some Stata language for recoding variables: Categorical variable “age” coded 1,2,3,4,5,6 . generate agegp=recode(age,2,4,6) . * All obsns with age <= 2 have agegp=2, all with age >2 and <=4 . * have agegp=4 and all with age > 4 and <=6 have agegp=6 . * Change the coding to 1,2,3 . recode agegp 2=1 4=2 6=3 .table age ----------+----------- Age in | years | Freq. ----------+----------- 25-34 | 116 35-44 | 199 45-54 | 213 55-64 | 242 65-74 | 161 75+ | 44 ----------+----------- Biost 536 Thompson Part 2

  6. . drop agegp . gen agegp=recode(age,2,4) . table agegp -------+----------- agegp | Freq. -------+----------- 2 | 315 4 | 660 -------+----------- . * All observations that are not <= a number in the list are given the last value in the list . drop agegp . gen agegp=1+(age>2)+(age>4) . table agegp ----------+----------- agegp | Freq. ----------+----------- 1 | 315 2 | 455 3 | 205 ----------+----------- Biost 536 Thompson Part 2

  7. Effect of linear transformations of covariates Biost 536 Thompson Part 2

  8. Dose response models Consider the role of alcohol in the esophageal cancer study, with age as a potential confounder (alcohol consumption 0-39, 40-79, 80-119, 120+ g/day; age 25-34, 35-44, 45-54, 55-64,65-74,75+) . 1. Dummy variable coding • What is the interpretation of β1,β2, β3 ? • How do we state the assumption of no association between alcohol consumption and disease risk in terms of model parameters? • What does H0 : β2 =0 mean? Biost 536 Thompson Part 2

  9. Dose response models • What is the interpretation of β1? • How would you put H0 : β1=0 into words? Biost 536 Thompson Part 2

  10. Dose response models Comparing dummy variable and grouped linear dose-response The two models are nested. The dummy variable model is a reparameterization of a model that adds terms to the grouped linear model. Biost 536 Thompson Part 2

  11. Dose response models Consider the following coding in a model where smoking status (cigs/day) is a risk factor: • What is the interpretation of H0 : β1=0? • What is the interpretation of H0 : β2=0? Note: the grouped linear model for smoking is nested in this model, comparing the two models provides a test of H0: β1 = β2 Biost 536 Thompson Part 2

  12. Stata analysis Biost 536 Thompson Part 2

  13. Biost 536 Thompson Part 2

  14. Biost 536 Thompson Part 2

  15. Fit a model without alcohol: Biost 536 Thompson Part 2

  16. Test significance of dummy variable model ORs Compare dummy variable and grouped linear models Create plots of the fitted values for grouped linear and dummy variable models: Biost 536 Thompson Part 2

  17. Biost 536 Thompson Part 2

  18. Example from the Framingham study Assume that cholesterol is the risk factor of interest for CHD and that age and sex are regarded as possible confounders Coding: Sex: Male=0 Female=1 Age: 30-49 yrs=0 50-62 yrs=1 Chol=0 <190 mg/100ml 1 190-219 mg / 100ml 2 220-249 mg /100ml 3 250+ mg/ 100ml . infile sex age chol case count using "p:\536\framingham.txt" . gen sa=sex*age . logistic case sex age sa [freq=count] Logistic regression Number of obs = 4856 LR chi2(3) = 223.78 Prob > chi2 = 0.0000 Log likelihood = -1238.1973 Pseudo R2 = 0.0829 ------------------------------------------------------------------------------ case | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- sex | .2343622 .0436654 -7.79 0.000 .1626654 .3376601 age | 2.708977 .3646438 7.40 0.000 2.080792 3.526809 sa | 2.170123 .5172673 3.25 0.001 1.36017 3.462386 ------------------------------------------------------------------------------ . est store A Biost 536 Thompson Part 2

  19. . logistic case sex age [freq=count] Logistic regression Number of obs = 4856 LR chi2(2) = 212.83 Prob > chi2 = 0.0000 Log likelihood = -1243.6693 Pseudo R2 = 0.0788 ------------------------------------------------------------------------------ case | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- sex | .3673749 .041778 -8.81 0.000 .2939751 .4591012 age | 3.516703 .3839374 11.52 0.000 2.839262 4.35578 ------------------------------------------------------------------------------ . est store B . lrtest A B Likelihood-ratio test LR chi2(1) = 10.94 (Assumption: B nested in A) Prob > chi2 = 0.0009 Now introduce cholesterol as a dummy variable, without and then with confounder adjustment. . xi: logistic case i.chol [freq=count] i.chol _Ichol_0-3 (naturally coded; _Ichol_0 omitted) Logistic regression Number of obs = 4856 LR chi2(3) = 85.86 Prob > chi2 = 0.0000 Log likelihood = -1307.1541 Pseudo R2 = 0.0318 ------------------------------------------------------------------------------ case | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Ichol_1 | 1.408998 .2849726 1.70 0.090 .9478795 2.094438 _Ichol_2 | 2.361255 .446123 4.55 0.000 1.630502 3.419514 _Ichol_3 | 3.811035 .6825005 7.47 0.000 2.682905 5.413532 ------------------------------------------------------------------------------ Biost 536 Thompson Part 2

  20. . xi: logistic case i.chol sex age sa [freq=count] i.chol _Ichol_0-3 (naturally coded; _Ichol_0 omitted) Logistic regression Number of obs = 4856 LR chi2(6) = 278.82 Prob > chi2 = 0.0000 Log likelihood = -1210.675 Pseudo R2 = 0.1033 ------------------------------------------------------------------------------ case | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Ichol_1 | 1.265023 .2604385 1.14 0.253 .8449991 1.89383 _Ichol_2 | 1.959199 .3786927 3.48 0.001 1.341375 2.861587 _Ichol_3 | 3.039625 .5653248 5.98 0.000 2.111102 4.376539 sex | .2504792 .0468737 -7.40 0.000 .1735726 .3614615 age | 2.649839 .360558 7.16 0.000 2.029543 3.459718 sa | 1.6341 .3962956 2.02 0.043 1.015894 2.628504 ------------------------------------------------------------------------------ . est store B . lrtest B A Likelihood-ratio test LR chi2(3) = 55.04 (Assumption: A nested in B) Prob > chi2 = 0.0000 Now explore the dose-response for cholesterol. Consider merging the two lower categories. . gen chol2=(chol>1)+(chol>2) Biost 536 Thompson Part 2

  21. . xi: logistic case age sex sa i.chol2 [freq=count] i.chol2 _Ichol2_0-2 (naturally coded; _Ichol2_0 omitted) Logistic regression Number of obs = 4856 LR chi2(5) = 277.50 Prob > chi2 = 0.0000 Log likelihood = -1211.3366 Pseudo R2 = 0.1028 ------------------------------------------------------------------------------ case | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | 2.657913 .3615256 7.19 0.000 2.035924 3.469924 sex | .2494536 .0466714 -7.42 0.000 .172876 .3599521 sa | 1.646246 .3992246 2.06 0.040 1.023466 2.64799 _Ichol2_1 | 1.702258 .2465811 3.67 0.000 1.281517 2.261134 _Ichol2_2 | 2.638887 .3545856 7.22 0.000 2.027895 3.433968 ------------------------------------------------------------------------------ . est store C . lrtest B C Likelihood-ratio test LR chi2(1) = 1.32 (Assumption: C nested in B) Prob > chi2 = 0.2500 We might also consider a grouped linear model: . logistic case sex age sa chol [freq=count] Logistic regression Number of obs = 4856 LR chi2(4) = 278.11 Prob > chi2 = 0.0000 Log likelihood = -1211.0318 Pseudo R2 = 0.1030 ------------------------------------------------------------------------------ case | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- sex | .2510722 .046962 -7.39 0.000 .1740143 .3622532 age | 2.647062 .3599985 7.16 0.000 2.027689 3.455627 sa | 1.6385 .3971248 2.04 0.042 1.01892 2.634833 chol | 1.484784 .0821353 7.15 0.000 1.332222 1.654817 ------------------------------------------------------------------------------ Biost 536 Thompson Part 2

  22. . est store D . lrtest B D Likelihood-ratio test LR chi2(2) = 0.71 (Assumption: D nested in B) Prob > chi2 = 0.7000 Or a grouped linear model based on 3 categories: . logistic case sex age sa chol2 [freq=count] Logistic regression Number of obs = 4856 LR chi2(4) = 277.36 Prob > chi2 = 0.0000 Log likelihood = -1211.4088 Pseudo R2 = 0.1027 ------------------------------------------------------------------------------ case | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- sex | .2488943 .0465438 -7.44 0.000 .1725196 .3590802 age | 2.656132 .3612456 7.18 0.000 2.034617 3.467503 sa | 1.648407 .3997793 2.06 0.039 1.024772 2.651562 chol2 | 1.621059 .1080207 7.25 0.000 1.422585 1.847223 ------------------------------------------------------------------------------ . est store E . lrtest C E Likelihood-ratio test LR chi2(1) = 0.14 (Assumption: E nested in C) Prob > chi2 = 0.7040 Using a grouped linear model with three cholesterol categories, we next proceed to explore possible interactions between the confounders and cholesterol. Biost 536 Thompson Part 2

  23. . gen sc=sex*chol2 . gen ac=age*chol2 . logistic case sex age sa chol2 sc [freq=count] Logistic regression Number of obs = 4856 LR chi2(5) = 279.08 Prob > chi2 = 0.0000 Log likelihood = -1210.5474 Pseudo R2 = 0.1034 ------------------------------------------------------------------------------ case | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- sex | .2962674 .0673835 -5.35 0.000 .1897079 .4626819 age | 2.656331 .3622607 7.16 0.000 2.033285 3.470291 sa | 1.773859 .4424029 2.30 0.022 1.087998 2.892078 chol2 | 1.720257 .1388469 6.72 0.000 1.468555 2.015098 sc | .8292774 .1178365 -1.32 0.188 .627694 1.095599 ------------------------------------------------------------------------------ . est store F . lrtest E F Likelihood-ratio test LR chi2(1) = 1.72 (Assumption: E nested in F) Prob > chi2 = 0.1893 . logistic case sex age sa chol2 ac [freq=count] Logistic regression Number of obs = 4856 LR chi2(5) = 291.83 Prob > chi2 = 0.0000 Log likelihood = -1204.1695 Pseudo R2 = 0.1081 ------------------------------------------------------------------------------ case | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- sex | .254191 .0477456 -7.29 0.000 .1759042 .3673198 age | 4.490304 .8809846 7.66 0.000 3.056839 6.595975 sa | 1.799092 .4374102 2.42 0.016 1.117126 2.897374 chol2 | 2.11201 .2059725 7.67 0.000 1.744548 2.556871 ac | .6036774 .0801734 -3.80 0.000 .465327 .7831619 ------------------------------------------------------------------------------ . est store G Biost 536 Thompson Part 2

  24. . lrtest E G Likelihood-ratio test LR chi2(1) = 14.48 (Assumption: E nested in G) Prob > chi2 = 0.0001 . lincom 2*chol2 ( 1) 2 chol2 = 0 ------------------------------------------------------------------------------ case | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 4.460585 .8700318 7.67 0.000 3.043448 6.537589 ------------------------------------------------------------------------------ . lincom chol2+ac ( 1) chol2 + ac = 0 ------------------------------------------------------------------------------ case | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.274972 .1149389 2.69 0.007 1.068476 1.521376 ------------------------------------------------------------------------------ . lincom 2*chol2+2*ac ( 1) 2 chol2 + 2 ac = 0 ------------------------------------------------------------------------------ case | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.625555 .2930878 2.69 0.007 1.141642 2.314586 ------------------------------------------------------------------------------ Biost 536 Thompson Part 2

  25. Logistic models Logit(p)=β0+β1chol2+β2sex+β3age+β4sex*age+β5chol2*age Biost 536 Thompson Part 2

  26. 2.11 4.46 2.11 1.27 1.63 Biost 536 Thompson Part 2

  27. Dose response models 3. Continuous X Biost 536 Thompson Part 2

  28. . logistic low lwt Logistic regression Number of obs = 189 LR chi2(1) = 5.98 Prob > chi2 = 0.0145 Log likelihood = -114.34533 Pseudo R2 = 0.0255 ------------------------------------------------------------------------------ low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lwt | .9860401 .0060834 -2.28 0.023 .9741886 .9980358 ------------------------------------------------------------------------------ . lincom 10*lwt, or ( 1) 10 lwt = 0 ------------------------------------------------------------------------------ low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | .8688519 .0536044 -2.28 0.023 .7698929 .9805307 ------------------------------------------------------------------------------ . logit low lwt Logistic regression Number of obs = 189 LR chi2(1) = 5.98 Prob > chi2 = 0.0145 Log likelihood = -114.34533 Pseudo R2 = 0.0255 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lwt | -.0140583 .0061696 -2.28 0.023 -.0261504 -.0019661 _cons | .9983143 .7852889 1.27 0.204 -.5408235 2.537452 ------------------------------------------------------------------------------ . . predict lp, p . est store A Biost 536 Thompson Part 2

  29. Biost 536 Thompson Part 2

  30. . centile lwt, c(20,40,60,80) -- Binom. Interp. -- Variable | Obs Percentile Centile [95% Conf. Interval] ---------+------------------------------------------------------------- lwt | 189 20 107 102.8034 110 | 40 120 115 120 | 60 130 123 132 | 80 150 140 160 . gen lwtc=(lwt>107)+(lwt>120)+(lwt>130)+(lwt>150) . xi: logistic low i.lwtc i.lwtc Ilwtc_0-4 (naturally coded; Ilwtc_0 omitted) Logit estimates Number of obs = 189 LR chi2(4) = 11.04 Prob > chi2 = 0.0261 Log likelihood = -111.81371 Pseudo R2 = 0.0471 ------------------------------------------------------------------------------ low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- Ilwtc_1 | .3410256 .1523973 -2.407 0.016 .1420374 .818788 Ilwtc_2 | .5225 .2578529 -1.315 0.188 .1986185 1.374526 Ilwtc_3 | .2891304 .1554239 -2.308 0.021 .1008149 .8292065 Ilwtc_4 | .2293103 .1213334 -2.783 0.005 .0812893 .6468657 ------------------------------------------------------------------------------ Biost 536 Thompson Part 2

  31. Biost 536 Thompson Part 2

  32. Smoothing . lowess low lwt, gen(lows) . twoway (scatter low lwt) (line lows lwt, sort lcol(red)) (line lp lwt, sort ), scheme(s1mono) legend(off) xtitle(Pre-pregnancy wt (lbs)) ytitle(Prob of low birthweight) Biost 536 Thompson Part 2

  33. Splines Instead of a simple continuous function or a step function, fit a function that is linear / quadratic / cubic within group categories, but constrained to join "nicely" at the boundaries. • Advantages: • Individual data points have a strong influence on the shape of the curve only in the interval in which they lie • More plausible than a step function • Flexibility • Relatively easy to fit • Disadvantages: • As with step function: interval choice is subjective. • Hard to report results succinctly any way but graphically. • More parameters to be fitted --- Biost 536 Thompson Part 2

  34. Linear spline • Divide the observed values of X into k+1 categories • Define: Biost 536 Thompson Part 2

  35. Stata example . logit low lwt s1-s4 Logistic regression Number of obs = 189 LR chi2(5) = 9.90 Prob > chi2 = 0.0781 Log likelihood = -112.38526 Pseudo R2 = 0.0422 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lwt | -.0260585 .0409074 -0.64 0.524 -.1062356 .0541185 s1 | -.055212 .081114 -0.68 0.496 -.2141925 .1037685 s2 | .1424752 .1001686 1.42 0.155 -.0538516 .338802 s3 | -.0732678 .0852176 -0.86 0.390 -.2402912 .0937556 s4 | -.0004225 .0449774 -0.01 0.993 -.0885765 .0877316 _cons | 2.493599 4.084743 0.61 0.542 -5.51235 10.49955 ------------------------------------------------------------------------------ . predict lsp, p . est store B . lrtest B A Likelihood-ratio test LR chi2(4) = 3.92 (Assumption: A nested in B) Prob > chi2 = 0.4169 Biost 536 Thompson Part 2

  36. Alternative Stata code . mkspline l1 107 l2 120 l3 130 l4 150 l5=lwt, marginal . logit low l1-l5 Logistic regression Number of obs = 189 LR chi2(5) = 9.90 Prob > chi2 = 0.0781 Log likelihood = -112.38526 Pseudo R2 = 0.0422 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- l1 | -.0260585 .0409074 -0.64 0.524 -.1062356 .0541185 l2 | -.055212 .081114 -0.68 0.496 -.2141925 .1037685 l3 | .1424752 .1001686 1.42 0.155 -.0538516 .338802 l4 | -.0732678 .0852176 -0.86 0.390 -.2402912 .0937556 l5 | -.0004225 .0449774 -0.01 0.993 -.0885765 .0877316 _cons | 2.493599 4.084743 0.61 0.542 -5.51235 10.49955 ------------------------------------------------------------------------------ . est store C . lrtest C A Likelihood-ratio test LR chi2(4) = 3.92 (Assumption: A nested in B) Prob > chi2 = 0.4169 Biost 536 Thompson Part 2

  37. Sensitivity to choice of intervals . gen t1=(lwt>100)*(lwt-100) . gen t2=(lwt>125)*(lwt-125) . gen t3=(lwt>150)*(lwt-150) . gen t4=(lwt>175)*(lwt-175) . logistic low lwt t1 t2 t3 t4 . logit low lwt t1 t2 t3 t4 Logistic regression Number of obs = 189 LR chi2(5) = 8.70 Prob > chi2 = 0.1216 Log likelihood = -112.98503 Pseudo R2 = 0.0371 ------------------------------------------------------------------------------ low | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- lwt | -.0209016 .062629 -0.33 0.739 -.1436521 .101849 t1 | -.023866 .0785323 -0.30 0.761 -.1777865 .1300544 t2 | .065341 .0469319 1.39 0.164 -.0266438 .1573259 t3 | -.0494432 .0615913 -0.80 0.422 -.1701599 .0712735 t4 | .0204834 .0587547 0.35 0.727 -.0946738 .1356405 _cons | 1.985505 6.026281 0.33 0.742 -9.82579 13.7968 ------------------------------------------------------------------------------ . predict lsp2, p . twoway (scatter low lwt) (line lp lwt, sort lcol(red)) (line lsp lwt, sort clpat(dash) lcol(green) ) (line lsp2 lwt, sort clpat(dash_dot) lcol(blue) ), scheme(s1mono) legend(off) xtitle(Pre-pregnancy wt (lbs)) ytitle(Prob of low birthweight) Biost 536 Thompson Part 2

  38. Biost 536 Thompson Part 2

  39. Cubic spline . spline low lwt, knots(107,120,130,150) regress(logit) gen(csp) . twoway (scatter low lwt) (line lp lwt, sort lcol(red)) (line lsp lwt, sort clpat(dash_dot) lcol(green)) (line csp lwt, sort clpat(dash)) , scheme(s1mono) legend(off) xtitle(Pre-pregnancy wt (lbs)) ytitle(Prob of low birthweight) Biost 536 Thompson Part 2

  40. Sensitivity to choice of intervals . spline low lwt, n(3) regress(logistic) gen(csp2) . twoway (scatter low lwt) (line lp lwt, sort lcol(red)) (line csp2 lwt, sort clpat(dash_dot) lcol(pink)) (line csp lwt, sort clpat(dash)) , scheme(s1mono) legend(off) xtitle(Pre-pregnancy wt (lbs)) ytitle(Prob of low birthweight) Biost 536 Thompson Part 2

  41. Fractional polynomials Royston P, Ambler G, Sauerbrei W. The use of fractional polynomials to model continuous risk variables in epidemiology. Int J Epidemiol, 1999; 28: 964-974. Royston P, Altman DG. Regression using fractional polynomials of continuous covariates: parsimonious modelling. Applied Statistics, 1994; 43: 429-467. Sauerbrei W, Royston P. Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. J R Statist Soc A, 1999; 162: 71-94. Biost 536 Thompson Part 2

  42. Fractional polynomials . fracpoly logistic low lwt ........ -> gen double Ilwt__1 = X^-2-.5934053858 if e(sample) -> gen double Ilwt__2 = X^-2*ln(X)-.1548424581 if e(sample) (where: X = lwt/100) Logistic regression Number of obs = 189 LR chi2(2) = 7.51 Prob > chi2 = 0.0234 Log likelihood = -113.58167 Pseudo R2 = 0.0320 ------------------------------------------------------------------------------ low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Ilwt__1 | 3.044759 5.104863 0.66 0.507 .1138735 81.41099 Ilwt__2 | .2034832 .9615213 -0.34 0.736 .0000193 2141.511 ------------------------------------------------------------------------------ Deviance: 227.16. Best powers of lwt among 44 models fit: -2 -2. . predict fp,p . twoway (scatter low lwt, symbol(x)) (line lp lwt, sort lcol(red))(line fp lwt, sort clpat(dash_dot) lcol(green)) (line csp lwt, sort clpat(dash)), scheme(s1mono) legend(off) xtitle(Pre-pregnancy wt (lbs)) ytitle(Prob of low birthweight) Biost 536 Thompson Part 2

  43. Biost 536 Thompson Part 2

  44. . fracpoly logistic low lwt, degree(3) compare ............................................ -> gen double Ilwt__1 = X^3-2.187624479 if e(sample) -> gen double Ilwt__2 = X^3*ln(X)-.5708359916 if e(sample) -> gen double Ilwt__3 = X^3*ln(X)^2-.1489532287 if e(sample) (where: X = lwt/100) Logistic regression Number of obs = 189 LR chi2(3) = 9.32 Prob > chi2 = 0.0253 Log likelihood = -112.67397 Pseudo R2 = 0.0397 ------------------------------------------------------------------------------ low | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- Ilwt__1 | .0065153 .0166058 -1.97 0.048 .0000441 .9625743 Ilwt__2 | 46656.28 277007.2 1.81 0.070 .4122545 5.28e+09 Ilwt__3 | .0013849 .0055176 -1.65 0.099 5.63e-07 3.408812 ------------------------------------------------------------------------------ Deviance: 225.35. Best powers of lwt among 164 models fit: 3 3 3. Fractional polynomial model comparisons: --------------------------------------------------------------- lwt df Deviance Gain P(term) Powers --------------------------------------------------------------- Not in model 0 234.672 -- -- Linear 1 228.691 0.000 0.014 1 m = 1 2 227.276 1.414 0.234 -2 m = 2 4 227.163 1.527 0.945 -2 -2 m = 3 6 225.348 3.343 0.403 3 3 3 --------------------------------------------------------------- Biost 536 Thompson Part 2

  45. Selection of variables Biost 536 Thompson Part 2

  46. Modeling associations Biost 536 Thompson Part 2

  47. Strategies • Fit all known confounders • Significance testing • Change of estimate rule A sensible strategy: • All variables that are logically confounders should be examined for their effects as confounders by controlling for them. • If the confounder alters the estimate of interest or its standard error to an important degree, include it in the model. • If the confounder does not appreciably alter the estimate, then include it in the model if it is: • traditional (e.g. gender or age) • statistically significant • estimates are believable • there are not too many other confounders in the model • Assess the role of exposure and possible effect modification by means of hypothesis testing. Biost 536 Thompson Part 2

More Related