1 / 75

Modeling the association between a binary outcome, Y, and an “exposure”, X

Modeling the association between a binary outcome, Y, and an “exposure”, X. Slides are from Research Professor M. Thompson. We might want to model p x =P(Y=1|X). What are the characteristics of p X ? 0 ≤p X ≤ 1 p X possibly monotone in X. 5. Logit. Probit. Transform of p. 0. -5.

Download Presentation

Modeling the association between a binary outcome, Y, and an “exposure”, X

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modeling the association between a binary outcome, Y, and an “exposure”, X Slides are from Research Professor M. Thompson BIOST 536 Thompson

  2. We might want to model px=P(Y=1|X) What are the characteristics of pX? • 0 ≤pX≤ 1 • pX possibly monotone in X BIOST 536 Thompson

  3. 5 Logit Probit Transform of p 0 -5 0 .2 .4 .6 .8 1 Probability Model g(pX)=β0 + β1 X BIOST 536 Thompson

  4. Logistic regression with a single binary risk factor BIOST 536 Thompson

  5. Cohort or Cross-sectional study estimates P(Y=1 | X=1) estimates P(Y=1 | X=0) estimatesthe odds ratio: BIOST 536 Thompson

  6. Under the logistic model: logit(P(Y=1|X))=β0+β1X ln(OR) = ln(Ψ) = logit(P(Y=1|X=1))-logit(P(Y=1|X=0)) = β0 + β1 - β0 = β1 i.e. Ψ = exp(β1) And: logit(P(Y=1 |X=0)) = β0 P(Y=1 |X=0) is estimated by BIOST 536 Thompson

  7. The logistic equations: For binary X: BIOST 536 Thompson

  8. Case Control study Let Z = 1 if individual was sampled = 0 otherwise Define π1 = P(Z=1 | Y=1); π0 = P(Z=1 | Y=0) Let pZ(X)= P(Y=1 | X, Z=1) BIOST 536 Thompson

  9. We can model: Logit(pZ(X)) BIOST 536 Thompson

  10. If we model logit(pZ(X)) = α + β1 X Then ln(Ψ) = β1 or Ψ = exp(β1) as before. But: BIOST 536 Thompson

  11. Parameter estimation:Maximum Likelihood We choose that estimate of the parameters that makes the data most likely to have occurred Let's take the simple setting of a cross-sectional study where we want to estimate the prevalence of a disease. Say we take a random sample of N individuals and w of them have the disease. The common sense estimate of the prevalence of disease is : BIOST 536 Thompson

  12. The likelihood Let w=number diseased in N independent individuals and let the true disease prevalence in the population be p. Then the likelihood of observing w diseased individuals in N is given by: BIOST 536 Thompson

  13. We want to choose that value of p which maximizes the likelihood or, equivalently, the log of the likelihood: Taking the derivative of l with respect to p: Setting the derivative equal to zero and solving for p: BIOST 536 Thompson

  14. In a study involving 53 men with prostate cancer, 20 of the men had nodal involvement How to estimate the chance of nodal involvement? BIOST 536 Thompson

  15. Using MLE in the logistic regression setting with a single covariate, X: Say we have N observations (Yi, Xi ), i=1,2,…,N, where Y denotes disease status (0 =non-diseased, 1=diseased) and X is a risk factor of interest. Let p(X) denote P(Y=1 | X). Then: BIOST 536 Thompson

  16. L= l =ln(L) = Alternative (Binomial) formulation: If X takes on n different values, Xj, j=1,2,…,n, and, for each Xj, there are nj subjects, where , of whom yj are “diseased”, we can represent the log likelihood as BIOST 536 Thompson

  17. If we model then, for a single dichotomous risk factor, X, as in Table A, the maximum likelihood estimate of β0 is ln(b/d) β1 is ln(ad/bc) and hence the maximum likelihood estimate of P(Y=1 | X=1) is a/m1 and of P(Y=1 | X=0) is b/m0. BIOST 536 Thompson

  18. Hypothesis testing and confidence intervals Say we want to establish whether tumor size affects the chance of nodal involvement in men with prostate cancer Nodal | Tumor involvement| largesmall| Total -----------+----------------------+---------- Yes | 15 5 | 20 | 56% 19% | 38% -----------+----------------------+---------- No | 1221 | 33 | 44% 81% | 62% -----------+----------------------+---------- Total | 26 27 | 53 BIOST 536 Thompson

  19. Consider logit(P(nodal involvement | tumor size=X))=β0 + β1 X The maximum likelihood estimate of β1 is Hence the OR is estimated by e1.66= 5.25 (=15x21/(5x12)) How do we test the statistical significance of the OR? Calculate a confidence interval? BIOST 536 Thompson

  20. Ho: β1=0 <=> Ho: OR=Ψ=1 BIOST 536 Thompson

  21. The deviance compares observed to predicted values via the likelihood: where To assess the role of X in the logistic model : Logit(P(Y=1|X))= β0 + β1 X We can consider G = D(model without X)-D(model with X) = BIOST 536 Thompson

  22. Let Y=nodal involvement in prostate cancer, X=tumor size We estimate: logit(P(Y=1|X)= -1.44+1.66 X, and OR=Ψ=5.25 Ln L= -31.276 Under the null model: Logit(P(Y=1))=constant, then Ln L=-35.126 Under the hypothesis H0 : β1 =0, G has a Χ2 distribution with 1 degree of freedom Here G =-2*(-35.126+31.276) = 7.7 LR test: P(Х21 > 7.7)= .0055 Score Test: P(Х21 > 7.44)= .0064 Wald test: P(Х21 > 6.92)= .0090 STATA gives the LR test for the fitted model versus the null model STATA does not do the Score test easily STATA gives the single parameter Wald test BIOST 536 Thompson

  23. Stata code • . logistic node tumor • Logistic regression Number of obs = 53 • LR chi2(1) = 7.70 • Prob > chi2 = 0.0055 • Log likelihood = -31.276312 Pseudo R2 = 0.1096 • ------------------------------------------------------------------------------ • node | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] • -------------+---------------------------------------------------------------- • tumor | 5.25 3.310487 2.63 0.009 1.52552 18.06761 • ------------------------------------------------------------------------------ • . logit • ------------------------------------------------------------------------------ • node | Coef. Std. Err. z P>|z| [95% Conf. Interval] • -------------+---------------------------------------------------------------- • tumor | 1.658228 .630569 2.63 0.009 .4223355 2.894121 • _cons | -1.435085 .4976116 -2.88 0.004 -2.410385 -.4597837 • ------------------------------------------------------------------------------ • Pseudo R2=1-lm/l0 BIOST 536 Thompson

  24. The information matrix Maximum likelihood theory states that the variance estimators for estimates obtained from MLE can be derived from the matrix of second partial derivatives of the log likelihood. Minus this matrix is called the information matrix, I, and the estimated variances and covariances of the parameter estimates are obtained from the inverse of the matrix. BIOST 536 Thompson

  25. Let and β and let V= BIOST 536 Thompson

  26. Then I = X' V X and it can be shown that ~N(β, I-1) and so an approximate 95% CI for, e.g., β1 is given by: and hence a 95% CI for the OR is obtained by exponentiation of the CI for β1 BIOST 536 Thompson

  27. Interpretation of coefficients Dichotomous X (coded 0 or 1) Here OR = or Interpretation of β0 depends on study design. BIOST 536 Thompson

  28. Polytomous X BIOST 536 Thompson

  29. Polytomous X with k categories We define X1, X2, …, Xk-1 dummy 0-1 design variables and consider the model: P(Y=1 | X) = β0 + β1 X1 + β2 X2 + … βk-1 Xk-1 . is the odds ratio for the j'th category of X relative to the baseline category. BIOST 536 Thompson

  30. Stata code: . input chd smoke count . 1 3 39 . 1 2 50 . 1 1 70 . 1 0 98 . 0 3 253 . 0 2 355 . 0 1 735 . 0 0 1554 . end BIOST 536 Thompson

  31. . xi: logit chd i.smoke [fweight = count] i.smoke _Ismoke_0-3 (naturally coded; _Ismoke_0 omitted) Iteration 0: log likelihood = -890.62187 Iteration 1: log likelihood = -876.52013 Iteration 2: log likelihood = -875.84853 Iteration 3: log likelihood = -875.84738 Logistic regression Number of obs = 3154 LR chi2(3) = 29.55 Prob > chi2 = 0.0000 Log likelihood = -875.84738 Pseudo R2 = 0.0166 ------------------------------------------------------------------------------ chd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Ismoke_1 | .4122448 .1627693 2.53 0.011 .0932229 .7312667 _Ismoke_2 | .8035253 .1834786 4.38 0.000 .4439138 1.163137 _Ismoke_3 | .8937922 .2010989 4.44 0.000 .4996455 1.287939 _cons | -2.76362 .1041517 -26.53 0.000 -2.967754 -2.559486 ------------------------------------------------------------------------------ BIOST 536 Thompson

  32. . xi: logistic chd i.smoke [fweight=count] i.smoke _Ismoke_0-3 (naturally coded; _Ismoke_0 omitted) Logistic regression Number of obs = 3154 LR chi2(3) = 29.55 Prob > chi2 = 0.0000 Log likelihood = -875.84738 Pseudo R2 = 0.0166 ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Ismoke_1 | 1.510204 .2458148 2.53 0.011 1.097706 2.077711 _Ismoke_2 | 2.2334 .4097812 4.38 0.000 1.558796 3.199955 _Ismoke_3 | 2.444382 .4915626 4.44 0.000 1.648137 3.625307 ------------------------------------------------------------------------------ BIOST 536 Thompson

  33. . expand count (3146 observations created) . xi: logit chd i.smoke Logistic regression Number of obs = 3154 LR chi2(3) = 29.55 Prob > chi2 = 0.0000 Log likelihood = -875.84738 Pseudo R2 = 0.0166 ------------------------------------------------------------------------------ chd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Ismoke_1 | .4122448 .1627693 2.53 0.011 .0932229 .7312667 _Ismoke_2 | .8035253 .1834786 4.38 0.000 .4439138 1.163137 _Ismoke_3 | .8937922 .2010989 4.44 0.000 .4996455 1.287939 _cons | -2.76362 .1041517 -26.53 0.000 -2.967754 -2.559486 ------------------------------------------------------------------------------ . xi: logistic chd i.smoke ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _Ismoke_1 | 1.510204 .2458148 2.53 0.011 1.097706 2.077711 _Ismoke_2 | 2.2334 .4097812 4.38 0.000 1.558796 3.199955 _Ismoke_3 | 2.444382 .4915626 4.44 0.000 1.648137 3.625307 ------------------------------------------------------------------------------------------------------------- BIOST 536 Thompson

  34. . lincom _Ismoke_2- _Ismoke_1, or ( 1) - _Ismoke_1 + _Ismoke_2 = 0 ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.478873 .2900367 2.00 0.046 1.006916 2.172044 ------------------------------------------------------------------------------ . lincom _Ismoke_3- _Ismoke_2, or ( 1) - _Ismoke_2 + _Ismoke_3 = 0 ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.094466 .2505588 0.39 0.693 .698771 1.714234 ------------------------------------------------------------------------------ . lincom _Ismoke_3- _Ismoke_1, or ( 1) - _Ismoke_1 + _Ismoke_3 = 0 ------------------------------------------------------------------------------ chd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 1.618577 .3442644 2.26 0.024 1.066809 2.455728 ------------------------------------------------------------------------------ BIOST 536 Thompson

  35. Continuous X Here interpretation of β1 depends on the units of X. If the logit is linear in X, then β1 represents the change in log odds for a 1 unit increase in X. is the odds ratio corresponding to a 1 unit increase in X. BIOST 536 Thompson

  36. Example: Effect of age on nodal involvement in prostate cancer . logistic node age Logit estimates Number of obs = 53 LR chi2(1) = 1.09 Prob > chi2 = 0.2965 Log likelihood = -34.581125 Pseudo R2 = 0.0155 ------------------------------------------------------------------------------ node | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- age | .9526993 .0445086 -1.037 0.300 .8693389 1.044053 ------------------------------------------------------------------------------ . logit ------------------------------------------------------------------------------ node | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- age | -.048456 .0467184 -1.037 0.300 -.1400223 .0431104 _cons | 2.366605 2.770912 0.854 0.393 -3.064283 7.797493 ------------------------------------------------------------------------------ BIOST 536 Thompson

  37. NOTES • The OR for nodal involvement corresponding to a ten year age difference is: estimated by .95310=.62 • The 95% CI for log(10βAGE) is given by: • Hence the 95% CI for the 10-year OR is given by: (.25,1.54) • This OR is the same comparing 40 year olds with 30 year olds as comparing 60 year olds with 50 year olds etc BIOST 536 Thompson

  38. Multiple logistic regression Logit(P(Y=1| X1, X2, .., Xk) ) = β0 +β1 X1 + β2 X2 + …+ βk Xk BIOST 536 Thompson

  39. Estimation Assume we have N observations (Yi, Xi1, Xi2, .., Xik), i=1,2,…,N As before, we can use maximum likelihood to obtain estimates of β0, β1, β2,…, βk that maximize the likelihood: L= and we can estimate the variances and covariances of the estimates from the inverse of the information matrix, I. BIOST 536 Thompson

  40. Hypothesis testing The Wald, Likelihood Ratio and Score tests generalize to the case of k X variables. In general Full model: logit(p) = β0 +β1 X1 + β2 X2 + …+ βk Xk Reduced model: logit(p) = β0 +β1 X1 + β2 X2 + …+ βp Xp, , p<k H0 : βp+1 = βp+2 = …= βk =0 Ha : ≠0 somewhere BIOST 536 Thompson

  41. Likelihood ratio test LR statistic = -2[ln L(reduced) -ln L(full)] = Deviance(reduced) - Deviance(full) Approximate distribution under H0 : Χ2k-p We must fit two models to calculate the LR statistic Stata provides LR test of the current model relative to the null model: H0 : β1 = β2 = …= βk =0 BIOST 536 Thompson

  42. Score test • If H0 implies β = β* then Score statistic = S(β*)' I-1 S(β*) where I denotes the information matrix • Approximate distribution under H0 : Χ2k-p • Only need to fit the reduced model to calculate the Score statistic • Stata does not perform the Score test easily. BIOST 536 Thompson

  43. Wald test • For a single parameter: ~ N(0,1) under H0 : βj=0. • The Wald test can be generalized to multiple parameters where it also follows a Χ2k-p distribution under H0. • Most confidence intervals are based on the Wald test statistic BIOST 536 Thompson

  44. LR tests using Stata In general: Fit "full" model, then: . est store A saves log-likelihood from most recently fitted model and labels it “A" Fit reduced model, then: . est store B saves log-likelihood from most recently fitted model and labels it “B" Carry out the LR test comparing "full" model (A) with reduced model (B) . lrtest A B, stats BIOST 536 Thompson

  45. Example: prostate cancer study BIOST 536 Thompson

  46. Fitting “full” model: . logistic node tsize xray Logistic regression Number of obs = 53 LR chi2(2) = 16.90 Prob > chi2 = 0.0002 Log likelihood = -26.676709 Pseudo R2 = 0.2405 ------------------------------------------------------------------------------ node | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- tsize | 4.895297 3.426809 2.269 0.023 1.241425 19.30357 xray | 8.326496 6.218498 2.838 0.005 1.926448 35.9888 ------------------------------------------------------------------------------ . logit ------------------------------------------------------------------------------ node | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- tsize | 1.588275 .7000206 2.269 0.023 .2162598 2.96029 xray | 2.119443 .7468325 2.838 0.005 .6556779 3.583208 _cons | -2.044627 .6099686 -3.352 0.001 -3.240144 -.8491109 ------------------------------------------------------------------------------ . est store A BIOST 536 Thompson

  47. Fitting “reduced” model: . logistic node tsize Logistic regression Number of obs = 53 LR chi2(1) = 7.70 Prob > chi2 = 0.0055 Log likelihood = -31.276312 Pseudo R2 = 0.1096 ------------------------------------------------------------------------------ node | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- tsize | 5.25 3.310487 2.630 0.009 1.52552 18.06761 ------------------------------------------------------------------------------------------------ . logit ------------------------------------------------------------------------------ node | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- tsize | 1.658228 .630569 2.630 0.009 .4223355 2.894121 _cons | -1.435085 .4976116 -2.884 0.004 -2.410385 -.4597837 ------------------------------------------------------------------------------ . est stor B BIOST 536 Thompson

  48. Likelihood ratio test: comparing models for nodal involvement with and without effect of xray . lrtest A B, stats Likelihood-ratio test LR chi2(1) = 9.20 (Assumption: B nested in A) Prob > chi2 = 0.0024 ------------------------------------------------------------------------------ Model | Obs ll(null) ll(model) df AIC BIC -------------+---------------------------------------------------------------- B | 53 -35.12608 -31.27631 2 66.55262 70.49321 A | 53 -35.12608 -26.67671 3 59.35342 65.26429 ------------------------------------------------------------------------------ What hypothesis is this testing? BIOST 536 Thompson

  49. Fitted probabilities in the “full” model: . predict pnode, p P(node | tumor=0, xray=0)=.1146 (.1429) P(node | tumor=1, xray=0)=.3879 (.3529) P(node | tumor=0, xray=1)=.5187 (.4000) P(node | tumor=1, xray=1)=.8407 (.9000) Note: these are slightly different from what we would get if we used the raw data without modelling. Why? BIOST 536 Thompson

  50. Confidence intervals • A 100(1-α)% Likelihood Ratio based confidence region for β is given by: • Stata provides Wald-based CIs for individual parameters • CIs for odds ratios can be obtained by exponentiation BIOST 536 Thompson

More Related