logistic regression l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Logistic Regression PowerPoint Presentation
Download Presentation
Logistic Regression

Loading in 2 Seconds...

play fullscreen
1 / 35

Logistic Regression - PowerPoint PPT Presentation


  • 336 Views
  • Uploaded on

Logistic Regression. November 2, 2004 Curtis A. Parvin, Ph.D. Associate Professor and Director of Informatics and Statistics Division of Laboratory Medicine Phone: 454-8699 email: parvin@wustl.edu. Regression.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Logistic Regression' - omer


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
logistic regression

Logistic Regression

November 2, 2004

Curtis A. Parvin, Ph.D.

Associate Professor and Director of Informatics and Statistics

Division of Laboratory Medicine

Phone: 454-8699 email: parvin@wustl.edu

regression
Regression
  • Relate one or more independent (predictor) variables to a dependent (outcome) variable
    • Ordinary linear regression
      • Continuous outcome variable
      • Determine the relationship between a continuous outcome variable and the predictor variable(s)
    • Logistic regression
      • Binary outcome variable
      • Determine the relationship between the probability of the outcome occurring and the predictor variable(s)
slide3
Example: Relationship between gestational age at birth and whether an infant is breast feeding at time of hospital discharge
probability odds and the logit transform
Probability, Odds, and the Logit Transform
  • Probabilities range between zero and one
  • Odds = P/(1-P)
  • Odds range between zero and infinity
  • Logit = ln(P/(1-P))
  • The logit transform ranges between negative infinity and infinity
logistic regression11
Logistic Regression
  • Model the logarithm of the odds of an outcome as a linear combination of predictor variables
  • Logit = ln(P/(1-P) = b0+b1X1+b2X2+. . .
  • Estimate the coefficients b0, b1, b2 based on a random sample of subjects’ data
  • Determine which of the predictors are “good”
  • Assess model fit
  • Use the model to predict future cases
odds and odds ratios
Odds and Odds Ratios
  • Odds is the probability of an event occurring divided by the probability of the event not occurring
  • An odds ratio is the ratio of the odds for two different groups
    • An odds ratio = 1 implies equal risk in the two groups
  • Example: the calculated odds ratio for breast feeding at hospital discharge for GA=32 compared to GA=28 is 4.0/0.5 = 8.0
logistic regression coefficients and odds ratios
Logistic Regression Coefficients and Odds Ratios
  • If ln(P/(1-P)) = b0+b1X1+b2X2+. . ., then b1, b2, … are slope coefficients reflecting rates of change
    • ln(odds(X1+1)) – ln(odds(X1)) = b1
    • ln(odds(X1+1)/odds(X1)) = b1
    • odds(X1+1)/odds(X1) = exp(b1)
  • exp(b1) represents the odds ratio associated with a 1 unit increase in X1
  • exp(k*b1) = odds ratio for a k unit increase in X1
    • Breast feeding example: the odds of breast feeding at hospital discharge increase by a factor of exp(.577) = 1.78 for each additional week of GA
one binary outcome and one binary predictor
One Binary Outcome and One Binary Predictor
  • Case-Control Study
  • Disease
  • Cases Controls
  • Risk Yes a b
  • Factor No c d
      • Odds Ratio (OR)= a/c = a/b = ad
      • b/d c/d bc
example chd and age dichotomized at 55 years
Example: CHD and Age (Dichotomized at 55 Years)

2X2 Table calculation: OR = (21/22)/(6/51) = 8.11

Logistic Regression: ln(OR) = -0.841 + 2.094 * Age

OR = exp(2.094) = 8.11

multiple predictor variables
Multiple Predictor Variables
  • The independent variables (predictors, risk factors) can be categorical or continuous
  • Example: TDx-FLM II and gestational age as predictors of risk for respiratory distress syndrome (RDS)
    • TDx-FLM II measures mg surfactant/g of albumin in amniotic fluid
logistic regression parameter estimates
Logistic Regression Parameter Estimates

------------------------------------------------------------------------------

rds | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

tdxflm | -.1136873 .0159786 -7.11 0.000 -.1450048 -.0823699

ga | -.2912549 .1129665 -2.58 0.010 -.5126652 -.0698446

_cons | 12.8149 3.879407 3.30 0.001 5.211399 20.41839

------------------------------------------------------------------------------

ln(P(RDS)/(1-P(RDS)) = 12.81 - 0.114*TDxFLM - 0.291*GA

Odds Ratio for a 1 g/mg increase in TDxFLM: e-0.114 = 0.89

Odds Ratio for a 1 week increase in GA: e-0.291 = 0.75

------------------------------------------------------------------------------

rds | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

tdxflm | .892537 .0142615 -7.11 0.000 .8650182 .9209313

ga | .7473252 .0844227 -2.58 0.010 .5988973 .9325387

------------------------------------------------------------------------------

using the logistic model to predict risk of rds
Using the Logistic Model to Predict Risk of RDS
  • We can use the logistic model equation to;
    • Identify variables that are significant predictors
    • calculate the absolute risk (probability) of RDS (may give biased estimates)
    • calculate the relative risk (odds ratio) of RDS
    • develop a classifier for diagnosing RDS
logistic regression parameter estimates26
Logistic Regression Parameter Estimates

------------------------------------------------------------------------------

rds | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

tdxflm | -.1136873 .0159786 -7.11 0.000 -.1450048 -.0823699

ga | -.2912549 .1129665 -2.58 0.010 -.5126652 -.0698446

_cons | 12.8149 3.879407 3.30 0.001 5.211399 20.41839

------------------------------------------------------------------------------

ln(P(RDS)/(1-P(RDS)) = 12.81 - 0.114*TDxFLM - 0.291*GA

Odds Ratio for a 1 g/mg increase in TDxFLM: e-0.114 = 0.89

Odds Ratio for a 1 week increase in GA: e-0.291 = 0.75

------------------------------------------------------------------------------

rds | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

tdxflm | .892537 .0142615 -7.11 0.000 .8650182 .9209313

ga | .7473252 .0844227 -2.58 0.010 .5988973 .9325387

------------------------------------------------------------------------------

logistic regression predicted probabilities and classification with 0 20 cutoff
Logistic Regression Predicted Probabilities and Classification with 0.20 cutoff

TDxFLM GA RDS Logistic P Classify

75 30 0 .0115517 0 TN

7 31 1 .9521286 1 TP

14.8 31 1 .8912354 1 TP

18.3 31 1 .8462539 1 TP

27 31 1 .6718219 1 TP

22 31 0 .7832782 1 FP

29 31 0 .6198854 1 FP

135 31 0 .0000095 0 TN

4 32 1 .9543484 1 TP

15 32 1 .8568574 1 TP

16.5 32 1 .8346432 1 TP

25 32 1 .6575863 1 TP

44.2 32 1 .1779585 0 FN

35.5 32 0 .3679177 1 FP

41 32 0 .2374989 1 FP

48 32 0 .1232235 0 TN

49 32 0 .1114575 0 TN

55.8 32 0 .0547323 0 TN

59 32 0 .0386864 0 TN

59 32 0 .0386864 0 TN

other prediction methods
Other Prediction Methods
  • Artificial Neural Networks
    • Tu JV. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J Clin Epidemiol 1996;49:1225-31.
  • Linear or Quadratic Discriminant Analysis
  • Classification and Regression Trees (CART)
  • Multivariate Adaptive Regression Splines (MARS)
other flavors of logistic regression
Other Flavors of Logistic Regression
  • Ordinal Logistic Regression
    • More than two ordered groups
  • Multinomial Logistic Regression
    • (Polychotomous, Polytomous, Discrete Choice)
    • More than two unordered groups
  • Conditional Logistic Regression
    • Matched pairs data (1:1 or 1:M matching)
references
References
  • Hosmer DW, Lemeshow S. Applied logistic regression, 2nd ed., New York, NY: John Wiley & Sons, 2000.
  • Kleinbaum DG. Logistic regression: a self-learning text. New York, NY: Springer-Verlag, 1994.
  • Bagley SC, White H, Golumb BA. Logistic regression in the medical literature: standards for use and reporting, with particular attention to one medical domain. J Clin Epidemiol 2001;54:979-85.
    • (http://www.sciencedirect.com/science/publications/journal)
  • Ostir GV, Uchida T. Logistic regression: a nontechnical review. Am J Phys Med Rehabil 2000;79:565-72.
    • (pdf file available online through Ovid gateway)
  • http://www.ioa.pdx.edu/newsom/pa551/lectur21.htm
  • http://personal.ecu.edu/whiteheadj/data/logit/