stats 330 lecture 23 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Stats 330: Lecture 23 PowerPoint Presentation
Download Presentation
Stats 330: Lecture 23

Loading in 2 Seconds...

play fullscreen
1 / 45

Stats 330: Lecture 23 - PowerPoint PPT Presentation


  • 161 Views
  • Uploaded on

Stats 330: Lecture 23. Multiple Logistic Regression (Cont). Plan of the day. In today’s lecture we continue our discussion of the multiple logistic regression model Topics covered Models and submodels Residuals for Multiple Logistic Regression Diagnostics in Multiple Logistic Regression

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Stats 330: Lecture 23' - booth


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
stats 330 lecture 23
Stats 330: Lecture 23

Multiple Logistic

Regression (Cont)

plan of the day
Plan of the day

In today’s lecture we continue our discussion of the multiple logistic regression model

Topics covered

  • Models and submodels
  • Residuals for Multiple Logistic Regression
  • Diagnostics in Multiple Logistic Regression
  • No analogue of R2

Reference: Coursebook, sections 5.2.3, 5.2.3

comparison of models
Comparison of models
  • Suppose model 1 and model 2 are two models, with model 2 a submodel of model1
  • If Model 2 is in fact correct, then the difference in the deviances will have approximately a chi-squared distribution
  • df equals the difference in df of the separate models
  • Approximation OK for grouped and ungrouped data
example kyphosis data
Example: kyphosis data
  • Is age alone an adequate model?

> age.glm<-glm(Kyphosis~Age+I(Age^2),family=binomial, data=kyphosis.df)

Null deviance: 83.234 on 80 degrees of freedom

Residual deviance: 72.739 on 78 degrees of freedom

AIC: 78.739

Full model has deviance 54.428 on 76 df

Chisq is 72.739 - 54.428 = 18.311 on 78-76=2 df

> 1-pchisq(18.311,2)

[1] 0.0001056372

Highly significant: need at least one of start and number

anova in r
Anova in R

Two-model form: comparing

> anova(age.glm,kyphosis.glm, test=“Chi”)

Analysis of Deviance Table

Model 1: Kyphosis ~ Age + I(Age^2)

Model 2: Kyphosis ~ Age + I(Age^2) + Start + Number

Resid. Df Resid. Dev Df Deviance P(>|Chi|)

1 78 72.739

2 76 54.428 2 18.311 0.0001056 ***

residuals
Residuals
  • Two kinds of residuals
    • Pearson residuals
      • useful for grouped data only
      • similar to residuals in linear regression, actual minus fitted value
    • Deviance residuals
      • useful for grouped and ungrouped data
      • Measure contribution of each covariate pattern to the deviance
pearson residuals
Pearson residuals

Pearson residual for pattern i is

Probability predicted by model

Standardized to have approximately unit variance, so big if more than 2 in absolute value

deviance residuals i
Deviance residuals (i)
  • For grouped data, the deviance is
deviance residuals i9
Deviance residuals (i)
  • Thus, the deviance can be written as the sum of squares of M quantities d1, …, dM ,one for each covariate pattern
  • Each di is the contribution to the deviance from the ith covariate pattern
  • If deviance residual is big (more than about 2 in magnitude), then the covariate pattern has a big influence on the likelihood, and hence the estimates
calculating residuals
Calculating residuals

> pearson.residuals<-residuals(budworm.glm,

type="pearson")

  • > deviance.residuals<-residuals(budworm.glm, type="deviance")
  • > par(mfrow=c(1,2))

> plot(pearson.residuals, ylab="residuals",

main="Pearson")

  • > abline(h=0,lty=2)

> plot(deviance.residuals, ylab="residuals", main="Deviance")

  • > abline(h=0,lty=2)
diagnostics outlier detection
Diagnostics: outlier detection
  • Large residuals indicate covariate patterns poorly fitted by the model
  • Large Pearson residuals indicate a poor match between the “maximum model probabilities” and the logistic model probabilities, for grouped data
  • Large deviance residuals indicate influential points
  • Example: budworm data
diagnostics detecting non linear regression functions
Diagnostics: detecting non-linear regression functions
  • For a single x, plot the logits of the maximal model probabilities against x
  • For multiple x’s, plot Pearson residuals against fitted probabilities, against individual x’s
  • If the data has most ni’s equal to 1, so can’t be grouped, try gam (cf kyphosis data)
example budworms
Example: budworms
  • Plot Pearson residuals versus dose, plot shows a curve
diagnostics influential points
Diagnostics: influential points

Will look at 3 diagnostics

  • Hat matrix diagonals
  • Cook’s distance
  • Leave-one-out Deviance Change
example vaso constriction data
Example: vaso-constriction data

Data from study of reflex vaso-constriction (narrowing of the blood vessels) of the skin of the fingers

  • Can be caused caused by sharp intake of breath
example vaso constriction data17
Example: vaso-constriction data

Variables measured:

Response = 0/1

1=vaso-constriction occurs, 0 = doesn’t occur

Volume: volume of air breathed in

Rate: rate of intake of breath

slide18
Data
  • Volume Rate Response
  • 1 3.70 0.825 1
  • 2 3.50 1.090 1
  • 3 1.25 2.500 1
  • 4 0.75 1.500 1
  • 5 0.80 3.200 1
  • 6 0.70 3.500 1
  • 7 0.60 0.750 0
  • 8 1.10 1.700 0
  • 9 0.90 0.750 0
  • 10 0.90 0.450 0
  • 11 0.80 0.570 0
  • 12 0.55 2.750 0
  • 0.60 3.000 0
  • . . . 39 obs in all
plot of data
Plot of data

> plot(Rate,Volume,type="n", cex=1.2)

> text(Rate,Volume,1:39,

col=ifelse(Response==1, “red",“blue"),

cex=1.2)

> text(2.3,3.5,“blue: no VS", col=“blue",

adj=0, cex=1.2)

> text(2.3,3.0,“red: VS", col=“red",adj=0, cex=1.2)

enhanced residual plots
Enhanced residual plots

> vaso.glm = glm(Response ~ log(Volume) + log(Rate), family=binomial, data=vaso.df)

> pear.r<-residuals(vaso.glm, type="pearson")

> dev.r<-residuals(vaso.glm, type="deviance")

> par(mfrow=c(1,2))

> plot(pear.r, ylab="residuals", main="Pearson",type="n")

> text(pear.r,cex=0.7)

> abline(h=0,lty=2)

> abline(h=2,lty=2,lwd=2)

> abline(h=-2,lty=2,lwd=2)

> plot(dev.r, ylab="residuals", main="Deviance",type="h")

> text(dev.r, cex=0.7)

> abline(h=0,lty=2)

> abline(h=2,lty=2,lwd=2)

> abline(h=-2,lty=2,lwd=2)

diagnostics hat matrix diagonals
Diagnostics: Hat matrix diagonals
  • Can define hat matrix diagonals (HMD’s) pretty much as in linear models
  • HMD big if HMD > 3p/M

(M= no of covariate patterns)

  • Draw index plot of HMD’s
plotting hmd s
Plotting HMD’s

> HMD<-hatvalues(vaso.glm)

> plot(HMD,ylab="HMD's",type="h")

> text(HMD,cex=0.7)

> abline(h=3*3/39, lty=2)

hat matrix diagonals
Hat matrix diagonals
  • In ordinary regression, the hat matrix diagonals measure how “outlying” the covariates for an observation are
  • In logistic regression, the HMD’s measure the same thing, but are down-weighted according to the estimated probability for the observation. The weights gets small if the probability is close to 0 or 1.
  • In the vaso-constriction data, points 1,2,17 had very small weights, since the probabilities are close to 1 for these points.
diagnostics cooks distance
Diagnostics: Cooks distance
  • Can define an analogue of Cook’s distance for each point

CD = (Pearson resid )2 x HMD/(p*(1-HMD)2)

p = number of coeficients

  • CD big if more than about 10% quantile of the chi-squared distribution on k+1 df, divided by k+1
  • Calculate with qchisq(0.1,k+1)/(k+1)
  • But not that reliable as a measure
cooks d calculating and plotting
Cooks D: calculating and plotting

p<-3

CD<-cooks.distance(vaso.glm)

plot(CD,ylab="Cook's D",type="h",

main="index plot of Cook's distances")

text(CD, cex=0.7)

bigcook<-qchisq(0.1,p)/p

abline(h=bigcook, lty=2)

diagnostics leave one out deviance change
Diagnostics: leave-one-out deviance change
  • If the ith covariate pattern is left out, the change in the deviance is approximately

(Dev. Res) 2 + (Pearson. Res)2HMD/(1-HMD)

Big if more than about 4

deviance change calculating and plotting
Deviance change: calculating and plotting

> dev.r<-residuals(vaso.glm,type="deviance")

> Dev.change<-dev.r^2 + pear.r^2*HMD/(1-HMD)

> plot(Dev.change,ylab="Deviance change",

type="h")

> text(Dev.change, cex=0.7)

> bigdev<-4

> abline(h=bigdev, lty=2)

all together
All together

influenceplots(vaso.glm)

should we delete points
Should we delete points?
  • How influential are the 3 points?
  • Can delete each in turn and examine changes in coefficients, predicted probabilities
  • First, coefficients:

Deleting: None 31 4 18 All 3

(Intercept) -2.875 -3.041 -5.206 -4.758 -24.348

log(Volume) 5.179 4.966 8.468 7.671 39.142

log(Rate) 4.562 4.765 7.455 6.880 31.642

should we delete points 2
Should we delete points (2)?
  • Next, fitted probabilities:
  • Conclusion: points 4 and 18 have a big effect.

delete points

Fitted at None 31 4 18 4 and 18 All 3

point 31 0.722 0.627 0.743 0.707 0.996 0.996

point 4 0.075 0.073 0.010 0.015 0.000 0.000

point 18 0.106 0.100 0.018 0.026 0.000 0.000

should we delete points 3
Should we delete points (3)?
  • Should we delete?
  • They could be genuine – no real evidence they are wrong
  • If we delete them, we increase the regression coefficients, make fitted probabilities more extreme
  • Overstate the predictive ability of the model
residuals for ungrouped data
Residuals for ungrouped data
  • If all cases have distinct covariate patterns, then the residuals lie along two curves (corresponding to success and failure) and have little or no diagnostic value.
  • Thus, there is a pattern even if everything is OK.
formulas
Formulas
  • Pearson residuals: for ungrouped data, residual for i th case is
formulas cont
Formulas (cont)
  • Deviance residuals: for ungrouped data, residual for i th case is
use of plot function
Use of plot function

plot(kyphosis.glm)

analogue of r 2
Analogue of R2?
  • There is no satisfactory analogue of R2 for logistic regression.
  • For the “small m big n” situation we can use the residual deviance, since we can obtain an approximate p-value.
  • For other situations we can use the Hosmer –Lemeshow statistic (next slide)
hosmer lemeshow statistic
Hosmer-Lemeshow statistic
  • How can we judge goodness of fit for ungrouped data?
  • Can use the Hosmer-Lemeshow statistic, which groups the data into cases having similar fitted probabilities
    • Sort the cases in increasing order of fitted probabilities
    • Divide into 10 (almost) equal groups
    • Do a chi-square test to see if the number of successes in each group matches the estimated probability
kyphosis data
Kyphosis data

Divide probs into 10 classes : lowest 10%, next 10%......

Class 1 Class 2 Class 3 Class 4 Class 5

Observed 0’s 9 8 8 7 8

Observed 1’s 0 0 0 1 0

Total obs 9 8 8 8 8

Expected 1’s 0.022 0.082 0.199 0.443 0.776

Class 6 Class 7 Class 8 Class 9 Class 10

Observed 0’s 8 5 5 3 3

Observed 1’s 0 3 3 5 5

Total obs 8 8 8 8 8

Expected 1’s 1.023 1.639 2.496 3.991 6.328

Note: Expected = Total.obs x average prob

in r using the kyphosis data
In R, using the kyphosis data

Result of fitting model

> HLstat(kyphosis.glm)

Value of HL statistic = 6.498

P-value = 0.592

A p-value of less than 0.05 indicates problems. No problem indicated for the kyphosis data – logistic appears to fit OK.

The function HLstat is in the “330 functions”