Loading in 2 Seconds...
Loading in 2 Seconds...
Logistic and Poisson Regression: Modeling Binary and Count Data LISA Short Course Series. Mark Seiss, Dept. of Statistics. Presentation Outline. 1. Introduction to Generalized Linear Models 2. Binary Response Data  Logistic Regression Model 3. Count Response Data 
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Mark Seiss, Dept. of Statistics
1. Introduction to Generalized Linear Models
2. Binary Response Data 
Logistic Regression Model
3. Count Response Data 
Poisson Regression Model
3 Components
Random – identifies response Y and its probability distribution
Systematic – explanatory variables in a linear predictor function (Xβ)
Link function – function (g(.)) that links the mean of the response (E[Yi]=μi) to the systematic component.
Model
for i = 1 to n
Generalized Linear Models
Linear regression assumes that the response is distributed normally
GLM’s allow us to analyze the linear relationship between predictor variables and the mean of the response variable when it is not reasonable to assume the data is distributed normally.
Generalized Linear Models
Two Types: Continuous and Categorical
Continuous Predictor Variables
Examples – Time, Grade Point Average, Test Score, etc.
Coded with one parameter – βixi
Categorical Predictor Variables
Examples – Sex, Political Affiliation, Marital Status, etc.
Actual value assigned to Category not important
Ex) Sex  Male/Female, M/F, 1/2, 0/1, etc.
Coded Differently than continuous variables
Generalized Linear Models
Consider a categorical predictor variable with L categories
One category selected as reference category
Assignment of Reference Category is arbitrary
Variable represented by L1 dummy variables
Model Identifiability
Two types of coding – Dummy and Effect
Generalized Linear Models
Dummy Coding (Used in R)
xk = 1 if predictor variable is equal to category k
0 otherwise
xk = 0 for all k if predictor variable equals category I
Effect Coding (Used in JMP)
xk = 1 if predictor variable is equal to category k
0 otherwise
xk = 1 for all k if predictor variable equals category I
Generalized Linear Models
Contains a separate indicator parameter for each observation
Perfect fit μ = y
Not useful since there is no data reduction, i.e. number of parameters equals number of observations.
Maximum achievable log likelihood – baseline for comparison to other model fits
Generalized Linear Models
Let L(μy) = maximum of the log likelihood for the model
L(yy) = maximum of the log likelihood for the saturated model
Deviance = D(y μ) = 2 [L(μy)  L(yy) ]
Likelihood Ratio Statistic for testing the null hypothesis that the model is a good alternative to the saturated model
Likelihood ratio statistic has an asymptotic chisquared distribution with N – p degrees of freedom, where p is the number of parameters in the model.
Allows for the comparison of one model to another using the likelihood ratio test.
Generalized Linear Models
Model 1  model with p predictor variables {X1, X2, X3,….,Xp} and vector of fitted values μ1
Model 2  model with q<p predictor variables {X1, X2, X3,….,Xq} and vector of fitted values μ2
Model 2 is nested within Model 1 if all predictor variables found in Model 2 are included in Model 1.
i.e. the set of predictor variables in Model 2 are a subset of the set of predictor variables in Model 1
Model 2 is a special case of Model 1  all the coefficients associated with Xp+1, Xp+2, Xp+3,….,Xq are equal to zero
Generalized Linear Models
Null Hypothesis: There is not a significant difference between the fit of two models.
Null Hypothesis for Nested Models: The predictor variables in Model 1 that are not found in Model 2 are not significant to the model fit.
Alternate Hypothesis for Nested Models  The predictor variables in Model 1 that are not found in Model 2 are significant to the model fit.
Likelihood Ratio Statistic = 2* [L(y,u2)L(y,u1)]
= D(y,μ2)  D(y, μ1)
Difference of the deviances of the two models
Always D(y,μ2) > D(y,μ1) implies LRT > 0
LRT is distributed ChiSquared with pq degrees of freedom
Generalized Linear Models
Later, we will use the Likelihood Ratio Test to test the significance of variables in Logistic and Poisson regression models.
Generalized Linear Models
3 predictor variables – 1 Continuous (X1), 1 Categorical with 4 Categories (X2, X3, X4), 1 Categorical with 1 Category (X5)
Model 1  predictor variables {X1, X2, X3, X4, X5}
Model 2  predictor variables {X1, X5}
Null Hypothesis – Variables with 4 categories is not significant to the model (β2 = β3 = β4= 0)
Alternate Hypothesis  Variable with 4 categories is significant
Likelihood Ratio Statistic = D(y,μ2)  D(y, μ1)
Difference of the deviance statistics from the two models
ChiSquared Distribution with 52=3 degrees of freedom
Generalized Linear Models
2 Goals: Complex enough to fit the data well
Simple to interpret, does not overfit the data
Study the effect of each predictor on the response Y
Continuous Predictor – Graph P[Y=1] versus X
Discrete Predictor  Contingency Table of P[Y=1] versus categories of X
Unbalance Data – Few responses of one type
Guideline – 10 outcomes of each type for each X terms
Example – Y=1 for only 30 observations out of 1000
Model should contain no more than 3 X terms
Generalized Linear Models
Multicollinearity
Correlations among predictors resulting in an increase in variance
Reduces the significance value of the variable
Occurs when several predictor variables are used in the model
Determining Model Fit
Other criteria besides significance tests (i.e. Likelihood Ratio Test) can be used to select a model
Generalized Linear Models
Determining Model Fit cont.
Akaike Information Criterion (AIC)
Penalizes model for having many parameters
AIC = Deviance+2*p where p is the number of parameters in model
Bayesian Information Criterion (BIC)
BIC = 2 Log L + ln(n)*p where p is the number of parameters in model and n is the number of observations
Generalized Linear Models
Selection Algorithms
Best subset – Tests all combinations of predictor variables to find best subset
Algorithmic – Forward, Backward and Stepwise Procedures
Generalized Linear Models
Run model with all possible combinations of the predictor variables
Number of possible models equal to 2p where p is the number of predictor variables
Dummy Variables for categorical predictors considered together
Ex) For a set of predictors {X1, X2, X3}
runs models with sets of predictors {X1, X2, X3}, {X1, X2},
{X2, X3}, {X1, X3}, {X1}, {X2}, {X3}, and no predictor variables.
23 = 8 possible models
Most programs only allow for a small set of predictor variables
Cannot be run in a reasonable amount of time
210 = 1024 models run for a set of 10 predictor variables
Generalized Linear Models
Idea: Start with no variables in the model and add one at a time
Step One: Fit model with single predictor variable and determine fit
Step Two: Select predictor variable with best fit and add to model
Step Three: Add each variable to the model one at a time and determine fit
Step Four: If at least one variable produces better fit, return to step two
If no variables produce better fit, use model
Drawback: Variables Added to the model cannot be taken out.
Generalized Linear Models
Idea: Start with all variables in the model and take out one at a time
Step One: Fit all predictor variables in model and determine fit
Step Two: Delete one variable at a time and determine fit
Step Three: If the deletion of at least one variable produces better fit, remove variable that produces best fit when deleted and return to step 2
If the deletion of a variable does not produce a better fit, use model
Drawback: Variables taken out of model cannot be added back in.
Generalized Linear Models
Idea: Combination of forward and backward selection
Forward Step then backward step
Step One: Fit each predictor variable as a single predictor variable and determine fit
Step Two: Select variable that produces best fit and add to model.
Step Three: Add each predictor variable one at a time to the model and determine fit
Step Four: Select variable that produces best fit and add to the model
Step Five: Delete each variable in the model one at a time and determine fit
Step Six: Remove variable that produces best fit when deleted
Step Seven: Return to Step Two
Loop until no variables added or deleted improve the fit.
Generalized Linear Models
3 Components of the GLM
Random (Y)
Link Function (g(E[Y]))
Systematic (xtβ)
Continuous and Categorical Predictor Variables
Coding Categorical Variables – Effect and Dummy Coding
Likelihood Ratio Test for Nested Models
Test the significance of a predictor variable or set of predictor variables in the model.
Model Selection – Best Subset, Forward, Backward, Stepwise
Generalized Linear Models
Generalized Linear Models
Variable with two outcomes
One outcome represented by a 1 and the other represented by a 0
Examples:
Does the person have a disease? Yes or No
Who is the person voting for? McCain or Obama
Outcome of a baseball game? Win or loss
Logistic Regression
Response Variable –> Admission to Grad School (Admit)
0 if admitted, 1 if not admitted
Predictor Variables
GRE Score (gre)
Continuous
University Prestige (topnotch)
1 if prestigious, 0 otherwise
Grade Point Average (gpa)
Continuous
Logistic Regression
ADMIT GRE TOPNOTCH GPA
1 380 0 3.61
0 660 1 3.67
0 800 1 4
0 640 0 3.19
1 520 0 2.93
0 760 0 3
0 560 0 2.98
1 400 0 3.08
0 540 0 3.39
1 700 1 3.92
Logistic Regression
where yi = response for observation i
xi = 1x(p+1) matrix of covariates for observation i
p = number of covariates
GLM with binomial random component and identity link g(μ) = μ
Issue: π(Xi) can take on values less than 0 or greater than 0
Issue: Predicted probability for some subjects fall outside of the [0,1] range.
Logistic Regression
GLM with binomial random component and identity link g(μ) = logit(μ)
Range of values for π(Xi) is 0 to 1
Logistic Regression
And the linear probability model
Then the graph of the predicted probabilities for different grade point averages:
Important Note: JMP models P(Y=0) and effect coding is used for categorical variables
Logistic Regression
The odds ratio is a statistic that measures the odds of an event compared to the odds of another event.
Say the probability of Event 1 is π1and the probability of Event 2 is π2. Then the odds ratio of Event 1 to Event 2 is:
Value of Odds Ratio range from 0 to Infinity
Value between 0 and 1 indicate the odds of Event 2 are greater
Value between 1 and infinity indicate odds of Event 1 are greater
Value equal to 1 indicates events are equally likely
Logistic Regression
Link to Logistic Regression :
Thus the odds ratio between two events is
Logistic Regression
Consider Event 1 is Y=0 given X and Event 2 is Y=0 given X+1
From our logistic regression model
Thus the ratio of the odds of Y=0 for X and X+1 is
Logistic Regression
Generalized Linear Model Fit
Response: Admit
Modeling P(Admit=0)
Distribution: Binomial
Link: Logit
Observations (or Sum Wgts) = 400
Whole Model Test
Model LogLikelihood LR ChiSquare DF Prob>ChiSq
Difference 6.50444839 13.0089 1 0.0003
Full 243.48381
Reduced 249.988259
Goodness Of Fit Statistic ChiSquare DF Prob>ChiSq
Pearson 401.1706 398 0.4460 398 0.4460
Deviance 486.9676 398 0.0015 398 0.0015
Logistic Regression
Effect Tests
Source DF LR ChiSquare Prob>ChiSq
GPA 1 13.008897 0.0003
Parameter Estimates
Term Estimate Std Error LR ChiSquare Prob>ChiSq Lower CL Upper CL
Intercept 4.357587 1.0353175 19.117873 <.0001 6.433355 2.367383
GPA 1.0511087 0.2988695 13.008897 0.0003 0.4742176 1.6479411
Interpretation of the Parameter Estimate:
Exp{1.0511087} = 2.86 = odds ratio between the odds at x+1 and odds at x for all x
The ratio of the odds of being admitted between a person with a 3.0 gpa and 2.0 gpa is equal to 2.86 or equivalently the odds of the person with the 3.0 is 2.86 times the odds of the person with the 2.0.
Logistic Regression
Generalized Linear Model Fit
Response: Admit
Modeling P(Admit=0)
Distribution: Binomial
Link: Logit
Observations (or Sum Wgts) = 400
Whole Model Test
Model LogLikelihood LR ChiSquare DF Prob>ChiSq
Difference 3.53984692 7.0797 1 0.0078
Full 246.448412
Reduced 249.988259
Goodness Of Fit Statistic ChiSquare DF Prob>ChiSq
Pearson 400.0000 398 0.4624
Deviance 492.8968 398 0.0008
I
Logistic Regression
Effect Tests
Source DF LR ChiSquare Prob>ChiSq
TOPNOTCH 1 7.0796939 0.0078
Parameter Estimates
Term Estimate Std Error LR ChiSquare Prob>ChiSq Lower CL Upper CL
Intercept 0.525855 0.138217 14.446085 0.0001 0.799265 0.255667
TOPNOTCH[0] 0.371705 0.138217 7.0796938 0.0078 0.642635 0.099011
Interpretation of the Parameter Estimate:
Exp{2*.371705} = 0.4755 = odds ratio between the odds of admittance for a student at a less prestigous university and the odds of admittance for a student from a more prestigous university.
The odds of being admitted from a less prestigous university is .48 times the odds of being admitted from a more prestigous university.
I
Logistic Regression
Consider the model with GPA, GRE, and Top Notch as predictor variables
Generalized Linear Model Fit
Response: Admit
Modeling P(Admit=0)
Distribution: Binomial
Link: Logit
Observations (or Sum Wgts) = 400
Whole Model Test
Model LogLikelihood LR ChiSquare DF Prob>ChiSq
Difference 10.9234504 21.8469 3 <.0001
Full 239.064808
Reduced 249.988259
Goodness Of Fit Statistic ChiSquare DF Prob>ChiSq
Pearson 396.9196 396 0.4775
Deviance 478.1296 396 0.0029
Logistic Regression
Effect Tests
Source DF LR ChiSquare Prob>ChiSq
TOPNOTCH 1 2.2143635 0.1367
GPA 1 4.2909753 0.0383
GRE 1 5.4555484 0.0195
Parameter Estimates
Term Estimate Std Error LR ChiSquare Prob>ChiSq Lower CL Upper CL
Intercept 4.382202 1.1352224 15.917859 <.0001 6.657167 2.197805
TOPNOTCH[0] 0.218612 0.1459266 2.2143635 0.1367 0.503583 0.070142
GPA 0.6675556 0.3252593 4.2909753 0.0383 0.0356956 1.3133755
GRE 0.0024768 0.0010702 5.4555484 0.0195 0.0003962 0.0046006
Logistic Regression
Stepwise Fit
Response:
Admit
Stepwise Regression Control
Prob to Enter 0.250
Prob to Leave 0.100
Direction:
Rules:
Current Estimates
LogLikelihood RSquare
239.06481 0.0437
Logistic Regression
Parameter Estimate nDF Wald/Score ChiSq "Sig Prob"
Intercept[1] 4.3821986 1 0 1.0000
GRE 0.00247683 1 5.356022 0.0207
GPA 0.66755511 1 4.212258 0.0401
TOPNOTCH{10} 0.21861181 1 2.244286 0.1341
Step History
Step Parameter Action LR ChiSquare "Sig Prob" RSquare p
1 GRE Entered 13.92038 0.0002 0.0278 2
2 GPA Entered 5.712157 0.0168 0.0393 3
3 TOPNOTCH{10} Entered 2.214363 0.1367 0.0437 4
Logistic Regression
Start by selecting to enter all variables into the model
Stepwise Fit
Response: Admit
Stepwise Regression Control
Prob to Enter 0.250
Prob to Leave 0.100
Direction: Backward
Rules: Combine
Logistic Regression
Current Estimates
LogLikelihood RSquare
240.17199 0.0393
Parameter Estimate nDF Wald/Score ChiSq "Sig Prob"
Intercept[1] 4.9493751 1 0 1.0000
GRE 0.00269068 1 6.473978 0.0109
GPA 0.75468641 1 5.576461 0.0182
TOPNOTCH{10} 0 1 2.259729 0.1328
Step History
Step Parameter Action LR ChiSquare "Sig Prob" RSquare p
1 TOPNOTCH{10} Removed 2.214363 0.1367 0.0393 3
Logistic Regression
Stepwise Fit
Response:
Admit
Stepwise Regression Control
Prob to Enter 0.250
Prob to Leave 0.250
Direction: Mixed
Rules: Combine
Current Estimates
LogLikelihood RSquare
239.06481 0.0437
Logistic Regression
Parameter Estimate nDF Wald/Score ChiSq "Sig Prob"
Intercept[1] 4.3821986 1 0 1.0000
GRE 0.00247683 1 5.356022 0.0207
GPA 0.66755511 1 4.212258 0.0401
TOPNOTCH{10} 0.21861181 1 2.244286 0.1341
Step History
Step Parameter Action LR ChiSquare "Sig Prob" Rsquare p
1 GRE Entered 13.92038 0.0002 0.0278 2
2 GPA Entered 5.712157 0.0168 0.0393 3
3 TOPNOTCH{10} Entered 2.214363 0.1367 0.0437 4
Logistic Regression
Introduction to the Logistic Regression Model
Interpretation of the Parameter Estimates β – Odds Ratio
Variable Significance – Likelihood Ratio Test
Model Selection
Forward
Backward
Stepwise
Logistic Regression
Logistic Regression
Response variable is the number of occurrences in a given time frame.
Outcomes equal to 0, 1, 2, ….
Examples:
Number of penalties during a football game.
Number of customers shop at a store on a given day.
Number of car accidents at an intersection.
Poisson Regression
Response Variable –> Number of Days Absent – Integer
Predictor Variables
Gender 1 if Female, 2 if Male
Ethnicity – 6 Ethnic Categories
School – 1 if School, 2 if School 2
Math Test Score – Continuous
Language Test Score – Continuous
Bilingual Status – 6 Bilingual Categories
Poisson Regression
GENDER ethnicity school.1.or.2 ctbs.math.nce ctbs.lang.nce bilingual.status number.days.absent
1 2 4 1 56.988830 42.45086 2 4
2 2 4 1 37.094160 46.82059 2 4
3 1 4 1 32.275460 43.56657 2 2
4 1 4 1 29.056720 43.56657 2 3
5 1 4 1 6.748048 27.24847 3 3
6 1 4 1 61.654280 48.41482 0 13
7 1 4 1 56.988830 40.73543 2 11
8 2 4 1 10.390490 15.35938 2 7
9 2 4 1 50.527950 52.11514 2 10
10 2 6 1 49.472050 42.45086 0 9
Poisson Regression
where Yi = response for observation i
xi = 1x(p+1) matrix of covariates for observation i
p = number of covariates
μi = expected number of events given xi
GLM with poisson random component and identity link g(μ) = μ
Issue: Predicted values range from ∞ to +∞
Poisson Regression
GLM with poisson random component and log link g(μ) = log(μ)
Predicted response values fall between 0 and +∞
In the case of a single predictor, An increase of one unit of x results an increase of exp(β) in μ
Poisson Regression
And the Poisson linear model
Then a graph of the predicted values from the model:
Poisson Regression
> fitline<glm(number.days.absent~ctbs.math.nce,data=poisson_data,family=poisson(link=log))
> summary(fitline)
Call:
glm(formula = number.days.absent ~ ctbs.math.nce, family = poisson(link = log), data = poisson_data)
Deviance Residuals:
Min 1Q Median 3Q Max
4.4451 2.5583 1.0842 0.6647 12.4431
Coefficients:
Estimate Std. Error z value Pr(>z)
(Intercept) 2.302100 0.062776 36.671 <2e16 ***
ctbs.math.nce 0.011568 0.001294 8.939 <2e16 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Poisson Regression
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 2409.8 on 315 degrees of freedom
Residual deviance: 2330.6 on 314 degrees of freedom
AIC: 3196
Number of Fisher Scoring iterations: 6
Interpretation of the parameter estimate:
Exp{0.011568} = .98 = multiplicative effect on the expected number of days absent for an increase of 1 in the Math Score
Fabricated Example – If a student is expected to miss 5 days with a math of 50, then another student with a math score of 51 is expected to miss 5*.98 = 4.9 days
Poisson Regression
> fitline<glm(number.days.absent~factor(GENDER),data=poisson_data,family=poisson(link=log))
> summary(fitline)
Call:
glm(formula = number.days.absent ~ factor(GENDER), family = poisson(link = log), data = poisson_data)
Deviance Residuals:
Min 1Q Median 3Q Max
3.660 2.755 1.128 0.902 9.738
Coefficients:
Estimate Std. Error z value Pr(>z)
(Intercept) 1.90174 0.03036 62.644 < 2e16 ***
factor(GENDER)2 0.31729 0.04747 6.684 2.32e11 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Poisson Regression
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 2409.8 on 315 degrees of freedom
Residual deviance: 2364.5 on 314 degrees of freedom
AIC: 3229.9
Number of Fisher Scoring iterations: 5
Important Note: The function factor(categorical variable) uses the dummy coding
Interpretation of the parameter estimate:
Exp{0.31729} = 0.7289 = multiplicative effect on the expected number of days absent of being male rather than female
If a female student is expected to miss X days, then a male student is expected to miss 0.7289*X.
Poisson Regression
Model with all variables
> fitline<glm(number.days.absent~factor(GENDER)+factor(school.1.or.2)+ctbs.math.nce+ctbs.lang.nce+factor(bilingual.status)+
factor(ethnicity),data=poisson_data,family=poisson(link=log))
summary(fitline)
Call:
glm(formula = number.days.absent ~ factor(GENDER) + factor(school.1.or.2) +
ctbs.math.nce + ctbs.lang.nce + factor(bilingual.status) +
factor(ethnicity), family = poisson(link = log), data = poisson_data)
Deviance Residuals:
Min 1Q Median 3Q Max
4.5222 2.1863 0.9622 0.7454 10.4077
Poisson Regression
Model with all variables Cont
> Coefficients:
Estimate Std. Error z value Pr(>z)
(Intercept) 2.972325 0.424645 7.000 2.57e12 ***
factor(GENDER)2 0.401980 0.048954 8.211 < 2e16 ***
factor(school.1.or.2)2 0.582321 0.070717 8.235 < 2e16 ***
ctbs.math.nce 0.001043 0.001845 0.565 0.57181
ctbs.lang.nce 0.003048 0.002003 1.521 0.12822
factor(bilingual.status)1 0.344696 0.083754 4.116 3.86e05 ***
factor(bilingual.status)2 0.282194 0.070846 3.983 6.80e05 ***
factor(bilingual.status)3 0.053406 0.081850 0.652 0.51409
factor(ethnicity)2 0.131202 0.420704 0.312 0.75515
factor(ethnicity)3 0.434061 0.418013 1.038 0.29909
factor(ethnicity)4 0.326230 0.419158 0.778 0.43639
factor(ethnicity)5 0.876270 0.416398 2.104 0.03534 *
factor(ethnicity)6 1.188835 0.457470 2.599 0.00936 **

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Poisson Regression
Model with all variables Cont
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 2409.8 on 315 degrees of freedom
Residual deviance: 1909.2 on 303 degrees of freedom
AIC: 2796.6
Number of Fisher Scoring iterations: 6
Poisson Regression
Model with all variables except Ethnicity
>fitline<glm(number.days.absent~factor(GENDER)+factor(school.1.or.2)+ctbs.math.nce+ctbs.lang.nce+factor(bilingual.status),
data=poisson_data,family=poisson(link=log))
> summary(fitline)
Call:
glm(formula = number.days.absent ~ factor(GENDER) + factor(school.1.or.2) + ctbs.math.nce + ctbs.lang.nce + factor(bilingual.status),
family = poisson(link = log), data = poisson_data)
Deviance Residuals:
Min 1Q Median 3Q Max
4.6955 2.3130 0.9115 0.7527 11.4247
Poisson Regression
Model with all variables except Ethnicity
Coefficients: Estimate Std. Error z value Pr(>z)
(Intercept) 2.5741133 0.0838754 30.690 < 2e16 ***
factor(GENDER)2 0.4212841 0.0484383 8.697 < 2e16 ***
factor(school.1.or.2)2 0.8242109 0.0570241 14.454 < 2e16 ***
ctbs.math.nce 0.0008193 0.0018278 0.448 0.65398
ctbs.lang.nce 0.0050753 0.0019380 2.619 0.00882 **
factor(bilingual.status)1 0.3080131 0.0762534 4.039 5.36e05 ***
factor(bilingual.status)2 0.1815997 0.0581877 3.121 0.00180 **
factor(bilingual.status)3 0.0363656 0.0686396 0.530 0.59625

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Poisson Regression
Model with all variables except Ethnicity
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 2409.8 on 315 degrees of freedom
Residual deviance: 1984.1 on 308 degrees of freedom
AIC: 2861.5
Number of Fisher Scoring iterations: 6
Poisson Regression
Model 1 with All Variables – Deviance = 2 Log L = 1909.2 with
df = 303
Model 2 without Ethnicity  Deviance = 2 Log L = 1984.1 with
df = 308
Likelihood Ratio Test = Deviance (Model 2) – Deviance (Model 1)
= 1984.1 – 1909.2= 74.9
Likelihood Ratio Test ~ Chi Square with 308303 = 5 degrees of freedom
PValue < .0001
There is significant evidence to conclude that ethnicity is a significant predictor variable.
Poisson Regression
Forward Selection
> fitline<glm(number.days.absent~1,data=data1,family=poisson(link=log))
> step(fitline,scope = list(upper = ~factor(GENDER)+factor(school.1.or.2)+ctbs.math.nce+ctbs.lang.nce+factor(bilingual.status)+factor(ethnicity), lower = ~1),direction="forward")
Start: AIC=3273.22
number.days.absent ~ 1
Df Deviance AIC
+ factor(school.1.or.2) 1 2103.7 2969.1
+ factor(ethnicity) 5 2095.9 2969.3
+ ctbs.lang.nce 1 2311.7 3177.0
+ ctbs.math.nce 1 2330.6 3196.0
+ factor(bilingual.status) 3 2339.2 3208.6
+ factor(GENDER) 1 2364.5 3229.9
<none> 2409.8 3273.2
Poisson Regression
Forward Selection cont.
Step: AIC=2969.12
number.days.absent ~ factor(school.1.or.2)
Df Deviance AIC
+ factor(ethnicity) 5 2018.7 2894.1
+ factor(GENDER) 1 2029.3 2896.7
+ factor(bilingual.status) 3 2066.0 2937.4
+ ctbs.lang.nce 1 2092.7 2960.1
+ ctbs.math.nce 1 2096.7 2964.1
<none> 2103.7 2969.1

Poisson Regression
Forward Selection cont.
Step: AIC=2894.07
number.days.absent ~ factor(school.1.or.2) + factor(ethnicity)
Df Deviance AIC
+ factor(GENDER) 1 1951.3 2828.7
+ factor(bilingual.status) 3 1981.6 2863.0
+ ctbs.math.nce 1 2011.1 2888.5
+ ctbs.lang.nce 1 2012.5 2889.9
<none> 2018.7 2894.1
Step: AIC=2828.67
number.days.absent ~ factor(school.1.or.2) + factor(ethnicity) + factor(GENDER)
Df Deviance AIC
+ factor(bilingual.status) 3 1915.3 2798.8
+ ctbs.lang.nce 1 1938.5 2817.8
+ ctbs.math.nce 1 1942.3 2821.7
<none> 1951.3 2828.7
Poisson Regression
Forward Selection cont.
Step: AIC=2798.75
number.days.absent ~ factor(school.1.or.2) + factor(ethnicity) + factor(GENDER) + factor(bilingual.status)
Df Deviance AIC
+ ctbs.lang.nce 1 1909.5 2794.9
+ ctbs.math.nce 1 1911.5 2796.9
<none> 1915.3 2798.8
Step: AIC=2794.89
number.days.absent ~ factor(school.1.or.2) + factor(ethnicity) + factor(GENDER) + factor(bilingual.status) + ctbs.lang.nce
Df Deviance AIC
<none> 1909.5 2794.9
+ ctbs.math.nce 1 1909.2 2796.6
Poisson Regression
Forward Selection cont.
Call: glm(formula = number.days.absent ~ factor(school.1.or.2) + factor(ethnicity) + factor(GENDER) + factor(bilingual.status) + ctbs.lang.nce, family = poisson(link = log), data = data1)
Coefficients:
(Intercept) factor(school.1.or.2)2 factor(ethnicity)2 factor(ethnicity)3 factor(ethnicity)4
2.948689 0.586678 0.126806 0.423376 0.313360
factor(ethnicity)5 factor(ethnicity)6 factor(GENDER)2 factor(bilingual.status)1 factor(bilingual.status)2
0.862743 1.175574 0.404215 0.343907 0.284027
factor(bilingual.status)3 ctbs.lang.nce
0.051558 0.003763
Degrees of Freedom: 315 Total (i.e. Null); 304 Residual
Null Deviance: 2410
Poisson Regression
Backward Selection
> fitline<glm(number.days.absent~factor(GENDER)+factor(school.1.or.2)+ctbs.math.nce+ctbs.lang.nce+factor(bilingual.status)+
factor(ethnicity),data=poisson_data,family=poisson(link=log))
> backwards<step(fitline,direction="backward")
Start: AIC=2796.57
number.days.absent ~ factor(GENDER) + factor(school.1.or.2) + ctbs.math.nce + ctbs.lang.nce + factor(bilingual.status) +
factor(ethnicity)
Df Deviance AIC
 ctbs.math.nce 1 1909.5 2794.9
<none> 1909.2 2796.6
 ctbs.lang.nce 1 1911.5 2796.9
 factor(bilingual.status) 3 1937.8 2819.2
 factor(ethnicity) 5 1984.1 2861.5
 factor(GENDER) 1 1977.8 2863.2
 factor(school.1.or.2) 1 1983.6 2869.0
Poisson Regression
Backward Selection cont.
Step: AIC=2794.89
number.days.absent ~ factor(GENDER) + factor(school.1.or.2) + ctbs.lang.nce + factor(bilingual.status) + factor(ethnicity)
Df Deviance AIC
<none> 1909.5 2794.9
 ctbs.lang.nce 1 1915.3 2798.8
 factor(bilingual.status) 3 1938.5 2817.8
 factor(ethnicity) 5 1984.3 2859.7
 factor(GENDER) 1 1979.4 2862.8
 factor(school.1.or.2) 1 1986.5 2869.9
Poisson Regression
Stepwise Selection cont.
> fitline<glm(number.days.absent~1,data=data1,family=poisson(link=log))
> step(fitline,scope = list(upper=~factor(GENDER)+factor(school.1.or.2)+ctbs.math.nce+ctbs.lang.nce+factor(bilingual.status)+factor(ethnicity), lower = ~1),direction="both")
Start: AIC=3273.22
number.days.absent ~ 1
Df Deviance AIC
+ factor(school.1.or.2) 1 2103.7 2969.1
+ factor(ethnicity) 5 2095.9 2969.3
+ ctbs.lang.nce 1 2311.7 3177.0
+ ctbs.math.nce 1 2330.6 3196.0
+ factor(bilingual.status) 3 2339.2 3208.6
+ factor(GENDER) 1 2364.5 3229.9
<none> 2409.8 3273.2
Poisson Regression
Stepwise Selection cont.
Step: AIC=2969.12
number.days.absent ~ factor(school.1.or.2)
Df Deviance AIC
+ factor(ethnicity) 5 2018.7 2894.1
+ factor(GENDER) 1 2029.3 2896.7
+ factor(bilingual.status) 3 2066.0 2937.4
+ ctbs.lang.nce 1 2092.7 2960.1
+ ctbs.math.nce 1 2096.7 2964.1
<none> 2103.7 2969.1
 factor(school.1.or.2) 1 2409.8 3273.2
Poisson Regression
Stepwise Selection cont.
Step: AIC=2894.07
number.days.absent ~ factor(school.1.or.2) + factor(ethnicity)
Df Deviance AIC
+ factor(GENDER) 1 1951.3 2828.7
+ factor(bilingual.status) 3 1981.6 2863.0
+ ctbs.math.nce 1 2011.1 2888.5
+ ctbs.lang.nce 1 2012.5 2889.9
<none> 2018.7 2894.1
 factor(ethnicity) 5 2103.7 2969.1
 factor(school.1.or.2) 1 2095.9 2969.3
Poisson Regression
Stepwise Selection cont.
Step: AIC=2828.67
number.days.absent ~ factor(school.1.or.2) + factor(ethnicity) + factor(GENDER)
Df Deviance AIC
+ factor(bilingual.status) 3 1915.3 2798.8
+ ctbs.lang.nce 1 1938.5 2817.8
+ ctbs.math.nce 1 1942.3 2821.7
<none> 1951.3 2828.7
 factor(GENDER) 1 2018.7 2894.1
 factor(ethnicity) 5 2029.3 2896.7
 factor(school.1.or.2) 1 2050.5 2925.9
Poisson Regression
Stepwise Selection cont.
Step: AIC=2798.75
number.days.absent ~ factor(school.1.or.2) + factor(ethnicity) + factor(GENDER) + factor(bilingual.status)
Df Deviance AIC
+ ctbs.lang.nce 1 1909.5 2794.9
+ ctbs.math.nce 1 1911.5 2796.9
<none> 1915.3 2798.8
 factor(bilingual.status) 3 1951.3 2828.7
 factor(GENDER) 1 1981.6 2863.0
 factor(ethnicity) 5 1993.4 2866.8
 factor(school.1.or.2) 1 2003.4 2884.8
Poisson Regression
Stepwise Selection cont.
Step: AIC=2794.89
number.days.absent ~ factor(school.1.or.2) + factor(ethnicity) + factor(GENDER) + factor(bilingual.status) + ctbs.lang.nce
Df Deviance AIC
<none> 1909.5 2794.9
+ ctbs.math.nce 1 1909.2 2796.6
 ctbs.lang.nce 1 1915.3 2798.8
 factor(bilingual.status) 3 1938.5 2817.8
 factor(ethnicity) 5 1984.3 2859.7
 factor(GENDER) 1 1979.4 2862.8
 factor(school.1.or.2) 1 1986.5 2869.9
Poisson Regression
Stepwise Selection cont.
Call: glm(formula = number.days.absent ~ factor(school.1.or.2) + factor(ethnicity) + factor(GENDER) + factor(bilingual.status) + ctbs.lang.nce, family = poisson(link = log), data = data1)
Coefficients:
(Intercept) factor(school.1.or.2)2 factor(ethnicity)2 factor(ethnicity)3 factor(ethnicity)4
2.948689 0.586678 0.126806 0.423376 0.313360
factor(ethnicity)5 factor(ethnicity)6 factor(GENDER)2 factor(bilingual.status)1 factor(bilingual.status)2
0.862743 1.175574 0.404215 0.343907 0.284027
factor(bilingual.status)3 ctbs.lang.nce
0.051558 0.003763
Degrees of Freedom: 315 Total (i.e. Null); 304 Residual
Null Deviance: 2410
Residual Deviance: 1909 AIC: 2795
Poisson Regression
Taking the sample mean and sample variance of the response for intervals of Math Scores
Poisson Regression
For Yi~Poisson(λi), E [Yi] = Var [Yi] = λi
The variance of the response is much larger than the mean.
Larger variance known as overdispersion
Consequences: Parameter estimates are still consistent
Standard errors are inconsistent
Remedy: Negative Binomial model
Poisson Regression
Introduction to the Poisson Regression Model
Interpretation of β
Variable Significance – Likelihood Ratio Test
Model Selection
Forward
Backward
Stepwise
Overdispersion
Poisson Regression
Poisson Regression