Negative Binomial Regression

1 / 12

Negative Binomial Regression - PowerPoint PPT Presentation

Negative Binomial Regression. NASCAR Lead Changes 1975-1979. Data Description. Units – 151 NASCAR races during the 1975-1979 Seasons Response - # of Lead Changes in a Race Predictors: # Laps in the Race # Drivers in the Race Track Length (Circumference, in miles) Models:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'Negative Binomial Regression' - shira

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Negative Binomial Regression

Data Description
• Units – 151 NASCAR races during the 1975-1979 Seasons
• Response - # of Lead Changes in a Race
• Predictors:
• # Laps in the Race
• # Drivers in the Race
• Track Length (Circumference, in miles)
• Models:
• Poisson (assumes E(Y) = V(Y))
• Negative Binomial (Allows for V(Y) > E(Y))
Poisson Regression
• Random Component: Poisson Distribution for # of Lead Changes
• Systematic Component: Linear function with Predictors: Laps, Drivers, Trklength
• Link Function: log: g(m) = ln(m)
Regression Coefficients – Z-tests
• Note: All predictors are highly significant.
• Holding all other factors constant:
• As # of laps increases, lead changes increase
• As # of drivers increases, lead changes increase
• As Track Length increases, lead changes increase
Testing Goodness-of-Fit
• Break races down into 10 groups of approximately equal size based on their fitted values
• The Pearson residuals are obtained by computing:
• Under the hypothesis that the model is adequate, X2 is approximately chi-square with 10-4=6 degrees of freedom (10 cells, 4 estimated parameters).
• The critical value for an a=0.05 level test is 12.59.
• The data (next slide) clearly are not consistent with the model.
• Note that the variances within each group are orders of magnitude larger than the mean.
Testing Goodness-of-Fit

107.4 >> 12.59  Data are not consistent with Poisson model

Negative Binomial Regression

Random Component: Negative Binomial Distribution for # of Lead Changes

Systematic Component: Linear function with Predictors: Laps, Drivers, Trklength

Link Function: log: g(m) = ln(m)

Regression Coefficients – Z-tests

Note that SAS and STATA estimate 1/k in this model.

Goodness-of-Fit Test
• Clearly this model fits better than Poisson Regression Model.
• For the negative binomial model, SD/mean is estimated to be 0.43 = sqrt(1/k).
• For these 10 cells, ratios range from 0.24 to 0.67, consistent with that value.
Computational Aspects - I

k is restricted to be positive, so we estimate k* = log(k) which can take on any value. Note that software packages estimating 1/k are estimating –k*

Likelihood Function:

Log-Likelihood Function:

Computational Aspects - II

Derivatives wrt k* and b:

Computational Aspects - III

Newton-Raphson Algorithm Steps:

Step 1: Set k*=0 (k=1) and iterate to obtain estimate of b:

Step 2: Set b’ = [1 0 0 0] and iterate to obtain estimate of k*:

Step 3: Use results from steps and 2 as starting values (software packages seem to use different intercept) to obtain estimates of k* and b

Step 4: Back-transform k* to get estimate of k: k=exp(k*)