Agenda

1 / 33

# Agenda - PowerPoint PPT Presentation

Two Distribution Families for Modelling Over- and Underdispersed Binomial Frequencies Feirer V. , Hirn U., Friedl H., Bauer W. Institute for Paper, Pulp and Fiber Technology &amp; Institute for Statistics Graz University of Technology. Agenda. Motivation Generalized Linear Models

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Agenda' - redell

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Two Distribution Familiesfor Modelling Over- and UnderdispersedBinomial FrequenciesFeirer V., Hirn U., Friedl H., Bauer W.Institute for Paper, Pulp and Fiber Technology& Institute for StatisticsGraz University of Technology

Agenda
• Motivation
• Generalized Linear Models
• Multiplicative Binomial Distribution
• Double Binomial Distribution
• Application of the Two Distributions
• Summary
Motivation
• consider the problem of successful ink transfer on paper

(No. of datapoints

in sample:

roughly 9106

sample size:

3  6 mm²)

• explain occurrence of unprinted regions

…part of a larger, industry-funded project at the IPZ.

Predictor Variables

Topography

Formation

…the way fibres are arranged

Response

true colour image

Distribution of the Response

response

…part of the Exponential Family

here

with

the probability for successful ink transmission

model for

the Generalized Linear Model*

model for

linear predictor

is linked to the mean by

• advances over a linear model:
• distribution of the relative frequencies
• … member of the Exponential Family
• mean lies between 0 and 1

* Nelder & Wedderburn (1972). Generalized Linear Models. Journal of the Royal Statistical Society, 135, 370-384

Model Deviance

…a test for goodness-of-fit

Deviance = -2 × ( maximized log-likelihood of considered model –

maximized log-likelihood of saturated model )

under certain regularity conditions,

if Underdispersion

Variance of data smaller than assumed by the model

if Overdispersion

Variance of data larger than assumed by the model

Deviances of the Printability Datasets

…values from 11 different data sets

distinct deviations from a binomial variance!

many

few

unprinted areas

Definition
• introduced by Altham* as „multiplicative generalization of the binomial distribution“

considers litters of rabbits

animals within one litter are treated with the same dosis of a certain drug

n… litter size

y… number of surviving animals

• outcomes from animals from within one litter are not mutually independent

Altham introduces an interaction parameter ω

*Altham (1978). Two Generalizations of the Binomial Distribution. Journal of the Royal Statistical Society, 27, 162-197

Properties
• Member of the 2-parameter Exponential Family
• For ω=1, it corresponds to the Binomial Distribution
• For n=1, it reduces to the Bernoulli distribution
Comparison With Classic Binomial pdf

n = 36

 = 0.8

ω=1 gives the classic binomial distribution

Comparison of the Variances

n = 36

ω=1 gives the classic binomial distribution

Integration into GLM Context

log-likelihood function of distribution

 ω > 0

 0 <  < 1

Definition

introduced by Efron* as part of the Double Exponential Family

second parameter  allows variation of variance:

variance is smaller than binomial if 0<<1

and larger than binomial if >1

=1 gives the classic binomial distribution

*Efron (1986). Double Exponential Families and their Use in Generalized Linear Regression.

Journal of the American Statistical Association, 81, 709-721

Comparison With Classic Binomial pdf

n = 36

 = 0.8

=1 gives the classic binomial distribution

Comparison of the Variances

n = 36

=1 gives the classic binomial distribution

Integration into GLM Context

member of the 2-parameter exponential family

log-likelihood function of distribution

  > 0

 0 <  < 1

Response and Explanatory Variables

~

explained by…

+ formation

topography

occurrrence of unprinted areas…

Comparison of the Means

The second parameter

influences the mean, too.

Comparison of the Variances

binomial Std. Dev. at n=36:

cannot be larger than 3

empirical Std. Deviations:

up to 11

Multiplicative and Double Binomial Standard Deviations fit much

better to empirical results

Summary

Two generalizations of the binomial distribution

might compensate over- or underdispersion

in the case of classic binomial distribution.

Multiplicative Binomial Distribution (Altham, 1978)

second parameter ω

in GLM context: model  with the logistic link

and ω with the log-linear link function

Summary 2

Double Binomial Distribution (Efron, 1986)

second parameter 

in GLM context: model  with the logistic link

and  with the log-linear link function