- 124 Views
- Uploaded on
- Presentation posted in: General

Data: Crab mating patterns

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Data: Challenger

(Binomial with random effects)

Data: Crab mating patterns

(Poisson Regression,

ZIP model, Negative Binomial)

Flu Data:

(Binomial with random effects)

Data: Typists

(Poisson with random effects)

D.A.D.

Mixed (not generalized) Models:

Fixed Effects and Random Effects

D.A.D.

SAS Global Forum 2010

“Generalized” non normal distribution

Binary for probabilities: Y=0 or 1

Mean p

Pr{Y=j}= pj(1-p)(1-j)

Link: L=ln(p/(1-p)) = “Logit”

Range (over all L): 0<p<1

Poisson for counts: Y in {0,1,2,3,4, ….}

Mean count l

Pr{Y=j} = exp(- l )(lj)/(j!)

Link: L = log(l)

Range (over all L): l >0

D.A.D.

SAS Global Forum 2010

Generalized (not mixed) linear models.

Use link L = g(E{Y}), e.g. ln(p/(1-p)) =ln(E{Y}/(1-E{Y})

Assume L is linear model in the inputs with fixed effects.

Estimate model for L, e.g. L=g(E{Y})=bo + b1 X

Use maximum likelihood

Example:

L = -1 + .18*dose

Dose = 10, L=0.8,

p=exp(0.8)/(1+exp(0.8))=

inverse link = 0.86

D.A.D.

SAS Global Forum 2010

Challenger was mission 24

From 23 previous launches we have:

6 O-rings per mission

Y=0 no damage, Y=1 erosion or blowby

p = Pr {Y=1} = f{mission, launch temperature)

Features: Random mission effects

Logistic link for p

procglimmix data=O_ring;

class mission;

model fail = temp/dist=binomial s;

random mission;

run;

- Generalized
- Mixed

Demo

O_rings.sas

D.A.D.

SAS Global Forum 2010

Estimated G matrix is not positive definite.

Covariance Parameter Estimates

Cov Standard

Parm Estimate Error

mission 2.25E-18 .

Solutions for Fixed Effects

Effect Estimate Error DF t Value Pr > |t|

Intercept 5.0850 3.0525 21 1.67 0.1106

temp -0.1156 0.04702 115 -2.46 0.0154

Just logistic regression – no mission variance component

D.A.D.

SAS Global Forum 2010

Flu Data CDC

Active Flu Virus

Weekly Data % positive

data FLU;

input fluseasn year t week pos specimens;

pct_pos=100*pos/specimens; logit=log(pct_pos/100/(1+(pct_pos/100)));

label pos = "# positive specimens";

label pct_pos="% positive specimens";

label t = "Week into flu season (first = week 40)";

label week = "Actual week of year";

label fluseasn = "Year flu season started";

logit pct. pos.

Demo

Get_Flu.sas

“Sinusoids”

S(j) = sin(2pjt/52) C(j)=cos(2pjt/52)

PROCGLM DATA=FLU;

class fluseasn;

model logit = s1 c1

fluseasn*s1 fluseasn*c1 fluseasn*s2 fluseasn*c2

fluseasn*s3 fluseasn*c3 fluseasn*s4 fluseasn*c4;

output out=out1 p=p;

data out1; set out1; P_hat = exp(p)/(1+exp(p));

label P_hat = "Pr{pos. sample} (est.)"; run;

(1) GLM all effects fixed

(harmonic main effects

insignificant)

Logit scale

Demo

Flu_GLM.sas

PROCMIXED DATA=FLU method=ml; ** reduced model;

class fluseasn;

model logit = s1 c1 /outp=outp outpm=outpm ddfm=kr;

random intercept/subject=fluseasn;

random s1 c1/subject=fluseasn type=toep(1);

random s2 c2/subject=fluseasn type=toep(1);

random s3 c3/subject=fluseasn type=toep(1);

random s4 c4/subject=fluseasn type=toep(1); run;

(2) MIXED analysis

on logits

Random harmonics.

Normality assumed

Logit scale

Probability scale

Demo

Flu_MIXED.sas

PROCGLIMMIX DATA=FLU; title2 "GLIMMIX Analysis";

class fluseasn;

model pos/specimens = s1 c1 ; * s2 c2 s3 c3 s4 c4;

random intercept/subject=fluseasn;

random s1 c1/subject=fluseasn type=toep(1);

random s2 c2/subject=fluseasn; ** Toep(1) - no converge;

random s3 c3/subject=fluseasn type=toep(1);

random s4 c4/subject=fluseasn type=toep(1);

random _residual_;

output out=out2 pred(ilink blup)=pblup

pred(ilink noblup) overallpearson = p_resid; run;

(3) GLIMMIX

analysis

Random harmonics.

Binomial assumed

(overdispersed –

lab effects?)

Mean – no BLUPs

Demo

Flu_GLIMMIX.sas

D.A.D.

SAS Global Forum 2010

Flu data

Binomial

random _residual_ does not affect

the fit (just standard errors)

Could try Beta distribution instead:

PROCGLIMMIX DATA=FLU; title2 "GLIMMIX Analysis";

class fluseasn;

model f = s1 c1 /dist=beta link=logit s;

random intercept/subject=fluseasn;

random s1 c1/subject=fluseasn type=toep(1);

random s2 c2/subject=fluseasn type=toep(1);

random s3 c3/subject=fluseasn type=toep(1);

random s4 c4/subject=fluseasn type=toep(1);

output out=out3 pred(ilink blup)=pblup

pred(ilink noblup)=overall

pearson=p_residbeta; run;

D.A.D.

SAS Global Forum 2010

Horseshoe Crab study

(reference: SAS GLIMMIX course notes):

Female nests have “satellite” males

Count data – Poisson?Generalized Linear

Features (predictors):

Carapace Width, Weight, Color, Spine condition

Random Effect: SiteMixed Model

Demo

Get_Horseshoe.sas

procglimmix data=crab;

class site;

model satellites = weight width / dist=poi solution ddfm=kr;

random int / subject=site;

output out=overdisp pearson=pearson; run;

procmeans data=overdisp n mean var;

var pearson; run;

procunivariate data=crab normal plot;

var satellites; run;

N Mean Variance

---------------------------

173 -0.0258264 2.6737114

---------------------------

Histogram # Boxplot

15.5+* 1 0

.* 1 0

.

12.5+* 1 |

.* 1 |

.** 3 |

9.5+** 3 |

.*** 6 |

.** 4 |

6.5+******* 13 |

.******** 15 +-----+

.********** 19 | |

3.5+********** 19 | |

.***** 9 *--+--*

.******** 16 | |

0.5+******************************* 62 +-----+

----+----+----+----+----+----+-

* may represent up to 2 counts

Fit Statistics

Gener. Chi-Square / DF 2.77

Cov Parm Subject Estimate

Intercept site 0.1625

Effect Estimate Pr > |t|

Intercept -1.1019 0.2527

weight 0.5042 0.0035

width 0.0318 0.5229

Zero Inflated ?

Demo

Crabs_OVERDISP.sas

D.A.D.

SAS Global Forum 2010

Zero Inflated Poisson (ZIP)

D.A.D.

SAS Global Forum 2010

Zero Inflated Poisson (ZIP)

procnlmixed data=crab;

parms b0=0 bwidth=0 bweight=0 c0=-2 c1=0s2u1=1 s2u2=1;

x=c0+c1*width+u1; p0 = exp(x)/(1+exp(x));

eta= b0+bwidth*width +bweight*weight +u2;

lambda=exp(eta);

if satellites=0 then

loglike = log(p0 +(1-p0)*exp(-lambda));

else loglike =

log(1-p0)+satellites*log(lambda)-lambda-lgamma(satellites+1);

expected=(1-p0)*lambda; id p0 expected lambda;

model satellites~general(loglike);

Random U1 U2~N([0,0],[s2u1,0,s2u2]) subject=site;

predict p0+(1-p0)*exp(-lambda) out=out1; run;

D.A.D.

SAS Global Forum 2010

Zero Inflated Poisson (ZIP)

Parameter Estimates

Parameter Estimate t Pr>|t| Lower Upper

b0 2.7897 2.55 0.0268 0.3853 5.1942

bwidth -0.0944 -1.65 0.1267-0.2202 0.0314

bweight 0.4649 2.38 0.0366 0.0347 0.8952

c0 13.3739 4.42 0.0010 6.7078 20.0401

c1 -0.5447 -4.61 0.0008 -0.8049 -0.2844

s2u1 0.5114 1.12 0.2852 -0.4905 1.5133

s2u2 0.1054 1.67 0.1239 -0.0339 0.2447

Demo

Crabs_ZIP.sas

D.A.D.

SAS Global Forum 2010

From fixed part of model, compute

Pr{count=j} and plot (3D) versus

Weight, Carapace width

D.A.D.

SAS Global Forum 2010

Another possibility: Negative binomial

Number of failures until kth success ( p=Prob{success} )

D.A.D.

SAS Global Forum 2010

3 trials before first success

Negative Binomial

Crab

beer

Crab

beer

Satellites

D.A.D.

Negative binomial:

In SAS, k is our 1/k

SAS Global Forum 2010

procglimmix data=crab; class site;

model satellites = weight width / dist=nb solution ddfm=kr;

random int / subject=site; run;

Fit Statistics

-2 Res Log Pseudo-Likelihood 539.06

Generalized Chi-Square 174.83

Gener. Chi-Square / DF 1.03

Covariance Parameter Estimates

Cov Parm Subject Estimate Std. Error

Intercept site 0.09527 0.07979

Scale 0.7659 0.1349

Standard

Effect Estimate Error DF t Value Pr > |t|

Intercept -1.2022 1.6883 168.5 -0.71 0.4774

weight 0.6759 0.3239 156.6 2.09 0.0386

width 0.01905 0.08943 166.2 0.21 0.8316

Num Den

Effect DF DF F Value Pr > F

weight 1 156.6 4.35 0.0386

width 1 166.2 0.05 0.8316

Demo

Crabs_NEGBIN.sas

Population average model vs. Individual Specific Model

8 typists

Y=Error counts (Poisson distributed)

ln(li)= ln(mean of Poisson) = m+Ui for typist i.

conditionally (individual specific)

Distributions for Y, U~N(0,1) and m=1

l=em=e1=2.7183

= mean for

“typical”

typist

D.A.D.

SAS Global Forum 2010

Population average model

Expectation ||||| | | of individual distributions averaged

across population of all typists.

Run same simulation for 8000 typists, compute mean

of conditional population means, exp(m+U).

The MEANS Procedure

Variable N Mean Std Dev Std Error

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

lambda 8000 4.4280478 6.0083831 0.067175

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Z=(4.428-2.7183)/0.06718 = 25.46 !!

Population mean is not em

Conditional means, m+U, are lognormal.

Log(Y)~N(1,1) E{Y}=exp(m+0.5s2) = e1.5 = 4.4817

Demo

Typists.sas

D.A.D.

SAS Global Forum 2010

- Main points:
- Generalized linear models with random effects are subject specific models.
- Subject specific models have fixed effects that represent an individual with random effects 0 (individual at the random effect distributional means).
- Subject specific models when averaged over the subjects do not give the model fixed effects.
- Models with only fixed effects do give the fixed effect part of the model when averaged over subjects and are thus called population average models.