Different distributions
This presentation is the property of its rightful owner.
Sponsored Links
1 / 27

Different Distributions PowerPoint PPT Presentation


  • 81 Views
  • Uploaded on
  • Presentation posted in: General

Different Distributions. David Purdie. Topics. Application of GEE to: Binary outcomes: logistic regression Events over time (rate): Poisson regression Survival data: Cox regression. General form for distributions from the exponential family. Outcome for subject i at time j = Y ij

Download Presentation

Different Distributions

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Different distributions

Different Distributions

David Purdie


Topics

Topics

Application of GEE to:

  • Binary outcomes:

    • logistic regression

  • Events over time (rate):

    • Poisson regression

  • Survival data:

    • Cox regression


General form for distributions from the exponential family

General form for distributions from the exponential family

Outcome for subject i at time j = Yij

E(Yij)=ij

Generalized linear model

g(ij)=Xi

where Xi=(xi1,…,xij) is the matrix of covariates for subject i


Binary outcomes logistic regression

Binary outcomes: logistic regression

Outcome:

Pr(Yij = 1) = ij (probability of an event)

Pr(Yij = 0) = 1- ij.

Logit link function:

Logistic model:

where ij = E(Yij|Xi)


Events over time poisson regression

Events over time: Poisson regression

  • Outcome: Yi= number of events in time period ti

  • E(Yi): iti Var(Yiti)= iti (were i is the event rate)

  • Log link function: log (i)

  • Poisson model:


Survival data cox regression

Parameter: tij (time to event yij)

Based on a hazard function: ht

Outcome: Tij= time till event yij

Log link function: log (ht)

Cox model:

Survival data: Cox regression

where t is the baseline hazard rate.


Alternating logistic regression

Alternating logistic regression

If the responses are binary, it may make more sense to use a matrix of odds ratios rather than correlations.

Replace corr(Yij, Yik) with:

The ALR algorithm models ijk = log{OR(Yij,Yik)} as:

ijk=zijk 

where  are regression parameters and

z is fixed and needs to be specified


Mixed models for non normal data

Mixed Models for Non-Normal Data

  • E(y|u)=, var(y|u)=V(), g()=X+Zu

  • Random coefficients u have dist f(u)

  • y|u has the usual glm distribution

  • Binary outcome:

    • binomial for y|u and beta for u

  • Count outcome:

    • Poisson for y|u and gamma for u


Example binary

Example - binary

  • Study of bladder cancer

  • All patients had superficial bladder tumours on entry which were removed

  • Two randomly allocated treatments (group):

    • Placebo (n=47), Thiotepa (n=38)

  • Many multiple recurrences of tumours

  • Month is month since treatment (1 to 53)

  • Baseline covariates of number of initial tumours (number) & size of largest tumour (size)

  • Lots of missing data: 3585 out of 4505 potential observations (80%) are missing

  • Model missing data (yes/no) using a binomial GEE to assess if data is missing at random (logit link function)

 Name in data set


Visits per subject

Visits per subject


Plot of missing proportion over time

Plot of missing proportion over time


Format for the data in sas

Format for the data in SAS


Logistic gee in sas

Logistic GEE in SAS

procgenmod data=tumour_miss descending;

class group subject month;

model missing=group month size number

/ dist=binomial type3;

repeated subject=subject

/ type=ind corrw within=month;

estimate 'effect of thiotepa' group -11/ exp;

run;


Ors for group thiotepa vs plac

ORs for group (Thiotepa vs plac)


Example poisson

Example - Poisson

  • Response: number of new tumours (count)

  • Month is month since treatment (1 to 53)

  • Baseline covariates of number of initial tumours (number) & size of largest tumour (size)

  • Timesince is the number of months since the last visit

  • Missing data are dependent upon treatment group and time

  • Model new tumour counts using a Poisson GEE to assess treatment effect (log link function)

 Name in data set


Different distributions

Count of tumours by treatment group


New tumour counts over time by treatment group

New tumour counts over time by treatment group


Plot of observed means over time

Plot of observed means over time


Poisson gee in sas

Poisson GEE in SAS

procgenmod data=tumour_count;

class group subject month;

model count=group size number timesince

/ dist=poisson scale=deviance;

repeated subject=subject

/ type=exch withinsubject=month corrw;

estimate 'effect of thiotepa' group -11/ exp;

run;


Rrs for group thiotepa vs plac

RRs for group (Thiotepa vs plac)

*WARNING: The number of response pairs for estimating correlation is less than or equal to the number of regression parameters. A simpler correlation model might be more appropriate.


Using an offset

Using an offset

data tumour_count;

set tumour_count;

off=log(timesince+1);

run;

proc genmod data=tumour_count;

class group subject month;

model count=group size number

/ dist=poisson scale=deviance offset=off type3;

repeated subject=subject

/ type=unstr withinsubject=month;

estimate 'effect of thiotepa' group -1 1/ exp;

run;


Rrs for group thiotepa vs plac1

RRs for group (Thiotepa vs plac)

*WARNING: The number of response pairs for estimating correlation is less than or equal to the number of regression parameters. A simpler correlation model might be more appropriate.


Interpretation and presentation

Interpretation and Presentation

  • Descriptive: plots of means or tables of means (percentages, etc.)

  • Tables of parameter estimates and confidence intervals (odds ratios or relative risks)

  • P-values for effects or interactions (possibly just in the text)

  • Emphasize results from descriptive analysis and effect estimates.


Statistical methods

Statistical Methods

  • What is the distribution of the outcome?

  • How were the data summarized?

  • Due to the repeated nature of the data, a generalized estimated equations (GEE) approach was used to estimate parameters and test for differences between groups.

  • What was the form of the correlation structure?

  • What hypotheses were being tested?

  • How were missing data handled?

  • How were variances calculated?

  • What statistical package was used?


Example statistical methods

Example: Statistical Methods

Mean numbers of new tumours were used to summarise the data. Poisson regression was used to model tumour counts using the time between successive observations as an offset. Due to the repeated nature of the data, a generalized estimated equations (GEE) approach was used to estimate parameters and test for differences between groups. The main hypothesis being tested was whether Thiotepa affected the numbers of new tumours. The correlation between successive observations was examined and an appropriate correlation structure was specified. Drop outs and non-attendance was examined to assess for differences between the treatment groups. Robust variance estimate techniques were used to calculate standard errors and confidence intervals. All analysis were performed using SAS version 8.2.


  • Login