Selecting the appropriate statistical distribution for a primary analysis
This presentation is the property of its rightful owner.
Sponsored Links
1 / 25

Selecting the Appropriate Statistical Distribution for a Primary Analysis PowerPoint PPT Presentation

Selecting the Appropriate Statistical Distribution for a Primary Analysis. P. Lachenbruch. A Study of Xeroderma Pigmentosa (XP). A characteristic of XP is the formation of Actinic Keratoses (AK s ) Multiple lesions appear haphazardly on a patient’s back

Download Presentation

Selecting the Appropriate Statistical Distribution for a Primary Analysis

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Selecting the appropriate statistical distribution for a primary analysis

Selecting the Appropriate Statistical Distribution for a Primary Analysis

P. Lachenbruch


A study of xeroderma pigmentosa xp

A Study of Xeroderma Pigmentosa (XP)

  • A characteristic of XP is the formation of Actinic Keratoses (AK s )

  • Multiple lesions appear haphazardly on a patient’s back

  • The rate of appearance may not be the same for different patients


Background

Background

  • Analysis: Rank Sum test.

  • Late in study the Statistical Analysis Plan (SAP) was amended to use Poisson regression

  • Unclear if stepwise selection of covariates was planned a priori


Study results

Study Results

  • Poisson regression analysis showed highly significant treatment difference (p=0.009) adjusting for baseline AK, age, and age x treatment interaction (stepwise selection)

  • All these effects were highly significant.

  • Substantial outlier problem


Assumptions

Assumptions

  • Each patient has the same incidence rate,  per area unit.

  • Chance of more than one AK in small area unit is negligible.

  • Non-overlapping lesions are independent, that is, lesions occurring in one area of the body are not affected by those occurring in another area.


Outliers

Outliers

  • Outliers are observations that are jarringly different from the remainder of the data

    • May be multiple outliers

    • If frequency is large, this may be evidence that we have a mixture distribution.

  • Can substantially affect analysis


Analyses

Analyses

Two-Sample Wilcoxon rank-sum (Mann-Whitney) test

trt | obs rank sum expected

--------+---------------------------------

0 | 9 158 135

1 | 20 277 300

--------+---------------------------------

Combined| 29 435 435

unadjusted variance 450.00

adjustment for ties -15.07

----------

adjusted variance 434.93

Ho: ak12tot(trt==0) = ak12tot(trt==1)

z = 1.103

Prob > |z| = 0.2701


Distribution of ak data at baseline stem and leaf yarosh et al lancet

Distribution of AK Data at Baseline (Stem and Leaf)(Yarosh et al, Lancet)

Lead | Trailing digits

0* | 00000000000000000011223335

//

4* | 27

//

10* | 0  oops!


Distribution of 12 month ak total data stem and leaf

Distribution of 12 Month AK Total Data (Stem and Leaf)

. stem ak12tot,w(10)

Lead| Trailing digits

0* | 000000001111222233457

1* | 00345

2* |

3* | 7

//

7* | 1

8* | 9

//

19*| 3  same patient - in placebo group


Results of poisson analyses

Results of Poisson Analyses

Poisson regression Number of obs = 29

LR chi2(3) = 1044.65

Prob > chi2 = 0.0000

Log likelihood = -127.46684 Pseudo R2 = 0.8038

----------------------------------------------------------

ak12tot | Coef. Std. Err. z P>|z| [95% Conf. Interval]

---------+------------------------------------------------

age | .017 .0056 3.00 0.003 .0058 .0276

trt | .532 .167 3.20 0.001 .2061 .859

akb | .045 .0019 23.10 0.000 .0409 .0485

_cons | .658 .219 3.00 0.003 .2282 1.0878

----------------------------------------------------------

  • G-O-F in control group, 2 =1222.5 with 8 d.f.

  • G-O-F in treatment group, 2 =682.5 with 19 d.f.


Permutation test

Permutation Test

  • Procedure: Scramble treatment codes and redo analysis. Repeat many (5,000?) times.

  • Count number of times the coefficient for treatment exceeds the observed value.


Command and output

Command and Output

. permute trt "permpois trt ak12tot age akb" rtrt=rtrt rage=rage rakb=rakb ,reps(5000) d

command: permpois trt ak12tot age akb

statistics: rtrt = rtrt

rage = rage

rakb = rakb

permute var: trt

Monte Carlo permutation statistics Number of obs = 30

Replications = 5000

----------------------------------------------------------

T | T(obs) c n p=c/n SE(p)

-------------+--------------------------------------------

rtrt | .5324557 2660 5000 0.5320 0.0071

rage | .0167116 3577 5000 0.7154 0.0064

rakb | .0446938 1118 5000 0.2236 0.0059

----------------------------------------------------------

Note: c = #{|T| >= |T(obs)|}

I deleted the confidence intervals for the proportions


Permutation tests 2

Permutation Tests (2)

  • Poisson with 5000 Replications

  • Treatment: p = 0.57

  • Age: p = 0.62

  • AK Baseline: p = 0.28

  • All significant results disappear


Results of poisson analysis

Results of Poisson Analysis

  • Sponsor found that all terms were highly significant (including the treatment x age interaction).

  • We reproduced this analysis.

  • We also did a Poisson goodness-of-fit test that strongly rejected the assumption of a Poisson distribution.

  • What does a highly significant result mean when the model is wrong?


Conclusions

Conclusions

  • The data are poorly fit by both Poisson and Negative Binomial distributions

    • Permutation tests suggest no treatment effect unless treatment by age interaction is included

  • Justification of interaction term by stepwise procedure is exploratory

  • Outliers are a problem and can affect the conclusions.


Conclusions 2

Conclusions (2)

  • The results of the study are based on exploratory data analysis.

  • The analysis is based on wrong assumptions of the data.

  • Our analyses based on distribution free tests do not agree with the sponsor’s results.

  • The results based on appropriate assumptions do not support approval of the product.


Suggestions

Suggestions

  • Conduct a phase II study to determine appropriate covariates.

  • Need to use appropriate inclusion / exclusion criteria.

  • Stratification.

  • a priori specification of full analysis


Reference

Reference

Yarosh D. et al., "Effect of topically applied T4 endonuclease V in liposomes on skin cancer in xeroderma pigmentosum: a randomised study" Lancet 357:926-929, 2001.


The end

The End


Grid on back

Grid on “Back”


The data

+-------------------------+

| sex trt akb ak12tot|

|-------------------------|

| F 0 0 5 |

| M 0 0 1 |

| F 0 0 1 |

| F 0 0 0 |

| F 0 1 15 |

|-------------------------|

| M 0 0 3 |

| F 0 100 193 |

| M 0 0 2 |

| M 0 2 13 |

| M 1 47 71 |

|-------------------------|

| F 1 0 0 |

| F 1 0 1 |

| F 1 0 0 |

| F 1 42 37 |

| F 1 2 0 |

|-------------------------|

+-------------------------+

| sex trt akb ak12tot|

+-------------------------+

| F 1 3 2 |

| F 1 0 10 |

| M 1 0 0 |

| F 1 0 2 |

| M 1 0 0 |

|-------------------------|

| F 1 0 0 |

| F 1 3 10 |

| F 1 1 0 |

| F 1 0 4 |

| F 1 5 3 |

|-------------------------|

| M 1 0 0 |

| F 1 0 2 |

| F 1 0 7 |

| F 1 3 14 |

| M . . . |

+-------------------------+

The Data


Descriptive statistics 1

Descriptive Statistics (1)


Descriptive statistics 2

Descriptive Statistics (2)


Negative binomial model

Negative Binomial Model

  • Need a model that allows for individual variability.

  • Negative binomial distribution assumes that each patient has Poisson, but incidence rate varies according to a gamma distribution.

  • Treatment: p = 0.64

  • Age: p = 0.45

  • AK Baseline: p = 0.0001

  • Age x Treat: p <0.001

    • Main effect of treatment is not interpretable. Need to look at effects separately by age.


Negative binomial results

Negative Binomial Results

  • This model shows only that the baseline AK and age x treatment effects are significant factors.

  • It also gives a test for whether the data are Poisson; the test rejects the Poisson Distribution: p<0.0005

  • A test based on chisquare test (obs - exp) suggests that these data are not negative binomial.


  • Login