2 27 03 outline
This presentation is the property of its rightful owner.
Sponsored Links
1 / 26

2/27/03 Outline PowerPoint PPT Presentation


  • 110 Views
  • Uploaded on
  • Presentation posted in: General

2/27/03 Outline. Part I: Misc. Statistical Issues Multiple comparisons in clinical trials Multiple endpoints Subgroups Adverse experience categorization Multivariate adjustment Part II: Multi-center trials and working with industry (Cummings left over). Multiple comparisons.

Download Presentation

2/27/03 Outline

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


2 27 03 outline

2/27/03 Outline

  • Part I: Misc. Statistical Issues

    • Multiple comparisons in clinical trials

    • Multiple endpoints

    • Subgroups

    • Adverse experience categorization

    • Multivariate adjustment

  • Part II: Multi-center trials and working with industry (Cummings left over)


Multiple comparisons

Multiple comparisons

  • The general problem

    • Each statistical test has a 5% chance of Type I error

    • We are wrong 1 time out of 20

    • Easy to come up with spurious results

  • Take a worthless drug (placebo 2) compare to placebo 1

    • 1 study: P(type I error)= 5%

    • 2 studies: P(1 or 2 type I errors)= almost 10%

    • 20 studies: P(at least one significant)=64%

  • Publication bias


Multiple comparisons solutions

Multiple comparisons: solutions?

  • Bonferroni

    • Divide overall p-value by number of tests

    • Unacceptable losses of power

  • Use common sense/Bayesian

    • Does result make sense?

    • Biologic plausibility

    • Is result supported by previous data?

    • Was analysis defined apriori?

  • Examples of problem in clinical trials


Multiple comparisons in rct s are pervasive

Multiple comparisons in RCT’s are pervasive

  • Monitoring of trials: look at results as they accumulate

    • Lots of statistical machinery

  • Multiple endpoints in a trial

    • Primary endpoint: “all fractures” but also found significant reductions in hip fractures

    • Primary endpoint: fractures, significant reductions in breast cancer

    • Safety

  • Subgroup analyses

  • Multivariate analysis (adjustment) for BL covariates


No adjustment for multiple comparisons

No Adjustment for Multiple Comparisons?

  • Rothman, 1990

    • Adjustments for multiple comparisons lead to type II errors

    • A policy of not making adjustments is preferable

  • “ Scientists should not be so reluctant to explore leads that may turn out to be wrong that they penalize themselves for missing possibly important findings”


Multiple endpoints making a mountain out of a molehill

Multiple Endpoints: Making a Mountain Out of a Molehill

  • Multiple Outcomes of Raloxifene Evaluation (MORE) trial

  • Main outcome: vertebral fractures

  • Secondary outcome: non-vertebral fractures

    • Main osteoporotic subtypes: hip, wrist

  • Overall, no effect of raloxifene on NV fractures

  • Looked at 14 subtypes of fractures

  • One significant: ankle. Wanted to title paper: “Raloxifene reduces ankle fractures”


Multiple endpoints in pepi strict bonferonni rule

Multiple Endpoints in PEPI: Strict Bonferonni Rule

  • Post-menopausal Estrogen/Progesterone Intervention PEPI (website)

  • 4 treatment groups, several primary outcomes: all continuous

  • Adjust all p-values to account for multiple comparisons

    • Multiple primary endpoints (4)

    • Within each endpoint, adjust for 4 treatments


Multiple endpoints

Multiple endpoints

  • Often many ways to slice the outcome pie

    • Different subgroups of endpoints

    • Fractures: all, leg, arm, rib, etc. (MORE)

    • Multiple comparisons problems

  • Some solutions

    • Very explicit predefinition of endpoints

    • Limit number of endpoints

    • FDA: single endpoint only


Subgroups

Subgroups

  • After primary analysis, want to look at subgroups

  • Does effectiveness vary by subgroup

  • If drug effective, is it more effective in some populations?

  • If results overall show no effect, does drug work in subgroup of participants?

  • Are adverse effects concentrated in some subgroups?


Example efficacy of alendronate

Example: Efficacy of alendronate

  • FIT II: Women with BMD T-score < -1.6 (osteopenic--only 1/3 osteoporotic)

    • Women without existing vertebral fractures (2)

  • Overall results: 14% reduction, p=.07

  • Wimpy


Rr for clinical fracture of alendronate fit ii cummings jama 1999

RR for clinical fracture of alendronate(FIT II, Cummings, JAMA 1999)

1.5

P=0.07

0.86

(0.73 - 1.01)

1

B

Relative Risk

B

B

0

Overall


Rr for clinical fracture of alendronate by baseline bmd groups

RR for clinical fracture of alendronate by baseline BMD groups

1.14

(0.82 - 1.60)

1.03

B

1.5

(0.77 - 1.39)

B

0.86

(0.73 - 1.01)

B

B

1

B

Relative Risk

B

B

B

B

B

B

B

0.64

(0.50 - 0.82)

0

Overall

T < -2.5

T > -2.0

-2.5 < T < -2.0

Baseline Femoral Neck BMD, by T-score


What to do with an unexpected subgroup finding

What to Do With an Unexpected Subgroup Finding

  • Is this a real finding? (not really specified apriori)

  • Has this been previously observed?

    • Increase prior probability

  • Ways to verify

    • Examine for other similar subgrouping variables (BMD at hip, spine, radius)

    • Examine for other similar endpoints (hip fractures, etc.)

    • Most important: look at other trials, if possible and available

    • Examine biologic plausibility


Effect of alendronate on hip fx depends on baseline hip bmd

Effect of alendronate on hip fx depends on baseline hip BMD

Baseline BMD T-score

-1.6 – -2.5

1.84 (0.7, 5.4)

0.44 (0.18, 0.97)

< - 2.5

Overall

0.79 (0.43, 1.44)

0.1

1

10

Relative Hazard (± 95% CI)


Fosamax international trial fosit

Fosamax International Trial (FOSIT)

  • 1908 women, 34 countries

  • Lumbar spine BMD T-score < -2

  • Alendronate (10 mg) vs. placebo

  • One year follow-up

  • BMD main endpoint

  • 47% reduction in all clinical fractures (p<.05)


Fosit relative risk alendronate vs placebo within bmd subgroups

FOSIT: Relative risk alendronate vs. placebo within BMD subgroups

Baseline hip BMD T NRR*95% CI

Overall19080.53(0.3,0.9)

> -29551.2 (0.5, 2.9)

-2 to –2..52790.32(0.07,1.5)

< -2.56740.26(0.1,0.7)


Subgroup analysis in hers

Subgroup analysis in HERS

  • Overall no effect of HRT or perhaps harm in year 1

  • Is there a subgroup who benefit?

  • Is there subgroup with significant harm?

  • Look at relative hazard (RH) within subgroups defined by baseline variables

    • Medication use at baseline

    • Prior disease

    • Health habits

    • Compare RH in those with and without risk factor

      • RH in those using beta blockers compared to those not using

      • RH > 1 ==> harm

      • Get p-value for significance of difference of RH in those w and without


Hers 4 years of hrt increased then decreased chd events

HERS: 4 years of HRT increased then decreased CHD Events

YearE + PPlaceboRHp-value

157381.5.04

247481.01.0

335410.9.6

4 + 533490.7.07

> 5 ???

P for trend = 0.009


Subgroups the final frontier in hers

Subgroups: the final frontier in HERS

Relative hazard (E vs. placebo)

Subgroup Within Among

Subgroup N (%) Subgroup Others p*

history of smoking 1712 (62) 1.01 3.39 .01

current smoker 360 (13) 0.55 1.92 .03

digitalis use 275 (10) 4.98 1.26 .04

>= 3 live births 1616 (58) 1.09 2.72 .04

lives alone 775 (28) 2.97 1.14 .05

prior mi by chart review 1409 (51) 2.14 0.93 .05

beta-blocker use 899 (33) 2.89 1.15 .06

age >= 70 at randomization 1019 (37) 2.65 1.14 .06

* Statistical significance of interaction


Lots of subgroups were analyzed in hers

Lots of subgroups were analyzed in HERS

  • history of smoking (at rv) 1712 (62) 1.01 3.39 0.30 .01

  • current smoker (at rv) 360 (13) 0.55 1.92 0.29 .03

  • digitalis use (at rv) 275 (10) 4.98 1.26 3.96 .04

  • >= 3 live births 1616 (58) 1.09 2.72 0.40 .04

  • lives alone (at rv) 775 (28) 2.97 1.14 2.60 .05

  • prior mi by chart review (cr) 1409 (51) 2.14 0.93 2.30 .05

  • beta-blocker use (at rv) 899 (33) 2.89 1.15 2.51 .06

  • age >= 70 at randomization 1019 (37) 2.65 1.14 2.32 .06

  • prior mi in most distant tertile 447 (16) 2.64 0.93 2.82 .07

  • walk 10m or in exercise program (at rv) 1770 (64) 2.35 1.11 2.12 .08

  • prior ptca by chart review (cr) 1189 (43) 0.92 1.98 0.46 .08

  • prior mi within 2 years 420 (15) 3.20 1.28 2.50 .11

  • tg > median (at rv) 1377 (50) 2.02 1.05 1.93 .12

  • rales in the lungs (at rv) 80 ( 3) 0.43 1.65 0.26 .13

  • digitalis or ace-inhibitor use (at rv) 653 (24) 2.33 1.24 1.88 .16

  • previous ert for >= 12 months 302 (11) 4.19 1.41 2.98 .18

  • serious medical conditions 1028 (37) 1.05 1.81 0.58 .21

  • age >= 53 at lmp 578 (21) 3.19 1.38 2.31 .23

  • hdl > median (at rv) 1315 (48) 1.18 1.95 0.61 .24

  • lp(a) > median (at rv) 1378 (50) 1.26 2.08 0.60 .25

  • use of non-statin llm (at rv) 420 (15) 0.89 1.69 0.52 .25

  • married (at rv) 1588 (57) 1.26 1.98 0.64 .29

  • lvef <= 40% 178 ( 6) 2.16 1.01 2.13 .31

  • prior mi within 4 years 765 (28) 2.07 1.32 1.57 .32

  • previous ert use for >= 1 year 327 (12) 2.86 1.41 2.03 .32

  • prior mi within 1 year 194 ( 7) 2.88 1.43 2.02 .33

  • chest pain (at rv) 982 (36) 1.25 1.88 0.67 .33

  • dbp >= 90 mmhg (at rv) 149 ( 5) 0.91 1.62 0.56 .35

  • prior ptca within 1 year 206 ( 7) 3.94 1.46 2.71 .38

  • prior mi within 3 years 612 (22) 2.05 1.37 1.50 .40

  • prior ptca within 4 years 838 (30) 1.15 1.70 0.68 .40

  • use of any llm (at rv) 1296 (47) 1.23 1.76 0.70 .40

  • diuretic use (at rv) 775 (28) 1.89 1.33 1.42 .41

  • signs and symptoms of chf (at rv) 118 ( 4) 0.94 1.60 0.58 .42

  • ace inhibitor use (at rv) 483 (17) 2.05 1.40 1.46 .44

  • total cholesterol > median (at rv) 1377 (50) 1.32 1.80 0.74 .47

  • l-thyroxine use (at rv) 414 (15) 2.29 1.43 1.60 .47

  • poor/fair self-rated health (at rv) 665 (24) 1.30 1.72 0.76 .51

  • heart murmur (at rv) 540 (20) 1.89 1.42 1.34 .53

  • sbp >= 140 mmhg (at rv) 1051 (38) 1.37 1.72 0.80 .59

  • prior ptca within 3 years 695 (25) 1.27 1.61 0.78 .62

  • s3 heart sounds (at rv) 19 ( 1) 2.74 1.50 1.82 .63

  • htn by physical exam (at rv) 557 (20) 1.32 1.62 0.81 .64

  • >= 2 severely obstructed main vessels 1312 (47) 1.53 1.26 1.22 .69

  • statin use (at rv) 1004 (36) 1.34 1.59 0.84 .71

  • have you ever been pregnant 2564 (93) 1.55 1.15 1.35 .72

  • calcium-channel blocker (at rv) 1511 (55) 1.61 1.38 1.17 .73

  • previous hrt for >= least 12 months 132 ( 5) 1.24 1.60 0.78 .77

  • ldl > median (at rv) 1373 (50) 1.44 1.63 0.89 .77

  • prior ptca within 2 years 475 (17) 1.35 1.56 0.87 .81

  • baseline left bundle branch block 212 ( 8) 1.31 1.55 0.85 .82

  • white 2451 (89) 1.48 1.62 0.92 .88

  • ever told you had diabetes 634 (23) 1.48 1.53 0.97 .94

  • aspirin use (at rv) 2183 (79) 1.51 1.56 0.97 .95

  • any alcohol consumption (at rv) 1081 (39) 1.54 1.57 0.98 .97

  • gallstones or gallbladder dis. 633 (23) 1.55 1.52 1.02 .97

  • baseline atrial fibrillation/flutter 33 ( 1) - 1.50 - -

Total subgroups examined: 102

Total subgroups with p< .05: 6


Subgroups conclusions

Subgroups: conclusions

  • Subgroups are full of statistical problems

    • Multiple comparisons may lead to erroneous conclusions

  • Limited power in for subgroup analyses

  • Subgroups based on baseline variables are less bad

  • Subgroups based on post-randomization variables are more problematic


Safety assessment

Safety assessment

  • Often many categories (FIT: 200 or more)

  • Some are rare

  • Ex: Risedronate and lung cancer

  • How to control for spurious findings?

  • P-values almost meaningless


Categorization of adverse experiences

Categorization of Adverse Experiences

  • AE’s collected as “open text”

  • Need to categorize and compare by treatment

  • Options:

    • Many categories: few events per treatment, low power

    • Few categories: heterogenuous, may miss important effects

    • No correct solution

  • MeDRA coding

    • ~15,000 standard clinical terms (“specific terms”)

    • Various levels of grouping

    • May be non-sensical in some situations


Categorization of adverse experiences sellmeyer solution

Categorization of Adverse Experiences:Sellmeyer solution


Multivariable adjustment

Multivariable adjustment

  • Sometimes adjust for baseline variables

  • Especially those that are maldistributed

  • If algorithm for adjustment not pre-defined, adds subjective element to “objective” RCT

  • Given ineffective treatment, with enough fiddling with adjustments, can come up with significant effect (Paul Meier)

  • Conclusions: Many argue that should NEVER do adjustments in RCT’s

  • If do adjustment, severely limit plans


Statistical issues summary

Statistical issues: Summary

  • ITT (from 1/30 lecture):

    • All participants remain on medication

    • All participants are followed until end of study

    • Pre-planned analysis

  • Multiple comparisons are ubiquitous

    • Monitoring

    • Subgroup analyses

    • Safety analyses

  • Where possible, minimize subjectivity and adhoc-ness

  • Use judgement


  • Login