the changing landscape of interim analyses for efficacy futility n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
The changing landscape of interim analyses for efficacy / futility PowerPoint Presentation
Download Presentation
The changing landscape of interim analyses for efficacy / futility

Loading in 2 Seconds...

play fullscreen
1 / 48

The changing landscape of interim analyses for efficacy / futility - PowerPoint PPT Presentation


  • 281 Views
  • Uploaded on

The changing landscape of interim analyses for efficacy / futility. Marc Buyse, ScD IDDI, Louvain-la-Neuve, Belgium marc.buyse@iddi.com. Massachusetts Biotechnology Council Cambridge, Mass June 2, 2009. Reasons for Interim Analyses. Early stopping for safety extreme efficacy

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The changing landscape of interim analyses for efficacy / futility' - omer


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the changing landscape of interim analyses for efficacy futility

The changing landscape of interim analyses for efficacy / futility

Marc Buyse, ScD

IDDI, Louvain-la-Neuve, Belgium

marc.buyse@iddi.com

Massachusetts Biotechnology Council

Cambridge, Mass

June 2, 2009

reasons for interim analyses

Reasons for Interim Analyses

Early stopping for

safety

extreme efficacy

futility

Adaptation of design based on observed data to

play the winner / drop the loser

maintain power

make any adaptation, for whatever reason and whether or not data-derived, whilst controlling for 

methods for interim analyses

Methods for Interim Analyses

Multi-stage designs / seamless transition designs

Group-sequential designs

Stochastic curtailment

Sample size adjustments

Adaptive (« flexible ») designs

early stopping
Early Stopping
  • Helsinki Declaration:

“Physician should cease any investigation if the hazards are found to outweigh the potential benefits.”(« Primum non nocere »)

  • Trials with serious, irreversible endpoints should be stopped if one treatment is “proven” to be superior, and such potential stopping should be formally pre-specified in the trial design.
the cost of delay
The Cost of Delay

« Blockbusters » reach sales > 500 M$ a year (> 1 M$ a day)

fixed sample size trials
Fixed Sample Size Trials…

1 – the sample size is calculated to detect a given difference at given significance and power2 – the required number of patients is accrued3 – patient outcomes are analyzed at the end of the trial, after observation of the pre-specified number of events

vs group sequential trials
…vs(Group) Sequential Trials…

1 – the sample size iscalculated to detect a givendifferenceatgivensignificance and power2 – patients are accrueduntil a pre-plannedinterimanalysisof patient outcomestakes place3a – the trial isterminatedearly, or3b – the trial continues unchanged4 – patient outcomes are analyzed at the end of the trial, after observation of the pre-specifiednumber of events

vs adaptive trials
…vs Adaptive Trials

1 – the sample size iscalculated to detect a givendifferenceatgivensignificance and power2 – patients are accrueduntil a pre-plannedinterimanalysisof patient outcomestakes place3a – the trial isterminatedearly, or3b – the trial continues unchanged, or3c – the trial continues withadaptations4 – patient outcomes are analyzed at the end of the trial, after observation of the pre-specified or modifiednumber of events

randomized phase ii trial with continuation as phase iii trial

PHASE III

PHASE II

Randomized phase II trial with continuation as phase III trial

Simultaneous screening of several treatment groups with continuation as phase III trial :

Arm 1

Arm 2

Arm 3

Early stopping ofone or more arms

Comparison of the arms

phase iii trial with interim analysis

PHASE III

PHASE III INTERIM

Phase III trial with interim analysis

Phase III trial with interim look at data:

Arm 1

Arm 2

Arm 3

Interim comparison ofthe arms

Comparison of the arms

seamless transition designs e g for dose selection
Seamless transition designs(e.g. for dose selection)

Designs can be operationally or inferentially seamless:

group sequential trials
GroupSequential Trials
  • If several analyses are carried out, the Type I error is inflated if each analysis is carried out at the target level of significance.
  • So, the interim analyses must use an adjusted level of significance so as to preserve the overall type I error.
inflation of with multiple analyses
Inflation of  with multiple analyses

With 5 analyses performed at level 0.05, the overall level is 0.15

adjusting for multiple analyses
Adjusting  for multiple analyses

The 5 analyses must be performed at level 0.0159 in order to preserve an overall level of 0.05

group sequential designs
Group sequential designs
  • Test H0: Δ = 0 vs. HA: Δ ≠ 0
  • m pts. accrued to each arm between analyses
  • Use standardized test statistic Zk, k=1,...,K
group sequential designs type i error
Group-Sequential Designs – Type I Error
  • Probability of wrongly stopping/rejecting H0at analysis k

PH0(|Z1|<c1, ..., |Zk-1|<ck-1, | Zk |≥ck) = πk

    • “Type I error spent at stage k”
  • P(Type I error) = ∑πk
  • Choose ck’s so that ∑πkα
group sequential designs type ii error
Group-Sequential Designs – Type II Error
  • Probability of Type II error is

1-PHA( U {|Z1|<c1, ..., |Zk-1|<ck-1, | Zk |≥ck} )

  • Depends on K, α, β, ck’s.
  • Given the values, the required sample size can be computed
    • it can be expressed as R x (fixed sample size)
pocock boundaries
Pocock Boundaries
  • Reject H0 if | Zk| > cP(K,α)
    • cP(K,α) chosen so that P(Type I error) = α
  • All analyses are carried out at the same adjusted significance level
  • The probability of early rejection is high but the power at the final analysis may be compromised
pocock boundaries1
Pocock Boundaries
  • p-values for Zk (two-sided)per interim analysis (K=5)
o brien fleming boundaries
O’Brien-Fleming Boundaries
  • Reject H0 if | Zk | > cOBF(K,α)√(K / k)
    • fork=K we get | ZK | > cOBF(K,α)
    • cOBF(K,α) chosen so that P(Type I error) = α
  • Early analyses are carried out at extreme adjusted significance levels
  • The probability of early rejection is low but the power at the final analysis is almost unaffected
o brien fleming boundaries1
O’Brien-Fleming Boundaries
  • p-values for Zk (two-sided)per interim analysis (K=5)
wang tsiatis boundaries
Wang & Tsiatis Boundaries
  • Wang & Tsiatis (1987):

Reject H0if | Zk | > cWT(K,α,θ)(K / k)θ - ½

    • θ = 0.5 gives Pocock’s test; θ = 0, O’Brien-Fleming
    • implemented in some software (e.g. EaSt)
  • Can accomodate any intermediate choice between Pocock and O’Brien-Fleming
wang tsiatis boundaries1
Wang & Tsiatis Boundaries
  • p-values for Zk (two-sided)per interim analysis (K=5) with  = .2
haybittle peto boundaries
Haybittle & Peto Boundaries
  • Haybittle & Peto (1976):

Reject H0 if | Zk | > 3 for k = 1,...,K-1

Reject H0 if | Zk | > cHP(K,α) for k = K

    • | Zk | > 3corresponds to using p < 0.0026
  • Early analyses are carried out at extreme, yet reasonable adjusted significance levels
  • Intuitive and easily implemented if correction to final significance level is ignored (pragmatic approach)
haybittle peto boundaries1
Haybittle & Peto Boundaries
  • p-values for Zk (two-sided)per interim analysis (K=5)
boundaries compared
Boundaries compared
  • p-values for Zk (two-sided)per interim analysis (K=5)
boundaries compared1
Boundaries compared
  • Zk per interim analysis (K=5)
potential savings costs in using group sequential designs
Potential savings / costs in using group sequential designs

Expected sample sizes for different designs (K=5): - outcomes normally distributed with  = 2-  = 0.05-  = 0.1 for A - B = 1

error spending approach
Error-Spending Approach
  • Removing the requirement of a fixed number of equally- spaced analyses
  • Lan & DeMets (1983): two-sided tests “spending” Type I error.
  • Maximum information design:
    • Error spending function →
    • Defines boundaries
    • Accept H0 if Imaxattained without rejecting the null
error spending approach1
Error-Spending Approach
  • f(t)=min(2-2Φ(z1-α/2),α) yields ≈ O’B-F boundaries
  • f(t)=min(α ln (1+(e -1)t,α) yields ≈ Pocock boundaries
  • f(t)=min(αtθ,α):
    • θ=1 or 3 corresponds to Pocock and O’B-F, respectively
how many interim analyses
How Many Interim Analyses?
  • One or two interim analyses give most benefit in terms of a reduction of the expected sample size
  • Not much gain from going beyond 5 analyses
when to conduct interim analyses
When to Conduct Interim Analyses?
  • With error-spending, full flexibility as to number and timing of analyses
    • First analysis should not be “too early” (often at  50% of information time)
    • Equally-spaced analyses advisable
  • In principle, strategy/timing should not be chosen based on the observed results
who conducts interim analyses
Who conducts interim analyses?
  • Independent Data Monitoring Committee
  • Experts from different disciplines (clinicians, statisticians, ethicists, patient advocates, …)
  • Reviews trial conduct, safety and efficacy data
  • Recommends
    • Stopping the trial
    • Continuing the trial unchanged
    • Amending the trial
sample size re estimation
Sample Size Re-Estimation
  • Assume normally distributed endpoints
  • Sample size depends on σ2
  • If misspecified, nIcan be too small
  • Idea: internal pilot study
    • estimate σ2 based on early observed data
    • compute new sample size, nA
    • if necessary, accrue extra patients above nI
early stopping for futility
Early Stopping for Futility
  • Stopping to reject H0ofno treatment difference
    • Avoids exposing further patients to the inferiortreatment
    • Appropriate if no further checks are needed on, e.g.,treatment safety or long-term effects.
  • Stopping to acceptH0 ofno treatment difference
    • Stopping “for futility” or “abandoning a lost cause”
    • Saves time and effort when a study is unlikely to leadto a positive conclusion.
stochastic curtailment
Stochastic Curtailment

Idea:

  • Terminate the trial for efficacy if there is high probability of rejecting the null, given the current data and assuming the null is true among future patients
  • Conversely, terminate the trial for futility if there is low probability of rejecting the null, given the current data and assuming the alternative is true among future patients
conditional power
Conditional Power
  • At the interim analysis k, define

pk(Δ) = PHA(Test will reject H0 | current data)

  • A high value of pk(0) suggests T will reject H0
    • terminate the trial & reject H0ifpk(0) > ξ
    • terminate the trial & accept H0if 1-pk(Δ) > ξ’ (1-sided)
    • probabilities of error, type I  α/ ξ, type II  β / ξ’

Note: ξ and ξ’ 0.8

conditional power1
Conditional Power
  • Unconditional power for α=0.05 and β=0.1 at Δ=0.2
  • Conditional power for a mid-trial analysis with an estimate of Δof 0.1
    • probability of rejecting the null at the end of the trial has been reduced from 0.9 to 0.1
conditional power2
Conditional Power

B(t) = Z(t)t1/2 = t

conditional power3
Conditional Power

Slope = assumed treatment effect in future patients

conditional power4
Conditional Power

Crosshatched area = conditional power

predictive power
Predictive Power
  • Problem with the conditional power approach: it is computed assuming Δ not supported by the current data.
  • A solution: average across the values of Δ
  • “Predictive power”
  • π(Δ | data)is the posterior density
  • Termination against H0ifPk > ξetc.
  • What prior ?
adaptive designs
Adaptive Designs
  • Based on combining p-values from different analyses
  • Allow for flexible designs
    • sample size re-calculation
    • any changes to the design (including endpoint, test, etc!)
adaptive designs1
Adaptive Designs
  • Lehmacher and Wassmer (1999):

At stage k, combine one-sided p-values p1,... ,pk

L = k-1/2∑Φ-1(1-pk)

  • Use any group sequential design for L
  • Slight power loss as compared to a group-sequential plan
  • Flexibility as to design modifications: OK for control of type I error, BUT…
potential concerns with a daptive d esigns
Potential concerns with adaptive designs
  • Major changes between cohorts make clinical interpretation difficult
  • If eligibility / endpoint changed, what is adequate label?
  • Temporal trends
  • Operational bias
  • Less efficient than group sequential for sample size adjustments
  • Modest gains (in general), high risks