Questions of Interest. Is the new treatment better than the control treatment that I am using now? (superiority trial)If it is not better, is the new treatment as good (not unacceptably non-inferior) as the control treatment that I am using now? (non-inferiority trial)Can I use the new treatmen
1. Trial Objectives Superiority, Non-inferiority, and Equivalence
2. Questions of Interest Is the new treatment better than the control treatment that I am using now? (superiority trial)
If it is not better, is the new treatment as good (not unacceptably non-inferior) as the control treatment that I am using now? (non-inferiority trial)
Can I use the new treatment and the control treatment interchangeably? (equivalence trial)
3. Definitions (ICH Guidelines – E9) Superiority trial – a trial with the primary objective of showing that the response to the investigational product is superior to a comparative agent (active or placebo control).
Equivalence trial – a trial with primary objective of showing that the response to two or more treatments differs by an amount which is clinically unimportant (active control).
Non-inferiority trial – a trial with the primary objective of showing that the response to the investigational product is not clinically inferior to a comparative agent (active or placebo control but usually active) – very common in the regulatory setting.
4. Examples – Non-Inferiority - 1 Is a new left ventricular assist device that provides a “bridge” to heart transplant as effective in keeping patients alive until a heart becomes available as one of the FDA-approved devices?
Is a new vaccine for pertussis (whooping cough) that has an improved safety profile as effective in preventing whooping cough as the currently licensed vaccine?
Is a single dose of a drug (low dose) equivalent to a twice a day dose (high dose)?
5. Examples – Non-Inferiority - 2 Is a short course of treatment for latent TB infection (3 months of INH plus rifapentine) as effective as 9 months of INH in preventing active TB?
6. Example - HIV Trial: Abacavir-Lamivudine-Zidovdine vs Indinavir-Lamivudine-Zidovudine JAMA 2001;285:1155-1163. “The study was powered to assess treatment equivalence for the primary endpoint (i.e., a plasma HIV RNA level <= 400 copies/mL at week 48 for the intent- to-treat population). For the primary end point, treatments were considered equivalent if the 95% confidence interval was within the bound -12% to 12%.”
7. Motivation for Non-Inferiority and Equivalence Trials New Treatment
More convenient to use (e.g., short course of prophylaxis for TB)
Lower risk of side effects (e.g., pertussis vaccine)
But is it as effective?
8. Superiority and Non-Inferiority in One Trial (Usually concurrent placebo arm is absent)
9. In the absence of a concurrent placebo, have to provide assurance that the active control would have been superior to placebo, if it had been used, and the test treatment would have beat placebo had it been used (indirect inference).
10. Non-inferiority or Equivalence Trials: Key Features Efficacy of reference or control treatment (anchor) must be clearly established (control is better than nothing).
Target population and outcome measures must be similar to the trial that established efficacy of control (constancy).
Margin of non-inferiority/equivalence must be a priori stated, clinically relevant, and chosen to ensure new treatment is better than “imputed” nothing (non-inferiority margin).
11. Assay Sensitivity and Constancy are Critical Assumptions in Interpreting Non-inferiority and Equivalence Trials Assay Sensitivity (def.) – ability to demonstrate a difference if one exists or, in this case, to demonstrate similarity if that is the case
How do you tell the difference between a good trial that establishes the treatments to be similar from a bad trial that fails to find a true difference?
External evidence: historical data that the control treatment is effective
Internal evidence : a high quality trial
Historical data showing that the control treatment is effective (better than placebo), holds in the setting of the current non-inferiority trial
12. Historical Evidence Concerning Efficacy of Active Control One trial
Meta-analysis or overview of trials
Point-estimate or lower bound of 95% CI
Retention of certain fraction of superiority of active control over placebo (e.g., 50%)
True probability of event for active control and placebo are 20% and 30%
Show probability of event with new treatment is smaller than 25% (a difference, or non-inferiority margin, between new treatment and active control of 5%)
13. PROBLEM How do you prove two treatments are equal?
Cannot prove HO: ?=0
14. “It is never correct to claim that treatments have no effect or that there is no difference in the effects of treatments. It is impossible to prove … that two treatments have the same effect. There will always be uncertainty surrounding estimates of treatment effects, and a small difference can never be excluded… An analysis of 45 reports of trials purporting to test equivalence found that only a quarter set boundaries on their equivalence.”
Alderson P, Chalmers I. BMJ 2003:326:1691-8.
15. Relationship Between Significance Tests and Confidence Intervals
16. Superiority Trial – ALLHAT: Lisinopril vs Chlorthalidone for CHD Incidence, CVD Composite Outcome, and ESRD*
17. Interpretation of Head to Head (Equivalence) Trials: CONVINCE and CAPPP
18. Example: 2NN Study A study of first-line antiretroviral therapy in HIV
Main comparison between nevirapine twice daily and efavirenz (plus stavudine and lamivudine) in terms of ‘treatment failure’ (based on virology, disease progression, therapy change)
Primary objective was to establish the non-inferiority of nevirapine twice daily (d =10%)
19. Results: 2NN Study Confidence intervals for failure rates (EFV-NVP)
All data (-12.8%, 0.9%)
Those starting med. (-14.6%, -0.8%)
Neither interval is completely above d value of -10%; one interval also excludes zero.
20. Conclusions: 2NN Study BUT, the authors concluded:
‘Antiviral therapy with nevirapine or efavirenz showed similar efficacy, so triple-drug regimens with either … are valid for first-line treatment’
21. Interpretation of Non-Inferiority Trials: 6 Examples (A – F): Hazard ratio (Test Drug/Standard) and 95% CI
22. Interpretation of Non-Inferiority Trials: 6 Examples (A – F) (Hazard ratio and 95% CI)
23. Interpretation of Non-Inferiority Trials: 6 Examples (A – F) (Hazard ratio and 95% CI)
24. Interpretation of Non-Inferiority Trials: 6 Examples (A – F) (Hazard ratio and 95% CI)
25. Interpretation of Non-Inferiority Trials: 6 Examples (A – F) (Hazard ratio and 95% CI)
26. Interpretation of Non-Inferiority Trials: 6 Examples (A – F) (Hazard ratio and 95% CI)
27. Interpretation of Non-Inferiority Trials: 6 Examples (A – F) (Hazard ratio and 95% CI)
28. Possible Reasons for Non-Significant Difference Small sample size
Poor compliance to study treatments
29. Non-Inferiority and Equivalence Trials Considerations Cannot prove Pe = Pc or µ1 = µ2 therefore Ho: d < 0 versus HA : d > 0 is not correct because a small, underpowered study could incorrectly lead to a claim of equivalence – absence of evidence is not evidence of absence, and if power is too high, Ho may be rejected when the difference is not important.
Since Ho cannot be accepted, either reverse the roles of type 1 and 2 errors (i.e., rejection of Ho implies equivalence) or focus on confidence intervals
Treatment difference must be chosen not only to rule out smallest clinically meaningful difference, but also to be sure new treatment is better than no treatment
Consensus on what equivalence means, especially in a broad sense, is hard to achieve
30. 1-Sided Hypothesis Testing (Non-inferiority)
31. Parallel Group Studies with Continuous Outcomes: Sample Size Formula is the Same Except for d0
32. Example Non-Inferiority Trial for New BP Lowering Drug 4 0 132
4 +2 525
4 -2 58
33. Confidence Interval Approach Example of Type I Error
34. Confidence Interval Approach Example of Type II Error
35. Sample Size for Equivalence Design Based on CI Limits A = New Treatment; B = Standard
36. Sample Size for Equivalence Design Based on CI Limits (cont.) A = New Treatment; B = Standard
37. For Proportions and Relative Risks, Farrington and Manning’s Approach is Better Problem arises because of estimation of variance under the null hypothesis.
Farrington and Manning (Stat Med 1990) have shown that their maximum likelihood approach is better particularly for small values of pc and pe.
Algorithm can be easily programmed.
39. Sample Size for Proportions for Non-Inferiority Trial: Makuch and Simon versus Farrington and Manning (PA=PB)* 0.05 0.05 0.01 9,972 10,032
0.10 0.10 0.05 756 775
0.15 0.15 0.05 1,071 1,080
0.20 0.20 0.05 1,344 1,348
0.20 0.20 0.10 336 340
40. Sample Size for Proportions for Non-Inferiority Trial: Makuch and Simon versus Farrington and Manning (PA = or ? PB)* 0.10 0.10 0.05 756 775
0.125 0.10 0.05 3,343 3,379
0.10 0.125 0.05 371 384
41. Sample Size for Proportions: Superiority Trial with Specified Delta or Inferiority with Farrington and Manning (1:1 allocation and 1-ß = 0.90) 0.05 0.05 0.01 9,021 10,032 8,174
0.10 0.10 0.05 581 775 630
0.15 0.15 0.05 917 1,080 880
0.20 0.20 0.05 1,211 1,349 1,099
0.20 0.20 0.10 266 340 277
43. Example: CPCRA Study of Nelfinavir (NFV) and Ritonavir (RTV) for Patients with CD4+ <100 In a placebo-controlled trial, RTV reduced the rate of progression to AIDS by 50%
The hypothesized relative risk (NFV/RTV) was chosen to correspond to a 33% loss of efficacy of RTV versus putative placebo
44. Confidence Interval Approach to Monitoring RR (NFV/RTV)
45. Non-inferiority and superiority
46. Non-inferiority and Inferiority
47. CONVINCE Design Based on the findings from 17 trials with over 50,000 participants, the CVD risk reduction associated with BP lowering by diuretics and beta-blockers was estimated as 24%.
Equivalence margin was set to ensure that there would be no more than a 50% loss of efficacy based on this point estimate.
Upper bound = 1.16 = 0.88 (12% reduction)/ 0.76 (24% reduction).
Lower bound = 1/1.16 = 0.86.
48. Another Example Treatment of Acute MI See Editorial NEJM 337:Oct. 16, 1997 Gusto I Study (N = 41,021)
49. Two New Studies Cobalt Study
50. Summary - Determining Equivalence First step in establishing equivalence - define ‘limits of equivalence’ (± d)
Having conducted the trial, calculate the 95% confidence intervals for the difference between the control and the new treatment
If the confidence interval is entirely within ± d then equivalence is established
51. Summary - Determining Non-inferiority Equivalence requires that the difference
control - new intervention is both > -d and < d, the new treatment must be neither worse nor better than the control by a fixed amount.
In contrast to equivalence with non-inferiority we are only interested in determining whether new treatment is no worse by an amount d.
52. Analysis of Non-inferiority/Equivalence Trials Superiority trials are analysed by intention-to-treat (ITT) because it is the most conservative and least likely to be biased.
ITT analysis of non-inferiority trials is not conservative - there is a bias towards no difference.
Per Protocol analysis is biased since not all randomised patients included.
Recommendation: Analyze by both ITT and per protocol (need to ensure power for both).
53. Equivalence/Non-Inferiority Trials Summary Equivalence is “in the eyes of the beholder”
The absence of a significant difference in a superiority trial does not imply equivalence
Need to be sure about the efficacy of the control treatment based on earlier trials.
Sloppy trials yield “equivalent” results
Because of difficulty of interpretation, equivalence and non-inferiority trials should be used cautiously for licensure.
More head to head comparisons of approved treatments are needed.
54. Quality of Reporting of Non-inferiority and Equivalence Trials (JAMA 2006;295:1147-1151) Margin defined in most trials, but rationale for margin missing in majority of studies
About 25% of reports did not give sample size justification in sufficient detail to reproduce
Less than 50% described both intention to treat and per protocol analysis
About 15% of reports did not state confidence intervals.
55. Guidelines for Reporting Non-inferiority and Equivalence Trials+ (JAMA 2006;295:1152-1160) Specification of whether the trial is a non-inferiority study
Sample size details (specification and rationale for non-inferiority margin)
Use of 1- or 2-sided confidence interval
Nature of analysis: intention to treat, per protocol or both
Presentation of results: confidence intervals
56. Checklist for Information Concerning Sample Size in 81 Trials Planned sample size 30
Type I error rate 21
Power or Type II error rate 26
1-sided or 2-sided test 7
Hypothesized treatment difference 26
Planned duration of follow-up 75
57. Sample Size Recommendations 1. Specify in advance in protocol
2. Inflate sample size to account for dropouts and dropins because analysis is “intent-to-treat”
3. Sample size may also have to be inflated to account for:
Pattern of events in control group
Medical exclusions, “healthy worker” effect
4. Plot power curve (power vs. ?) for fixed N to assess impact of mis-specification of k (Pe)
5. Monitor parameters which influence sample size during study; modify sample size if necessary
6. Report parameters used for sample size in trial publication