1 / 89

Biomarker Adaptive Threshold Design

It is difficult to have the right single completely defined predictive biomarker identified and analytically validated by the time the pivotal trial of a new drug is ready to start accrual Changes in the way we do phase II trials

dhurst
Download Presentation

Biomarker Adaptive Threshold Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. It is difficult to have the right single completely defined predictive biomarker identified and analytically validated by the time the pivotal trial of a new drug is ready to start accrual • Changes in the way we do phase II trials • Adaptive methods for the refinement and evaluation of predictive biomarkers in the pivotal trials in a non-exploratory manner • Use of archived tissues in focused “prospective-retrospective” designs based on randomized pivotal trials

  2. Biomarker Adaptive Threshold Design Wenyu Jiang, Boris Freidlin & Richard Simon JNCI 99:1036-43, 2007

  3. Biomarker Adaptive Threshold Design • Randomized trial of T vs C • Have identified a univariate biomarker index B thought to be predictive of patients likely to benefit from T relative to C • Eligibility not restricted by biomarker • No threshold for biomarker determined • Biomarker value scaled to range (0,1) • Time-to-event data

  4. Procedure AFallback Procedure • Compare T vs C for all patients • If results are significant at level .04 claim broad effectiveness of T • Otherwise proceed as follows

  5. Procedure A • Test T vs C restricted to patients with biomarker B > b • Let S(b) be log likelihood ratio statistic • Repeat for all values of b • Let S* = max{S(b)} • Compute null distribution of S* by permuting treatment labels • If the data value of S* is significant at 0.01 level, then claim effectiveness of T for a patient subset • Compute point and interval estimates of the threshold b

  6. Sample Size Planning (A) • Standard broad eligibility trial is sized for 80% power to detect reduction in hazard D at significance level 5% • Biomarker adaptive threshold design is sized for 80% power to detect same reduction in hazard D at significance level 4% for overall analysis

  7. Estimated Power of Broad Eligibility Design (n=386 events) vs Adaptive Design A (n=412 events) 80% power for 30% hazard reduction

  8. Estimation of Threshold

  9. 506 prostate cancer patients were randomly allocated to one of four arms: Placebo and 0.2 mg of diethylstilbestrol (DES) were combined as control arm C 1.0 mg DES, or 5.0 mg DES were combined as E. The end-point was overall survival (death from any cause). Covariates: Age: In years Performance status (pf): Not bed-ridden at all vs other Tumor size (sz): Size of the primary tumor (cm2) Index of a combination of tumor stage and histologic grade (sg) Serum phosphatic acid phosphatase levels (ap)

  10. Prostate Cancer Data

  11. Prostate Cancer Data

  12. Procedure B • S(b)=log likelihood ratio statistic for treatment effect in subset of patients with Bb • T=max{S(0)+R, max{S(b)}} • Compute null distribution of T by permuting treatment labels • If the data value of T is significant at 0.05 level, then reject null hypothesis that T is ineffective • Compute point and interval estimates of the threshold b

  13. Sample Size Planning (B) • Estimate power of procedure B relative to standard broad eligibility trial based on Table 1 for the row corresponding to the expected proportion of sensitive patients ( ) and the target hazard ratio for sensitive patients • e.g. =25% and =.4 gives RE=.429/.641=.67 • When B has power 80%, overall test has power 80*.67=53% • Use formula B.2 to determine the approximate number of events needed for overall test to have power 53% for detecting =.4 limited to =25% of patients

  14. Example Sample Size Planning for Procedure B • Design a trial to detect =0.4 (60% reduction) limited to =25% of patients • Relative efficiency from Table 1 .429/.641=.67 • When procedure B has power 80%, standard test has power 80%*.67=53% • Formula B.2 gives D’=230 events to have 53% power for overall test and thus approximate 80% power for B • Overall test needs D=472 events for 80% power for detecting the diluted treatment effect

  15. Events needed to Detect Hazard Ratio  With Proportional Hazards

  16. Events (D’) Needed for Overall Test to Detect Hazard Ratio  Limited to Fraction 

  17. Multiple Biomarker DesignA Generalization of the Biomarker Adaptive Threshold Design • Have identified K candidate binary classifiers B1 , …, BK thought to be predictive of patients likely to benefit from T relative to C • RCT comparing new treatment T to control C • Eligibility not restricted by candidate classifiers • Let the B0 classifier classify all patients positive

  18. Test T vs C restricted to patients positive for Bk for k=0,1,…,K • Let S(Bk) be a measure of treatment effect in patients positive for Bk • Let S* = max{S(Bk)} , k* = argmax{S(Bk)} • S* is the largest treatment effect observed • k* is the marker that identifies the patients where the largest treatment effect is observed

  19. For a global test of significance • Randomly permute the treatment labels and repeat the process of computing S* for the shuffled data • Repeat this to generate the distribution of S* under the null hypothesis that there is no treatment effect for any subset of patients • The statistical significance level is the area in the tail of the null distribution beyond the value of S* obtained for the un-suffled data • If the data value of S* is significant at 0.05 level, then claim effectiveness of T for patients positive for marker k*

  20. Repeating the analysis for bootstrap samples of cases provides • an estimate of the stability of k* (the indication)

  21. Adaptive Signature Design An adaptive design for generating and prospectively testing a gene expression signature for sensitive patients Boris Freidlin and Richard Simon Clinical Cancer Research 11:7872-8, 2005

  22. Adaptive Signature DesignEnd of Trial Analysis • Compare E to C for all patients at significance level 0.04 • If overall H0 is rejected, then claim effectiveness of E for eligible patients • Otherwise

  23. Otherwise: • Using only the first half of patients accrued during the trial, develop a binary classifier that predicts the subset of patients most likely to benefit from the new treatment E compared to control C • Compare E to C for patients accrued in second stage who are predicted responsive to E based on classifier • Perform test at significance level 0.01 • If H0 is rejected, claim effectiveness of E for subset defined by classifier

  24. Treatment effect restricted to subset.10% of patients sensitive, 10 sensitivity genes, 10,000 genes, 400 patients.

  25. Overall treatment effect, no subset effect.10% of patients sensitive, 10 sensitivity genes, 10,000 genes, 400 patients.

  26. True Model

  27. Classifier Development • Using data from stage 1 patients, fit all single gene logistic models (j=1,…,M) • Select genes with interaction significant at level 

  28. Classification of Stage 2 Patients • For i’th stage 2 patient, selected gene j votes to classify patient as preferentially sensitive to T if

  29. Classification of Stage 2 Patients • Classify i’th stage 2 patient as differentially sensitive to T relative to C if at least G selected genes vote for differential sensitivity of that patient

  30. Empirical PowerResponse Rate for Control Patients 25%

  31. Adaptive Signature Design for Clinical Trial of Advanced Prostate Cancer Richard Simon, D.Sc. Chief, Biometric Research Branch, National Cancer Institute http://brb.nci.nih.gov

  32. Cancers of a primary site often represent a heterogeneous group of diverse molecular diseases which vary fundamentally with regard to the oncogenic mutations that cause them their responsiveness to specific drugs

  33. How can we develop new drugs in a manner more consistent with modern tumor biology and obtain reliable information about what regimens work for what kinds of patients?

  34. Developing a drug with a companion test increases complexity and cost of development but should improve chance of success and has substantial benefits for patients and for the economics of medical care

  35. Although the randomized clinical trial remains of fundamental importance for predictive genomic medicine, some of the conventional wisdom of how to design and analyze rct’s requires re-examination The concept of doing an rct of thousands of patients to answer a single question about average treatment effect for a target population presumed homogeneous with regard to the direction of treatment efficacy in many cases no longer has an adequate scientific basis

  36. Predictive biomarkers Measured before treatment to identify who will benefit from a particular treatment

  37. Prospective Co-Development of Drugs and Companion Diagnostics in Ideal Settings Develop a completely specified classifier identifying the patients most likely to benefit from a new drug Based on biology, pre-clinical data and phase I-II studies Establish analytical validity of the classifier Design and analyze a focused clinical trial to evaluate effectiveness of the new treatment and how it relates to the classifier

  38. Cancer biology is complex and it is not always possible to have the right single completely defined predictive classifier identified and analytically validated by the time the pivotal trial of a new drug is ready to start accrual Adaptive methods for the refinement and evaluation of predictive biomarkers in the pivotal trials in a non-exploratory manner Use of archived tissues in focused “prospective-retrospective” designs based on previously conducted randomized pivotal trials Simon, Paik, Hayes; JNCI 101:1-7, 2009

  39. Adaptive Signature Design Boris Freidlin and Richard Simon Clinical Cancer Research 11:7872-8, 2005

  40. Adaptive Signature DesignEnd of Trial Analysis Compare X to C for all patients at significance level 0.01 If overall H0 is rejected, then claim effectiveness of X for eligible patients Otherwise Compare X to C in adaptively defined subset of patients using threshold of statistical significance 0.04

  41. Divide the patients randomly into a training set T and a validation set V. The training set will contain one-third of the patients. Using the biomarker information, treatment and outcome for the patients in T, develop a binary classifier that identifies the subset of patients who appear most likely to benefit from the new treatment X compared to control C f(B1,B2,B3,B4) = log hazard ratio of death for X relative to C as a function of biomarker values If f(B1,B2,B3,B4)/ser <c then Classifier(B1,B2,B3,B4)=X If f(B1,B2,B3,B4)/ser >c then Classifier(B1,B2,B3,B4)=C Cutpoint c optimized

  42. Use the classifier developed in training set T to classify the patients in the validation set V. Let VX denote the subset of patients in V who are classified as likely to benefit from X Compare survivals of patients who received T to survivals of those who received C for patients accrued in VX If the difference in survival is significant at level 0.04, then the new treatment is more effective than the control for patients with biomarker values for which Classifier(B1,B2,B3,B4) =X.

  43. This approach can also be used to identify the subset of patients who don’t benefit from X in cases where X is superior to C overall at the 0.01 level. The patients in VC= V – VX are predicted not to benefit from X. Survivals of X vs C can be examined for patients in that subset and a confidence interval for the hazard ratio calculated.

  44. This design has improved statistical power for identifying treatments that benefit a subset of patients in molecularly heterogeneous diseases It has greater specificity than the standard approach which results in over-treatment of vast numbers of patients with approved drugs that do not benefit them

  45. Sample Size Planning for Advanced Prostate Cancer Trial • Survival endpoint • Final analysis when there are 700 deaths total • 90% power for detecting a 25% overall reduction in hazard at two-sided 0.01 significance level (increase in median from 12 months to 9 months) • Power for evaluating treatment in adaptively determined subset • 157 deaths required for 80% power to detect 37% reduction in hazard at two-sided 0.04 significance level. • If one-third of patients in the validation set are classifier positive, then to have 157 deaths in the subset we need 157*3=471 deaths in the validation set. Since the validation set is two-thirds of the total, we require 707 total deaths. • To have 700 deaths at final analysis, 935 patients will be accrued and followed till the event rate is 75%

More Related