1 / 37

Model and Variable Selections for Personalized Medicine

Model and Variable Selections for Personalized Medicine. Lu Tian (Northwestern University) Hajime Uno (Kitasato University) Tianxi Cai, Els Goetghebeur, L.J. Wei (Harvard University). Outline . Background and motivation

baker-diaz
Download Presentation

Model and Variable Selections for Personalized Medicine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Model and Variable Selections for Personalized Medicine Lu Tian (Northwestern University) Hajime Uno (Kitasato University) Tianxi Cai, Els Goetghebeur, L.J. Wei (Harvard University)

  2. Outline • Background and motivation • Developing and evaluating prediction rules based on a set of markers for • Continuous or binary outcome • Censored event time outcome • Evaluating the incremental value of a biomarker over • the entire population • various sub-populations • Incorporating the patient level precision of the prediction • Prediction intervals/sets • Remarks

  3. Diagnosis Prognosis Treatment Background and Motivation • Personalized medicine: using information about a person’s biological and genetic make up to tailor strategies for the prevention, detection and treatment of disease • Important step: develop prediction rules that can accurately predict health outcome or diagnosis of clinical phenotype

  4. Predictor Z Subject Characteristics Biomarkers Genetic Markers Outcome Y Disease status Time to event Treatment Response Background and Motivation • Accurate prediction of disease outcome and treatment response, however, are complex and difficult tasks. • Developing prediction rules involve • Identifying important predictors • Evaluating the accuracy of the prediction • Evaluating the incremental value of new markers

  5. Outcome Y ? Predictor Z CD4week 24 Age, CD4week 0, CD4week 8 RNAweek 0,RNAweek 8 Background and Motivation AIDS Clinical Trial : ACTG320 • Study objective: to compare • 3-drug regimen (n=579): Zidovudine + Lamivudine + Indinarvir • 2-drug regimen (n=577): Zidovudine + Lamivudine • Identify biomarkers for predicting treatment response • How well can we predict the treatment response? • Is RNA needed?

  6. Association Coefficients for RNA significant? Regression Analysis: CD4week 24 Background and Motivation Is RNA needed? Predictors

  7. Coefficient for RNAweek 8highly significant  RNA needed for a more precise prediction of responses?? Background and Motivation AIDS Clinical Trial Regression Coefficient

  8. prediction procedure Does adding RNA improve the prediction? • Prediction rule: based on regression models • The distance between and Y? Background and Motivation Is RNA needed? Y = CD4week 8 Z=Predictors

  9. Developing Prediction RulesBased on a Set of Markers • Regression approach to approximate Y | Z • Continuous or binary outcome:Generalize linear regression • Survival outcome: • Proportional Hazards model • Time-specific prediction models • Regression modeling as a vehicle: • the procedure has to be valid when the imposed statistical model is not the true model!

  10. Y = 0 Y = 1 Developing and Evaluating Prediction Rules • Predict Y with Z based on the prediction model • Evaluate the performance of the prediction by the average “distance” between and Y • The utility or cost to predicting Y as is • The average “distance” is • Examples: • Absolute prediction error: • Total “Cost” of Risk Stratification:

  11. and Evaluating and Comparing Prediction Rules • The performance of the prediction model/rule with can be estimatedby • Prediction Model/Rule Comparison: • Prediction with E(Y | Z) = g1(a’Z) vs E(Y | W) = g2(b’W) • Compare two models/rules by comparing

  12. Variability in the Estimated Prediction Performance Measures • Variability in the prediction errors: • Estimate  = 50, SE = 1? SE = 50? • Inference about D and  = D1 – D2 • Confidence intervals based on large sample approximations to the distribution of

  13. and have the same limiting distribution Bias Correction • Bias issue in the apparent error type estimators • Bias correction via Cross-validation: • Data partition Tk, Vk • For each partition • Obtain based on observations in Tk • Obtain based on observations in Vk • Obtain cross-validated estimator

  14. Example: AIDS Clinical Trial • Objective: identify biomarkers to predict the treatment response • Outcome: Y = CD4week 24 • Predictors Z: Age, CD4week 0, CD4week 8, RNAweek 0,RNAweek 8 • Working Model: E(Y|Z) = ’Z

  15. Example: AIDS Clinical TrialIncremental Value of RNA Estimates 95% C.I. * : Std Error Estimates

  16. Incremental Value of RNA within Various Sub-populations

  17. Trandolapril Cardiac Evaluation Study(Kober et al 2005, NEJM) • Prognostic importance of the left ventricular dysfunction • Thune et al (2005) : Diamond study • Trace study (Kober et al 2005, NEJM) • Designed to determine whether patients w/ left ventricular dysfunction soon after myocardial infarction benefit from long-term oral ACE inhibition • Between 1990 and 1992, a total of 6676 patients with myocardial infarction were screened with echocardiography • A total of 5921 subjects had available data

  18. Trandolapril Cardiac Evaluation Study(Kober et al 2005, NEJM) • Routine Markers include: • Age • creatine (CRE) • occurrence of heart failure (CHF) • history of diabetes (DIA), • history of hypertension (HYP), • cardiogenic shock after MI (KS) • We are interested in evaluating in the incremental value of wall motion index (WMI)

  19. Trandolapril Cardiac Evaluation Study(Kober et al 2005, NEJM) • Does WMI improve the prediction of 5-year survival?

  20. Population Average Incremental Value of WMIPredicting 5-year Survival 5-year mortality rate = 42%

  21. D1 D2

  22. Gain Due to WMI

  23.  = 1  = 4  = 9 Gain Due to WMI with respect to D

  24. ExampleBreast Cancer Gene Expression Study • Objective: construct a new classifier that can accurately predict future disease outcome • van’t Veer et al (2002) established a classifier based on a 70-gene profile • good- or poor-prognosis signature based on their correlation with the previously determined average profile in tumors from patients with good prognosis • Classify subjects as • Good prognosis if Gene score > cut-off • Poor prognosis if Gene score < cut-off • van de Vijver et al (2002) evaluated the accuracy of this classifier by using hazard ratios and signature specific Kaplan Meier curves

  25. ExampleBreast Cancer Gene Expression Study • Data consist of 295 Subjects • Outcome T: time to death • Predictors: Lymph-Node Status, Estrogen Receptor Status, gene score • We are interested in • Constructing prediction rules for identify subjects who would survive t-year, Y = I(Tt)=1. • Evaluating the incremental value of the Gene Score.

  26. Example: Breast Cancer DataPredicting 10-year Survival

  27. Evaluating the Prediction RuleBased on Various Accuracy Measures • For a future patient with T0 and Z0, we predict • Classification accuracy measures • Sensitivity • Specificity • Prediction accuracy measures

  28.  Naïve • o Clinical • Clinical + Gene van de Vijver Example: Breast Cancer DataPredicting 10-year Survival

  29. Example: Breast Cancer Data • To compare • Model II: g(a + Node + ER) • Model III: g(a + Node + ER + Gene) • Choosing cut-off values for each model to achieve SE = 69% which is an attainable value for Model II, then • Model II  SP = 0.45, PPV = 0.35, NPV = 0.77 • Model III  SP = 0.75, PPV = 0.54, NPV = 0.85 • 95% CI for the difference in • SP:[0.11, 0.45], PPV: [0.01, 0.24], NPV: [0.06, 0.19]

  30. Prediction IntervalAccounting for the Precision of the Prediction • Based on a prediction model • predict the response • summarize the corresponding population average accuracy • What if the population average accuracy of 70% is not satisfactory? How to achieve 90% accuracy? • What if can predict Y0 more precisely for certain Z0, while on the other hand fails to predict Y0 accurately? • Account for the precision of the prediction? Identify patients would need further assessment?

  31. Classic Rule: Risk of Death < 0.50  Survivor {Y=0} Risk of Death ≥ 0.50  Non-survivor {Y=1} Predicted Risk = 0.51 Predicted Risk = 0.04 {0} {1}

  32. Prediction Interval • To account for patient-levelprediction error, one may instead predict such that • The optimal interval for the population with Z0  is • : estimated conditional density function

  33. Example: Breast CancerStudy • Data: 295 patients • Response: 10 year survival • Predictors: Lymph-Node Status, Estrogen Receptor Status, Gene Score • Model • Possible prediction sets: {}, {0}, {1}, {0,1} • Classic prediction: considers {0}, {1} only.

  34. 90% Prediction Set: {0,1} 90% Prediction Set: {0} Predicted Risk = 0.04 Predicted Risk = 0.51

  35. (0%) 4% (63%) 39% (37%) 57% Example: Breast Cancer Study Prediction Sets Based on Clinical + Gene Score

  36. Remarks • Proper choice of the accuracy/cost measure • Classification accuracy vs predictive values • Utility function: what is the consequence of predicting a subject with outcome Y as • With an expensive or invasive marker • Should it be applied to the entire population? • Is it helpful for a certain sub-population? • Should the cost of the marker be considered when evaluating its value?

More Related