1 / 34

Variable Selection for Optimal Decision Making

Variable Selection for Optimal Decision Making. Susan Murphy & Lacey Gunter University of Michigan Statistics Department Artificial Intelligence Seminar Joint work with Ji Zhu. Simple Motivating Example. Nefazodone - CBASP Trial. Nefazodone. R. Nefazodone + Cognative

cirila
Download Presentation

Variable Selection for Optimal Decision Making

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Variable Selection for Optimal Decision Making Susan Murphy & Lacey Gunter University of Michigan Statistics Department Artificial Intelligence Seminar Joint work with Ji Zhu

  2. Simple Motivating Example Nefazodone - CBASP Trial Nefazodone R Nefazodone + Cognative Behavioral-analysis System of Psychotherapy (CBASP) 50+ baseline covariates, both categorical and continuous

  3. Complex Motivating Example

  4. Outline • Framework and notation for decision making • Need for variable selection • Variables that are important to decision making • Introduce a new technique • Simulated and real data results • Future work

  5. Optimal Decision Making • 3 components: observations X =(X1, X2,…, Xp), action, A, and reward, R • A policy, π, maps observations, X, to actions, A • Policies compared via expected mean reward, Vπ= Eπ[R], called the Value of π(Sutton & Barto,1998) • Long Term Goal: find a policy, π*, for which

  6. Some Working Assumptions • Data collection is difficult and expensive • limited number of trajectories (<1000) • training set with randomized actions • many observations • Finite horizon (only 1-4 time points) • we will initially work with just one time point • Noisy data with little knowledge about underlying system dynamics • Little knowledge about which variables are most important for decision making

  7. Simple Example • A clinical trial to test two alternative drug treatments • The goal: to discover which treatment is optimal for any given future patient

  8. Variable Selection • Multiple reasons for variable selection in decision making, for example • Better performance: avoid inclusion of spurious variables that lead to bad policies • Limited resources: only small number of variables can be collected when enacting policies in a real world setting • Interpretability: policies with fewer variables are easier to understand

  9. What are people currently using? • Variable selection for reinforcement learning in medical settings predominantly guided by expert opinion • Predictive selection techniques, such as Lasso (Loth et al., 2006) and decision trees (Ernst et al., 2005) have been proposed • Good predictive variables are useful in decision making, but are only a small part of the puzzle • Need variables that help determine optimal actions, variables that qualitatively interact with the action

  10. Qualitative Interactions • What is a qualitative interaction? • X qualitatively interacts with A if at least two distinct, non-empty sets exist within the space of X for which the optimal action is different (Peto, 1982) No Interaction Non-qualitative Interaction Qualitative interaction • Qualitative interactions tell us which actions are optimal

  11. Qualitative Interactions • We focus on two important factors • The magnitude of the interaction between the variable and the action • The proportionof patients whose optimal choice of action changes given knowledge of the variable big interaction small interaction big interaction big proportion big proportion small proportion

  12. Variable Ranking for Qualitative Interactions • We propose ranking the variables in Xbased on potential for a qualitative interaction with A • We give a score for ranking the variables • Given data on i = 1,.., nsubjects with j = 1,…,pvariables in X, along with an action, A, and a reward, R, for each subject • For Ê[R| A=a]an estimator of E[R| A=a], define

  13. Variable Ranking Components • Ranking score based on 2 usefulness factors • Interaction Factor: min = 0.3 – 0.7 = - 0.4 max = 1 – 0 = 1 Dj= 1 – (-.4) = 1.4

  14. Variable Ranking Components • Proportion Factor: 2 out of 7 subjects would change choice of optimal action given Xj

  15. Ranking Score • Ranking Score: • Score, Uj,j=1,…,pcan be used to rank the p variables in Xbased on their potential for a qualitative interaction with A

  16. Variable Selection Algorithm • Select important main effects of X on R using some predictive variable selection method • Choose tuning parameter value that gives best predictive model • Rank variables in X using score Uj; select top k in rank • Again use a predictive variable selection method, this time selecting among main effects of X from step 1, main effect of A, and ranked interactions from step 2 • Choose tuning parameter value such that the total subset of variables selected leads to a policy with the highest estimated Value

  17. Simulation • Data simulated under wide variety of scenarios (with and without qualitative interactions) • Used observation matrix, X, and actions, A, from a real data set • Generated new rewards, R, based on several different realistic models • Compared new ranking method Uj versus a standard method • 1000 repetitions: recorded percentage of time each interaction was selected for each method

  18. Methods Used in Simulation • Standard Method: Lasso on (X, A, XA) (Tibshirani, 1996) • The standard Lasso minimization criterion is where Zi is the vector of predictors for observation i and λ is a penalty parameter • Coefficient for A, βp+1, not included in penalty term • Value of λ chosen by cross-validation on the prediction error

  19. Methods Used in Simulation New Method: • Select important main effects of X on R using Lasso • Choose λvalue by cross-validation on prediction error • Rank variables in X using score Uj; select top k in rank • Use Lasso to select among main effects of X chosen in step 1, main effect of A, and interactions chosen in step 2 • Choose λ value such that the total subset of variables selected leads to a policy with the highest estimated Value

  20. Simulation Results ×Binary Qualitative Interaction Spurious Interaction ×Continuous Qualitative Interaction  Spurious Interaction

  21. Simulation Results × Binary Qualitative Interaction  Non-qualitative Interaction  Spurious Interaction × Continuous Qualitative Interaction  Non-qualitative Interaction  Spurious Interaction

  22. Depression Study Analysis • Data from a randomized controlled trial comparing treatments for chronic depression (Keller et al., 2000) • n = 440 patients, p = 64 observation variables in X, actions, A = Nefazodone orA = Nefazodone + Cognitive psychotherapy (CBASP), • Reward, R = Hamilton Rating Scale for Depression score

  23. Depression Study Results • Ran both methods on 1000 bootstrap samples • Resulting selection percentages: OCD ALC2 ALC2 Som Anx ALC1

  24. Inclusion Thresholds • Based on previous plots, which variables should we select? • Need inclusion thresholds • Idea: remove effect of X on R from data, then run algorithm to determine maximum percentage of selections • this tells us the noise threshold • variables with percentages above this threshold are selected

  25. Inclusion Thresholds • Do 100 times • Randomly assign the observed rewards to different subjects given a particular action • Run the methods on new data • Record the variables that were selected by each method • Threshold: largest percentage of time a variable was selected over the 100 iterations

  26. Thresholds for Depression Study • We should disregard any interactions selected 6% of the time or less when using either method

  27. Threshold on Results • New method U includes 2 indicator variables for Alcohol problems and Somatic Anxiety Score • Standard Lasso includes 39 variables! ALC2 Som Anx ALC1

  28. Future Work • Extend algorithm to select variables for multiple time points • How best to do this? • What rewards to use at each time point • Do we need to adjust the distribution of our X based on prior actions • What order should variable selection be done

  29. Other Issues To Think About • Do we need to account for variability in our estimate of E[R| Xj, A=a] over differentXj • Can we reasonably estimate the value of a derived policy from a fixed data set collected under random actions when the number of time points gets larger? • Any other issues?

  30. References & Acknowledgements • For more information see: L. Gunter, J. Zhu, S.A. Murphy (2007). Variable Selection for Optimal Decision Making. Technical Report, Department of Statistics, University of Michigan. • This work was partially supported by NIH grants: • R21 DA019800,K02 DA15674,P50 DA10075 • Technical and data support • A. John Rush, MD, Betty Jo Hay Chair in Mental Health at the University of Texas Southwestern Medical Center, Dallas • Martin Keller and the investigators who conducted the trial `A Comparison of Nefazodone, the Cognitive Behavioral-analysis System of Psychotherapy, and Their Combination for Treatment of Chronic Depression’

  31. Addressing Concerns • Many Biostat literature discourage looking for qualitative interactions and are very skeptical when new interactions are found, why is this? • Qualitative interactions are hard to find, have small effects • Too many people fishing without disclosing • Strict entry criteria for most clinical trials, thus small variability in X precludes looking at avoid looking at interesting subgroups • How are we addressing these concerns? • Testing new algorithms in multiple settings where no qualitative interactions exist

  32. No Interaction: What can we expect? No Qualitative Interactions No relationship between (X, A, X*A) and R Main effects of X & moderate effect of A only Main effects of X only Everything but qualitative interactions

  33. Estimating the Value • Fit selected variables into chosen estimator, Ê • Estimate optimal policy: • Estimate Value of by:

  34. Estimating the Value (2 time points) • Estimate of the optimal policy: • Estimate Value of by:

More Related