1 / 31

A Strategy for Making Predictions under Manipulation

A Strategy for Making Predictions under Manipulation. Ioannis Tsamardinos Assistant Professor Computer Science Department, University of Crete ICS, Foundation for Research and Technology - Hellas. Laura E. Brown Ph.D. Candidate Dept. Biomedical Inf., Vanderbilt Univ.

minowa
Download Presentation

A Strategy for Making Predictions under Manipulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Strategy for Making Predictions under Manipulation Ioannis Tsamardinos Assistant Professor Computer Science Department, University of Crete ICS, Foundation for Research and Technology - Hellas Laura E. Brown Ph.D. Candidate Dept. Biomedical Inf., Vanderbilt Univ.

  2. Selecting a Formulation of Causality V2 • Causal Bayesian Networks • Cross Sectional Data • No explicit notion of time • No feedback cycles allows • Edges express causal relations • Distribution expressed as V1 V3 T V4 V5 V6 I. Tsamardinos, CSD, University of Crete

  3. Effect of Manipulation V2 V1 V3 T V4 V5 V6 Manipulate V1 , V5 I. Tsamardinos, CSD, University of Crete

  4. Effect of Manipulation V2 V2 E V1 V3 V1 V3 T T V4 V4 External Manipulator V5 V5 V6 V6 Manipulate V1 , V5 I. Tsamardinos, CSD, University of Crete

  5. Effect of Manipulation V2 V2 E V1 V3 V1 V3 T T V4 V4 Other parents are removed V5 V5 V6 V6 Manipulate V1 , V5 I. Tsamardinos, CSD, University of Crete

  6. Effect of Manipulation V2 E V1 V3 M the set of manipulated variables T V4 V5 V6 J Pearl. Causality, Models, Reasoning, and Inference, 2000. I. Tsamardinos, CSD, University of Crete

  7. Types of Predictive Tasks • No manipulations • Known set of manipulated variables M • From data following P(V) • Predict data following PM(V) • The way manipulations are performed is unknown, i.e. PM(Vi | E) are uknown • Unknown M I. Tsamardinos, CSD, University of Crete

  8. The Markov Blanket of T V2 • The set of direct causes, direct effects, and direct causes of direct effects V1 V3 T V4 V5 V6 I. Tsamardinos, CSD, University of Crete

  9. The Manipulated Markov Blanket of T V2 • The set of direct causes, direct effects, and direct causes of direct effects in the manipulated distribution • E.g. V1 and V5 V1 V3 T V4 V5 V6 I. Tsamardinos, CSD, University of Crete

  10. Properties of MB(T) • The smallest-size, most-predictive subset of variables • All and only the variables we need for building optimal predictive models I. Tsamardinos and C. F. Aliferis. Towards principled feature selection: Relevancy, Filters and Wrappers. AI & Statistics, 2003. I. Tsamardinos, CSD, University of Crete

  11. A. No Manipulations • Find the MB(T) • Fit a model from training data for P(T | MBM(T)), using only the the variables of the MB(T) I. Tsamardinos, CSD, University of Crete

  12. B. Known M • Find the MBM(T) • Fit a model from training data, using only the variables of the MBM(T) • Proposition: PM(T | MBM(T)) = P(T | MBM(T)) provided there are no manipulated spouses of T that is a descendant of T in the unmanipulated distribution I. Tsamardinos, CSD, University of Crete

  13. Can Be Fit From Unmanipulated Data V2 • M = {V1 , V5} • PM(T | MBM(T)) = P(T | MBM(T)) V1 V3 T V4 V5 V6 I. Tsamardinos, CSD, University of Crete

  14. Cannot Be Fit From Unmanipulated Data V2 • M = {V1, V4 } • PM(T | MBM(T))  P(T | MBM(T)) V1 V3 T V4 V5 V6 I. Tsamardinos, CSD, University of Crete

  15. Unknown Manipulations M • Find the direct causes of T • Fit a model from training data, using only the the variables that are direct causes of T • Only the direct causes remain in MBM(T) under any manipulation I. Tsamardinos, CSD, University of Crete

  16. Learning Bayesian Networks • Many algorithms that can learn the network exist • Discrete data : MMHC1 • Mixed: Bach2 • Find the graph, find the MBM(T), fit a model and you are done • … or are you? 1. I Tsamardinos, LE Brown, and CF Aliferis. Machine Learning, 65(1):31, 2006. 2. F.R. Bach and M.I. Jordan. NIPS-02 I. Tsamardinos, CSD, University of Crete

  17. Faithfulness and Parity Functions • All BN methods assume Faithfulness • Causes and effects have detectable conditional pairwise associations with T • T = V1XOR V3 • No pairwise association between T and V1 V1 V3 T I. Tsamardinos, CSD, University of Crete

  18. Parity Functions in Feature Space V1 V2 • T = V1XOR V2 • No pairwise association T, V1 • Construct New Feature • V1 V2 • Pairwise associations become apparent T V1 V2 V1V2 T I. Tsamardinos, CSD, University of Crete

  19. Feature Space Markov Blanket • Map Data to Feature Space • Learn the Markov Blanket in Feature Space I. Tsamardinos, CSD, University of Crete

  20. Feature Space Markov Blanket • Map Data to Feature Space • Brute force is inefficient • Indirectly map to feature space using an SVM • Assume: low SVM weight of a feature implies low association of the feature with T • Produce only the top weighted features! (recently developed heuristic method) • Learn the Markov Blanket in Feature Space • Run HITON1 1. C. F. Aliferis, I. Tsamardinos, and A. Statnikov. AMIA 2003. I. Tsamardinos, CSD, University of Crete

  21. Inducting the MB(T) • Run MMMB1, RFE2, FSMB3, no feature selection • Build predictive models • If there is a large discrepancy in predicting performance consult FSMB • If there are “parity”-like variables, add the corresponding constructed features in the data before learning the network • I Tsamardinos, CF Aliferis, and A Statnikov. KDD 2003. • I. Guyon, et. al. Machine Learning, 46(1-3):389{422}, 2002. • submitted for publication I. Tsamardinos, CSD, University of Crete

  22. Hidden Variables and Confounding V2 V1 V3 H1 H1 , H2hidden variables Dashed edges appear in the marginal network Marginal MB(T) showed in green H2 T V4 V5 V6 I. Tsamardinos, CSD, University of Crete

  23. Hidden Variables and Confounding V2 V1 V3 H1 H1 , H2hidden variables Dashed edges appear in the marginal network Redish edges are “removed” by manipulations Manipulations of V5 , V3lead to errors in estimating MBM(T) (bluish nodes) H2 T V4 V5 V6 I. Tsamardinos, CSD, University of Crete

  24. Finding Non-Confounded Edges Proposition: V = O H, O are observable, H are not. P(V) is faithful to a Causal Bayesian Network . If • S O, I(V1 ; T | S) • S O, I(V3 ; T | S) • S O, I(V5 ;T | S) •  Z1 O, s.t. I(V1 ; V3 | S) •  Z2 O, s.t. I(V1 ; V5 | S) • I(V1 ; V3 | Z1  {T}) • I(V1 ; V5 | Z2  {T}) Then there is a causal path T to V5 (edge T V5 is causal) V2 V1 V3 T V6 V5 I. Tsamardinos, CSD, University of Crete

  25. Finding Non-Confounded Edges Proposition: V = O H, O are observable, H are not. P(V) is faithful to a Causal Bayesian Network . If • S O, I(V1 ; T | S) • S O, I(V3 ; T | S) • S O, I(V5 ;T | S) •  Z1 O, s.t. I(V1 ; V3 | S) •  Z2 O, s.t. I(V1 ; V5 | S) • I(V1 ; V3 | Z1  {T}) • I(V1 ; V5 | Z2  {T}) Then there is a causal path T to V5 (edge T V5 is causal) V2 V1 V3 T V6 H V5 I. Tsamardinos, CSD, University of Crete

  26. Finding Non-Confounded Edges • Use to test to • Orient some edges • Find truly causal (non-confounded) edges • Extension of basic idea presented in [1] 1. S. Mani, P. Spirtes, and G.F. Cooper. UAI 2006. I. Tsamardinos, CSD, University of Crete

  27. Finding the MBM(T) • Edge existence: BN learning algorithm • Edge orientation: • Learn the network, convert to PDAG, obtain compelled edges • Confounding test • Edge confounding • Confounding test • Weigh evidence and decide on orientation and absence of confounding I. Tsamardinos, CSD, University of Crete

  28. Finding the MBM(T) V2 Non-confounded Oriented but could be confounded Undirected Manipulated Nodes V1 V3 V7 T Vi V4 V5 Are V7 , V3part of MBM(T)? Is V4 part of MBM(T)? V6 I. Tsamardinos, CSD, University of Crete

  29. Results I. Tsamardinos, CSD, University of Crete

  30. Limitations • Most time spent or REGED • Conditional independence tests were sometimes inappropriate • New methods not optimized or fully tested • Model averaging should be used • Formal methods for weighing the evidence are needed I. Tsamardinos, CSD, University of Crete

  31. Conclusions • General basis of theory and algorithms for predictions under manipulation • New algorithms for addressing lack of faithfulness and hidden confounding variables • The strategy can be implemented using the new and existing algorithms • Many open directions/problems • Faithfulness • Acyclicity • Hidden variables • Timed data I. Tsamardinos, CSD, University of Crete

More Related