1 / 26

A Short Tutorial on Causal Network Modeling and Discovery

A Short Tutorial on Causal Network Modeling and Discovery. Greg Cooper Department of Biomedical Informatics University of Pittsburgh. Modeling the World’s Systems 5/22/2018. Outline. Brief background on causal network discovery Introduction to a method for causal network discovery

marinel
Download Presentation

A Short Tutorial on Causal Network Modeling and Discovery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Short Tutorial on Causal Network Modeling and Discovery Greg Cooper Department of Biomedical Informatics University of Pittsburgh Modeling the World’s Systems 5/22/2018

  2. Outline • Brief background on causal network discovery • Introduction to a method for causal network discovery • Software tools for causal network discovery

  3. Causal modeling and discovery are at the core of much of science, engineering, medicine, business, and other domains

  4. Current Circumstances Favorablefor Causal Network Discovery from Data Abundance of data + Dramatic increases in computing power + Algorithmic advances in causal network discovery

  5. Basic Causal Discovery Workflow Causal Hypotheses Causal Analysis Causal Networks Prior Knowledge Experiments Data

  6. Types of Data Include … • Experimental data – controlled manipulation of some variables and observation of the others • Observational data – observation only, with no manipulation

  7. Types of Data Include … • Experimental data – controlled manipulation of some variables and observation of the others • Observational data – observation only, with no manipulation

  8. Basic Components Needed to Learn Causal Networks from Observational Data • Causal network representation • Causal network search • Causal network evaluation

  9. Causal Bayesian Networks (CBNs) • A directed acyclic graph • Nodes represent variables • Arcs represent direct causation • A variable is modeled as independent of its non-effects, given its causal parents Example: • The structure implies a factorization of the joint probability distribution Example: P(A, B, C) = P(A) P(B | A) P(C| B) } } CBN structure CBN parameters A B C

  10. Methods for Learning CBNsfrom Observational Data • Constraint-based • Bayesian • Other

  11. Methods for Learning CBNsfrom Observational Data • Constraint-based • Bayesian • Other

  12. The Constraint-Based Method • Determine constraints that hold among the nodes (e.g., independence conditions based on statistical tests) • Use the patterns of constraints to narrow the causal possibilities

  13. A Hypothetical Example of the Constraint-Based Method • Three binary variables X, Y, Z • The following is known: X occurs before Y X occurs before Z • For instance • X: gene mutation status • Y: gene expression level • Z: disease status • Question: Does Y cause Z?

  14. A Hypothetical Example of the Constraint-Based Method X X X X X X Y Y Y Y Z Y Z Z Y Z Z Z • Suppose statistical testing yields the following constraints dep(X, Y), dep(Y, Z), dep(X, Z), ind(X, Z | Y) • Consider the consistency of these constraints with respect to the following causal networks: X H X X H None of these satisfy all four tests 94 additional causal networks

  15. The Three Accepted ( ) Causal Networks Are No Longer Accepted If There is Hidden Confounding between Y and Z X X Y Y Z Z X H H H H X X Y Z H X

  16. The Only Three Networks Consistent with the Four Constraints X X Y Y Z Z H X Y Z H

  17. The Only Three Networks Consistent with the Four Constraints X X Y Y Z Z H X Y Z H

  18. Summary of the Constraint-Based Causal Discovery Method • Reduces a large number of causal network possibilities to just those networks consistent with the constraints obtained from the data • Looks for causal relationships that are common across those networks (e.g., Y Z) • Generalizes to many variables (1000s) and to more complex patterns of constraints that support causal relationships

  19. A Real Application ofthe Simple Constraint-Based Method* X Y Z • 354 rheumatoid arthritis (RA) cases and 337 controls from Sweden • X: SNPs measured with Illumina Human Hap chip • Y: Differentiated methylation positions (DMPs) measured with an Illumina HumanMethylation450 array • Z: RA status (yes/no) • Core discovery algorithm: For all combinations of SNPs (X) and DMPs (Y), output Y Z whenever the 4 statistical tests (above) hold • Results • Found 9 DMPs (Y) that the data support as causally influencing RA (Z) • A validation study using a separate set of 24 total cases found changes in the 9 DMPs that were consistent with the original study * Epigenome-wide association data implicate DNA methylation as an intermediary of genetic risk in rheumatoid arthritis. Liu Y, Aryee MJ, Padyukov L, et al. Nature Biotechnology 31 (2013) 142-147.

  20. Center for Causal Discovery www.ccd.pitt.edu

  21. Suggested Reading Lagani V, Triantafillou S, Ball G, Tegner J, Tsamardinos I. Probabilistic computational causal discovery for systems biology. In: Uncertainty in Biology: A Computational Modeling Approach. Editors: Geris L, Gomez-Cabrero D (2016, Springer) Available at: mensxmachina.org under Publications for 2016

  22. Acknowledgements The Center for Causal Discovery is supported by grant U54HG008540 awarded by the National Human Genome Research Institute through funds provided by the trans-NIH Big Data to Knowledge (BD2K) initiative (www.bd2k.nih.gov). The content of this presentation is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. 

  23. Thank you gfc@pitt.edu

More Related