1 / 37

Objective Bayesian Nets for Integrating Cancer Knowledge

Objective Bayesian Nets for Integrating Cancer Knowledge. Sylvia Nagl PhD Cancer Systems Biology & Biomedical Informatics UCL London. caOBNET: Overview. Knowledge integration by objective Bayesian networks (obNETS) Maximum entropy method An integrated clinico-genomic obNET for breast cancer

kaiya
Download Presentation

Objective Bayesian Nets for Integrating Cancer Knowledge

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Objective Bayesian Nets for Integrating Cancer Knowledge Sylvia Nagl PhD Cancer Systems Biology & Biomedical Informatics UCL London

  2. caOBNET: Overview • Knowledge integration by objective Bayesian networks (obNETS) • Maximum entropy method • An integrated clinico-genomic obNET for breast cancer • Conclusions

  3. Bayesian networks • Graphical models • directed and acyclic graph (DAG) • Joint multivariate probability distribution • with conditional independencies between variables • Given the data, optimal network topology can be estimated • heuristic search algorithms and scoring criteria • Statistical significance of edge strengths • Bayesian methods • bootstrapping Apolipoprotein E gene SNPs and plasma apoE level Rodin & Boerwinkle 2005

  4. Knowledge integration • Cancer treatment decisions should be based on all available knowledge • Knowledge is complex and varied: Patient's symptoms, expert knowledge, clinical databases relating to past patients, molecular databases, scientific papers, medical informatics systems • Generated by independent studies with diverse protocols

  5. Knowledge integration • Diverse data types Genomic, transcriptomic, proteomic, SNPs, tissue microarray, histopathology, clinical etc. • New data types, e.g., epigenetic data • All data types capture different characteristics of a dynamic complex system • At different spatial and temporal scales • Cell, tumour, patient, and therapeutic system of patient-therapy interactions • How can this disparate data be used for an integrated understanding on which to base our actions?

  6. Objective Bayesianism • Data and knowledge impinge on belief – we try to find a coherent set of beliefs with best fit • Beliefs based on undefeated items of knowledge • In case of conflict, try to find compromise beliefs • Objective Bayesianism offers a formalism for determining the beliefs that best fit background knowledge • Applying Bayesian theory, an agent’s degree of belief should be representable by a probability function p • Empirical knowledge imposes quantitative constraints on p • Represented in an obNET (learnt from database)

  7. obNETS for prediction • Standard algorithms can be used to calculate the probability of a specific outcome • A direct link between variables may suggest a causal connection

  8. Bayesian networks • Can BNs be integrated? Spanning genetic/molecular and clinical levels • obNETS offer a principled path to knowledge integration

  9. Maximum entropy principle • Adopt p, from all those that satisfy the constraints, that are maximally equivocal • Williamson, J.(2002) Maximising Entropy Efficiently. • Williamson, J. (2005a): Bayesian Nets and Causality. • Williamson, J. (2005b): Objective Bayesian nets. www.kent.ac.uk/secl/philosophy/jw/

  10. Example • Two items of empirical knowledge may conflict: • Study 1: Cancer will recur in 50% of patients with given set of characteristics Degree of belief in recurrence in individual patient = 0.5 • Study 2: Frequency of recurrence is 30% • Degree of belief will be constrained to closed interval [0.3,0.5] In general: • Belief function will lie within a closed set of probability functions • There will be a unique function that maximises entropy

  11. obNet integration

  12. obNet integration Original obNETs provide probability distributions

  13. obNET integration

  14. obNET integration

  15. obNET integration n number of nets

  16. obNET integration Maximum entropy principle If CPTs for merged nodes disagree on probabilities, assign closed interval and take least committal value in that range

  17. obNET integration: Proof of principle Two obNETs from breast cancer knowledge domain • Genomic: Comparative genome hybridisation (CGH) data - progenetix database • Subset of bands with 3 or more genes implicated in tumour progression and response to cytotoxic therapies (28 bands) • Clinical: American Surveillance, Epidemiology and End results (SEER) database

  18. Clinical and genomic nets (Hugin 6.6) SEER database 4731 cases progenetix database 28 bands/502 cases ?

  19. obNet integration obNet learnt from 2nd progenetix dataset - 119 cases with clinical annotation (lymph node status, tumour size, grade) CPT 22q12: -1 0 1 LN:0 0.148 0.5 0.148 1 0.852 0.5 0.852

  20. Additional empirical knowledge chr. 22 Fridlyand et al. 2006

  21. obNet integration chr. 22 Fridlyand et al. 2006 CPT

  22. obNet integration chr. 22 Fridlyand et al. 2006 CPT

  23. Metastasis-associated genes KREMEN1 MYH9 cadherin11 CD97 BMP7, ELMO2, BCAS1, BCAS4, ZNF217

  24. KREMEN1 Howard et al., 2003 Biological knowledge suggests possible causal link (in context of whole obNET – HR status!)

  25. Knowledge integration Multi-scale obNETs Cancer clinical data & epidemiology Translation of clinical data to genomics research Predictive markers Molecular profiling of tumours

  26. Acknowledgements • Jon Williamson (Philosophy, Unversity of Kent) www.kent.ac.uk/secl/philosophy/jw/ • Matt Williams (Cancer Research UK) • Nadjet El-Mehidi (Cancer Systems Biology, UCL) • Vivek Patkar (Cancer Research UK) • Contact: s.nagl@ucl.ac.uk

  27. obNET integration: Proof of principle • Two obNETs • Non-independent rearrangements at chromosomal locations in breast cancer from comparative genome hybridisation (CGH) data - progenetix database • Subset of bands with 3 or more genes implicated in tumour progression and response to cytotoxic therapies (28 bands) • Probabilistic dependencies between clinical parameters from the American Surveillance, Epidemiology and End results (SEER) database

  28. HR status link

  29. Genomic systems • Genomes are dynamic molecular systems • Selection acts on unstable cancer genomes as integrated wholes, not just on individual oncogenes or tumour suppressors. • A multitude of ways to ‘solve the problems’ of achieving a survival advantage in cancer cells: • Irreversible evolutionary processes • Randomness of mutation • Modularity and redundancy of complex systems

  30. Genome-wide rearrangements • Can we identify probabilistic dependency networks in large sample sets of genomic data from individual tumours? • If so, under which conditions may these be interpreted as causal networks? • Can we identify probabilistic dependency networks involving molecular and clinical levels?

  31. Systems Biology and Causation • Profound conceptual challenge regarding physical causation in complex biological systems • Mutual dependence of physical causes • The biological relevance of any factor, and therefore “the information” it conveys, is jointly determined, frequently in a statistically interactive fashion, by that factor and the system state (Susan Oyama, The Ontogeny of Information, 2000) • The influence of a gene, or a genetic mutation, depends on the context, such as availability of other molecular agents and the state of the biological system, including the rest of the genome

  32. System state agents Cell networks are dynamically instantiated – genes for components are switched on or off in response to signals and cell state

  33. System state Cell networks are reconfigured in response to changes in environment or cell’s internal state

  34. System state Cell computation networks are reconfigured in response to changes in environment or cell’s internal state

  35. Cancer: Genome instability re-programs cell networks Selection for increased proliferation, resistance, invasiveness etc. Drivenby tumour cell – tissue interactions

  36. Genome-wide rearrangements • Can we identify probabilistic dependency networks in large sample sets of genomic data from individual tumours? • Can we identify probabilistic dependency networks involving molecular and clinical levels?

  37. Proof of principle • Screen the whole genome for chromosomal abnormalities in one experiment Cytogenetics • Comparative genomic hybridization (CGH) • Fluorescence in situ hybridization (FISH) and multicolour fluorescence in situ hybridization (MFISH) • Detection of allelic instabilities, loss of heterozygosity (LOH)

More Related