Learning bayesian networks from postgenomic data with an improved structure mcmc sampling scheme
Download
1 / 75

Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme - PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on

Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme. Dirk Husmeier Marco Grzegorczyk 1) Biomathematics & Statistics Scotland 2) Centre for Systems Biology at Edinburgh. Systems Biology. Protein activation cascade. Cell membran. phosphorylation.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme' - tyne


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Learning bayesian networks from postgenomic data with an improved structure mcmc sampling scheme

Learning Bayesian networks from postgenomic data with an improved structure MCMC sampling scheme

Dirk Husmeier

Marco Grzegorczyk

1) Biomathematics & Statistics Scotland

2) Centre for Systems Biology at Edinburgh


Systems biology
Systems Biology improved structure MCMC sampling scheme


Protein activation cascade
Protein activation cascade improved structure MCMC sampling scheme

Cell membran

phosphorylation

nucleus

TF

TF

-> cell response


Raf signalling network improved structure MCMC sampling scheme

From Sachs et al Science 2005


unknown improved structure MCMC sampling scheme

high-throughput experiments

postgenomic data

machine learning

statistical methods


Differential equation models
Differential equation models improved structure MCMC sampling scheme

  • Multiple parameter sets can offer equally plausible solutions.

  • Multimodality in parameters space: point estimates become meaningless.

  • Overfitting problem  not suitable for model selection.

  • Bayesian approach: computing of marginal likelihood computationally challenging.


Bayesian networks improved structure MCMC sampling scheme

  • Marriage between graph theory and probability theory.

  • Directed acyclic graph (DAG) representing conditional independence relations.

  • It is possible to score a network in light of the data: P(D|M), D:data, M: network structure.

  • We can infer how well a particular network explains the observed data.

NODES

A

B

C

EDGES

D

E

F


Learning Bayesian networks improved structure MCMC sampling scheme

P(M|D) = P(D|M) P(M) / Z

M: Network structure. D: Data


MCMC in structure space improved structure MCMC sampling scheme

Madigan & York (1995), Guidici & Castello (2003)


Alternative paradigm: order MCMC improved structure MCMC sampling scheme


Instead of improved structure MCMC sampling scheme

MCMC in structure space


MCMC in order space improved structure MCMC sampling scheme



A distribution

B

A

B

A

B

B

A

A

B


A distribution

B

0.5

A

B

A

B

0.5

B

A

A

B


0.5 distribution

A

B

0.5

A

B

0.5

A

B

0.5

B

A

A

B


0.5 distribution

A

B

0.5

A

B

0.5

A

B

0.5

0.5

B

A

A

B

0.5


0.5 distribution

0.25

A

B

0.5

A

B

0.5

0.5

A

B

0.5

0.5

B

A

0.25

A

B

0.5


Current work with marco grzegorczyk
Current work with distributionMarco Grzegorczyk

Proposed new paradigm

  • MCMC in structure space rather than order space.

  • Design new proposal moves that achieve faster mixing and convergence.


First idea distribution

Propose new parents from the distribution:

  • Identify those new parents that are involved in the formation of directed cycles.

  • Orphan them, and sample new parents for them subject to the acyclicity constraint.


1) Select a node distribution

2) Sample new parents

3) Find directed cycles

5) Sample new parents for these parents

4) Orphan “loopy” parents


Path via illegal structure distribution

Problem: This move is not reversible


Devise a simpler move that is reversible

Devise a simpler move that is reversible distribution

  • Identify a pair of nodes X Y

  • Orphan both nodes.

  • Sample new parents from the “Boltzmann distribution” subject to the acyclicity constraint such the inverse edge Y X is included.

C1

C2

C1,2

C1,2


1) Select an edge distribution

2) Orphan the nodes involved

3) Constrained resampling of the parents



1) Select an edge distribution

2) Orphan the nodes involved

3) Constrained resampling of the parents


Simple idea mathematical challenge
Simple idea distributionMathematical Challenge:

  • Show that condition of detailed balance is satisfied.

  • Derive the Hastings factor …

  • … which is a function of various partition functions



Ergodicity
Ergodicity distribution

  • The new move is reversible but …

  • … not irreducible

A

B

A

B

A

B

  • Theorem: A mixture with an ergodic transition kernel gives an ergodic Markov chain.

  • REV-MCMC: at each step randomly switch between a conventional structure MCMC step and the proposed new move.


Evaluation distribution

  • Does the new method avoid the bias intrinsic to order MCMC?

  • How do convergence and mixing compare to structure and order MCMC?

  • What is the effect on the network reconstruction accuracy?


Results
Results distribution

  • Analytical comparison of the convergence properties

  • Empirical comparison of the convergence properties

  • Evaluation of the systematic bias

  • Molecular regulatory network reconstruction with prior knowledge


Analytical comparison of the convergence properties
Analytical comparison of the convergence properties distribution

  • Generate data from a noisy XOR

  • Enumerate all 3-node networks

t


Analytical comparison of the convergence properties1
Analytical comparison of the convergence properties distribution

  • Generate data from a noisy XOR

  • Enumerate all 3-node networks

  • Compute the posterior distributionp°

  • Compute the Markov transition matrixA for the different MCMC methods

  • Compute the Markov chainp(t+1)= A p(t)

  • Compute the (symmetrized) KL divergence KL(t)= <p(t), p°>

t


Solid line: distribution REV-MCMC. Other lines: structure MCMC and different versions of inclusion-driven MCMC


Results1
Results distribution

  • Analytical comparison of the convergence properties

  • Empirical comparison of the convergence properties

  • Evaluation of the systematic bias

  • Molecular regulatory network reconstruction with prior knowledge


Empirical comparison of the convergence and mixing properties
Empirical comparison of the convergence and mixing properties

  • Standard benchmark data:

    Alarm network (Beinlich et al. 1989) for monitoring patients in intensive care

  • 37 nodes, 46 directed edges

  • Generate data sets of different size

  • Compare the three MCMC

    algorithms under the same

    computational costs

     structure MCMC(1.0E6)

     order MCMC(1.0E5)

     REV-MCMC(1.0E5)


What are the implications for properties

network reconstruction ?

ROC curves

Area under the ROC curve (AUROC)

AUC=1

AUC=0.75

AUC=0.5


Conclusion
Conclusion properties

  • Structure MCMC has convergence and mixing difficulties.

  • Order MCMC and REV-MCMC show a similar (and much better) performance.


Conclusion1
Conclusion properties

  • Structure MCMC has convergence and mixing difficulties.

  • Order MCMC and REV-MCMC show a similar (and much better) performance.

  • How about the bias?


Results2
Results properties

  • Analytical comparison of the convergence properties

  • Empirical comparison of the convergence properties

  • Evaluation of the systematic bias

  • Molecular regulatory network reconstruction with prior knowledge


Evaluation of the systematic bias using standard benchmark data
Evaluation of the systematic bias propertiesusing standard benchmark data

  • Standard machine learning benchmark data: FLARE and VOTE

  • Restriction to 5 nodes complete enumeration possible (~ 1.0E4 structures)

  • The true posterior probabilities of edge features can be computed

  • Compute the difference between the true scores and those obtained with MCMC




Results3
Results posterior probabilities

  • Analytical comparison of the convergence properties

  • Empirical comparison of the convergence properties

  • Evaluation of the systematic bias

  • Molecular regulatory network reconstruction with prior knowledge


Raf regulatory network posterior probabilities

From Sachs et al Science 2005


Raf signalling pathway
Raf signalling pathway posterior probabilities

  • Cellular signalling network of 11 phosphorylated proteins and phospholipids in human immune systems cell

  • Deregulation  carcinogenesis

  • Extensively studied in the literature  gold standard network


Data prior knowledge

Data posterior probabilitiesPrior knowledge


Flow cytometry data posterior probabilities

  • Intracellular multicolour flow cytometry experiments: concentrations of 11 proteins

  • 5400 cells have been measured under 9 different cellular conditions (cues)

  • Downsampling to 10 & 100 instances (5 separate subsets): indicative of microarray experiments


Data prior knowledge1

Data posterior probabilitiesPrior knowledge


Biological prior knowledge matrix posterior probabilities

Indicates some knowledge about

the relationship between genes i and j

Biological Prior Knowledge

P  B (for “belief”)

Define the energy of a Graph G


Energy of a network posterior probabilities

Prior distribution over networks


Prior knowledge posterior probabilitiesSachs et al.

Edge

Non-edge

0.1

0.4

0.45

0.9 0.6 0.55


AUROC scores posterior probabilities


Conclusion2
Conclusion posterior probabilities

  • True prior knowledge that is strong  no significant difference

  • True prior knowledge that is weak  Order MCMC leads to a slight yet significant deterioration. (Significant at the p=0.01 value obtained from a paired t-test).


Prior knowledge from KEGG posterior probabilities


Flow cytometry data posterior probabilities and KEGG


Conclusions posterior probabilities

  • The new method avoidsthe bias intrinsic to order MCMC.

  • Its convergence and mixing are similar to order MCMC; both methods outperform structure MCMC.

  • We can get an improvement over order MCMC when using explicit prior knowledge.


Thank you

Thank you! posterior probabilities

Any questions?


ad