slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Problem PowerPoint Presentation
Download Presentation
Problem

Loading in 2 Seconds...

play fullscreen
1 / 67

Problem - PowerPoint PPT Presentation


  • 115 Views
  • Uploaded on

Problem. Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction. Problem. Limited number of experimental replications. Postgenomic data intrinsically noisy.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Problem' - josie


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
problem
Problem
  • Limited number of experimental replications.
  • Postgenomic data intrinsically noisy.
  • Poor network reconstruction.
problem1
Problem
  • Limited number of experimental replications.
  • Postgenomic data intrinsically noisy.
  • Can we improve the network reconstruction by systematically integrating different sources of biological prior knowledge?
slide7

+

+

slide8

+

+

+

+

slide9
Which sources of prior knowledge are reliable?
  • How do we trade off the different sources of prior knowledge against each other and against the data?
overview of the talk
Overview of the talk
  • Revision: Bayesian networks
  • Integration of prior knowledge
  • Empirical evaluation
overview of the talk1
Overview of the talk
  • Revision: Bayesian networks
  • Integration of prior knowledge
  • Empirical evaluation
slide12

Bayesian networks

  • Marriage between graph theory and probability theory.
  • Directed acyclic graph (DAG) representing conditional independence relations.
  • It is possible to score a network in light of the data: P(D|M), D:data, M: network structure.
  • We can infer how well a particular network explains the observed data.

NODES

A

B

C

EDGES

D

E

F

slide14

Bayesian networks versus causal networks

Bayesian networks represent conditional (in)dependence relations - not necessarily causal interactions.

slide15

Node A unknown

A

A

True causal graph

B

C

B

C

Bayesian networks versus causal networks

slide16

Bayesian networks versus causal networks

A

A

A

B

C

B

C

B

C

  • Equivalence classes: networks with the same scores: P(D|M).
  • Equivalent networks cannot be distinguished in light of the data.

A

B

C

slide17

Symmetry breaking

A

A

A

B

C

B

C

B

C

A

Priorknowledge

B

C

P(M|D) = P(D|M) P(M) / Z

D: data. M: network structure

slide19

P(M)

Prior knowledge:

B is a transcription factor with binding sites in the upstream regions of A and C

slide21

Learning Bayesian networks

P(M|D) = P(D|M) P(M) / Z

M: Network structure. D: Data

overview of the talk2
Overview of the talk
  • Revision: Bayesian networks
  • Integration of prior knowledge
  • Empirical evaluation
slide27

Biological prior knowledge matrix

Indicates some knowledge about

the relationship between genes i and j

Biological Prior Knowledge

slide28

Biological prior knowledge matrix

Indicates some knowledge about

the relationship between genes i and j

Biological Prior Knowledge

Define the energy of a Graph G

notation
Notation
  • Prior knowledge matrix:

P  B (for “belief”)

  • Network structure:

G (for “graph”) or M (for “model”)

  • P: Probabilities
slide30

Energy of a network

Prior distribution over networks

slide31

Sample networks and hyperparameters

  • from the posterior distribution
  • Capture intrinsic inference uncertainty
  • Learn the trade-off parameters automatically

P(M|D) = P(D|M) P(M) / Z

slide32

Energy of a network

Prior distribution over networks

slide33

Energy of a network

Rewriting the energy

slide34

Approximation of the partition function

Partition functionof a perfect gas

slide37

Sample networks and hyperparameters from the posterior distribution

Proposal probabilities

Metropolis-Hastings scheme

slide38

Bayesian networkswith biological prior knowledge

  • Biological prior knowledge: Information about the interactions between the nodes.
  • We use two distinct sources of biological prior knowledge.
  • Each source of biological prior knowledge is associated with its own trade-off parameter:b1 and b2.
  • The trade off parameter indicates how much biological prior information is used.
  • The trade-off parameters are inferred. They are not set by the user!
slide39

Bayesian networkswith two sources of prior

Source 2

Source 1

Data

BNs + MCMC

b1

b2

Recovered Networks and trade off parameters

slide40

Bayesian networkswith two sources of prior

Source 2

Source 1

Data

BNs + MCMC

b1

b2

Recovered Networks and trade off parameters

slide41

Bayesian networkswith two sources of prior

Source 2

Source 1

Data

BNs + MCMC

b1

b2

Recovered Networks and trade off parameters

overview of the talk3
Overview of the talk
  • Revision: Bayesian networks
  • Integration of prior knowledge
  • Empirical evaluation
evaluation
Evaluation
  • Can the method automatically evaluate how useful the different sources of prior knowledge are?
  • Do we get an improvement in the regulatory network reconstruction?
  • Is this improvement optimal?
slide44

Raf regulatory network

From Sachs et al Science 2005

evaluation raf signalling pathway
Evaluation: Raf signalling pathway
  • Cellular signalling network of 11 phosphorylated proteins and phospholipids in human immune systems cell
  • Deregulation  carcinogenesis
  • Extensively studied in the literature  gold standard network
slide48

Flow cytometry data

  • Intracellular multicolour flow cytometry experiments: concentrations of 11 proteins
  • 5400 cells have been measured under 9 different cellular conditions (cues)
  • Downsampling to 100 instances (5 separate subsets): indicative of microarray experiments
slide49

Microarray example

Spellman et al (1998)

Cell cycle

73 samples

Tu et al (2005)

Metabolic cycle

36 samples

time

time

Genes

Genes

slide51

Flow cytometry data and KEGG

http://www.genome.jp/kegg/

KEGG PATHWAYS are a collection of manually drawn pathway maps representing our knowledge of molecular interactions and reaction networks.

evaluation1
Evaluation
  • Can the method automatically evaluate how useful the different sources of prior knowledge are?
  • Do we get an improvement in the regulatory network reconstruction?
  • Is this improvement optimal?
slide56

Bayesian networkswith two sources of prior

Source 2

Source 1

Data

BNs + MCMC

b1

b2

Recovered Networks and trade off parameters

slide57

Bayesian networkswith two sources of prior

Source 2

Source 1

Data

BNs + MCMC

b1

b2

Recovered Networks and trade off parameters

evaluation2
Evaluation
  • Can the method automatically evaluate how useful the different sources of prior knowledge are?
  • Do we get an improvement in the regulatory network reconstruction?
  • Is this improvement optimal?
evaluation3
Evaluation
  • Can the method automatically evaluate how useful the different sources of prior knowledge are?
  • Do we get an improvement in the regulatory network reconstruction?
  • Is this improvement optimal?
slide64

Learning the trade-off hyperparameter

Mean and standard deviation of the sampled trade off parameter

  • Repeat MCMC simulations for large set of fixed hyperparameters β
  • Obtain AUC scores for each value of β
  • Compare with the proposed scheme in which β is automatically inferred.
conclusion
Conclusion
  • Bayesian scheme for the systematic integration of different sources of biological prior knowledge.
  • The method can automatically evaluate how useful the different sources of prior knowledge are.
  • We get an improvement in the regulatory network reconstruction.
  • This improvement is close to optimal.