1 / 27

Joint work with :

Temporal Logic Constraints in the Biochemical Abstract Machine BIOCHAM François Fages, Project-team: Contraintes, INRIA Rocquencourt, France http://contraintes.inria.fr/. Joint work with : Nathalie Sylvain Laurence

jenski
Download Presentation

Joint work with :

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Temporal Logic Constraints in the Biochemical Abstract Machine BIOCHAMFrançois Fages, Project-team: Contraintes, INRIA Rocquencourt, Francehttp://contraintes.inria.fr/ • Joint work with : • Nathalie Sylvain Laurence • Chabrier-Rivier Soliman Calzone • 2002-2004: ARC CPBIO “Process Calculi and Biology of Molecular Networks” • Bockmayr, LORIA, V. Danos, CNRS PPS, V. Schächter, Genoscope Evry

  2. Systems Biology ? • Multidisciplinary field aiming at getting over • the complexity walls to reason about • biological processes at the system level. • Virtual cell: emulate high-level biological processes in terms of their biochemical basis at the molecular level (in silico experiments) • Bioinformatics: end 90’s, genomic sequences  post-genomic data (RNA expression, protein synthesis, protein-protein interactions,… ) • Need for a strong effort on: • - the formal representation of biological processes, • - formal tools for modeling and reasoning about their global behavior.

  3. Language Approach to Cell Systems Biology • Qualitative models:from diagrammatic notation to • Boolean networks [Thomas 73] • Petri Nets [Reddy 93] • Milner’s π–calculus[Regev-Silverman-Shapiro 99-01, Nagasali et al. 00] • Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03] • Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02] • Transition systems [Chabrier-Chiaverini-Danos-Fages-Schachter 04] • Biochemical abstract machine BIOCHAM-1[Chabrier-Fages 03] • Quantitative models: from differential equation systems to • Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00] • Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01] • Hybrid concurrent constraint languages [Bockmayr-Courtois 01] • Rules with continuous dynamics BIOCHAM-2[Chabrier-Fages-Soliman 04]

  4. Outline of the Presentation • Introduction • Biocham Rule Language for Modeling Biochemical Systems • Syntax of objects and reactions • Semantics at 3 abstraction levels: Boolean, Concentrations, Populations • Biocham Temporal Logic for Formalizing Biological Properties • CTL for Boolean semantics • Constraint LTL for Concentration semantics • Learning Rules and Parameters from Temporal Properties • Learning reaction rules from CTL specification • Learning kinetic parameter values from Constraint-LTL specification • Conclusion and collaborations

  5. 2. Modeling Biochemical Systems • Small molecules: covalent bonds (outer electrons shared) 50-200 kcal/mol • 70% water • 1% ions • 6% amino acids (20), nucleotides (5), • fats, sugars, ATP, ADP, … • Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol • Stability and bindings determined by the number of weak bonds: 3D shape • 20% proteins (50-104 amino acids) • RNA (102-104 nucleotides AGCU) • DNA (102-106 nucleotides AGCT)

  6. Formal Proteins • Cyclin dependent kinase 1 Cdk1 • (free, inactive) • Complex Cdk1-Cyclin B Cdk1–CycB • (low activity) • Phosphorylated form Cdk1~{thr161}-CycB • at site threonine 161 • (high activity) also called • Mitosis Promotion Factor MPF

  7. BIOCHAM Syntax of Objects • E == compound | E-E | E~{p1,…,pn} • Compound: molecule, #gene binding site, abstract @process… • - : binding operator for protein complexes, gene binding sites, … • Associative and commutative. • ~{…}: modification operator for phosphorylated sites, … • Set of modified sites (Associative, Commutative, Idempotent). • O == E | E::location • Location: symbolic compartment (nucleus, cytoplasm, membrane, …) • S == _ | O+S • + : solution operator (Associative, Commutative, Neutral _)

  8. Seven Main Rule Schemas • Complexation: A + B => A-B Decomplexation A-B => A + B • cdk1+cycB => cdk1–cycB • Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A • Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB • Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB • Synthesis: _ =[C]=> A. Degradation: A =[C]=> _. • _=[#Ge2-E2f13-Dp12]=>cycA cycE =[@UbiPro]=> _ • (not for cycE-cdk2 which is stable) • Transport: A::L1 => A::L2 • Cdk1~{p}-CycB::cytoplasm=>Cdk1~{p}-CycB::nucleus

  9. BIOCHAM Syntax of Reaction Rules • R ::= S=>S | S=[O]=>S | S<=>S | S<=[O]=>S • where A=[C]=>B stands for A+C=>B+C • A<=>B stands for A=>B and B=>A, etc. • N ::= expr for R (import/export SBML format) • Three abstraction levels: • Boolean Semantics: presence-absence of molecules • Concurrent Transition System (asynchronous, non-deterministic) • Concentration Semantics: number / volume of diffusion • Ordinary Differential Equations (deterministic) • Population of molecules: number of molecules • Stochastic Multiset Rewriting

  10. Cell Cycle: G1  DNA Synthesis  G2  Mitosis • G1: CdK4-CycD S: Cdk2-CycA G2,M: Cdk1-CycA • Cdk6-CycD Cdk1-CycB • Cdk2-CycE (MPF)

  11. Mammalian Cell Cycle Model [Kohn 99]

  12. Boolean Semantics • Associate: • Booleanstate variables to molecules • denoting the presence/absence of molecules in the cell or compartment • A Finite concurrent transition system [Shankar 93] to rules (asynchronous) over-approximating the set of all possible behaviors • A reaction A+B=>C+D is translated into 4 transition rules for the possibly complete consumption of reactants: • A+BA+B+C+D • A+BA+B +C+D • A+BA+B+C+D • A+BA+B+C+D

  13. Concentration Semantics • k1cc for _=>preMPF. • k3cc*[C25~{s1,s2}]*[preMPF] for preMPF=[C25~{s1,s2}]=>MPF. • (k14cc*[CKI]*[MPF],k15cc*[CKI-MPF]) for CKI+MPF<=>CKI-MPF. • k2cc*[preMPF] for preMPF=>_. • k2cc*[MPF] for MPF=>_. • k2u*[APC]*[MPF] for MPF=[APC]=>_. • k4cc*[Wee1]*[MPF] for MPF=[Wee1]=>preMPF. • … • parameter(k1cc,0.25). • … • present({preMPF, Wee1m}). • Compiles into an ODE system • (or a Stochastic Process under • the Population semantics)

  14. E, A Non-determinism AG EU EF F,G,U Time 3. Formalizing Biological Properties in Temporal Logics • Boolean Semantics: Computation Tree Logic CTL

  15. Biological Properties formalized in CTL [Chabrier Fages 03] • Aboutreachability: • Can the cell produce some protein P? reachable(P)==EF(P)

  16. Biological Properties formalized in CTL [Chabrier Fages 03] • Aboutreachability: • Can the cell produce some protein P? reachable(P)==EF(P) • Aboutpathways: • Is it possible to produce P without having Q? E(Q U P) • Is state s2 a necessary checkpoint for reaching state s? • checkpoint(s2,s)== E(s2U s)

  17. Biological Properties formalized in CTL [Chabrier Fages 03] • Aboutreachability: • Can the cell produce some protein P? reachable(P)==EF(P) • Aboutpathways: • Is it possible to produce P without having Q? E(Q U P) • Is state s2 a necessary checkpoint for reaching state s? • checkpoint(s2,s)== E(s2U s) • Aboutstationarity: • Is a (partially described) state s a stable state? stable(s)== AG(s) • Is s a steady state (with possibility of escaping) ? steady(s)==EG(s) • Can the cell reach a stable state? EF(stable(s))

  18. Biological Properties formalized in CTL [Chabrier Fages 03] • Aboutreachability: • Can the cell produce some protein P? reachable(P)==EF(P) • Aboutpathways: • Is it possible to produce P without having Q? E(Q U P) • Is state s2 a necessary checkpoint for reaching state s? • checkpoint(s2,s)== E(s2U s) • Aboutstationarity: • Is a (partially described) state s a stable state? stable(s)== AG(s) • Is s a steady state (with possibility of escaping) ? steady(s)==EG(s) • Can the cell reach a stable state? EF(stable(s)) • Aboutoscillations (approximation without strong fairness): • Can the system exhibit a cyclic behavior w.r.t. the presence of P ? oscillation(P)== EG((P  EF P) ^ (P  EF P))

  19. Cell Cycle Model-Checking • biocham: check_reachable(cdk46~{p1,p2}-cycD~{p1}). • Ei(EF(cdk46~{p1,p2}-cycD~{p1})) is true • biocham: check_checkpoint(cdc25C~{p1,p2}, cdk1~{p1,p3}-cycB). • Ai(!(E(!(cdc25C~{p1,p2}) U cdk1~{p1,p3}-cycB))) is true • biocham: nusmv(Ai(AG(!(cdk1~{p1,p2,p3}-cycB) -> checkpoint(Wee1, cdk1~{p1,p2,p3}-cycB))))). • Ai(AG(!(cdk1~{p1,p2,p3}-cycB)->!(E(!(Wee1) U cdk1~{p1,p2,p3}-cycB)))) is false • biocham: why. • -- Loop starts here • cycB-cdk1~{p1,p2,p3} is present • cdk7 is present • cycH is present • cdk1 is present • Myt1 is present • cdc25C~{p1} is present • rule_114 cycB-cdk1~{p1,p2,p3}=[cdc25C~{p1}]=>cycB-cdk1~{p2,p3}. • cycB-cdk1~{p2,p3} is present • cycB-cdk1~{p1,p2,p3} is absent • rule_74 cycB-cdk1~{p2,p3}=[Myt1]=>cycB-cdk1~{p1,p2,p3}. • cycB-cdk1~{p2,p3} is absent • cycB-cdk1~{p1,p2,p3} is present

  20. Cell Cycle Model-Checking • 800 rules, 165 proteins and genes, 500 variables. • BIOCHAM-NuSMV symbolic model-checker time in seconds:

  21. Concentration Semantics: Constraint LTL • Constraints over concentrations and derivatives as FOL formulae over the reals: • [M] > 0.2 • [M]+[P] > [Q] • d([M])/dt < 0 • Constraint LTL operators for time F, U, G (no non-determinism). • F([M]>0.2) • FG([M]>0.2) • F ([M]>2 & F (d([M])/dt<0 & F ([M]<2 & d([M])/dt>0 & F(d([M])/dt<0)))) • oscil(M,n)= F (d([M])/dt>0 & F(d([M])/dt<0 & … )) • Language to formalize the relevant properties observed in experiments

  22. Traces from Numerical Simulation • From a system of Ordinary Differential Equations • dX/dt = f(X) • Numerical integration produces a discretization of time (adaptive step size Runge-Kutta and Rosenbrock method for stiff systems) • The trace is a linear Kripke structure: • (t0,X0), (t1,X1), …, (tn,Xn)… • the derivatives can be added to the trace • (t0,X0,dX0/dt), (t1,X1,dX1/dt), …, (tn,Xn,dXn/dt)… • Equality x=v true if xi≤v & xi+1≥v or if xi≥v & xi+1≤v

  23. 4. Learning Kinetic Parameters with Constraint-LTL • parameter(k3cc,0.1). • k3cc*[MPF~{p}]*[cdc25C~{p1,p2}] for • MPF~{p}=[cdc25C~{p1,p2}]=>MPF. • biocham: trace_get([k3cc],[(0,5)],20, • oscil(MPF,4)&F([MPF]>1),100). • Found parameters that make • oscil(MPF,4) & F([MPF]>1) true: • parameter(k3cc,2.5).

  24. Learning Reaction Rules from CTL Specification • The biological properties of the system are added as CTL formulas • biocham: add_spec({reachable(MPF),checkpoint(cdc25C~{p1,p2},MPF),...}). • Suppose that the MPF activation rule is missing in the model • biocham: delete_rule(MPF~{p}=[cdc25C~{p1,p2}]=>MPF). • biocham: check_all. • The specification is not satisfied. • This formula is the first not verified: Ei(EF(MPF)) • Rules can be searched to correct the model w.r.t. specification: • biocham: learn_one_rule(all_elementary_interaction_rules). • Possible rules to be added: 3 • _=[cdc25C~{p1,p2}]=>MPF • MPF~{p}=[cdc25C~{p1,p2}]=>MPF • CKI+MPF~{p}=[cdc25C~{p1,p2}]=>CKI-MPF

  25. Learning Reaction Rules from CTL Specification • Example: finding an intermediary step between MPF and APC activation • biocham: absent(X). add_rule(_=>X). add_rule(X=>_). • biocham: add_specs({ Ei(reachable(X)), Ai(oscil(X)), • Ai(AG(!APC->checkpoint(X,APC))), • Ai(AG(!X->checkpoint(MPF,X))) }). • biocham: check_all. • The specification is not satisfied. • This formula is the first not verified: Ai(AG(!APC->!(E(!X U APC)))) • Biocham searches for revisions of the model satisfying the specification • biocham: revise_model. • Deletion(s): _=[MPF]=>APC. _=>X. • Addition(s): _=[X]=>APC. _=[MPF]=>X.

  26. Conclusion • The biochemical abstract machine BIOCHAM implements: • A simplerule-based languagefor modeling biochemical processes with three abstraction levels: • Boolean semantics: presence/absence of molecules • Molecule Concentration semantics (ODE) • Molecule Population semantics (stochastic) • A powerful temporal logic language for formalizing biological properties • CTL (implemented with NuSMV model checker) • Constraint LTL (implemented in Prolog) • Machine learning techniques • Reaction rule discovery from CTL specification • Parameter estimation from constraint LTL specification • Issue of compositionality: model reuse in different contexts • Issue of abstraction/refinement: model simplification/decomposition

  27. Collaborations • STREP APRIL 2: Applications of probabilistic inductive logic programming • Luc de Raedt, Freiburg, Stephen Muggleton, Imperial College London,… • Learning in a probabilistic logic setting • NoE REWERSE: Reasoning on the web with rules and semantics • François Bry, Münich, Rolf Backofen Jena, Mike Schroeder Dresden,… • Connecting Biocham to the semantic web: gene and protein ontologies • INRIA Bang, Jean Clairambault, Benoît Perthame • INSERM, Villejuif, Francis Lévi “Cancer chronotherapies” • ULB, Albert Goldbeter, Bruxelles • Coupled models of cell cycle, circadian cycle, cytotoxic drugs.

More Related