1 / 32

Joint work with :

Temporal Logic Constraints in the Biochemical Abstract Machine BIOCHAM François Fages, Project-team: Contraintes, INRIA Rocquencourt http://contraintes.inria.fr/. Joint work with : Nathalie Sylvain Laurence

goro
Download Presentation

Joint work with :

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Temporal Logic Constraints in the Biochemical Abstract Machine BIOCHAMFrançois Fages, Project-team: Contraintes, INRIA Rocquencourthttp://contraintes.inria.fr/ • Joint work with : • Nathalie Sylvain Laurence • Chabrier-Rivier Soliman Calzone • 2002-2004: ARC CPBIO “Process Calculi and Biology of Molecular Networks” • Bockmayr, LORIA, V. Danos, CNRS PPS, V. Schächter, Genoscope Evry

  2. Systems Biology ? • Multidisciplinary field aiming at getting over the complexity walls to reason about biological processes at the system level. • Virtual cell: emulate high-level biological processes in terms of their biochemical basis at the molecular level (in silico experiments) • Beyond providing tools to biologists, Computer Science has much to offer in terms of concepts and methods. • Bioinformatics: end 90’s, genomic sequences  post-genomic data (ARN expression, protein synthesis, protein-protein interactions,… ) • Need for a strong effort on: • - the formal representation of biological processes, • - formal tools for modeling and reasoning about their global behavior.

  3. Language Approach to (Cell) Systems Biology • Qualitative models:from diagrammatic notation to • Boolean networks [Thomas 73] • Milner’s π–calculus [Regev-Silverman-Shapiro 99-01, Nagasali et al. 00] • Concurrent transition systems[Chabrier-Chiaverini-Danos-Fages-Schachter 04] • Biochemical abstract machine BIOCHAM-1[Chabrier-Fages 03] • Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02] • Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03] • Quantitative models: from differential equation systems to • Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00] • Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01] • Hybrid concurrent constraint languages [Bockmayr-Courtois 01] • Rule-based language BIOCHAM-2[Chabrier-Fages-Soliman 04]

  4. Plan of the Presentation • Introduction • Biocham Rule Language for Modeling Biochemical Systems • Syntax of objects and reactions • Semantics at 3 abstraction levels: Boolean, concentrations, populations • Biocham Temporal Logic for Formalizing Biological Properties • Computation Tree Logic for Boolean semantics • Constraint Linear Time Logic for concentration semantics • Machine Learning Rules and Parameters from Temporal Properties • Learning kinetic parameter values • Learning reaction rules • Conclusion, collaborations

  5. 2. Objects in the Cell • Small molecules: covalent bonds (outer electrons shared) 50-200 kcal/mol • 70% water • 1% ions • 6% amino acids (20), nucleotides (5), • fats, sugars, ATP, ADP, … • Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol • Stability and bindings determined by the number of weak bonds: 3D shape • 20% proteins (50-104 amino acids) • RNA (102-104 nucleotides AGCU) • DNA (102-106 nucleotides AGCT)

  6. Formal Proteins • Cyclin dependent kinase 1 Cdk1 • (free, inactive) • Complex Cdk1-Cyclin B Cdk1–CycB • (low activity) • Phosphorylated form Cdk1~{thr161}-CycB • at site threonine 161 • (high activity) • mitosis promotion factor MPF

  7. Formal Genes and RNA • Genes = parts of DNA #ERCC1 • Gene transcription: RNA copying from a gene • RNA expression: Protein synthesis from an RNA • #ERCC1-(PRB-JUN-CFOS)

  8. BIOCHAM Syntax of Objects • E == compound | E-E | E~{p1,…,pn} • O == E | E::location • S == _ | O+S • Location: symbolic compartment (nucleus, cytoplasm, membrane, …) • Compound: molecule, #gene binding site, abstract @process… • - : binding operator for protein complexes, gene binding sites, … • Associative and commutative. • ~{…}: modification operator for phosphorylated sites, … • Set of modified sites (Associative, Commutative, Idempotent). • + : solution operator (Associative, Commutative, Neutral _)

  9. Elementary Reaction Rule Schemas • Complexation: A + B => A-B Decomplexation A-B => A + B • Cdk1+CycB => Cdk1–CycB • Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A • Cdk1–CycB =[Myt1]=> Cdk1~{thr161}-CycB • Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB • Synthesis: _ =[C]=> A. • _ =[#Ge2-E2f13-Dp12]=> CycA • Degradation: A =[C]=> _. • CycE =[@UbiPro]=> _ (not for CycE-Cdk2 which is stable)

  10. BIOCHAM Syntax of Reaction Rules • N ::= expr for R (import/export SBML,…) • R ::= S=>S | S=[O]=>S | S<=>S | S<=[O]=>S • where A=[C]=>B stands for A+C=>B+C • A<=>B stands for A=>B and B=>A, etc. • Three abstraction levels: • Boolean Semantics: presence-absence of molecules • Concurrent Transition System (asynchronous, non-deterministic) • Concentration Semantics: number / volume • Ordinary Differential Equations (deterministic) • ( Population of molecules: number of molecules ) • Stochastic Multiset Rewriting

  11. Cell Cycle: G1  DNA Synthesis  G2  Mitosis • G1: CdK4-CycD • Cdk6-CycD • Cdk2-CycE • S: Cdk2-CycA • G2 • M: Cdk1-CycA • Cdk1-CycB • (MPF)

  12. Cell Cycle Example [Qu 97]: Concentration Semantics • parameter(k1cc,0.25). • … • k1cc for _=>MPF. • k3cc*[C25~{s1,s2}]*[MPF] for • MPF=[C25~{s1,s2}]=>MPF~{s}. • (k14cc*[CKI]*[MPF~{s}],k15cc*[CKI-MPF~{s}]) for • CKI+MPF~{s}<=>CKI-MPF~{s}. • k2cc*[MPF]for MPF=>_. • k2cc*[MPF~{s}]for MPF~{s}=>_. • k2u*[APC]*[MPF~{s}] for MPF~{s}=[APC]=>_. • k4cc*[Wee1]*[MPF~{s}] for MPF~{s}=[Wee1]=>MPF. • present({MPF, Wee1m}).

  13. Mass Action Law Kinetics • Law: The number of reactions is proportional to the number of reactants. • A + B k C • proportionality factor k • reaction rate=kAB=dC/dt , dA/dt=-kAB, dB/dt=-kAB • E+S k1 C k2 E+P dE/dt = -k1ES+(k2+k3)C • E+S k3 C dS/dt = -k1ES+k3C • dC/dt = k1ES-(k2+k3)C • dP/dt = k2C • Compositionality: The dynamics of a complex system is the composition of the dynamics of the elementary reactions under mass action law (at given temperature, pH,…).

  14. Boolean Semantics • Associate: • Booleanstate variables to molecules • denoting the presence/absence of molecules in the cell or compartment • A Finite concurrent transition system [Shankar 93] to rules (asynchronous) over-approximating the set of all possible behaviors • A reaction A+B=>C+D is translated into 4 transition rules taking into account the possible consumption of reactants: • A+BA+B+C+D • A+BA+B +C+D • A+BA+B+C+D • A+BA+B+C+D

  15. Cell Cycle Example [Qu 97]: Boolean Semantics • _=>MPF. • MPF=[C25~{s1,s2}]=>MPF~{s}. • CKI+MPF~{s}<=>CKI-MPF~{s}. • MPF=>_. • MPF~{s}=>_. • MPF~{s}=[APC]=>_. • MPF~{s}=[Wee1]=>MPF. • … • present({MPF, Wee1m}).

  16. Mammalian Cell Cycle Model [Kohn 99]

  17. Detail for Cdk2 • Complexation with CycA and CycE • Phosphorylation sites PY15 and P • Biocham Rules: • cdk2~$P + cycA-$C => cdk2~$P-cycA-$C • where $C in {_,cks1} . • cdk2~$P + cycE~$Q-$C => cdk2~$P-cycE~$Q-$C • where $C in {_,cks1} . • p57 + cdk2~$P-cycA-$C => p57-cdk2~$P-cycA-$C • where $C in {_, cks1}. • cycE-$C =[cdk2~{p2}-cycE-$S]=> cycE~{T380}-$C • where $S in {_, cks1} and $C in {_, cdk2~?, cdk2~?-cks1} • 147-2733 rules, 165 proteins and genes, 500 variables, 2500 states.

  18. Plan • Biocham Rule Language for Modeling Biochemical Systems • Syntax of objects and reactions • Semantics at 3 abstraction levels: Boolean, concentrations, populations • Biocham Temporal Logic for Formalizing Biological Properties • Computation Tree Logic for Boolean semantics • Constraint Linear Time Logic for concentration semantics • Machine Learning Rules and Parameters from Temporal Properties • Learning kinetic parameter values • Learning reaction rules • Conclusion, collaborations

  19. E, A Non-determinism AG EU EF F,G,U Time 2. Formalizing Biological Properties in Temporal Logics • Boolean Semantics: Computation Tree Logic

  20. Biological Properties formalized in CTL [Chabrier Fages 03] • Aboutreachability: • Can the cell produce some protein P? reachable(P)==EF(P) • Aboutpathways: • Is it possible to produce P without using nor creating Q? E(Q U P) • Is state s2 a necessary checkpoint for reaching state s? • checkpoint(s2,s)== E(s2U s) • Aboutstationarity: • Is a (partially described) state s a stable state? stable(s)== AG(s) • Is s a steady state (with possibility of escaping) ? steady(s)==EG(s) • Can the cell reach a stable state? EF(stable(s)) • Aboutoscillations: • Can the system exhibit a cyclic behavior w.r.t. the presence of P ? oscillation(P)== EG((P  EF P) ^ (P  EF P))

  21. Temporal Logic Queries in Cell Cycle Model • Is C25~{s1,s2} a checkpoint for activating MPF? • biocham: nusmv(Ai(checkpoint(C25~{s1,s2},MPF~{s})). • Ai(!E(!C25~{s1,s2} U MPF~{s})) isfalse • Biocham: why. • MPF is present • Wee1m is present • 6 MPF=>MPF~{s}. • MPF~{s} is present • biocham: nusmv(Ai(loop(MPF,MPF~{s})). • Ai(AG(MPF->EF(MPF~{s})&(MPF~{s}->EF(MPF)))) istrue • biocham: nusmv(Ai(oscil(C25))). • Ai(AG(C25->EF(!(C25))&(!(C25)->EF(C25)))) is true

  22. Cell Cycle Benchmark with Kohn’s Model • 147-2733 rules, 165 proteins and genes, 500 variables, 2500 states. • BIOCHAM NuSMV symbolic model-checker time in seconds:

  23. Plan • Biocham Rule Language for Modeling Biochemical Systems • Syntax of objects and reactions • Semantics at 3 abstraction levels: Boolean, concentrations, populations • Biocham Temporal Logic for Formalizing Biological Properties • Computation Tree Logic for Boolean semantics • Constraint Linear Time Logic for concentration semantics • Machine Learning Rules and Kinetics from Temporal Properties • Learning kinetic parameter values • Learning reaction rules • Conclusion, collaborations

  24. 3. Learning Rules from Temporal Properties • Theory T: BIOCHAM model • molecule declarations • reaction rules: complexation, phosphorylation, etc… • Training Examples φ: biological properties in temporal logic • Reachability • Checkpoints • Stable states • Oscillations • Bias P: Rule pattern or parameter range • Kind of reaction rules to learn • Find R in P such that T,R |= φ Theory Revision framework [de Raedt 92]

  25. Learning Interaction Rules in the Boolean Semantics • Example: MPF degradation rules erased • biocham: delete_rules({MPF~{s}=>_. , MPF=>_. , • MPF~{s}=>MPF. , MPF=>MPF~{s}.}). • biocham: absent(IE). add_rule(_=>IE). add_rule(IE=>_). • biocham: add_specs({ Ei(reachable(IE)), Ai(oscil(IE)), • Ai(AG((!(APC))->checkpoint(IE,APC))), • Ai(AG((!(IE))->checkpoint(MPF,IE))) }). • biocham: check_all. • Specification not satisfied: Ai(AG(!(APC)->!(E(!(IE) U APC)))) is false • biocham: revise_model. • Deletion(s): _=[MPF]=>APC. _=>IE. • Addition(s): _=[IE]=>APC. _=[MPF]=>IE.

  26. Theory Revision Algorithm • General idea of constraint programming: replace a generate-and-test algorithm by a constrain-and-generate algorithm. • Anticipate whether one has to add or remove a rule? • ACTL formulae contain only A quantifiers: checkpoint,… • If false, remains false after adding a rule  delete rule • Remove a rule on the path given by the model checker (why command) • ECTL formulae contain only E quantifiers: reachability, oscillation, … • If false, remain false after deleting a rule  add rule • Unclassified CTL formulae • Mixed E and A quantifiers

  27. Constraint LTL Logic for Concentration Semantics • Constraints over concentrations and derivatives as FOL formulae over the reals: • [M] > 0.2 • [M]+[P] > [Q] • d([M])/dt < 0 • Constraint LTL operators for time F, U, G (no non-determinism). • F([M]>0.2) • FG([M]>0.2) • F ([M]>2 & F (d([M])/dt<0 & F ([M]<2 & d([M])/dt>0 & F(d([M])/dt<0)))) • oscil(M,n) • Language to formalize the relevant properties observed in experiments

  28. Example of Parameter Learning in Cell Cycle • k1cc for _=>MPF~{s}. • biocham: parameter(k1cc,1). • biocham: numerical_simulation(100). • Simulation time: 2.123s • biocham: plot. • biocham: trace_get([k1cc],[(0,1)],20, • oscil(MPF,5),100). • Found parameter(k1cc,0.25).

  29. Traces from Numerical Simulation • From a system of Ordinary Differential Equations • dX/dt = f(X) • Numerical integration produces a discretization of time (adaptive step size Runge-Kutta and Rosenbrock method for stiff systems) • The trace is a linear Kripke structure: • (t0,X0), (t1,X1), …, (tn,Xn)… • the derivatives can be added to the trace • (t0,X0,dX0/dt), (t1,X1,dX1/dt), …, (tn,Xn,dXn/dt)… • Equality x=v true if xi≤v & xi+1≥v or if xi≥v & xi+1≤v

  30. Constraint-Based LTL (Forward) Model Checking • Hypothesis 1: the initial state is completely known • Hypothesis 2: the formula can be checked over a finite period of time [0,T] • Simple algorithm based on the trace of the numerical simulation: • Run the numerical simulation from 0 to T producing values at a finite sequence of time points • Iteratively label the time points with the sub-formulae of f that are true: • Add f to the time points where a FOL formula f is true, • Add F fto the previous time points labeled by f, • Add f1 U f2to the predecessor time points of f2 while they satisfy f1, • (Add G f to the states satisfying f until T (optimistic abstraction…))

  31. Conclusion • The biochemical abstract machine BIOCHAM offers: • A simplerule-based languagefor modeling biochemical processes • Molecule concentration semantics (ODE) • Boolean semantics: presence/absence of molecules • A powerful temporal logic language for formalizing biological properties • CTL (implemented with NuSMV model checker) • Constraint LTL (implemented in Prolog) • An original machine learning system • Interaction rule discovery (from CTL specification) • Parameter estimation (from constraint LTL specification) • A repository of models: cell-cycle control, signaling pathways… (SBML) • http://contraintes.inria.fr/CMBSlib

  32. Collaborations • STREP APRIL 2: Applications of probabilistic inductive logic programming • Luc de Raedt, Freiburg, Stephen Muggleton, Imperial College London,… • Learning in a probabilistic logic setting • NoE REWERSE: Reasoning on the web with rules and semantics • François Bry, Münich, Rolf Backofen Jena, Mike Schroeder Dresden,… • Connecting Biocham to the semantic web: gene and protein ontologies • INRIA Bang, Jean Clairambault, Benoît Perthame • INSERM, Villejuif, Francis Lévi “Cancer chronotherapies” • ULB, Albert Goldbeter, Bruxelles • Coupled BIOCHAM models of cell cycle, circadian cycle, drugs.

More Related