Methods for Construction and Analysis of Computational Models in S ystems Biology

Methods for Construction and Analysis of Computational Models in Systems Biology AndrzejMizera Institute of Fundamental Technological Research Polish Academy of Sciences amizera@ippt.gov.pl University of Luxembourg

Outline • Introduction: Systems Biology • Part I: Model construction • Parameter Estimation • Model validation • Model identifiability • Part II: Higher-level model analysis methodologies • Quantitative submodel comparison • Techniques for model modifications (extensions & simplifications) • model refinement of ODE-based self-assembly models • Summary University of Luxembourg

Introduction ATP synthase — a marvellous rotary engine of the cell Nature Reviews Molecular Cell Biology 2, 669-677 (September 2001) http://micro.magnet.fsu.edu/cells/animals/animalmodel.html Systems biology essentiallyadvocates a departure from the reductionist viewpoint, while putting emphasison the holistic approach towards the analysis of a biological system. Action of muscle Ca2+ ATPase Molecular Cell Biology. 4th edition.Lodish H, Berk A, Zipursky SL, et al.New York: W. H. Freeman; 2000. University of Luxembourg Biological processes have long been seen as static systems comprisinga vast number of loosely linked, highly detailed, molecular devices.

Introduction Systems biology: • emerging research field, • aims to study biochemical and biological systems from a holistic perspective, • with the goal toprovide a comprehensive, system-level understanding of cellular behaviour. “... the objective of systems biology is defined as the understandingof network behaviour, and in particular their dynamicaspects, which requires the utilization of mathematical modellingtightly linked to experiment.” - WTEC Panel Report on InternationalResearch and Development in Systems Biology, 2005- University of Luxembourg

Introduction • Systems biology, unlike “traditional” biology, focuses on high-level concepts. • The very terminology of systems biology is “foreign” to “traditional”biology and it marks its drastic shift in research paradigm. Systems biology network signalling synchronisation control regulation robustness competition hierarchical design parallelism efficiency component University of Luxembourg

Introduction The key concepts in systems biology have already beenstudied for a long time in computer science (albeit from different perspectives). A key contribution that computer science brings to systemsbiology is the ability to • manipulate, • analyse, and • reason about such system-levelconcepts and structures. • For example, mathematical modelling, formalsystem specifications, control design, and others are by now mainstreamtechniques in systems biology. University of Luxembourg

Introduction • My research concerns a number of challenges of computational modelling in Systems Biology. • It is focused on the development and utilization of different methodologies having their origins in the fields of computer science and mathematics. • The presented methodologies are applied in the modelling of two biological processes chosen as modelling case studies: • the heat shock response mechanism in eukaryotic cells (HSR) • the in vitro self-assembly of intermediate filaments from tetramericvimentin. • The developed methodologies are general in nature: can be applied for the modelling of other biological processes as well. University of Luxembourg

PART I: Model construction University of Luxembourg

The heat shock response • Cell’s response to elevated temperatures • Intense research on HSR in the last years • HSR is a very well-conserved regulatory network across all eukaryotes (bacteria have a similar mechanism) • Good candidate for deciphering the engineering principles of regulatory networks • Heat shock proteins are very potent chaperones (sometimes called the “master proteins” of the cell) • Involved in a large number of regulatory processes • Also in anti-inflammatory processes • Found in extra-cellular environment, which may suggest they are used for signaling • Major role in the resilience of cancer cells; attractive as targets for cancer treatment • Tempting for a biomodelling, SysBio project because it involves relatively few main actors (at least in a first, simplified presentation) University of Luxembourg

Heat shock response: main actors • Heat shock proteins (HSP) • Very potent chaperones • Main task: assist the refolding of misfolded proteins • Several types of them, we treat them all uniformly in our model with hsp70 as a base denominator • Heat shock elements (HSE) • Several copies found upstream of the HSP-encoding gene, used for the transactivation of the HSP-encoding genes • Treat uniformly all HSEs of all HSP-encoding genes • Heat shock factors (HSF) • Proteins acting as transcription factors for the HSP-encoding gene • Trimerize, then bind to HSE to promote gene transcription • Generic proteins • Consider them in two states: correctly folded and misfolded • Under elevated temperatures, proteins tend to misfold, exhibit their hydrophobic cores, form aggregates, lead eventually to cell death (see Alzheimer, vCJ, and other diseases) • Various bonds between these metabolites University of Luxembourg

HSE HSE Heat shock gene Heat shock gene MFP MFP MFP MFP MFP The molecular model for HSR 37C HSP HSP RNA pol HSP HSP HSP:HSF HSF 42C University of Luxembourg

The new molecular model • Transcription • HSF+HSF  HSF2 • HSF+HSF2 HSF3 • HSF3+HSE  HSF3:HSE • HSF3:HSE  HSF3:HSE+HSP • Backregulation • HSP+HSF  HSP:HSF • HSP+HSF2 HSP:HSF+HSF • HSP+HSF3 HSP:HSF+2HSF • HSP+HSF3:HSE  HSP:HSF+2HSF+HSE • Response to stress • PROT  MFP • HSP+MFP  HSP:MFP • HSP:MFP  HSP+PROT • Protein degradation • HSP   University of Luxembourg

The new molecular model • Several criteria were followed when introducing this molecular model: • as few reactions and reactants as possible; • include the temperature-induced protein misfolding; • include HSF in all its three forms: monomers, dimers, and trimers; • include the HSP-backregulation of the transactivation of the HSP-encoding gene; • include the chaperon activity of HSP; • include only well-documented, textbook-like reactions and reactants. University of Luxembourg

The Law of Mass Action • Introduced by Guldberg and Waage in 1864: Waage, P.; Guldberg, C. M. Forhandlinger: Videnskabs-Selskabeti Christiana1864, 35 The rate of any given reaction is proportional to the concentration of reactants. For example: SA + SB P • In modern chemistry derived using statistical mechanics University of Luxembourg

The mathematical model University of Luxembourg

The mathematical model Modelling of the heat-induced misfolding • Adapted from Pepper et al (1997), based on studies of Lepock (1989, 1992) on differential calorimetry (T)=(1-0.4/eT-37) 14.5 10-71.4T-37 s-1 • Formula valid for temperatures between 37 °C and 45 °C, gives a generic protein misfolding rate per second. University of Luxembourg

The mathematical model Our model contains 3 mass-conservation relations: • The total number of HSF molecules is constant, • The total number of heat shock elements (HSE) is constant, • and the total number of Protein molecules (PROT) is conserved. University of Luxembourg

Parameter estimation • Data readily available for the goal: Kline, Morimoto (1997) – heat shock of HeLa cells at 42 °C for up to 4 hours, data on DNA binding (HSF3:HSE) • Requirements for the model: • 17 kinetic constants, 10initial values to estimate, but: • 3 conservation relations available; • the initial value of the variables of the model is a steady state for temperature set to 37 °C, which gives 7more independent algebraic equations (each of them quadratic). • Altogether: 17 independent values • Other conditions: total HSF somewhat low, refolding a fast reaction, HSPs long-lived proteins University of Luxembourg

Parameter estimation • Parameter estimation performed with (www.copasi.org) Hoops, S., Sahle, S., Gauges, R., Lee, C., Pahle, J., Simus, N., Singhal, M., Xu, L., Mendes, P., and Kummer, U. (2006). COPASI — a COmplexPAthwaySImulator. Bioinformatics22, 3067-74. • User-friendly • Stochastic and deterministic time course simulation • Steady state analysis • Sensitivity analysis • Metabolic control analysis • Mass conservation analysis • Optimization of arbitrary objective functions • SBML-based • Excellent for parameter estimation • FREE for non-commercial use University of Luxembourg

Strategy for parameter estimation • Ideal approach: • Solve analytically the steady state equations at 37 °C. • Use the solution to decrease the number of parameters and initial valuesto the independent ones. • Do parameter estimation on the remaining independent variables to fit the model based on the data at 42 °C. • Problem: the steady state (37 °C) equations cannot be solved because of they high overall degree. • Solution: use the Kline-Morimoto experimental data to fit parameters and ask also that the fluctuations at 37°C are (close to) 0. • Duplicate the model (the same parameter values!) and run both at the same time (37 & 42 °C). University of Luxembourg

Parameter estimation results University of Luxembourg

All data is in relative terms with respect to the highest value in the graph so that it can be easily compared Predictions and validation • Higher the temperature, higher the response • Prolonged transcription at 43 °C confirmed • Unlike previous models • Heat shock removed at the peak of the response confirms a more rapid attenuation phase University of Luxembourg

Predictions and validation • Experiment: two waves of heat shock, the second applied after the level of HSP has peaked • Observation: the second heat shock response much milder than the first • The reason is that the cell is better prepared to deal with the second heat shock • Therapeutic consequences have been suggested: “train” the cell for heat shock by an initial milder heat shock • The model prediction is in line with the experimental observation • Dotted line: heat shock at 42 °C for two hours, behavior followed up to 20 hours • Continuous line: heat shock at 42 °C for two hours, followed by a second wave of heat shock after the level of HSP has peaked University of Luxembourg

Model identifiability • The problem of model identifiability: the question of the uniqueness of the set of parameters that fulfil the imposed conditions. • By repeating from scratch the whole parameter estimation procedure, we obtained several different sets of parameter values that result both in a good fit of the model to the experimental data, as well as in initial values that are steady-states of the model at 37 °C. • However, all these parameter sets failed the model validation tests with respect to the qualitative observations concerning the behaviour of cells under stress. • A new thorough method of searching for alternative numerical model fits • based on systematic parameter scan in the space determined by the considered ranges of parameter valuesLatin Hypercube Sampling- provides samples which are uniformly distributed over each parameter while the number of samples is independent of the number of parameters University of Luxembourg

Model identifiability Strategy: look for alternative models fits that are in agreement with the experimental data of Kline and Morimoto (1997), and satisfy the steady-state condition for the initial values. • Sampled N = 100.000 sets of parameter values • For each set, we estimated numerically the steady state of the model for a temperature value of 37 °C. We then set the initial state of the model as the calculated steady state. • Simulated the model for 14400s at a temperature of 42 °C. • Classified as non-responsive those parameter samples that led to low DNA binding level at the peak of the response, and excluded them from further analysis. • Only 31.506 out of the 100.000 samples were responsive, already a result pointing to difficulties in finding satisfactory alternative numerical fits. • For each model, we made a scatter plot for each variable and each parameter where we plotted the steady state values of each variable at 37 °C, against the values of the parameter. University of Luxembourg

Model identifiability 37 °C : most of the alternative fits predicted high levels of gene transcription in the absence of the heat shock none of the sampled models reached such a low level of HSF as the reference model the reference model is one of the very few models in which most of the HSF molecules are sequestered by HSP, i.e. at 37°C the response mechanism is turned off the basic model reaches the lowest valuesfor HSF dimers – in agreement with the observation that HSF dimersare unnoticeable in biological experiments University of Luxembourg

Model identifiability 42 °C : Conclusions While it is likely that a model of this size is not uniquely identifiable, our parameter scan showed that finding parameter values satisfying our model constraints is far from being easy. Our reference model obtained the lowest score of around 12, while the 13 best fits of the sampled models were in the range between 300 and 1000. All the other models had much worse scores, of more than 1000. University of Luxembourg

PART II: Higher-level model analysis methodologies University of Luxembourg

Model decomposition and quantitative submodel comparison • Various experimental investigations of a given biochemical system often lead to generation of a large variety of alternative molecular designs and models. • How to compare their functionality, efficiency, and robustness? • Comparing alternative models for a given biochemical system is, in general, a very difficult problem: • the models may focus on different aspects of the same system, • may consist of very different species and reactions, • numerical setups of the associated computational models play a crucial role in the quantitative comparison. It involves a deep analysis of the underlying network of reactions, the biological assumptions as well as the numerical setup. • Unbiased methods to objectively compare the alternative designs are needed. University of Luxembourg

Model decomposition and quantitative submodel comparison • The problem becomes somewhat simpler when the alternative designs are actually submodels of a larger model: • the underlying networks are similar, although not identical, and • the biological constrains are given by the larger model. • E.g., submodels are considered when striving to obtain a global picture of a large system’s architecture, i.e. understanding the interactions between various components. • Similar problems have been encountered in engineering sciences, e.g., in control theory. • One strategy is to adapt specific methods originating from these disciplines. University of Luxembourg

Model decomposition and quantitative submodel comparison Three methods were developed: • Local submodelscomparisonEl. Czeizler, Eu. Czeizler, R.-J. Back, and I. Petre. Control strategies for the regulation of the eukaryotic heat shock response. In P. Degano and R. Gorrieri (Eds.), Computational Methods in Systems Biology, LNCS, vol. 5688, pp. 111-125, Heidelberg, 2009. Springer-Verlag. • Discrete approach for comparing continuous submodelsEl. Czeizler, A. Mizera, and I. Petre. A Boolean approach for disentangling the numerical contribution of modules to the system-level behavior of a biomodel. Submitted, 2011. • A statistical method for quantitative submodel comparisonA. Mizera, El. Czeizler, and I. Petre. Methods for biochemical model decomposition and quantitative submodel comparison. Israel Journal of Chemistry, 51(1):151-164, 2011. University of Luxembourg

Model decomposition and quantitative submodel comparison These methods are: • using the control-based decomposition of the reference model, • comparing knockdown mutants of the reference model, i.e., submodelsmissing one or several of the modules, • illustrated on the eukaryotic heat shock response case study. University of Luxembourg

Model decomposition University of Luxembourg

Local submodels comparison • Aim: biologically unbiased analysis, i.e., elimination of accidental differences between models. • Question: initial conditions in the alternative models? • Taking them from the reference model biased comparison • Local submodel comparison: initial setup of each submodel constitutes a steady state in the absence of a trigger unbiased comparison • The method: Two initial biological constraints are imposed on the models: • identical chemistry (reaction rates) and mass constants (conservation relations) • initial values form a steady state under physiological conditions University of Luxembourg

Local submodels comparison Comparison of the mutants • Based on the numerical observations one can summarize the contribution of the modules to some performance indicators of the network.E.g., in the case of the HSR model these are: • economical use of the cellular resources, • speed of response to a heat shock, • effectiveness of the response, • scalability. University of Luxembourg

Discrete approach for comparing continuous submodels • Three conditions are imposed on the each knockdown mutant model: • kinetic rate constants: the numerical prediction of the mutant model fits in with the experimental data • initial conditions: form a steady state under physiological conditions • mass constants: are chosen to be identical to those of the reference model • All three constraints come as natural consequences of the fact that we consider all knockdown mutants as viable alternatives. University of Luxembourg

Discrete approach for comparing continuous submodels • For each knockdown mutant a Boolean formula is associated that describes the mutant’s control architecture (negation and conjunction of variables associated to regulating mechanisms). • Boolean formulas describing all mutants that show given behavioural properties are written.Example: which knockdown mutants can be both effective and economic? • The numerical comparison of the mutants is then performed by analysing the Boolean formulas associated with various behavioural properties. University of Luxembourg

Statistical method for quantitative submodel comparison • Statistical sampling of the reference model and mutant behaviour is performed: • the parameter value space of the reference model is scanned (e.g. Latin Hypercube Sampling) • each coordinate of the sampled parameter vectors is associated with one of the parameters in the reference model • For each sampled parameter vector: • parameters of the reference model and the submodel are set in accordance with the considered vector, • initial conditions of the reference model and the submodel are determined independently of each other by a systemic property (e.g. being in a steady state in a given setup). University of Luxembourg

Statistical method for quantitative submodel comparison • For each sampled parameter vector: • one numerical instance for the mutant and one for the reference model • numerical simulations are run for both of them in order to evaluate their functional effectiveness • The obtained results for each variant are summarized by use of some statistical measures (e.g. the moving median technique). University of Luxembourg

Refinement of ODE-based self-assembly models University of Luxembourg

Refinement of ODE-based self-assembly models Challenge: construction of models capable of capturing the evolution offilaments of certain length up to some arbitrarily chosen value. R. Kirmse et al. A Quantitative Kinetic Model for the in Vitro Assembly of Intermediate Filaments from TetramericVimentin. Journal of Biological Chemistry, 282(25):18563-18572, 2007. monomer simple model coiled-coil dimer tetramer extended model ULF University of Luxembourg

Refinement of ODE-based self-assembly models • The proposed methodology of self-assembly model refinement is a particularinstance of formal model refinement – a topic extensively studied in ComputerScience, especially in connection to formal software specifications. • Model refinement: • starting from an abstract model of a system, a more detailed model isconstructed • the refinement mechanism preserves already proven systemic quantitativeproperties of the original model (e.g. model fit, stochastic semantics) University of Luxembourg

Refinement of ODE-based self-assembly models Our method: an instance of data refinement, where one replaces a variable with a set of other variables in a way that introduces more details into the model, while keeping the model constraints unchanged. A. Mizera, Eu. Czeizler, and I. Petre. Self-assembly models of variable resolution. To appear in Transactions on Computational Systems Biology, 2012. • a generic formal modelfor the process of self-assembly is presented • the notion of model resolutionis introduced –a self-assembly mathematicalmodel is of resolution n if it allows for capturing the dynamics of the number(or concentration) of components that are exactly of size i, where 0 ≤ i ≤ n. • model refinement procedures preserving the fit to experimental data for a family of self-assembly ODE models are developed • to the best of our knowledge, this is the first time formal refinement is considered in the context of ODE-based mathematical models University of Luxembourg

Refinement of ODE-based self-assembly models • increasing the model resolution: an exact, constructive method based on analytical investigations of the ODEs is developed • decreasing the model resolution is more challenging: method based on symbolical computations but requires numerical investigations and simulations of the models • case study: the in vitro self-assembly of intermediate filaments University of Luxembourg

Summary University of Luxembourg

Summary • Issues related to model construction methodologies:parameter estimation, model validation, model identifiability,choice of deterministic vs stochastic modelling framework • Discuss and develop new techniques for the problem of modelcomparison. • methodologies for model decomposition • special case: comparison between submodels of a certain model • The problem of model modifications: various techniques useful forapplying simplifications or extensions to an already fitted and validatedmathematical model in such a way that the desired properties of themodel are retained (fitting experimental data is computationallyexpensive!). • computational heuristics for simplifying a biological model • model refinement University of Luxembourg

Summary • prof. Ion Petre Computational Biomodelling Laboratory ÅboAkademi University, Department of Information Technologies Turku Centre for Computer Science • prof. Barbara Gambin Institute of Fundamental Technological Research Polish Academy of Sciences University of Luxembourg

Methods for Construction and Analysis of Computational Models in S ystems Biology

Methods for Construction and Analysis of Computational Models in S ystems Biology

Presentation Transcript

Computational Models of Discourse Analysis

Computational Models of Discourse Analysis

Models and methods in systems biology

HIDDEN MARKOV MODELS IN COMPUTATIONAL BIOLOGY

Computational Models of Discourse Analysis

Hidden Markov models in Computational Biology

Hidden Markov models in Computational Biology

Computational Models of Discourse Analysis

Computational Models of Discourse Analysis

Computational Models of Discourse Analysis

Computational Models of Discourse Analysis

Computational Models of Discourse Analysis

Computational Models of Discourse Analysis

Computational Models of Discourse Analysis

Introduction to Probabilistic Models for Computational Biology

Computational Methods in Systems Biology

Computational Methods for Data Analysis

Computational Models of Discourse Analysis

Computational Models of Discourse Analysis

Computational Models of Discourse Analysis

Computational Models of Discourse Analysis

Computational Models of Discourse Analysis