1 / 51

Special Topics in Computational Biology : Formal Methods in Systems Biology

Special Topics in Computational Biology : Formal Methods in Systems Biology. Spring, 2008. Chris Langmead Department of Computer Science Carnegie Mellon University James Faeder Department of Computational Biology University of Pittsburgh School of Medicine. General Info.

edward
Download Presentation

Special Topics in Computational Biology : Formal Methods in Systems Biology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Special Topics in Computational Biology:Formal Methods in Systems Biology Spring, 2008 Chris Langmead Department of Computer Science Carnegie Mellon University James Faeder Department of Computational Biology University of Pittsburgh School of Medicine

  2. General Info • Course Numbers: • CMU 15-872(A) • CMU 02-730 • Pitt CMPBIO 2045(Arts & Sciences) • Pitt MSCBIO 2045 (School of Medicine) • Location: Newell-Simon Hall (NSH) 3002 - OK? • Time: Tu, Th 1:30-2:50 PM • Instructors • Chris Langmead (cjl@cs.cmu.edu) • Jim Faeder (faeder@pitt.edu) • Office Hours: By appointment (please email) • Course Wiki: http://bionetgen.org/index.php/Formal_Methods_in_Systems_Biology (email Jim for account)

  3. Course Format: An Informal Course about Formal Methods • Introductory lectures (two weeks) • Students will read and present research papers • Sign up for open dates on the wiki (25 - projects) • Students will design and complete a course project on a subject of special interest • Grading is based on completion of work • Flexibility depending on course enrollment • Journal club • Focused project • Review article

  4. Encouragement • Opportunity to learn about new areas and methods that will be of direct interest in your research. • (True for the “instructors” as well) • We will operate as a multi-disciplinary team • Computer Scientists, Physicists, Chemists, Engineers, Mathematicians, …, Biologists • Good communication essential

  5. Products of the Course • Comprehensive bibliography in wiki format • Research projects leading to publishable results in the field • Review article (?) • Improved organization and presentation skills • Participation on a multi-disciplinary team

  6. Introductions • Your name • Your university, department, research area(s) and research advisor • Your educational background • Computer Science, Math, Physics, etc. • Goals taking the course

  7. Outline of Today’s Lecture • Definition of terms • Goals • Examples of Successful Abstractions • Flux Balance Analysis • Mass Action Kinetics • Brief survey of topics

  8. Importance of Symbols • Invention of symbol for zero and decimal system for writing numbers “among the greatest human inventions.” • 3 known independent inventions • In each case, development took centuries • Major impact on trade, culture, and philosophy. • Celebration of zero dot in Sanskrit poetry “The dot on her forehead / Increases her beauty tenfold,/ Just as a zero dot [sunya-bindu] /Increases a number tenfold. -Biharilal

  9. Key Definitions - Formal Methods • In computer science and software engineering, formal methods are mathematically-based techniques for the specification, development and verification of software and hardware systems. • The use of formal methods for software and hardware design is motivated by the expectation that, as in other engineering disciplines, performing appropriate mathematical analyses can contribute to the reliability and robustness of a design. • However, the high cost of using formal methods means that they are usually only used in the development of high-integrity systems, where safety or security is important. - WIKIPEDIA

  10. Expanded View of Formal Methods • Formal abstractions that may be used to model system of interest • In addition to sytems that can be formally analyzed, we will consider representations that can only be fully explored by simulations.

  11. Key Definitions - Systems Biology • Systems biology is a relatively new biological study field that focuses on the systematic study of complex interactions in biological systems, thus using a new perspective (integration instead of reduction) to study them. • Particularly from 2000 onwards, the term is used widely in the biosciences, and in a variety of contexts. • Because the scientific method has been used primarily toward reductionism, one of the goals of systems biology is to discover new emergent properties that may arise from the systemic view used by this discipline in order to understand better the entirety of processes that happen in a biological system. - WIKIPEDIA

  12. Origin of Systems Biology • Completion of genome projects is major inspiration • Provided “parts list” for the cell • Next obvious step is to ask how parts work together to carry out function?

  13. Vision for Role of Computer Science in Systems Biology • “Computer science could provide the abstraction[s] needed for consolidating knowledge of biomolecular systems” • “...the abstractions, tools and methods used to specify and study computer systems should illuminate our accumulated knowledge about biomolecular systems.” Regev and Shapiro, “Cells as Computation,” Nature (2002).

  14. Abstract Representations in Biology • DNA sequence represented by strings with 4 letter alphabet (ATGC) • Protein sequence and structure • Strings with 20 letter alphabet • Set of 3D atomic coordinates (PDB file) The KaiC hexamer, a Circadian clock protein. From pdb.org.

  15. (Some) Desirable Properties of an Abstract Representation • Relevant / accurate • Computable • Understandable • Extensible • Scalable Modular Hierarchical 1-4 from Regev and Shapiro, “Cells as Computation,” Nature (2002).

  16. An Irony • CS community aims to provide powerful abstract representations to improve understanding of systems. • Manner of reporting results - technical reports in conference proceedings - presents major barrier to wider adoption by science and engineering communities. • There is a need for better communication among disciplines!

  17. Sometimes formalism creates a barrier

  18. Example: Red blood cell model

  19. Agenda • We are looking for useful abstractions that can improve our understanding of how biological systems behave

  20. Goals • Language(s) for constructing whole-cell models (comprehensive, system-wide) • Formal analysis (reasoning) of such models • Simulation of models on distributed systems • Combination of analysis and simulation to predict behavior of models • genotype  phenotype

  21. Challenges • Accuracy • Missing interactions • Computability • Requirement to perform simulations for many properties of interest • Poor scaling of simulations • Understanding • Problem of network visualization • Extensibility • Missing biophysics • Scalability • Need to compute behavior on multiple scales, e.g. tissuecellcytoplasmnucleus

  22. Computational Mathematical vs. Computational Models Consider an elementary chemical reaction r1: A + B -> C Mathematical module A : [0..N] init N; [r1] (A > 0) -> k*A*B: (A’ = A - 1); … endmodule How important is this distinction? Fisher & Henzinger, Nat. Biotechnol. (2007).

  23. Tension between Accuracy and Computability • Application of formal methods requires that elements of representation be relatively simple. • For example, a representation that includes all analytical functions in mathematics might not be useful - impossible to make predictions. • In general, increasing the complexity of the representation limits ability for analysis. • Representations are sometimes chosen for amenability to analysis rather than realism - e.g. boolean networks. • Computational (“executable”) models tend to make restrictions explicit.

  24. Some successful abstractions in systems biology • Flux Balance Analysis • Genome-wide models of metabolism • Mass Action Kinetics • Cell-cycle model • Growth factor signaling model

  25. Network Reconstruction (2D Annotation) B. O. Palsson, Nature Biotechnology22, 1218 - 1219 (2004)

  26. Network Reconstruction (cont.) • Wiring diagram for the components in a cell • Elements are • Molecular Components (Species) • Interactions (Reactions) • Additional detail can be added. • Genome-wide reconstructions for metabolism are available for many model organisms (including Homo Sapiens!) • “All such interactions are ultimately represented by a genome-scale stoichiometric matrix—a two-dimensional genome annotation.” B. O. Palsson, Nature Biotechnology22, 1218 - 1219 (2004)

  27. Overview of Flux Balance Analysis • Genome-wide reconstruction of metabolic network • Assume steady state • Assume optimal growth (biomass production)

  28. Genome-Wide Reconstruction of Haemophilus influenzae Edwards, J. S. et al. J. Biol. Chem. 1999;274:17410-17416

  29. Single and double deletion in the central metabolic pathways of H. Influenzae Edwards, J. S. et al. J. Biol. Chem. 1999;274:17410-17416

  30. What Accounts for Success? • Knowledge Base • Metabolic chemistry known from >50 years biochemistry and genome sequence • Simple Abstraction • Biochemistry reduced to list of reaction stoiochimetries • Powerful Computation Method • Highly optimized solvers for Linear Programming problem • Extensibility • Non-optimal growth in mutants • Constraints arising from molecular crowding

  31. ligand-receptor binding ligand transphosphorylation aggregation receptor SH2 domain kinase Cellular Signal Transduction signaling complex plasma membrane adaptor SH3 domain

  32. Mass Action Kinetics Differential Equations

  33. Reaction Network Model of Signaling Kholodenko et al., J. Biol. Chem. 274, 30169 (1999)

  34. Comparing Model and Experiment Experimental Data Simulation Results

  35. Benefits of Mass Action Kinetic Modeling • Large knowledge base of signaling biochemistry • Models dynamical behavior • Computational Methods Well Established • ODE solvers for continuous systems • Nonlinear Dynamics Theory • Extensibility • Stochastic Simulation Algorithm for discrete systems • Spatially-resolved models can be built on same mass action equations

  36. Limitations of Mass Action Kinetic Modeling • Rapidly expanding knowledge base • Many components and interactions unknown • Lack of precision • ad hoc assumptions to limit combinatorial explosion (next lecture) • Large sets of nonlinear ODE’s are difficult to simulate or analyze • No comprehensive models yet

  37. Map of Signaling Initiated by a Single Family of Receptors Oda and Kitano (2006) Mol. Syst. Biol.

  38. Map of Signaling Initiated by a Single Family of Receptors Analysis is limited to simple graph theoretic measures and qualitative discussions of architecture. Oda and Kitano (2006) Mol. Syst. Biol.

  39. Boolean Networks Petri Nets Statecharts Process Algebras Agent-Based Modeling Hybrid Systems Model Checking Simulation Algorithms (Partial) List of Topics

  40. Boolean Networks Petri Nets Statecharts Process Algebras Agent-Based Modeling Hybrid Systems Model Checking Simulation Algorithms Brief Overview of Two Useful Abstractions

  41. Boolean Networks BN model of cell cycle in budding yeast G1 Li, F., et al. PNAS101, 4781–4786 (2004).

  42. Boolean Networks BN model of cell cycle in budding yeast G1 Update: Li, F., et al. PNAS101, 4781–4786 (2004).

  43. Boolean Networks BN model of cell cycle in budding yeast G1 Update: Blue arrows form stable basin of attraction Li, F., et al. PNAS101, 4781–4786 (2004).

  44. Pro Models may be constructed on basis of scant data* Fast computation Strong analysis tools (?) Good for reasoning about stability and robustness Con Two levels may not be enough Lack of compositionality Not hierarchical, but may be embedded in more complex models. Balance Sheet for BNs *Li S, Assmann SM, Albert R (2006) Predicting Essential Components of Signal Transduction Networks: A Dynamic Model of Guard Cell Abscisic Acid Signaling. PLoS Biol 4(10): e312

  45. Tokens Places Transition Transition Petri Nets Chaouiya, C. Petri net modelling of biological networks. Brief. Bioinform. 8, 210–219 (2007).

  46. Tokens Places Transition Transition Petri Nets Time Evolution Chaouiya, C. Petri net modelling of biological networks. Brief. Bioinform. 8, 210–219 (2007).

  47. Petri Nets Generalize Network Reconstruction p3 t2 p4 C corresponds to S Chaouiya, C. Brief. Bioinform. 8, 210–219 (2007).

  48. Some useful formal properties of PNs • P-invariants ( ) ~ Mass Conservation • T-invariants ( ) ~ Loops / Ele. Modes • Reachability - whether a state can be reached • Liveness - whether a transition can be fired

  49. Overview of PNs • PNs are graphs, and provide tight connection between visualization and modeling • PN formalism is isomorphic to network reconstruction formalism (reaction networks) • Many extensions are possible to overcome limitations • Colored Petri Nets, Hierarchical CPNs, Multi-level PN, Stochastic PNs, etc. • Extensions provide further modeling capabilities at the expense of analysis.

More Related