1 / 57

Existing Standards in Systems Biology

Existing Standards in Systems Biology. Anatoly Sorokin Computation Systems Biology Group University of Edinburgh. Standard. 2000-2010 is decade of standards in biology 31 MIBI standard 56 OBO ontologies About 80 exchange formats Scope of interest Language Controlled vocabulary.

letitia
Download Presentation

Existing Standards in Systems Biology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Existing Standards in Systems Biology Anatoly SorokinComputation Systems Biology GroupUniversity of Edinburgh Anatoly Sorokin

  2. Standard • 2000-2010 is decade of standards in biology • 31 MIBI standard • 56 OBO ontologies • About 80 exchange formats • Scope of interest • Language • Controlled vocabulary

  3. Standards and Languages • CML – description of chemical structure • MathML – representation of mathematical formulas • PSI – standard description of protein interaction data • AnatML – language to describe interaction at organ level • GeneOntology – standard and ontology to describe gene function and regulation

  4. Standards for Computational System Biology • BioPAX – language for database of biological networks exchange • SBML – language of biochemical model exchange • CellML – language to describe mathematical models • SBGN – visual language for biological model description

  5. MI standards • Reporting guidelines specify the minimum amount of meta data (information) and data required to meet a specific aim • Aim is to provide enough meta data and data to enable the unambiguous reproduction and interpretation of an experiment. • Normally informal human readable specifications that inform the development of formal data models (e.g. XML or UML), data exchange formats Anatoly Sorokin

  6. Exchange format • Strict structure to exchange data of model • Mainly XML • Well defined meta-model, often supported by software API Anatoly Sorokin

  7. Ontologies • “ontology deals with questions concerning what entities exist or can be said to exist, and how such entities can be grouped, related within a hierarchy, and subdivided according to similarities and differences” Wikipedia • Often used as controlled vocabulary and description support framework • GeneOntology Anatoly Sorokin

  8. BioPAX • “Biological PAthway eXchange -A data exchange ontology and format for biological pathway integration, aggregation and inference”

  9. BioPAX Goals • BioPAX = Biological PAthway eXchange • Data exchange format for pathway data • Include support for these pathway types: • Metabolic pathways • Signaling pathways • Protein-protein, molecular interactions • Gene regulatory pathways • Genetic interactions • Accommodate representations used in existing databases such as BioCyc, BIND, WIT, aMAZE, KEGG, Reactome, etc. • PathwayCommons – collection of pathways in BioPAX • http://www.pathwaycommons.org

  10. BioPAX • BioPAX ontology and format in OWL (XML) • Ontology built using GKB Editor and Protégé • Semantic mapping still an issue • Level 1 represents metabolic pathway data • Level 2 adds support for molecular interactions, post-translational modifications, experimental description from PSI-MI model (Backwards compatible) • Level 3 adds support for generics, protein states, rearrange reaction representation

  11. Subclass (is a) Contains (has a) Pathway Entity Interaction Physical Entity BioPAX Ontology: Top Level • Pathway • A set of interactions • E.g. Glycolysis, MAPK, Apoptosis • Interaction • A set of entities and some relationship between them • E.g. Reaction, Molecular Association, Catalysis • Physical Entity • A building block of simple interactions • E.g. Small molecule, Protein, DNA, RNA

  12. BioPAX Ontology: Interactions Interaction Physical Interaction Control Conversion ComplexAssembly Catalysis Modulation BiochemicalReaction Transport TransportWithBiochemicalReaction

  13. BioPAX Ontology: Physical Entities PhysicalEntity Protein Small Molecule Complex RNA DNA

  14. Molecular Interactions Pro:Pro All:All Metabolic Pathways Low Detail High Detail Interaction Networks Molecular Non-molecular Pro:Pro TF:Gene Genetic Regulatory Pathways Low Detail High Detail Small Molecules Low Detail High Detail BioPAX and other standards Database Exchange Formats Simulation Model Exchange Formats BioPAX SBML, CellML Genetic Interactions PSI-MI 2 Rate Formulas Biochemical Reactions

  15. Simulation-related standards Result Model Simulation MinimalRequirements ? implements implements Exchange format SED-ML SBRML Makes sense of Makes sense of Ontology Anatoly Sorokin

  16. SBML • “The Systems Biology Markup Language (SBML) is a computer-readable format for representing models of biochemical reaction networks. SBML is applicable to metabolic networks, cell-signaling pathways, regulatory networks, and many others. ”

  17. SBML • Reaction • container for rate law • Species • reactants, products, or modifiers of reaction • Compartment • container for species • Parameter, Rule, Event

  18. Characteristics of SBML • Many top-level types, little nesting • Units, Compartment, Species, Parameter, Reaction, Rule, Function, Event • Non-modular structure • Next SBML ‘Level’ (3) will introduce modularity • Emphasis on reactions • Some math implicit • Explicit rate equations; implicit integration • Implicit concentration conversion between compartments • Compartments are physical containers for species • Spatial dimensions (volume, surface)

  19. Structure of SBML

  20. Structure of SBML • Note field of SBase intended to store information for human to read • Annotation field of SBase provide a container for software-generated annotations that are not intended to be seen by humans • The id field is usually required for most structures and is used to identify a component within the model definition. • The name field is optional and provide a human-readable label for the component.

  21. Result Model Simulation MinimalRequirements ? implements implements Data model SED-ML SBRML Makes sense of Makes sense of Ontology Anatoly Sorokin

  22. MIRIAM • Model description require extra information • Biological • Description of elements of model • Mathematical • Definition of math concepts • Referential • Author name • Paper reference etc. • http://www.ebi.ac.uk/compneur-srv/miriam/ Anatoly Sorokin

  23. Reference correspondence • The model must be encoded in a public, standardized, machine-readable format (SBML, CellML, GENESIS ...) • The model must comply with the standard in which it is encoded! • The model must be clearly related to a single reference description. If a model is composed from different parts, there should still be a description of the derived/combined model. • The encoded model structure must reflect the biological processes listed in the reference description. • The model must be instantiated in a simulation: All quantitative attributes have to be defined, including initial conditions. • When instantiated, the model must be able to reproduce all results given in the reference description within an epsilon (algorithms, round-up errors) Anatoly Sorokin

  24. Attribution annotation • The model has to be named. • A citation of the reference description must be joined (completecitation, unique identifier, unambigous URL). The citation should permit to identify the authors of the model. • The name and contact of model creators must be joined. • The date and time of creation and last modification should be specified. An history is useful but not required. • The model should be linked to a precise statement about the terms of distribution. MIRIAM does not require “freedom of use” or “no cost”. Anatoly Sorokin

  25. External resource annotation • The annotation must permit to unambiguously relate a piece of knowledge to a model constituent. • The referenced information should be described using a triplet {data-type, identifier, qualifier} • The data-type should be written as a Unique Resource Identifier (URI) • The identifier is analysed within the framework of the data-type. • Data-type and Identifier can be combined in a single URI http://www.myResource.org/#myIdentifier urn:lsid:myResource.org:myIdentifier • Qualifiers (optional) should refine the link between the model constitutent and the piece of knowledge: “has a”, “is version of”, “is homolog to” etc. Anatoly Sorokin

  26. Anatoly Sorokin

  27. Result Model Simulation MinimalRequirements ? implements implements Data model SED-ML SBRML Makes sense of Makes sense of Ontology Anatoly Sorokin

  28. SBO • Part of OBO Foundry • Assign meanings to mathematical elements of SBML • Allows automatic validation of semantic consistency of math part of model • http://www.ebi.ac.uk/sbo Anatoly Sorokin

  29. SBO • Types and roles of reaction participants, including terms like “substrate”, “catalyst” etc., but also “macromolecule”, or “channel”. • Parameter used in quantitative models. This vocabulary includes terms like “Michaelis constant” , “forward unimolecular rate constant”etc. A term may contain a precise mathematical expression stored as a MathML lambda function. The variables refer to other parameters. • Mathematical expressions. Examples of terms are “mass action kinetics”, “Henri-Michaelis-Menten equation” etc. A term may contain a precise mathematical expression stored as a MathML lambda function. The variables refer to the other vocabularies. • Modelling framework to precise how to interpret the rate-law. E.g. “continuous modelling”, “discrete modelling” etc. • Event type, such as “catalysis” or “addition of a chemical group”. Anatoly Sorokin

  30. SBO Anatoly Sorokin

  31. Result Model Simulation MinimalRequirements ? implements implements Data model SED-ML SBRML Makes sense of Makes sense of Ontology Anatoly Sorokin

  32. MIASE • Minimum Information About a Simulation Experiment • What base model to use & which modifications to apply • What simulation task to run on those models (algorithms, see KiSAO; simulation parameters) • How to post-process the numerical results and to present them • http://www.ebi.ac.uk/compneur-srv/miase/ • Subset of MISE bould be encoded in SED-ML Anatoly Sorokin

  33. Description of models Anatoly Sorokin

  34. Description of models Anatoly Sorokin

  35. Simulations Anatoly Sorokin

  36. Simulation task Anatoly Sorokin

  37. Data generation Anatoly Sorokin

  38. Data generation Anatoly Sorokin

  39. Production of results Anatoly Sorokin

  40. Result Model Simulation MinimalRequirements ? implements implements Data model SED-ML SBRML Makes sense of Makes sense of Ontology Anatoly Sorokin

  41. KiSAO • Kinetic Simulation Algorithm Ontology • Classification of simulation algorithms & methods • Definition, literature references • Relations between different simulation algorithms & methods • http://www.ebi.ac.uk/compneur-srv/kisao/index.html Anatoly Sorokin

  42. KiSAO http://bioportal.bioontology.org/visualize/40844 Anatoly Sorokin

  43. Result Model Simulation MinimalRequirements ? implements implements Data model SED-ML SBRML Makes sense of Makes sense of Ontology Anatoly Sorokin

  44. SBRML • Systems Biology Results Markup Language • A new markup language for specifying the results from operations on SBML models • http://www.comp-sys-bio.org/tiki-index.php?page=SBRML Anatoly Sorokin

  45. SBRML Anatoly Sorokin

  46. SBRML Anatoly Sorokin

  47. Anatoly Sorokin

  48. Anatoly Sorokin

  49. Anatoly Sorokin

  50. Dimension example Anatoly Sorokin

More Related