1 / 26

How Ontologies Add Value BioPAX: Biological Pathway Data Exchange Ontology

How Ontologies Add Value BioPAX: Biological Pathway Data Exchange Ontology. Joanne Luciano BioPAX Workgroup ( biopax.org ) BioPathways Consortium Liaison (biopathways.org) 3 May 2005 KM Pro Forum Bentley College, Waltham MA, USA. Introduction. BioPAX = Biopathway Exchange Language

cynara
Download Presentation

How Ontologies Add Value BioPAX: Biological Pathway Data Exchange Ontology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. How Ontologies Add ValueBioPAX: Biological Pathway Data Exchange Ontology Joanne Luciano BioPAX Workgroup (biopax.org) BioPathways Consortium Liaison (biopathways.org) 3 May 2005 KM Pro Forum Bentley College, Waltham MA, USA

  2. Introduction BioPAX = Biopathway Exchange Language Emerged at ISMB • conceived at ISMB ’01 • born at ISMB ’02 • crawling at ISMB ’03 (Level 0.5) • walking at ISMB ’04 (Level 1.0) • now in the “terrible twos”

  3. Ontology Intro • Natural language does a poor job at conveying complex information without ambiguity • Ontologies provide a means to give concise meanings to pieces of data from a particular domain • Thereby facilitating computational operations on the data • Ontologies are becoming increasingly common in the biological community • See http://obo.sourceforge.net/obo.htm

  4. Ontology: Components • Class hierarchy: chemical  protein • Relations & attributes: fields (slots) on the classes, can be other classes • Constraints: Define allowable values and connections within an ontology • Objects: instances of classes • Values: occupy slots • Controlled vocabularies (CVs) • BioPAX will use class, attributes, constraints, values and CVs. Objects are user responsibility * From Peter Karp, “Ontologies: Definitions, Components, Subtypes”, SRI International, presentation available at http://www.biopax.org

  5. What is a Pathway? Depends on who you ask! Glycolysis Protein-Protein Apoptosis Lac Operon Molecular Interaction Networks Metabolic Pathways Signaling Pathways Gene Regulation

  6. Genetics Microarray High Throughput Experimental Methods MassSpectrometry Two-Hybrid Protein modifications Interaction Data Expression Function Existing Literature PubMed Multiple Pathway Databases Integration Nightmare!

  7. So many pathway databases…Each has its owndata model, format, and data access methods Source: Pathway Resource List (http://cbio.mskcc.org/prl/)

  8. } Research Community Needs Semantic Aggregation, Integration, Inference(Pedantic Aggravation, Irritation, and Interference) Pathway Databases WIT BioCyc Reactome aMAZE KEGG BIND DIP HPRD MINT IntAct PSI format CSNDB TRANSPATH TRANSFAC PubGene GeneWays

  9. A Common Exchange Language Promotes collaboration (big science), accessibility Application Database User Without BioPAX With BioPAX Over 170 DBs and tools Common “computable semantic” enables scientific discovery

  10. Molecular Interactions Pro:Pro All:All Metabolic Pathways Low Detail High Detail Interaction Networks Molecular Non-molecular Pro:Pro TF:Gene Genetic Regulatory Pathways Low Detail High Detail Small Molecules Low Detail High Detail Closes Gaps in Pathway Data Space Exchange Language Domain Database Exchange Formats Simulation Model Exchange Formats BioPAX SBML, CellML Genetic Interactions PSI-MI 2 Rate Formulas Biochemical Reactions

  11. Design Goals • Encapsulation: An entire pathway in one record • Compatible: Use existing standards wherever possible • Computable: From file reading to logical inference • Successful: Buy-in from the research community

  12. Technical Goals Interoperability • Integration and exchange of pathway data • Interchange through a common (standard) representation • accommodate existing database representations • provide a basis for future databases • enables development of tools for searching and reasoning over the data base Development of tools and API to facilitate conversion (libBioPAX)

  13. Technical Goals (cont’d) Why OWL? Why OWL DL? Expressivity (biology = “complex relationships”) • W3C Standard (use existing standards) “Semantic Web enabled” • XML based (theexchange language in computing) • Machine Computable • Facilitate integration of knowledge, data, tool development • Uncover inconsistencies and new knowledge • OWL DL • Enable full reasoning capability for users from file reading to logical inference • Complete: all conclusions are guaranteed to be computed • Decidable: all computations will finish in finite time (with OWL Lite, short amount of time)

  14. Social Logistics Get organized Make the decision & commitment 2 or 3 dedicated individuals to be the contact points Small core group • Bi-weekly conference calls, bi-monthly F2F • Commitment & resources • Participants willing and able cover their costs • Outside funding (DOE) Special interests and needs form subgroup task forces • Core group member(s) • Outside experts International representation & participation (Outreach & Community Building) • conferences and mailing lists • follow-up and individual Collaborate with complementary/competing representations

  15. Social Logistics (cont’d) How we engendered buy in from the field whichmade life much easier Take things in steps: • Pathway Database vision -> Data Exchange Format as 1st step • Data Exchange Format -> Release in Levels of increasing complexity Level 1 supports Metabolic pathways, Level 2 Early success leads to early adoption, leads to increased probability of overall project success. Get “buy in” and get involvement -leads to acceptance later • Support the existing databases (BioCYC, WIT, BIND, etc.) • Got database sources to agree to participate in the development to assure that their DBs will be properly represented • Got database sources to agree to export in the new format once it is defined

  16. Social Logistics (cont’d) Get “buy in” (continued) • Community Involvement and Support Core group (represents voice of community, small, committed) Mailing List User community interaction (BioPAX-Boston) Subgroups • International Meetings and Presentations Tool developers Modelers Users (researchers) Ontology developers Database providers Complementary representations (SBML, CellML) Like minds General Community

  17. Implementation of BioPAX Designed using GKB Editor and Protégé BioPAX uses OWL to define the “Schema” BioPAX Instances to store the data Technically, an ontology with instance data is a knowledge base

  18. BioPAX – Ontology Level 1: Metabolic Pathways

  19. Creating and Editing

  20. Mapping Pathways to BioPAX OWL (schema) Instances (Individuals) data

  21. Mapping Pathways to BioPAX

  22. Challenges & Bottlenecks • Scientific • What’s a pathway? Depends on who you ask. • Technical • Each own syntax & semantics • Immaturity of tools for data integration • Social / Logistical • Community organization and adoption • Financial • mostly volunteer of stakeholders • Dept of Energy

  23. Bridging Chemistry and Molecular Biology • Different Views have different semantics: Lenses • When there is a correspondence between objects, a semantic binding is possible Uniprot:P49841 Apply Correspondence Rule:if ?target.xref.lsid == ?bpx:prot.xref.lsidthen ?target.correspondsTo.?bpx:prot Source: Eric Neumann

  24. Enables Computable Biology BioPAX increases collaboration and accessibility to the field and enables 'big science' because it delivers a scalable solution Capture the complex relationships inherent in Biology Solves some nasty integration problems Saves a lot of time and money

  25. BioPAX Supporting Groups Databases • BioCyc (www.biocyc.org) • BIND (www.bind.ca) • WIT (wit.mcs.anl.gov/WIT2) • PharmGKB (www.pharmgkb.org) Grants • Department of Energy (Workshop) Groups • Memorial Sloan-Kettering Cancer Center: G. Bader, M. Cary, J. Luciano, C. Sander • SRI Bioinformatics Research Group: P. Karp, S. Paley, J. Pick • University of Colorado Health Sciences Center: I. Shah • BioPathways Consortium: J. Luciano, E. Neumann, A. Regev, V. Schachter • Argonne National Laboratory: N. Maltsev, E. Marland • Samuel Lunenfeld Research Institute: C. Hogue • Harvard Medical School: E. Brauner, D. Marks, J. Luciano, A. Regev • NIST: R. Goldberg • Stanford: T. Klein • Columbia: A. Rzhetsky • Dana Farber Cancer Institute: J. Zucker Collaborating Organizations: • Proteomics Standards Initiative (PSI) • Systems Biology Markup Language (SBML) • CellML • Chemical Markup Language (CML) The BioPAX Community

  26. Thank you!Questions?

More Related