1 / 94

Basic Building Blocks for Biomedical Ontologies

Basic Building Blocks for Biomedical Ontologies. Barry Smith. Problems with UMLS-style approaches. let a million ontologies bloom, each one close to the terminological habits of its authors in concordance with the “not invented here” syndrome

Download Presentation

Basic Building Blocks for Biomedical Ontologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Basic Building Blocks for Biomedical Ontologies Barry Smith

  2. Problems with UMLS-style approaches • let a million ontologies bloom, each one close to the terminological habits of its authors • in concordance with the “not invented here” syndrome • then map these ontologies, and use these mappings to integrate your different pots of data

  3. Mappings are hard They create an N2 problem; are fragile, and expensive to maintain Need new authorities to maintain(one for each pair of mapped ontologies), yielding new risk of forking – who will police the mappings? The goal should be to minimize the need for mappings, by avoiding redundancy in the first place – one ontology for each domain Invest resources in disjoint ontology modules which work well together – reduce need for mappings to minimum possible

  4. Why should you care? • you need to create systems for data mining and text processing which will yield useful digitally coded output • if the codes you use are constantly in need of ad hoc repair huge, resources will be wasted • serious investment in annotation will be defeated from the start • relevant data will not be found, because it will be lost in multiple semantic cemeteries

  5. How to do it right? • how create an incremental, evolutionary process, where what is good survives, and what is bad fails • where the number of ontologies needing to be used together is small – integration = addition • where these ontologies are stable • by creating a scenario in which people will find it profitable to reuse ontologies, terminologies and coding systems which have been tried and tested

  6. Reasons why GO has been successful • It is a system for prospective standardization built with coherent top level but with content contributed and monitored by domain specialists • Based on community consensus • Updated every night • Clear versioning principles ensure backwards compatibility; prior annotations do not lose their value • Initially low-tech to encourage users, with movement to more powerful formal approaches (including OWL-DL – though still proceeding caution)

  7. GO has learned the lessons of successful cooperation • Clear documentation • The terms chosen are already familiar • Fully open source (allows thorough testing in manifold combinations with other ontologies) • Subjected to considerable third-party critique • Tracker for user input and help desk with rapid turnaround

  8. GO has been amazingly successful in overcoming the data balkanization problem but it covers only generic biological entities of three sorts: • cellular components • molecular functions • biological processes no diseases, symptoms, disease biomarkers, protein interactions, experimental processes …

  9. OBO (Open Biomedical Ontology) Foundry proposal (Gene Ontology in yellow)

  10. Environment Ontology (ENVO) Environment Ontology

  11. Population-level ontologies

  12. The OBO Foundry: a step-by-step, evidence-based approach to expanding the GO • Developers commit to working to ensure that, for each domain, there is community convergence on a single ontology • and agree in advance to collaboratewith developers of ontologies in adjacent domains. http://obofoundry.org

  13. OBO Foundry Principles • Common governance (coordinating editors) • Common training • Common architecture: • simple shared top level ontology (BFO) • shared Relation Ontology: www.obofoundry.org/ro

  14. Open Biomedical Ontologies Foundry Seeks to create high quality, validated terminology modules across all of the life sciences which will be • one ontology for each domain, so no need for mappings • close to language use of experts • evidence-based • incorporate a strategy for motivating potential developers and users • revisable as science advances

  15. Principles http://obofoundry.org/wiki/index.php/OBO_FoundryPrinciples

  16. RELATION TO TIME GRANULARITY OBO Foundry coverage

  17. ORTHOGONALITY • modularity ensures • annotations can be additive • division of labor amongst domain experts • high value of training in any given module • lessons learned in one module can benefit work on other modules • incentivization of those responsible for individual modules

  18. Benefits of coordination Can more easily reuse what is made by others Can more easily inspect and criticize what is made by others Leads to innovations (e.g. Mireot strategy for importing terms into ontologies)

  19. Current Foundry members in yellow

  20. Foundry ontologies currently under review Plant Ontology (PO) Ontology for Biomedical Investigations (OBI) Ontology for General Medical Science (OBMS) Infectious Disease Ontology (IDO)

  21. Basic Formal Ontology (BFO) top level mid-level domain level OBO Foundry Modular Organization

  22. OBI • The Ontology for Biomedical Investigations • hfp://purl.org/obo/OBI_0000225

  23. Purpose of OBI • To provide a resource for the unambiguous description of the components of biomedical investigations such as the design, protocols and instrumentation, material, data and types of analysis and statistical tools applied to the data • NOT designed to model biology

  24. OBI Collaborating Communities • Crop sciences Generation Challenge Programme (GCP), • Environmental genomics MGED RSBI Group, www.mged.org/Workgroups/rsbi • Genomic Standards Consortium (GSC), www.genomics.ceh.ac.uk/genomecatalogue • HUPO Proteomics Standards Initiative (PSI), psidev.sourceforge.net • Immunology Database and Analysis Portal, www.immport.org • Immune Epitope Database and Analysis Resource (IEDB), http://www.immuneepitope.org/home.do • International Society for Analytical Cytology, http://www.isac-net.org/ • Metabolomics Standards Initiative (MSI), • Neurogenetics, Biomedical Informatics Research Network (BIRN), • Nutrigenomics MGED RSBI Group, www.mged.org/Workgroups/rsbi • Polymorphism • Toxicogenomics MGED RSBI Group, www.mged.org/Workgroups/rsbi • Transcriptomics MGED Ontology Group

  25. Ontology for General Medical Science • http://code.google.com/p/ogms/ • (OBO) http://purl.obolibrary.org/obo/ogms.obo • (OWL) http://purl.obolibrary.org/obo/ogms.owl

  26. OGMS-based initiatives • Vital Signs Ontology (VSO) (Welch Allyn) • EHR / Demographics Ontology • Infectious Disease Ontology • Mental Health Ontology • Emotion Ontology

  27. Ontology for General Medical Science • JobstLandgrebe (then Co-Chair of the HL7 Vocabulary Group): • “the best ontology effort in the whole biomedical domain by far”

  28. How to keep clear about the distinction • processes of observation, • results of such processes (measurement data) • the entities observed

  29. How is the OBO Foundry organized? • Top-Level: Basic Formal Ontology (BFO) • Mid-Level: IAO, OBI, OGMS ... • Domain-Level: Foundry Bio-Ontologies

  30. Basic Formal Ontology (BFO) top level mid-level domain level OBO Foundry Modular Organization

  31. BFO: the very top Continuant Occurrent (Process, Event) Independent Continuant Dependent Continuant

  32. RELATION TO TIME GRANULARITY obofoundry.org

  33. BFO & GO continuant occurrent biological processes independent continuant cellular component dependent continuant molecular function

  34. Basic Formal Ontology types Continuant Occurrent process, event Independent Continuant thing Dependent Continuant quality .... ..... ....... instances

  35. Experience with BFO in building ontologies provides • a community of skilled ontology developers and users (user group has 120 members) • associated logical tools • documentation for different types of users • a methodology for building conformant ontologies by starting with BFO and populating downwards

  36. Example: The Cell Ontology

  37. How to build an ontology • import BFO into ontology editor such as Protégé • work with domain experts to create an initial mid-level classification • find ~50 most commonly used terms corresponding to types in reality • arrange these terms into an informal is_a hierarchy according to this universality principle • A is_a B  every instance of A is an instance of B • fill in missing terms to give a complete hierarchy • (leave it to domain experts to populate the lower levels of the hierarchy)

  38. Users of BFO PharmaOntology (W3C HCLS SIG) MediCognos / Microsoft Healthvault Cleveland Clinic Semantic Database in Cardiothoracic Surgery Major Histocompatibility Complex (MHC) Ontology (NIAID) Neuroscience Information Framework Standard (NIFSTD) and Constituent Ontologies Interdisciplinary Prostate Ontology (IPO) Nanoparticle Ontology (NPO): Ontology for Cancer Nanotechnology Research Neural Electromagnetic Ontologies (NEMO) ChemAxiom – Ontology for Chemistry

  39. Users of BFO GO Gene Ontology CL Cell Ontology SO Sequence Ontology ChEBI Chemical Ontology PATO Phenotype (Quality) Ontology FMA Foundational Model of Anatomy Ontology ChEBI Chemical Entities of Biological Interest PRO Protein Ontology Plant Ontology Environment Ontology Ontology for Biomedical Investigations RNA Ontology

  40. Users of BFO Ontology for Risks Against Patient Safety (RAPS/REMINE) eagle-i an VIVO (NCRR) IDO Infectious Disease Ontology (NIAID) National Cancer Institute Biomedical Grid Terminology (BiomedGT) US Army Biometrics Ontology US Army Command and Control Ontology Sleep Domain Ontology Subcellular Anatomy Ontology (SAO)  Translaftional Medicine On (VO) Yeast Ontology (yOWL) Zebrafish Anatomical Ontology (ZAO)

  41. Basic Formal Ontology continuant occurrent independent continuant dependent continuant organism

  42. Continuants • continue to exist through time, preserving their identity while undergoing different sorts of changes • independent continuants – objects, things, ... • dependent continuants – qualities, attributes, shapes, potentialities ...

  43. Occurrents • processes, events, happenings • your life • this process of accelerated cell division

  44. Qualities temperature blood pressure mass ... are continuants they exist through time while undergoing changes

  45. Qualities temperature / blood pressure / mass ... are dimensions of variation within the structure of the entity a quality is something which can change while its bearer remains one and the same

More Related