slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
The CROP ( C ommon R eference O ntologies for P lants) Initiative Barry Smith PowerPoint Presentation
Download Presentation
The CROP ( C ommon R eference O ntologies for P lants) Initiative Barry Smith

The CROP ( C ommon R eference O ntologies for P lants) Initiative Barry Smith

227 Views Download Presentation
Download Presentation

The CROP ( C ommon R eference O ntologies for P lants) Initiative Barry Smith

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. The CROP (Common Reference Ontologies for Plants) Initiative Barry Smith September 13, 2013

  2. The OBO FoundryPrinciplesReference ontologies vs. application ontologiesOther ontology consortiaThe CROP InitiativeExamples of ontologies within CROP Agenda

  3. On June 22, 1799, in Paris,everything changed

  4. International System of Units

  5. How to find data? How to find other people’s data? How to reason with data when you find it? How to work out what data does not yet exist?

  6. How to solve the problem of making the data we find queryable and re-usable by others? Part of the solution must involve: standardized terminologies and coding schemes

  7. But there are multiple kinds of standardization for biological data, and they do not work well together Proposed solution: Ontology-based annotation of data

  8. ontologies = standardized labels designed for use in annotations to make the data cognitively accessible to human beings and algorithmically accessible to computers

  9. ontologies = high quality controlled structured vocabularies for the annotation (description) of data, images, journal articles …

  10. Ramirez et al. Linking of Digital Images to Phylogenetic Data Matrices Using a Morphological Ontology Syst. Biol. 56(2):283–294, 2007

  11. ontologies used in curation of literature what cellular component? what molecular function? what biological process?

  12. Proposed framework: the Semantic Web • html demonstrated the power of the Web to allow sharing of information • can we use semantic technology to create a Web 2.0 which would allow algorithmic reasoning with online information based on a common Web Ontology Language (OWL)? • can we use netcentricity, common URLs, to break down silos, and create useful integration of on-line data and information

  13. Ontology success stories, and some reasons for failure A fragment of the “Linked Open Data” in the biomedical domain


  15. The more ontology-building is successful, the more it fails OWL breaks down data silos via controlled vocabularies for the description of data dictionaries Unfortunately the very success of this approach led to the creation of multiple, new, semantic silos – because multiple ontologies are being created in ad hoc ways

  16. Many ontologies in bioportal are created by importing content from existing ontologies and giving the terms imported new names and new IDs The result is chaos, with bits and pieces of the same ontologies chopped in multiple different places. Leads to massively redundant effort, forking and doom

  17. A standard engineering methodology • It is easier to write useful software if one works with a simplified model • (“…we can’t know what reality is like in any case; we only have our concepts…”) • This looks like a useful model to me • (One week goes by:) This other thing looks like a useful model to him • Data in Pittsburgh does not interoperate with data in Vancouver • Science is siloed

  18. A good solution to this silo problem must be: • modular • incremental • independent of hardware and software • bottom-up • evidence-based • revisable • incorporate a strategy for motivating potential developers and users

  19. Uses of ‘ontology’ in PubMed abstracts

  20. main reason for GO’s success Gene Ontology and associated databases “make it possible to systematically dissect large gene lists in an attempt to assemble a summary of the most enriched and pertinent biology” PMC2615629

  21. GO provides a controlled system of terms for use in annotating (describing, tagging) data • multi-species, multi-disciplinary, open source • contributing to the cumulativity of scientific results obtained by distinct research communities • compare use of kilograms, meters, seconds in formulating experimental results

  22. GO is 3 ontologies cellular component molecular function biological process

  23. Top-Level Architecture Continuant Occurrent (Process, Event) Independent Continuant Dependent Continuant universals ..... ..... ..... instances

  24. Problem with the GO • it covers only three types of entities • no diseases • no laboratory artifacts • no anatomy (above the cell) • only species-terms for development • no phenotypes

  25. The Open Biomedical Ontologies (OBO) Foundry

  26. RELATION TO TIME GRANULARITY rationale of OBO Foundry coverage

  27. First step (2001) a shared portal for (so far) 58 ontologies (low regimentation) NCBO BioPortal

  28. OBO builds on the principles successfully implemented by the GO recognizing that ontologies need to be developed in tandem

  29. Second step (2006) The OBO Foundry

  30. Building out from the original GO

  31. RELATION TO TIME GRANULARITY initial OBO Foundry coverage

  32. OBO Foundry Principles • common formal architecture • clearly delineated content (redundant – overlaps with orthogonality) • the ontology is well-documented (– overlaps with rules for definitions; needs expanding, for developers, for users, minimal metadata) • plurality of independent users • single locus of authority, trackers, help desk

  33. OBO Foundry Principles • textual definitions plus formal definitions • all definitions should be of the genus-species form A =def. a B which Cs where B is the parent term of A in the ontology hierarchy • formal definitions use OBO format or OWL

  34. Orthogonality • For each domain, there should be convergence upon a single ontology that is recommended for use by those who wish to become involved with the Foundry initiative • Part of the goal here is to avoid the need for mappings – which are in any case too expensive, too fragile, too difficult to keep up-to-date as mapped ontologies change • Orthogonality means: • everyone knows where to look to find out how to annotate each kind of data • everyone knows where to look to find content for application ontologies

  35. Orthogonality = non-redundancy for the reference ontologies inside the Foundry • application ontologies can overlap, but then only in those areas where common coverage is supplied by a reference ontology

  36. PRINCIPLES • COMMON FORMAL ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the Basic Formal Ontology (BFO) • ‘formal’= domain neutral

  37. Basic Formal Ontology Continuant Occurrent biological process Independent Continuant Dependent Continuant cell component molecular function

  38. OBO Foundry provides guidelines (traffic laws) to new groups of ontology developers in ways which can counteract current dispersion of effort

  39. New principle: Employ the methodology of cross-products compound terms in ontologies are to be defined as cross-products of simpler terms: E.g elevated blood glucose is a cross-product of PATO: increased concentration with FMA: blood and CheBI: glucose. = factoring out of ontologies into discipline-specific modules (orthogonality)

  40. The methodology of cross-products enforcing use of common relations in linking terms drawn from Foundry ontologies serves • to ensure that the ontologies are maintained and revised in tandem • logically defined relations serve to bind terms in different ontologies together to create a network

  41. Building out from the original GO

  42. Population-level ontologies

  43. environments Environment Ontology

  44. top level mid-level domain level Basic Formal Ontology (BFO) Extension Strategy + Modular Organization