html5-img
1 / 72

Building a Suite of Biomedical Ontologies

Building a Suite of Biomedical Ontologies. Barry Smith. Problems with UMLS-style approaches. let a million ontologies bloom, each one close to the terminological habits of its authors in concordance with the “not invented here” syndrome

Download Presentation

Building a Suite of Biomedical Ontologies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building a Suite of Biomedical Ontologies Barry Smith

  2. Problems with UMLS-style approaches • let a million ontologies bloom, each one close to the terminological habits of its authors • in concordance with the “not invented here” syndrome • then map these ontologies, and use these mappings to integrate your different pots of data

  3. Mappings are hard They create an N2 problem; are fragile, and expensive to maintain Need new authorities to maintain(one for each pair of mapped ontologies), yielding new risk of forking – who will police the mappings? The goal should be to minimize the need for mappings, by avoiding redundancy in the first place – one ontology for each domain Invest resources in disjoint ontology modules which work well together – reduce need for mappings to minimum possible

  4. How to do it right? • how create an incremental, evolutionary process, where what is good survives, and what is bad fails • where the number of ontologies needing to be used together is small – integration = addition • where these ontologies are stable • by creating a scenario in which people will find it profitable to reuse ontologies, terminologies and coding systems which have been tried and tested

  5. Modularity • modularity ensures • annotations can be additive • division of labor amongst domain experts • high value of training in any given module • lessons learned in one module can benefit work on other modules • incentivization of those responsible for individual modules

  6. Reasons why GO has been successful • It is a system for prospective standardization built with coherent top level but with content contributed and monitored by domain specialists • Based on community consensus • Updated every night • Clear versioning principles ensure backwards compatibility; prior annotations do not lose their value • Initially low-tech to encourage users, with movement to more powerful formal approaches (including OWL-DL – though still proceeding caution)

  7. GO has learned the lessons of successful cooperation • Clear documentation • The terms chosen are already familiar • Fully open source (allows thorough testing in manifold combinations with other ontologies) • Subjected to considerable third-party critique • Tracker for user input and help desk with rapid turnaround

  8. GO has been amazingly successful in overcoming the data balkanization problem but it covers only generic biological entities of three sorts: • cellular components • molecular functions • biological processes no diseases, symptoms, disease biomarkers, protein interactions, experimental processes …

  9. How create a disease ontology? • One option: a flat list • One option: template approach • Cancer • Infectious Disease • Diabetes • Autoimmune Disease • To make this work: think very hard about what a disease is

  10. Aristotelian definitions • To define a term ‘A’ in an ontology identify the parent term ‘B’ and start your definition: • An A is a B which … Cs …. A = species B = genus C = differentia

  11. Cancer disease is a disease which … • Genetic disease is a disease which … • Infectious disease is a disease which …

  12. Basic Formal Ontology (BFO)

  13. Basic Formal Ontology (BFO) top level mid-level domain level OBO Foundry Modular Organization

  14. Ontology for General Medical Science • http://code.google.com/p/ogms/ • (OBO) http://purl.obolibrary.org/obo/ogms.obo • (OWL) http://purl.obolibrary.org/obo/ogms.owl

  15. OGMS-based initiatives • Vital Signs Ontology (VSO) • EHR / Demographics Ontology • Infectious Disease Ontology (IDO) • Psychology Ontology (PSY) • Emotion Ontology (PSY-EM) • … • Genetic Disease Ontology • Cancer Ontology

  16. BFO: the very top Continuant Occurrent (Process, Event) Independent Continuant Dependent Continuant

  17. BFO & GO continuant occurrent biological processes independent continuant cellular component dependent continuant molecular function

  18. Basic Formal Ontology types Continuant Occurrent process, event Independent Continuant thing Dependent Continuant quality .... ..... ....... instances

  19. Experience with BFO in building ontologies provides • a community of skilled ontology developers and users (user group has 120 members) • associated logical tools • documentation for different types of users • a methodology for building conformant ontologies by starting with BFO and populating downwards

  20. Example: The Cell Ontology

  21. How to build an ontology • import BFO into ontology editor such as Protégé • work with domain experts to create an initial mid-level classification • find ~50 most commonly used terms corresponding to types in reality • arrange these terms into an informal is_a hierarchy according to this universality principle • A is_a B  every instance of A is an instance of B • fill in missing terms to give a complete hierarchy • (leave it to domain experts to populate the lower levels of the hierarchy)

  22. Basic Formal Ontology continuant occurrent independent continuant dependent continuant organism

  23. Continuants • continue to exist through time, preserving their identity while undergoing different sorts of changes • independent continuants – objects, things, ... • dependent continuants – qualities, attributes, shapes, potentialities ...

  24. Occurrents • processes, events, happenings • your life • this process of accelerated cell division

  25. Qualities temperature blood pressure mass ... are continuants they exist through time while undergoing changes

  26. Qualities temperature / blood pressure / mass ... are dimensions of variation within the structure of the entity a quality is something which can change while its bearer remains one and the same

  27. A Chart representing how John’s temperature changes

  28. A Chart representing how John’s temperature changes

  29. John’s temperature, the temperature he has throughout his entire life, cycles through different determinate temperatures from one time to the next John’s temperature in thus changing, exerts an influence on other dimensions of variation in the physiology of the organism through time

  30. BFO: The Very Top continuant occurrent independent continuant dependent continuant quality temperature

  31. Blinding Flash of the Obvious independent continuant dependent continuant quality organism temperature types instances John John’s temperature

  32. Blinding Flash of the Obvious independent continuant dependent continuant quality organism temperature types instances John John’s temperature

  33. Blinding Flash of the Obvious inheres_in . organism temperature types instances John John’s temperature

  34. temperature types 37ºC 37.1ºC 37.2ºC 37.3ºC 37.4ºC 37.5ºC instantiates at t1 instantiates at t2 instantiates at t3 instantiates at t4 instantiates at t5 instantiates at t6 John’s temperature instances

  35. human types embryo fetus neonate infant child adult instantiates at t1 instantiates at t2 instantiates at t3 instantiates at t4 instantiates at t5 instantiates at t6 John instances

  36. whole plant continuants mature whole plant zygote pro-embryo globular embryo bilateral embryo ... fertili-zation first cell division becomes reproductively able occurrents

  37. child transformation_of fetus

  38. Temperature subtypesDevelopment-stage subtypes are threshold divisions (hence we do not have sharp boundaries, and we have a certain degree of choice, e.g. in how many subtypes to distinguish, though not in their ordering)

  39. independent continuant dependent continuant quality organism temperature types instances John John’s temperature

  40. independent continuant dependent continuant occurrent process quality organism course of temperature changes temperature John John’s temperature John’s temperature history

  41. independent continuant dependent continuant occurrent process quality organism life of an organism temperature John John’s temperature John’s life

  42. BFO: The Very Top continuant occurrent independent continuant dependent continuant quality disposition

  43. BFO: The Very Top continuant occurrent independent continuant dependent continuant quality function role disposition

  44. disposition - of a glass vase, to shatter if dropped - of a human, to eat - of a banana, to ripen - of John, to lose hair

  45. disposition if it ceases to exist, then its bearer and/or its immediate surrounding environment is physically changed its realization occurs when its bearer is in some special physical circumstances its realization is what it is in virtue of the bearer’s physical make-up

  46. function - of liver: to store glycogen- of birth canal: to enable transport- of eye: to see- of mitochondrion: to produce ATPnot optional; reflection of physical makeup of bearer; subtype of disposition

  47. independent continuant dependent continuant occurrent process function eye process of seeing to see John’s eye function of John’s eye: to see John seeing

  48. OGMS Ontology for General Medical Science http://code.google.com/p/ogms

  49. Physical Disorder

More Related