Basic building blocks for biomedical ontologies
Download
1 / 94

Basic Building Blocks for Biomedical Ontologies - PowerPoint PPT Presentation


  • 117 Views
  • Uploaded on

Basic Building Blocks for Biomedical Ontologies. Barry Smith. Problems with UMLS-style approaches. let a million ontologies bloom, each one close to the terminological habits of its authors in concordance with the “not invented here” syndrome

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Basic Building Blocks for Biomedical Ontologies' - maxwell-rasmussen


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Problems with umls style approaches
Problems with UMLS-style approaches

  • let a million ontologies bloom, each one close to the terminological habits of its authors

  • in concordance with the “not invented here” syndrome

  • then map these ontologies, and use these mappings to integrate your different pots of data


Mappings are hard
Mappings are hard

They create an N2 problem; are fragile, and expensive to maintain

Need new authorities to maintain(one for each pair of mapped ontologies), yielding new risk of forking – who will police the mappings?

The goal should be to minimize the need for mappings, by avoiding redundancy in the first place – one ontology for each domain

Invest resources in disjoint ontology modules which work well together – reduce need for mappings to minimum possible


Why should you care
Why should you care?

  • you need to create systems for data mining and text processing which will yield useful digitally coded output

  • if the codes you use are constantly in need of ad hoc repair huge, resources will be wasted

  • serious investment in annotation will be defeated from the start

  • relevant data will not be found, because it will be lost in multiple semantic cemeteries


How to do it right
How to do it right?

  • how create an incremental, evolutionary process, where what is good survives, and what is bad fails

  • where the number of ontologies needing to be used together is small – integration = addition

  • where these ontologies are stable

  • by creating a scenario in which people will find it profitable to reuse ontologies, terminologies and coding systems which have been tried and tested


Reasons why go has been successful
Reasons why GO has been successful

  • It is a system for prospective standardization built with coherent top level but with content contributed and monitored by domain specialists

  • Based on community consensus

  • Updated every night

  • Clear versioning principles ensure backwards compatibility; prior annotations do not lose their value

  • Initially low-tech to encourage users, with movement to more powerful formal approaches (including OWL-DL – though still proceeding caution)


Go has learned the lessons of successful cooperation
GO has learned the lessons of successful cooperation

  • Clear documentation

  • The terms chosen are already familiar

  • Fully open source (allows thorough testing in manifold combinations with other ontologies)

  • Subjected to considerable third-party critique

  • Tracker for user input and help desk with rapid turnaround


Go has been amazingly successful in overcoming the data balkanization problem
GO has been amazingly successful in overcoming the data balkanization problem

but it covers only generic biological entities of three sorts:

  • cellular components

  • molecular functions

  • biological processes

    no diseases, symptoms, disease biomarkers, protein interactions, experimental processes …


OBO (Open Biomedical Ontology) Foundry proposal balkanization problem

(Gene Ontology in yellow)


Environment Ontology (ENVO) balkanization problem

Environment Ontology


Population-level ontologies balkanization problem


The obo foundry a step by step evidence based approach to expanding the go
The OBO Foundry: a step-by-step, evidence-based approach to expanding the GO

  • Developers commit to working to ensure that, for each domain, there is community convergence on a single ontology

  • and agree in advance to collaboratewith developers of ontologies in adjacent domains.

    http://obofoundry.org


Obo foundry principles
OBO Foundry Principles expanding the GO

  • Common governance (coordinating editors)

  • Common training

  • Common architecture:

    • simple shared top level ontology (BFO)

    • shared Relation Ontology: www.obofoundry.org/ro


Open biomedical ontologies foundry
Open Biomedical expanding the GOOntologies Foundry

Seeks to create high quality, validated terminology modules across all of the life sciences which will be

  • one ontology for each domain, so no need for mappings

  • close to language use of experts

  • evidence-based

  • incorporate a strategy for motivating potential developers and users

  • revisable as science advances


Principles
Principles expanding the GO

http://obofoundry.org/wiki/index.php/OBO_FoundryPrinciples


RELATION TO TIME expanding the GO

GRANULARITY

OBO Foundry coverage


Orthogonality
ORTHOGONALITY expanding the GO

  • modularity ensures

    • annotations can be additive

    • division of labor amongst domain experts

    • high value of training in any given module

    • lessons learned in one module can benefit work on other modules

    • incentivization of those responsible for individual modules


Benefits of coordination
Benefits of coordination expanding the GO

Can more easily reuse what is made by others

Can more easily inspect and criticize what is made by others

Leads to innovations (e.g. Mireot strategy for importing terms into ontologies)



Foundry ontologies currently under review expanding the GO

Plant Ontology (PO)

Ontology for Biomedical Investigations (OBI)

Ontology for General Medical Science (OBMS)

Infectious Disease Ontology (IDO)


Basic Formal Ontology (BFO) expanding the GO

top level

mid-level

domain level

OBO Foundry Modular Organization


OBI expanding the GO

  • The Ontology for Biomedical Investigations

  • hfp://purl.org/obo/OBI_0000225


Purpose of obi
Purpose of OBI expanding the GO

  • To provide a resource for the unambiguous description of the components of biomedical investigations such as the design, protocols and instrumentation, material, data and types of analysis and statistical tools applied to the data

    • NOT designed to model biology


Obi collaborating communities
OBI Collaborating Communities expanding the GO

  • Crop sciences Generation Challenge Programme (GCP),

  • Environmental genomics MGED RSBI Group, www.mged.org/Workgroups/rsbi

  • Genomic Standards Consortium (GSC), www.genomics.ceh.ac.uk/genomecatalogue

  • HUPO Proteomics Standards Initiative (PSI), psidev.sourceforge.net

  • Immunology Database and Analysis Portal, www.immport.org

  • Immune Epitope Database and Analysis Resource (IEDB), http://www.immuneepitope.org/home.do

  • International Society for Analytical Cytology, http://www.isac-net.org/

  • Metabolomics Standards Initiative (MSI),

  • Neurogenetics, Biomedical Informatics Research Network (BIRN),

  • Nutrigenomics MGED RSBI Group, www.mged.org/Workgroups/rsbi

  • Polymorphism

  • Toxicogenomics MGED RSBI Group, www.mged.org/Workgroups/rsbi

  • Transcriptomics MGED Ontology Group


Ontology for general medical science
Ontology for General Medical Science expanding the GO

  • http://code.google.com/p/ogms/

  • (OBO) http://purl.obolibrary.org/obo/ogms.obo

  • (OWL) http://purl.obolibrary.org/obo/ogms.owl


Ogms based initiatives
OGMS-based initiatives expanding the GO

  • Vital Signs Ontology (VSO) (Welch Allyn)

  • EHR / Demographics Ontology

  • Infectious Disease Ontology

  • Mental Health Ontology

  • Emotion Ontology


Ontology for general medical science1
Ontology for General Medical Science expanding the GO

  • JobstLandgrebe (then Co-Chair of the HL7 Vocabulary Group):

  • “the best ontology effort in the whole biomedical domain by far”



How is the obo foundry organized
How is the OBO Foundry organized? expanding the GO

  • Top-Level: Basic Formal Ontology (BFO)

  • Mid-Level: IAO, OBI, OGMS ...

  • Domain-Level: Foundry Bio-Ontologies


Basic Formal Ontology (BFO) expanding the GO

top level

mid-level

domain level

OBO Foundry Modular Organization


Bfo the very top
BFO: the very top expanding the GO

Continuant

Occurrent

(Process, Event)

Independent

Continuant

Dependent

Continuant


RELATION expanding the GO

TO TIME

GRANULARITY

obofoundry.org


Bfo go
BFO & GO expanding the GO

continuant

occurrent

biological

processes

independent

continuant

cellular

component

dependent

continuant

molecular

function


Basic formal ontology
Basic Formal Ontology expanding the GO

types

Continuant

Occurrent

process, event

Independent

Continuant

thing

Dependent

Continuant

quality

.... ..... .......

instances


Experience with bfo in building ontologies provides
Experience with BFO in expanding the GObuilding ontologies provides

  • a community of skilled ontology developers and users (user group has 120 members)

  • associated logical tools

  • documentation for different types of users

  • a methodology for building conformant ontologies by starting with BFO and populating downwards


Example the cell ontology
Example: The Cell expanding the GOOntology


How to build an ontology
How to build an ontology expanding the GO

  • import BFO into ontology editor such as Protégé

  • work with domain experts to create an initial mid-level classification

  • find ~50 most commonly used terms corresponding to types in reality

  • arrange these terms into an informal is_a hierarchy according to this universality principle

  • A is_a B  every instance of A is an instance of B

  • fill in missing terms to give a complete hierarchy

  • (leave it to domain experts to populate the lower levels of the hierarchy)


Users of bfo
Users of BFO expanding the GO

PharmaOntology (W3C HCLS SIG)

MediCognos / Microsoft Healthvault

Cleveland Clinic Semantic Database in Cardiothoracic Surgery

Major Histocompatibility Complex (MHC) Ontology (NIAID)

Neuroscience Information Framework Standard (NIFSTD) and Constituent Ontologies

Interdisciplinary Prostate Ontology (IPO)

Nanoparticle Ontology (NPO): Ontology for Cancer Nanotechnology Research

Neural Electromagnetic Ontologies (NEMO)

ChemAxiom – Ontology for Chemistry


Users of bfo1
Users of BFO expanding the GO

GO Gene Ontology

CL Cell Ontology

SO Sequence Ontology

ChEBI Chemical Ontology

PATO Phenotype (Quality) Ontology

FMA Foundational Model of Anatomy Ontology

ChEBI Chemical Entities of Biological Interest

PRO Protein Ontology

Plant Ontology

Environment Ontology

Ontology for Biomedical Investigations

RNA Ontology


Users of bfo2
Users of BFO expanding the GO

Ontology for Risks Against Patient Safety (RAPS/REMINE)

eagle-i an VIVO (NCRR)

IDO Infectious Disease Ontology (NIAID)

National Cancer Institute Biomedical Grid Terminology (BiomedGT)

US Army Biometrics Ontology

US Army Command and Control Ontology

Sleep Domain Ontology

Subcellular Anatomy Ontology (SAO) 

Translaftional Medicine On (VO)

Yeast Ontology (yOWL)

Zebrafish Anatomical Ontology (ZAO)


Basic formal ontology1
Basic Formal Ontology expanding the GO

continuant

occurrent

independent

continuant

dependent

continuant

organism


Continuants
Continuants expanding the GO

  • continue to exist through time, preserving their identity while undergoing different sorts of changes

  • independent continuants – objects, things, ...

  • dependent continuants – qualities, attributes, shapes, potentialities ...


Occurrents
Occurrents expanding the GO

  • processes, events, happenings

    • your life

    • this process of accelerated cell division


Qualities
Qualities expanding the GO

temperature

blood pressure

mass

...

are continuants

they exist through time while undergoing changes


Qualities1
Qualities expanding the GO

temperature / blood pressure / mass ...

are dimensions of variation within the structure of the entity

a quality is something which can change while its bearer remains one and the same



A chart representing how john s temperature changes1
A Chart representing how expanding the GOJohn’s temperature changes


Bfo the very top1
BFO: The Very Top expanding the GO

continuant

occurrent

independent

continuant

dependent

continuant

quality

temperature


Blinding flash of the obvious
Blinding Flash of the Obvious expanding the GO

independent

continuant

dependent

continuant

quality

organism

temperature

types

instances

John

John’s

temperature


Blinding flash of the obvious1
Blinding Flash of the Obvious expanding the GO

independent

continuant

dependent

continuant

quality

organism

temperature

types

instances

John

John’s

temperature


Blinding flash of the obvious2
Blinding Flash of the Obvious expanding the GO

inheres_in

.

organism

temperature

types

instances

John

John’s

temperature


temperature expanding the GO

types

37ºC

37.1ºC

37.2ºC

37.3ºC

37.4ºC

37.5ºC

instantiates at t1

instantiates at t2

instantiates at t3

instantiates at t4

instantiates at t5

instantiates at t6

John’s temperature

instances


human expanding the GO

types

embryo

fetus

neonate

infant

child

adult

instantiates at t1

instantiates at t2

instantiates at t3

instantiates at t4

instantiates at t5

instantiates at t6

John

instances


Temperature subtypes development stage subtypes
Temperature subtypes expanding the GODevelopment-stage subtypes

are threshold divisions (hence we do not have sharp boundaries, and we have a certain degree of choice, e.g. in how many subtypes to distinguish, though not in their ordering)


independent expanding the GO

continuant

dependent

continuant

quality

organism

temperature

types

instances

John

John’s

temperature


independent expanding the GO

continuant

dependent

continuant

occurrent

process

quality

organism

course of temperature changes

temperature

John

John’s

temperature

John’s

temperature history


independent expanding the GO

continuant

dependent

continuant

occurrent

process

quality

organism

life of an organism

temperature

John

John’s

temperature

John’s

life


Bfo the very top2
BFO: The Very Top expanding the GO

continuant

occurrent

independent

continuant

dependent

continuant

quality

disposition


Bfo the very top3
BFO: The Very Top expanding the GO

continuant

occurrent

independent

continuant

dependent

continuant

quality

function

role

disposition


Disposition
disposition expanding the GO

- of a glass vase, to shatter if dropped

- of a human, to eat

- of a banana, to ripen

- of John, to lose hair


Disposition1
disposition expanding the GO

if it ceases to exist, then its bearer and/or its immediate surrounding environment is physically changed

its realization occurs when its bearer is in some special physical circumstances

its realization is what it is in virtue of the bearer’s physical make-up


independent expanding the GO

continuant

dependent

continuant

occurrent

process

function

eye

process of seeing

to see

John’s eye

function of John’s

eye: to see

John seeing


OGMS expanding the GO

Ontology for General Medical Science

http://code.google.com/p/ogms


Ontology of general medical science ogms
Ontology of General Medical Science (OGMS) expanding the GO

  • ontology for the representation of

    • diseases, signs, symptoms

    • clinical processes

    • diagnosis, treatment and outcomes

  • fundamental idea:

    • a disease is a disposition rooted in some (physical) disorder in the organism


Motivation
Motivation expanding the GO

  • Clarity about:

    • disease etiology and progression

    • disease and the diagnostic process

    • phenotype and signs/symptoms

    • entities in reality and observations of sucn entities


Physical disorder
Physical Disorder expanding the GO


Physical disorder1
Physical Disorder expanding the GO

– independent continuant

fiat object part

A causally linked combination of physical components of the extended organism that is clinically abnormal.


Clinically abnormal
Clinically abnormal expanding the GO

  • (1) not part of the life plan for an organism of the relevant type (unlike aging or pregnancy),

  • (2) causally linked to an elevated risk either of pain or other feelings of illness, or of death or dysfunction, and

  • (3) such that the elevated risk exceeds a certain threshold level.*

    *Compare: baldness


Big picture
Big Picture expanding the GO


Pathological process
Pathological Process expanding the GO

=def. A bodily process that is a manifestation of a disorder and is clinically abnormal.

Disease =def. – A disposition to undergo pathological processes that exists in an organism because of one or more disorders in that organism.


Cirrhosis environmental exposure
Cirrhosis - environmental exposure expanding the GO

  • Etiological process - phenobarbitol-induced hepatic cell death

    • produces

  • Disorder - necrotic liver

    • bears

  • Disposition (disease) - cirrhosis

    • realized_in

  • Pathological process - abnormal tissue repair with cell proliferation and fibrosis that exceed a certain threshold; hypoxia-induced cell death

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms - fatigue, anorexia

  • Signs - jaundice, enlarged spleen


Dispositions and predispositions
Dispositions and Predispositions expanding the GO

All diseases are dispositions; not all dispositions are diseases.

Predisposition to Disease

=def.– A disposition in an organism that constitutes an increased risk of the organism’s subsequently developing some disease.


Hnpcc genetic pre disposition
HNPCC - genetic pre-disposition expanding the GO

  • Etiological process - inheritance of a mutant mismatch repair gene

    • produces

  • Disorder - chromosome 3 with abnormal hMLH1

    • bears

  • Disposition (disease) - Lynch syndrome

    • realized_in

  • Pathological process - abnormal repair of DNA mismatches

    • produces

  • Disorder - mutations in proto-oncogenes and tumor suppressor genes with microsatellite repeats (e.g. TGF-beta R2)

    • bears

  • Disposition (disease) - non-polyposis colon cancer

    • realized in

  • Symptoms (including pain)


Huntington s disease genetic
Huntington’s Disease - genetic expanding the GO

  • Symptoms & Signs

    • used_in

  • Interpretive process

    • produces

  • Hypothesis - rule out Huntington’s

    • suggests

  • Laboratory tests

    • produces

  • Test results - molecular detection of the HTT gene with >39CAG repeats

    • used_in

  • Interpretive process

    • produces

  • Result - diagnosis that patient X has a disorder that bears the disease Huntington’s disease

  • Etiological process - inheritance of >39 CAG repeats in the HTT gene

    • produces

  • Disorder - chromosome 4 with abnormal mHTT

    • bears

  • Disposition (disease) - Huntington’s disease

    • realized_in

  • Pathological process - accumulation of mHTT protein fragments, abnormal transcription regulation, neuronal cell death in striatum

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms - anxiety, depression

  • Signs - difficulties in speaking and swallowing


Hnpcc genetic pre disposition1
HNPCC - genetic pre-disposition expanding the GO

  • Etiological process - inheritance of a mutant mismatch repair gene

    • produces

  • Disorder - chromosome 3 with abnormal hMLH1

    • bears

  • Disposition (disease) - Lynch syndrome

    • realized_in

  • Pathological process - abnormal repair of DNA mismatches

    • produces

  • Disorder - mutations in proto-oncogenes and tumor suppressor genes with microsatellite repeats (e.g. TGF-beta R2)

    • bears

  • Disposition (disease) - non-polyposis colon cancer


Cirrhosis environmental exposure1
Cirrhosis - environmental exposure expanding the GO

  • Symptoms & Signs

    • used_in

  • Interpretive process

    • produces

  • Hypothesis - rule out cirrhosis

    • suggests

  • Laboratory tests

    • produces

  • Test results - elevated liver enzymes in serum

    • used_in

  • Interpretive process

    • produces

  • Result - diagnosis that patient X has a disorder that bears the disease cirrhosis

  • Etiological process - phenobarbitol-induced hepatic cell death

    • produces

  • Disorder - necrotic liver

    • bears

  • Disposition (disease) - cirrhosis

    • realized_in

  • Pathological process - abnormal tissue repair with cell proliferation and fibrosis that exceed a certain threshold; hypoxia-induced cell death

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms - fatigue, anorexia

  • Signs - jaundice, splenomegaly


Systemic arterial hypertension
Systemic arterial hypertension expanding the GO

  • Symptoms & Signs

    • used_in

  • Interpretive process

    • produces

  • Hypothesis - rule out hypertension

    • suggests

  • Laboratory tests

    • produces

  • Test results -

    • used_in

  • Interpretive process

    • produces

  • Result - diagnosis that patient X has a disorder that bears the disease hypertension

  • Etiological process – abnormal reabsorption of NaCl by the kidney

    • produces

  • Disorder – abnormally large scattered molecular aggregate of salt in the blood

    • bears

  • Disposition (disease) - hypertension

    • realized_in

  • Pathological process – exertion of abnormal pressure against arterial wall

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms - headaches, dizziness

  • Signs – elevated blood pressure


Type 2 diabetes mellitus
Type 2 Diabetes Mellitus expanding the GO

  • Symptoms & Signs

    • used_in

  • Interpretive process

    • produces

  • Hypothesis - rule out diabetes mellitus

    • suggests

  • Laboratory tests – fasting serum blood glucose, oral glucose challenge test, and/or blood hemoglobin A1c

    • produces

  • Test results -

    • used_in

  • Interpretive process

    • produces

  • Result - diagnosis that patient X has a disorder that bears the disease type 2 diabetes mellitus

  • Etiological process –

    • produces

  • Disorder – abnormal pancreatic beta cells and abnormal muscle/fat cells

    • bears

  • Disposition (disease) – diabetes mellitus

    • realized_in

  • Pathological processes – diminished insulin production , diminished muscle/fat uptake of glucose

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms – polydipsia, polyuria, polyphagia, blurred vision

  • Signs – elevated blood glucose and hemoglobin A1c


Type 1 hypersensitivity to penicillin
Type 1 hypersensitivity to penicillin expanding the GO

  • Symptoms & Signs

    • used_in

  • Interpretive process

    • produces

  • Hypothesis -

    • suggests

  • Laboratory tests –

    • produces

  • Test results – occasionally, skin testing

    • used_in

  • Interpretive process

    • produces

  • Result - diagnosis that patient X has a disorder that bears the disease type 1 hypersensitivity to penicillin

  • Etiological process – sensitizing of mast cells and basophils during exposure to penicillin-class substance

    • produces

  • Disorder – mast cells and basophils with epitope-specific IgE bound to Fc epsilon receptor I

    • bears

  • Disposition (disease) – type I hypersensitivity

    • realized_in

  • Pathological process – type I hypersensitivity reaction

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms – pruritis, shortness of breath

  • Signs – rash, urticaria, anaphylaxis


Disease vs disease course
Disease vs. Disease course expanding the GO

Disease =def. – A disposition to undergo pathological processes that exists in an organism because of one or more disorders in that organism.

Disease course =def. – The aggregate of processes in which a disease disposition is realized.


coronary heart disease expanding the GO

disease associated with early lesions and small fibrous plaques

disease associated with asymptomatic (‘silent’) infarction

disease associated with surface disruption of plaque

unstable angina

stable angina

instantiates at t1

instantiates at t2

instantiates at t3

instantiates at t4

instantiates at t5

John’s coronary heart disease

time


independent expanding the GO

continuant

dependent

continuant

occurrent

process

disposition

disorder

course of disease

disease

John’s

disordered

heart

John’s

coronary heart

disease

course of John’s

disease


Examples of ontology terms
Examples of ontology terms expanding the GO


Ido infectious disease ontology core
IDO (Infectious Disease Ontology) Core expanding the GO

Follows GO strategy of providing a canonical ontology of what is involved in every infectious disease – host, pathogen, vector, virulence, vaccine, transmission – accompanied by IDO Extensions for specific diseases, pathogens and vectors

Provides common terminology resources and tested common guidelines for a vast array of different disease communities


Infectious disease ontology consortium
Infectious Disease Ontology Consortium expanding the GO

  • MITRE, Mount Sinai, UTSouthwestern – Influenza

  • IMBB/VectorBase – Vector borne diseases (A. gambiae, A. aegypti, I. scapularis, C. pipiens, P. humanus)

  • Colorado State University – Dengue Fever

  • Duke University – Tuberculosis, Staph. aureus

  • Cleveland Clinic – Infective Endocarditis

  • University of Michigan – Brucellosis

  • Duke University, University at Buffalo – HIV


Influenza infectious
Influenza - infectious expanding the GO

  • Etiological process - infection of airway epithelial cells with influenza virus

    • produces

  • Disorder - viable cells with influenza virus

    • bears

  • Disposition (disease) - flu

    • realized_in

  • Pathological process - acute inflammation

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms - weakness, dizziness

  • Signs - fever


Influenza disease course
Influenza – disease course expanding the GO

  • Etiological process - infection of airway epithelial cells with influenza virus

    • produces

  • Disorder - viable cells with influenza virus

    • bears

  • Disposition (disease) - flu

    • realized_in

  • Pathological process - acute inflammation

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms - weakness, dizziness

  • Signs - fever

The disorder also induces normal physiological processes (immune response) that can results in the elimination of the disorder (transient disease course).


Big picture1
Big Picture expanding the GO


ad