Basic building blocks for biomedical ontologies
This presentation is the property of its rightful owner.
Sponsored Links
1 / 94

Basic Building Blocks for Biomedical Ontologies PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on
  • Presentation posted in: General

Basic Building Blocks for Biomedical Ontologies. Barry Smith. Problems with UMLS-style approaches. let a million ontologies bloom, each one close to the terminological habits of its authors in concordance with the “not invented here” syndrome

Download Presentation

Basic Building Blocks for Biomedical Ontologies

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Basic building blocks for biomedical ontologies

Basic Building Blocks for Biomedical Ontologies

Barry Smith


Problems with umls style approaches

Problems with UMLS-style approaches

  • let a million ontologies bloom, each one close to the terminological habits of its authors

  • in concordance with the “not invented here” syndrome

  • then map these ontologies, and use these mappings to integrate your different pots of data


Mappings are hard

Mappings are hard

They create an N2 problem; are fragile, and expensive to maintain

Need new authorities to maintain(one for each pair of mapped ontologies), yielding new risk of forking – who will police the mappings?

The goal should be to minimize the need for mappings, by avoiding redundancy in the first place – one ontology for each domain

Invest resources in disjoint ontology modules which work well together – reduce need for mappings to minimum possible


Why should you care

Why should you care?

  • you need to create systems for data mining and text processing which will yield useful digitally coded output

  • if the codes you use are constantly in need of ad hoc repair huge, resources will be wasted

  • serious investment in annotation will be defeated from the start

  • relevant data will not be found, because it will be lost in multiple semantic cemeteries


How to do it right

How to do it right?

  • how create an incremental, evolutionary process, where what is good survives, and what is bad fails

  • where the number of ontologies needing to be used together is small – integration = addition

  • where these ontologies are stable

  • by creating a scenario in which people will find it profitable to reuse ontologies, terminologies and coding systems which have been tried and tested


Reasons why go has been successful

Reasons why GO has been successful

  • It is a system for prospective standardization built with coherent top level but with content contributed and monitored by domain specialists

  • Based on community consensus

  • Updated every night

  • Clear versioning principles ensure backwards compatibility; prior annotations do not lose their value

  • Initially low-tech to encourage users, with movement to more powerful formal approaches (including OWL-DL – though still proceeding caution)


Go has learned the lessons of successful cooperation

GO has learned the lessons of successful cooperation

  • Clear documentation

  • The terms chosen are already familiar

  • Fully open source (allows thorough testing in manifold combinations with other ontologies)

  • Subjected to considerable third-party critique

  • Tracker for user input and help desk with rapid turnaround


Go has been amazingly successful in overcoming the data balkanization problem

GO has been amazingly successful in overcoming the data balkanization problem

but it covers only generic biological entities of three sorts:

  • cellular components

  • molecular functions

  • biological processes

    no diseases, symptoms, disease biomarkers, protein interactions, experimental processes …


Basic building blocks for biomedical ontologies

OBO (Open Biomedical Ontology) Foundry proposal

(Gene Ontology in yellow)


Basic building blocks for biomedical ontologies

Environment Ontology (ENVO)

Environment Ontology


Basic building blocks for biomedical ontologies

Population-level ontologies


The obo foundry a step by step evidence based approach to expanding the go

The OBO Foundry: a step-by-step, evidence-based approach to expanding the GO

  • Developers commit to working to ensure that, for each domain, there is community convergence on a single ontology

  • and agree in advance to collaboratewith developers of ontologies in adjacent domains.

    http://obofoundry.org


Obo foundry principles

OBO Foundry Principles

  • Common governance (coordinating editors)

  • Common training

  • Common architecture:

    • simple shared top level ontology (BFO)

    • shared Relation Ontology: www.obofoundry.org/ro


Open biomedical ontologies foundry

Open Biomedical Ontologies Foundry

Seeks to create high quality, validated terminology modules across all of the life sciences which will be

  • one ontology for each domain, so no need for mappings

  • close to language use of experts

  • evidence-based

  • incorporate a strategy for motivating potential developers and users

  • revisable as science advances


Principles

Principles

http://obofoundry.org/wiki/index.php/OBO_FoundryPrinciples


Basic building blocks for biomedical ontologies

RELATION TO TIME

GRANULARITY

OBO Foundry coverage


Orthogonality

ORTHOGONALITY

  • modularity ensures

    • annotations can be additive

    • division of labor amongst domain experts

    • high value of training in any given module

    • lessons learned in one module can benefit work on other modules

    • incentivization of those responsible for individual modules


Benefits of coordination

Benefits of coordination

Can more easily reuse what is made by others

Can more easily inspect and criticize what is made by others

Leads to innovations (e.g. Mireot strategy for importing terms into ontologies)


Basic building blocks for biomedical ontologies

Current Foundry members in yellow


Basic building blocks for biomedical ontologies

Foundry ontologies currently under review

Plant Ontology (PO)

Ontology for Biomedical Investigations (OBI)

Ontology for General Medical Science (OBMS)

Infectious Disease Ontology (IDO)


Basic building blocks for biomedical ontologies

Basic Formal Ontology (BFO)

top level

mid-level

domain level

OBO Foundry Modular Organization


Basic building blocks for biomedical ontologies

OBI

  • The Ontology for Biomedical Investigations

  • hfp://purl.org/obo/OBI_0000225


Purpose of obi

Purpose of OBI

  • To provide a resource for the unambiguous description of the components of biomedical investigations such as the design, protocols and instrumentation, material, data and types of analysis and statistical tools applied to the data

    • NOT designed to model biology


Obi collaborating communities

OBI Collaborating Communities

  • Crop sciences Generation Challenge Programme (GCP),

  • Environmental genomics MGED RSBI Group, www.mged.org/Workgroups/rsbi

  • Genomic Standards Consortium (GSC), www.genomics.ceh.ac.uk/genomecatalogue

  • HUPO Proteomics Standards Initiative (PSI), psidev.sourceforge.net

  • Immunology Database and Analysis Portal, www.immport.org

  • Immune Epitope Database and Analysis Resource (IEDB), http://www.immuneepitope.org/home.do

  • International Society for Analytical Cytology, http://www.isac-net.org/

  • Metabolomics Standards Initiative (MSI),

  • Neurogenetics, Biomedical Informatics Research Network (BIRN),

  • Nutrigenomics MGED RSBI Group, www.mged.org/Workgroups/rsbi

  • Polymorphism

  • Toxicogenomics MGED RSBI Group, www.mged.org/Workgroups/rsbi

  • Transcriptomics MGED Ontology Group


Ontology for general medical science

Ontology for General Medical Science

  • http://code.google.com/p/ogms/

  • (OBO) http://purl.obolibrary.org/obo/ogms.obo

  • (OWL) http://purl.obolibrary.org/obo/ogms.owl


Ogms based initiatives

OGMS-based initiatives

  • Vital Signs Ontology (VSO) (Welch Allyn)

  • EHR / Demographics Ontology

  • Infectious Disease Ontology

  • Mental Health Ontology

  • Emotion Ontology


Ontology for general medical science1

Ontology for General Medical Science

  • JobstLandgrebe (then Co-Chair of the HL7 Vocabulary Group):

  • “the best ontology effort in the whole biomedical domain by far”


Basic building blocks for biomedical ontologies

  • How to keep clear about the distinction

    • processes of observation,

    • results of such processes (measurement data)

    • the entities observed


How is the obo foundry organized

How is the OBO Foundry organized?

  • Top-Level: Basic Formal Ontology (BFO)

  • Mid-Level: IAO, OBI, OGMS ...

  • Domain-Level: Foundry Bio-Ontologies


Basic building blocks for biomedical ontologies

Basic Formal Ontology (BFO)

top level

mid-level

domain level

OBO Foundry Modular Organization


Bfo the very top

BFO: the very top

Continuant

Occurrent

(Process, Event)

Independent

Continuant

Dependent

Continuant


Basic building blocks for biomedical ontologies

RELATION

TO TIME

GRANULARITY

obofoundry.org


Bfo go

BFO & GO

continuant

occurrent

biological

processes

independent

continuant

cellular

component

dependent

continuant

molecular

function


Basic formal ontology

Basic Formal Ontology

types

Continuant

Occurrent

process, event

Independent

Continuant

thing

Dependent

Continuant

quality

.... ..... .......

instances


Experience with bfo in building ontologies provides

Experience with BFO in building ontologies provides

  • a community of skilled ontology developers and users (user group has 120 members)

  • associated logical tools

  • documentation for different types of users

  • a methodology for building conformant ontologies by starting with BFO and populating downwards


Example the cell ontology

Example: The Cell Ontology


How to build an ontology

How to build an ontology

  • import BFO into ontology editor such as Protégé

  • work with domain experts to create an initial mid-level classification

  • find ~50 most commonly used terms corresponding to types in reality

  • arrange these terms into an informal is_a hierarchy according to this universality principle

  • A is_a B  every instance of A is an instance of B

  • fill in missing terms to give a complete hierarchy

  • (leave it to domain experts to populate the lower levels of the hierarchy)


Users of bfo

Users of BFO

PharmaOntology (W3C HCLS SIG)

MediCognos / Microsoft Healthvault

Cleveland Clinic Semantic Database in Cardiothoracic Surgery

Major Histocompatibility Complex (MHC) Ontology (NIAID)

Neuroscience Information Framework Standard (NIFSTD) and Constituent Ontologies

Interdisciplinary Prostate Ontology (IPO)

Nanoparticle Ontology (NPO): Ontology for Cancer Nanotechnology Research

Neural Electromagnetic Ontologies (NEMO)

ChemAxiom – Ontology for Chemistry


Users of bfo1

Users of BFO

GO Gene Ontology

CL Cell Ontology

SO Sequence Ontology

ChEBI Chemical Ontology

PATO Phenotype (Quality) Ontology

FMA Foundational Model of Anatomy Ontology

ChEBI Chemical Entities of Biological Interest

PRO Protein Ontology

Plant Ontology

Environment Ontology

Ontology for Biomedical Investigations

RNA Ontology


Users of bfo2

Users of BFO

Ontology for Risks Against Patient Safety (RAPS/REMINE)

eagle-i an VIVO (NCRR)

IDO Infectious Disease Ontology (NIAID)

National Cancer Institute Biomedical Grid Terminology (BiomedGT)

US Army Biometrics Ontology

US Army Command and Control Ontology

Sleep Domain Ontology

Subcellular Anatomy Ontology (SAO) 

Translaftional Medicine On (VO)

Yeast Ontology (yOWL)

Zebrafish Anatomical Ontology (ZAO)


Basic formal ontology1

Basic Formal Ontology

continuant

occurrent

independent

continuant

dependent

continuant

organism


Continuants

Continuants

  • continue to exist through time, preserving their identity while undergoing different sorts of changes

  • independent continuants – objects, things, ...

  • dependent continuants – qualities, attributes, shapes, potentialities ...


Occurrents

Occurrents

  • processes, events, happenings

    • your life

    • this process of accelerated cell division


Qualities

Qualities

temperature

blood pressure

mass

...

are continuants

they exist through time while undergoing changes


Qualities1

Qualities

temperature / blood pressure / mass ...

are dimensions of variation within the structure of the entity

a quality is something which can change while its bearer remains one and the same


A chart representing how john s temperature changes

A Chart representing how John’s temperature changes


A chart representing how john s temperature changes1

A Chart representing how John’s temperature changes


Bfo the very top1

BFO: The Very Top

continuant

occurrent

independent

continuant

dependent

continuant

quality

temperature


Blinding flash of the obvious

Blinding Flash of the Obvious

independent

continuant

dependent

continuant

quality

organism

temperature

types

instances

John

John’s

temperature


Blinding flash of the obvious1

Blinding Flash of the Obvious

independent

continuant

dependent

continuant

quality

organism

temperature

types

instances

John

John’s

temperature


Blinding flash of the obvious2

Blinding Flash of the Obvious

inheres_in

.

organism

temperature

types

instances

John

John’s

temperature


Basic building blocks for biomedical ontologies

temperature

types

37ºC

37.1ºC

37.2ºC

37.3ºC

37.4ºC

37.5ºC

instantiates at t1

instantiates at t2

instantiates at t3

instantiates at t4

instantiates at t5

instantiates at t6

John’s temperature

instances


Basic building blocks for biomedical ontologies

human

types

embryo

fetus

neonate

infant

child

adult

instantiates at t1

instantiates at t2

instantiates at t3

instantiates at t4

instantiates at t5

instantiates at t6

John

instances


Temperature subtypes development stage subtypes

Temperature subtypesDevelopment-stage subtypes

are threshold divisions (hence we do not have sharp boundaries, and we have a certain degree of choice, e.g. in how many subtypes to distinguish, though not in their ordering)


Basic building blocks for biomedical ontologies

independent

continuant

dependent

continuant

quality

organism

temperature

types

instances

John

John’s

temperature


Basic building blocks for biomedical ontologies

independent

continuant

dependent

continuant

occurrent

process

quality

organism

course of temperature changes

temperature

John

John’s

temperature

John’s

temperature history


Basic building blocks for biomedical ontologies

independent

continuant

dependent

continuant

occurrent

process

quality

organism

life of an organism

temperature

John

John’s

temperature

John’s

life


Bfo the very top2

BFO: The Very Top

continuant

occurrent

independent

continuant

dependent

continuant

quality

disposition


Bfo the very top3

BFO: The Very Top

continuant

occurrent

independent

continuant

dependent

continuant

quality

function

role

disposition


Disposition

disposition

- of a glass vase, to shatter if dropped

- of a human, to eat

- of a banana, to ripen

- of John, to lose hair


Disposition1

disposition

if it ceases to exist, then its bearer and/or its immediate surrounding environment is physically changed

its realization occurs when its bearer is in some special physical circumstances

its realization is what it is in virtue of the bearer’s physical make-up


Basic building blocks for biomedical ontologies

independent

continuant

dependent

continuant

occurrent

process

function

eye

process of seeing

to see

John’s eye

function of John’s

eye: to see

John seeing


Basic building blocks for biomedical ontologies

OGMS

Ontology for General Medical Science

http://code.google.com/p/ogms


Ontology of general medical science ogms

Ontology of General Medical Science (OGMS)

  • ontology for the representation of

    • diseases, signs, symptoms

    • clinical processes

    • diagnosis, treatment and outcomes

  • fundamental idea:

    • a disease is a disposition rooted in some (physical) disorder in the organism


Motivation

Motivation

  • Clarity about:

    • disease etiology and progression

    • disease and the diagnostic process

    • phenotype and signs/symptoms

    • entities in reality and observations of sucn entities


Physical disorder

Physical Disorder


Physical disorder1

Physical Disorder

– independent continuant

fiat object part

A causally linked combination of physical components of the extended organism that is clinically abnormal.


Clinically abnormal

Clinically abnormal

  • (1) not part of the life plan for an organism of the relevant type (unlike aging or pregnancy),

  • (2) causally linked to an elevated risk either of pain or other feelings of illness, or of death or dysfunction, and

  • (3) such that the elevated risk exceeds a certain threshold level.*

    *Compare: baldness


Big picture

Big Picture


Pathological process

Pathological Process

=def. A bodily process that is a manifestation of a disorder and is clinically abnormal.

Disease =def. – A disposition to undergo pathological processes that exists in an organism because of one or more disorders in that organism.


Cirrhosis environmental exposure

Cirrhosis - environmental exposure

  • Etiological process - phenobarbitol-induced hepatic cell death

    • produces

  • Disorder - necrotic liver

    • bears

  • Disposition (disease) - cirrhosis

    • realized_in

  • Pathological process - abnormal tissue repair with cell proliferation and fibrosis that exceed a certain threshold; hypoxia-induced cell death

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms - fatigue, anorexia

  • Signs - jaundice, enlarged spleen


Dispositions and predispositions

Dispositions and Predispositions

All diseases are dispositions; not all dispositions are diseases.

Predisposition to Disease

=def.– A disposition in an organism that constitutes an increased risk of the organism’s subsequently developing some disease.


Hnpcc genetic pre disposition

HNPCC - genetic pre-disposition

  • Etiological process - inheritance of a mutant mismatch repair gene

    • produces

  • Disorder - chromosome 3 with abnormal hMLH1

    • bears

  • Disposition (disease) - Lynch syndrome

    • realized_in

  • Pathological process - abnormal repair of DNA mismatches

    • produces

  • Disorder - mutations in proto-oncogenes and tumor suppressor genes with microsatellite repeats (e.g. TGF-beta R2)

    • bears

  • Disposition (disease) - non-polyposis colon cancer

    • realized in

  • Symptoms (including pain)


Huntington s disease genetic

Huntington’s Disease - genetic

  • Symptoms & Signs

    • used_in

  • Interpretive process

    • produces

  • Hypothesis - rule out Huntington’s

    • suggests

  • Laboratory tests

    • produces

  • Test results - molecular detection of the HTT gene with >39CAG repeats

    • used_in

  • Interpretive process

    • produces

  • Result - diagnosis that patient X has a disorder that bears the disease Huntington’s disease

  • Etiological process - inheritance of >39 CAG repeats in the HTT gene

    • produces

  • Disorder - chromosome 4 with abnormal mHTT

    • bears

  • Disposition (disease) - Huntington’s disease

    • realized_in

  • Pathological process - accumulation of mHTT protein fragments, abnormal transcription regulation, neuronal cell death in striatum

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms - anxiety, depression

  • Signs - difficulties in speaking and swallowing


Hnpcc genetic pre disposition1

HNPCC - genetic pre-disposition

  • Etiological process - inheritance of a mutant mismatch repair gene

    • produces

  • Disorder - chromosome 3 with abnormal hMLH1

    • bears

  • Disposition (disease) - Lynch syndrome

    • realized_in

  • Pathological process - abnormal repair of DNA mismatches

    • produces

  • Disorder - mutations in proto-oncogenes and tumor suppressor genes with microsatellite repeats (e.g. TGF-beta R2)

    • bears

  • Disposition (disease) - non-polyposis colon cancer


Cirrhosis environmental exposure1

Cirrhosis - environmental exposure

  • Symptoms & Signs

    • used_in

  • Interpretive process

    • produces

  • Hypothesis - rule out cirrhosis

    • suggests

  • Laboratory tests

    • produces

  • Test results - elevated liver enzymes in serum

    • used_in

  • Interpretive process

    • produces

  • Result - diagnosis that patient X has a disorder that bears the disease cirrhosis

  • Etiological process - phenobarbitol-induced hepatic cell death

    • produces

  • Disorder - necrotic liver

    • bears

  • Disposition (disease) - cirrhosis

    • realized_in

  • Pathological process - abnormal tissue repair with cell proliferation and fibrosis that exceed a certain threshold; hypoxia-induced cell death

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms - fatigue, anorexia

  • Signs - jaundice, splenomegaly


Systemic arterial hypertension

Systemic arterial hypertension

  • Symptoms & Signs

    • used_in

  • Interpretive process

    • produces

  • Hypothesis - rule out hypertension

    • suggests

  • Laboratory tests

    • produces

  • Test results -

    • used_in

  • Interpretive process

    • produces

  • Result - diagnosis that patient X has a disorder that bears the disease hypertension

  • Etiological process – abnormal reabsorption of NaCl by the kidney

    • produces

  • Disorder – abnormally large scattered molecular aggregate of salt in the blood

    • bears

  • Disposition (disease) - hypertension

    • realized_in

  • Pathological process – exertion of abnormal pressure against arterial wall

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms - headaches, dizziness

  • Signs – elevated blood pressure


Type 2 diabetes mellitus

Type 2 Diabetes Mellitus

  • Symptoms & Signs

    • used_in

  • Interpretive process

    • produces

  • Hypothesis - rule out diabetes mellitus

    • suggests

  • Laboratory tests – fasting serum blood glucose, oral glucose challenge test, and/or blood hemoglobin A1c

    • produces

  • Test results -

    • used_in

  • Interpretive process

    • produces

  • Result - diagnosis that patient X has a disorder that bears the disease type 2 diabetes mellitus

  • Etiological process –

    • produces

  • Disorder – abnormal pancreatic beta cells and abnormal muscle/fat cells

    • bears

  • Disposition (disease) – diabetes mellitus

    • realized_in

  • Pathological processes – diminished insulin production , diminished muscle/fat uptake of glucose

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms – polydipsia, polyuria, polyphagia, blurred vision

  • Signs – elevated blood glucose and hemoglobin A1c


Type 1 hypersensitivity to penicillin

Type 1 hypersensitivity to penicillin

  • Symptoms & Signs

    • used_in

  • Interpretive process

    • produces

  • Hypothesis -

    • suggests

  • Laboratory tests –

    • produces

  • Test results – occasionally, skin testing

    • used_in

  • Interpretive process

    • produces

  • Result - diagnosis that patient X has a disorder that bears the disease type 1 hypersensitivity to penicillin

  • Etiological process – sensitizing of mast cells and basophils during exposure to penicillin-class substance

    • produces

  • Disorder – mast cells and basophils with epitope-specific IgE bound to Fc epsilon receptor I

    • bears

  • Disposition (disease) – type I hypersensitivity

    • realized_in

  • Pathological process – type I hypersensitivity reaction

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms – pruritis, shortness of breath

  • Signs – rash, urticaria, anaphylaxis


Disease vs disease course

Disease vs. Disease course

Disease =def. – A disposition to undergo pathological processes that exists in an organism because of one or more disorders in that organism.

Disease course =def. – The aggregate of processes in which a disease disposition is realized.


Basic building blocks for biomedical ontologies

coronary heart disease

disease associated with early lesions and small fibrous plaques

disease associated with asymptomatic (‘silent’) infarction

disease associated with surface disruption of plaque

unstable angina

stable angina

instantiates at t1

instantiates at t2

instantiates at t3

instantiates at t4

instantiates at t5

John’s coronary heart disease

time


Basic building blocks for biomedical ontologies

independent

continuant

dependent

continuant

occurrent

process

disposition

disorder

course of disease

disease

John’s

disordered

heart

John’s

coronary heart

disease

course of John’s

disease


Examples of ontology terms

Examples of ontology terms


Ido infectious disease ontology core

IDO (Infectious Disease Ontology) Core

Follows GO strategy of providing a canonical ontology of what is involved in every infectious disease – host, pathogen, vector, virulence, vaccine, transmission – accompanied by IDO Extensions for specific diseases, pathogens and vectors

Provides common terminology resources and tested common guidelines for a vast array of different disease communities


Infectious disease ontology consortium

Infectious Disease Ontology Consortium

  • MITRE, Mount Sinai, UTSouthwestern – Influenza

  • IMBB/VectorBase – Vector borne diseases (A. gambiae, A. aegypti, I. scapularis, C. pipiens, P. humanus)

  • Colorado State University – Dengue Fever

  • Duke University – Tuberculosis, Staph. aureus

  • Cleveland Clinic – Infective Endocarditis

  • University of Michigan – Brucellosis

  • Duke University, University at Buffalo – HIV


Influenza infectious

Influenza - infectious

  • Etiological process - infection of airway epithelial cells with influenza virus

    • produces

  • Disorder - viable cells with influenza virus

    • bears

  • Disposition (disease) - flu

    • realized_in

  • Pathological process - acute inflammation

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms - weakness, dizziness

  • Signs - fever


Influenza disease course

Influenza – disease course

  • Etiological process - infection of airway epithelial cells with influenza virus

    • produces

  • Disorder - viable cells with influenza virus

    • bears

  • Disposition (disease) - flu

    • realized_in

  • Pathological process - acute inflammation

    • produces

  • Abnormal bodily features

    • recognized_as

  • Symptoms - weakness, dizziness

  • Signs - fever

The disorder also induces normal physiological processes (immune response) that can results in the elimination of the disorder (transient disease course).


Big picture1

Big Picture


  • Login