slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Experience with Using the UMLS Semantic Network to Coordinate Controlled Terminologies for a Large Clinical Data Repo PowerPoint Presentation
Download Presentation
Experience with Using the UMLS Semantic Network to Coordinate Controlled Terminologies for a Large Clinical Data Repo

Loading in 2 Seconds...

play fullscreen
1 / 42

Experience with Using the UMLS Semantic Network to Coordinate Controlled Terminologies for a Large Clinical Data Repo - PowerPoint PPT Presentation


  • 106 Views
  • Uploaded on

Experience with Using the UMLS Semantic Network to Coordinate Controlled Terminologies for a Large Clinical Data Repository. James J. Cimino Department of Biomedical Informatics Columbia University College of Physicians and Surgeons National Library of Medicine, April 8, 2005. Overview.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Experience with Using the UMLS Semantic Network to Coordinate Controlled Terminologies for a Large Clinical Data Repo' - nili


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Experience with Using the UMLS Semantic Network to Coordinate Controlled Terminologiesfor a Large Clinical Data Repository

James J. Cimino

Department of Biomedical Informatics

Columbia University College of Physicians and Surgeons

National Library of Medicine, April 8, 2005

overview
Overview
  • Background
  • History
  • General principles
  • Empiric observations: Semantic Network in the Medical Entities Dictionary
  • Lessons to be learned
clinical data architecture
Clinical Data Architecture
  • Central repository to collect data from myriad sources
  • Myriad users of data - some not yet imagined
new york presbyterian hospital clinical information systems architecture

Medical Logic Modules

Clinical Database

Alerts & Reminders

Database Monitor

Results Review

Database

Interface

Administrative

Medical Entities Dictionary

Research

Reformatter

Reformatter

Reformatter

. . .

. . .

Radiology

Discharge

Summaries

Laboratory

New York Presbyterian HospitalClinical Information Systems Architecture
clinical data architecture1
Clinical Data Architecture
  • Central repository to collect data from myriad sources
  • Myriad users of data - some not yet imagined
  • Patient-oriented, not visit oriented, database
  • Relational, not hierarchical, model
  • Entity-attribute-value model
clinical data architecture2
Clinical Data Architecture
  • Central repository to collect data from myriad sources
  • Myriad users of data - some not yet imagined
  • Patient-oriented, not visit oriented, database
  • Relational, not hierarchical, model
  • Entity-attribute-value model
  • Coded data wherever possible
  • Unify terminology
med structure

Substance

Laboratory

Specimen

Event

Chemical

Anatomic

Substance

Plasma

Specimen

Diagnostic

Procedure

Substance

Sampled

Plasma

Laboratory

Test

Laboratory

Procedure

Has Specimen

Carbo-

hydrate

Bioactive

Substance

Part of

Glucose

Substance Measured

MED Structure

Medical

Entity

CHEM-7

Plasma

Glucose

Test

communicating terminology changes

K#1 = 4.2

K#1 = 3.3

K#2 = 3.2

K#1 = 3.0

K#3 = 2.6

K#1

K#3

K#2

Communicating Terminology Changes
solution hierarchical integration

K#1 = 4.2

K#1 = 3.3

K#2 = 3.2

K#1 = 3.0

K#3 = 2.6

K

K#3

Solution: Hierarchical Integration

K#1

K#2

use of the umls in patient care

Use of the UMLS in Patient Care

James J. Cimino, M.D.

Center for Medical Informatics

Columbia University

Mont Pelerin, Switzerland 1994

umls semantic network
UMLS Semantic Network
  • Strict hierarchy
  • Semantic types: 132 (135)
  • Semantic relations: 46 (53)
  • Inheritance of relations: 6233 (6700)
umls metathesaurus
UMLS Metathesaurus
  • Terms from 22 (100+) controlled vocabularies
  • Total source terms: 311,046
  • Total strings: 279,237 (5,000,000)
  • Total concepts: 152,444 (1,000,000)
  • Relationships: 1,484,994 (16,000,000)
medical entities dictionary
Medical Entities Dictionary
  • Semantic Network
  • Sources: 5
  • Strings: 108,492
  • Concepts: 35,281
  • Semantic relations: 23 pairs
  • Semantic Links: 145,672
comparisons methods
Comparisons - Methods
  • CPMC Entities vs. UMLS Semantic Types
  • MED Classes vs. UMLS Semantic Types
  • MED Semantic Links vs. UMLS Semantic Relations
  • MED Concepts vs. Metathesaurus Concepts
  • MED Semantic Links vs. Meta Relations
comparisons results
Comparisons - Results

CPMC

DB Entities

Classes

Links

Concepts

Types

++++

+++

U

M

L

S

Relations

++

Concepts

+++

+/-

Meta Links

summary
Summary
  • Semantic Types provide good coverage
  • Concepts provide good coverage in certain domains
  • No technical reason why UMLS could not incorporate clinical vocabulary
where we are today repository
Patients: 2.6 million

Visits: >10 million since 1996 with archives going back to 1979

Visit diagnoses, locations, procedures, providers, insurance

Lab procedures: 16 million with 130 million results (to 1989)

Radiology procedures reports: 5.7 million

Pathology: 1.4 million

Cardiology procedures: 1.5 million

Resident signout notes:760,000

Operative Notes: 426,000

Clinical Notes: 400,000

Discharge Summaries: 420000

Medication orders: >60 million

ObGyn Procedure Reports: 241,000

GI Procedure Reports: 101,000

Neurology Procedure Reports: 54,000

Ideatel BP’s: 215,000

Ideatel Glucose: 650,000

Consult Events: 18000

HEENT Events:13000

Hospitalist Notes:30000

PFT: 25000

Provider profiles 11000

IDX 1.4 million

East Campus

Where We Are Today - Repository
where we are today med
Where We Are Today - MED
  • Domains: 7++ (5)
    • HP lab terms
    • Misys lab terms
    • Cerner lab terms
    • Misys Radiology
    • Digimedix drugs
    • Cerner Drugs
    • ICD9-based problem list terms
    • Other applications
    • Knowledge terms
  • Size:
    • Concept-based: 95,641 (35,281)
    • Multiple hierarchy: 141,306
    • Synonyms: 239,581 (108,492)
    • Translations: 141,717
    • Semantic link pairs: 52 (23)
    • Semantic links: 225,698 (145,672)
    • Attributes: 210,456
what does this have to do with the sn
What does this have to do with the SN?
  • MED was initially based on UMLS design (creationism)
  • UMLS SN was the “starter set”
  • MED is “local UMLS” for CPMC
  • General principles were established
  • MED has developed without further conscious attention to the SN (evolution)
  • MED content represents real-world terminology
  • What follows are empiric observations, open to criticism; perhaps indefensible
general principles
General Principles
  • Everything is a class
  • Multiple hierarchy
  • Some relations are definitional
  • At most, one part of relation pair is definitional
  • Properties introduced at single points
observations on the sn in the med
Observations on the SN in the MED
  • Arrangement of SN in MED
  • Multiple hierarchy of STs
  • Size of ST classes in MED (vs Meta?)
  • STs as introduction points
  • Intersections
umls semantic net in the med
UMLS Semantic Net in the MED

A: T071: Medical Entity [94729]

. A1: T072: Physical Object [5618]

. +*A1.2: T017: Anatomical Structure [577]

. A2: T077: Conceptual Entity [77861]

. *B: T051: Event [55450]

Key:

“A1.2”: UMLS Tree address

“T071”: Semantic type ID

“Event”: MED Name

“+”: Multiple locations

“*”: Discontinuous tree address

“[577]”: Number of MED concepts

umls semantic net in the med1
UMLS Semantic Net in the MED

A: T071: Medical Entity [94729]

. A1: T072: Physical Object [5618]

. . A1.1: T001: Organism [3153]

. . . A1.1.1: T002: Plant [1]

. . . . A1.1.1.1: T003: Alga [0]

. . . A1.1.2: T004: Fungus [273]

. . . A1.1.3: T005: Virus [169]

. . . A1.1.4: T006: Rickettsia or Chlamydia [5]

. . . A1.1.5: T007: Bacterium [992]

. . . A1.1.6: T194: Archaeon [0]

. . . A1.1.7: T008: Animal [93]

. . . . A1.1.7.1: T009: Invertebrate [85]

. . . . A1.1.7.2: T010: Vertebrate [6]

. . . . . A1.1.7.2.1: T011: Amphibian [0]

. . . . . A1.1.7.2.2: T012: Bird [0]

. . . . . A1.1.7.2.3: T013: Fish [0]

. . . . . A1.1.7.2.4: T014: Reptile [0]

. . . . . A1.1.7.2.5: T015: Mammal [1]

. . . . . . A1.1.7.2.5.1: T016: Human [0]

Key:

“A1.2”: UMLS Tree address

“T071”: Semantic type ID

“Event”: MED Name

“+”: Multiple locations

“*”: Discontinuous tree address

“[577]”: Number of MED concepts

umls semantic net in the med2
UMLS Semantic Net in the MED

A: T071: Medical Entity [94729]

. +*A1.2: T017: Anatomical Structure [577]

. . A1.2.3: T021: Fully Formed Anatomical Structure [230]

. . . A1.2.3.1: T023: Body Part, Organ, or Organ Component [204]

. . . *A1.2.1: T018: Embryonic Structure [2]

. . . *A1.2.2: T190: Anatomical Abnormality [20]

. . . . A1.2.2.1: T019: Congenital Abnormality [0]

. . . . A1.2.2.2: T020: Acquired Abnormality [18]

. . *A1.2.3.2: T024: Tissue [66]

. . *A1.2.3.3: T025: Cell [61]

. . *A1.2.3.4: T026: Cell Component [11]

. . *A1.2.3.5: T028: Gene or Genome [0]

. . *A1.4.2: T031: Body Substance [56]

. . +*A2.1.4.1: T022: Body System [65]

. . +*A2.1.5.1: T030: Body Space or Junction [43]

. . +*A2.1.5.2: T029: Body Location or Region [117

. . *A1.3: T073: Manufactured Object [16]

. . . A1.3.1: T074: Medical Device [6]

. . . A1.3.2: T075: Research Device [0]

. . . A1.3.3: T200: Clinical Drug [0]

. . A1.4: T167: Substance [???]

. . . A1.4.1: T103: Chemical [1942]

. . . . A1.4.1.1: T120: Chemical Viewed Functionally [1828]

. . . . . A1.4.1.1.1: T121: Pharmacologic Substance [1468]

. . . . . . +*A1.4.1.1.3.4: T127: Vitamin [20]

. . . . . . A1.4.1.1.1.1: T195: Antibiotic [130]

. . . . . A1.4.1.1.3: T123: Biologically Active Substance [530]

. . . . . . +A1.4.1.1.3.4: T127: Vitamin [20]

Key:

“A1.2”: UMLS Tree address

“T071”: Semantic type ID

“Event”: MED Name

“+”: Multiple locations

“*”: Discontinuous tree address

“[577]”: Number of MED concepts

property introduction points
Property Introduction Points

1: Medical Entirity [T071]

MED-CODE

UMLS-CODE

NAME

SUBCLASS-OF -> SUBCLASS (1: Medical Entity [T071])

SUBCLASS -> SUBCLASS-OF (1: Medical Entity [T071])

SYNONYMS

PRINT-NAME

HAS-PARTS -> PART-OF (1: Medical Entity [T071])

PART-OF -> HAS-PARTS (1: Medical Entity [T071])

DEFINITION

MAIN-MESH

SUPPLEMENTARY-MESH

NAME-TOKEN

DEFAULT-SHORT-DISPLAY-NAME

DEFAULT-DISPLAY-NAME

SPEECH-SYNONYM

SPEECH-SYNTHESIS-NAME

ENTITY-(HAS-RELATED)-PAGER-NUMBER

ENTITY-(HAS)-MEDLEE-TARGET-TERM

HIERARCHY-SELECTOR

medical properties
Medical Properties

7: Body System [T022]

ACTION-SITE-OF -> ACTION-SITE (98: Health Care Activity (Procedure) [T058])

14: Anatomical Structure [T017]

SITE-OF-PROBLEM -> HAS-PROBLEM-SITE (30007: Patient Problem)

OBSERVATION-SITE-OF -> OBSERVATION-SITE (94: Diagnostic Procedure [T060])

43: Chemical [T103]

PHARMACEUTIC-COMPONENT-OF -> PHARMACEUTIC-COMPONENT (28103: Pharmacy Items (Drugs and Nondrugs))

50: Measureable Entity

MEASURED-BY-PROCEDURE -> ENTITY-MEASURED (64964: Assessment Procedures)

LOINC-ANALYTE-NAME

76: Disease or Syndrome [C0391828]

ETIOLOGY -> CAUSES-DISEASES (135: Etiologic Agent)

IS-HISTORIC-DISEASE-FOR -> HISTORIC-DISEASE (56164: Factors Related to Past Disease Influencing Health Status)

medical properties1
Medical Properties

83: Laboratory Finding or Test Result [T034]

RESULT-TYPE-->TESTS -> TEST-->RESULT-TYPE (94: Diagnostic Procedure [T060])

86: Finding [T033]

FINDING-(REFERS-TO)->ORGANISM

93: Laboratory Diagnostic Procedure

COLLECTED-BY -> COLLECTED-FOR (33023: Specimen Collection [C0200345])

94: Diagnostic Procedure [T060]

UNITS

TEST-->RESULT-TYPE -> RESULT-TYPE-->TESTS (83: Laboratory Finding or Test Result [T034])

OBSERVATION-SITE -> OBSERVATION-SITE-OF (14: Anatomical Structure [T017])

TEST-(HAS)-ABNORMAL-FLAG -> ABNORMAL-FLAG-(FOR)-TEST (77746: Abnormal Flag Value)

98: Health Care Activity (Procedure) [T058]

PROCEDURE-(INDICATES)->PT-PROBLEM -> PT-PROBLEM-(INDICATED-BY)->PROCEDURE (30007: Patient Problem)

ACTION-SITE -> ACTION-SITE-OF (7: Body System [T022])

medical properties2
Medical Properties

135: Etiologic Agent

CAUSES-DISEASES -> ETIOLOGY (76: Disease or Syndrome [C0391828])

1181: Antibiotic Sensitivity Tests

SENSITIVITY-ANALYTE -> SENSITIVITY-ANALYTE-OF (44440: Antibiotic or Bacterial Enzyme Inhibitor)

32291: Sampleable Entity

SAMPLED-BY -> SYSTEM-SAMPLED (64970: Sample Entity)

LOINC-SYSTEM-CODE

44440: Antibiotic or Bacterial Enzyme Inhibitor

SENSITIVITY-ANALYTE-OF -> SENSITIVITY-ANALYTE (1181: Antibiotic Sensitivity Tests)

data dictionary properties
Data Dictionary Properties

59511: Clinical Repository Table

TABLE-HAS-COLUMN -> COLUMN-IS-IN-TABLE (59512: Clinical Repository Column)

59512: Clinical Repository Column

COLUMN-IS-IN-TABLE -> TABLE-HAS-COLUMN (59511: Clinical Repository Table)

59528: Generic Column

COLUMN-HAS-PERMITTED-VALUES -> IS-PERMITTED-VALUE-FOR-COLUMN (67164: Verification Concept for Generic Column)

59729: Data Entry Form Component

REPEAT-TYPE(DATA-ENTRY-COMPONENT)

NUMBER-REPEATS(DATA-ENTRY-COMPONENT)

REPEAT-LAYOUT-TYPE(DATA-ENTRY-COMPONENT)

59732: Form Field Allowable Values

ALLOWABLE-VALUE-(FOR)->DATA-ENTRY-FIELD -> DATA-ENTRY-FIELD-(HAS)->ALLOWABLE-VALUE (42646: Data Entry Form Field)

controlled terminology properties
Controlled Terminology Properties

21762: ICD9 Element

ICD9-CODE

ICD9-ENTRY-CODE

OLD-ICD9-CODE

ICD9-NAME

23147: American Hospital Formulary Service Class

AHFS-CLASS-CODE

28104: Drug Enforcement Administration (DEA) Controlled Substance Category

DEA-CODE

data modeling properties
Data Modeling Properties

1178: Number or String Result

EVENT-ID-OF -> EVENT-ID (9876: CPMC Event)

EVENT-PATIENT-ID-OF -> EVENT-PATIENT-ID (9876: CPMC Event)

EVENT-ORGANIZATION-OF -> EVENT-ORGANIZATION (9876: CPMC Event)

EVENT-LOCATION-OF -> EVENT-LOCATION (9876: CPMC Event)

PARTICIPANT-ID-OF -> PARTICIPANT-ID (30352: Medical Event Participant)

9876: CPMC Event

EVENT-ID -> EVENT-ID-OF (1178: Number or String Result)

EVENT-DATE -> EVENT-DATE-OF (30349: Date Result)

EVENT-PATIENT-ID -> EVENT-PATIENT-ID-OF (1178: Number or String Result)

EVENT-PARTICIPANT -> PARTICIPANT-OF (30352: Medical Event Participant)

EVENT-ORGANIZATION -> EVENT-ORGANIZATION-OF (1178: Number or String Result)

EVENT-LOCATION -> EVENT-LOCATION-OF (1178: Number or String Result)

EVENT-STATUS -> STATUS-OF (30355: CPMC Status Term)

EVENT-(HAS)-ORGANIZATION -> ORGANIZATION-(FOR)-EVENT (81475: CPMC Coded Organizations)

30344: CPMC Order

ORDER-QUANTITY -> ORDER-QUANTITY-OF (30350: Quantity Result)

ORDER-FREQUENCY -> ORDER-FREQUENCY-OF (32504: Order Frequency)

ORDER-START-DATE -> ORDER-START-DATE-OF (30349: Date Result)

ORDER-STOP-DATE -> ORDER-STOP-DATE-OF (30349: Date Result)

30352: Medical Event Participant

PARTICIPANT-OF -> EVENT-PARTICIPANT (9876: CPMC Event)

PARTICIPANT-ID -> PARTICIPANT-ID-OF (1178: Number or String Result)

PARTICIPANT-NAME -> PARTICIPANT-NAME-OF (32653: ID Number Plus Text Result)

application properties
Application Properties

40441: Display Information [C0010996]

DEFAULT-DISPLAY-FOR -> HAS-DEFAULT-DISPLAYS (94: Diagnostic Procedure [T060])

DISPLAYS-ELEMENTS-OF -> ELEMENTS-DISPLAYED-BY (94: Diagnostic Procedure [T060])

HAS-DISPLAY-PARAMETERS -> IS-DISPLAY-PARAMETER-OF (94: Diagnostic Procedure [T060])

DISPLAY-PARAMETER-ORDER

document properties
Document Properties

42645: Data Entry Form

FORM-(IS-PART-OF)->FORMSET -> FORMSET-(CONTAINS)->FORM (66436: Data Entry Form Sets)

42646: Data Entry Form Field

DATA-ENTRY-FIELD-(HAS)->ALLOWABLE-VALUE -> ALLOWABLE-VALUE-(FOR)->DATA-ENTRY-FIELD (59732: Form Field Allowable Values)

FORM-FIELD-(HAS)->FIELD-TYPE -> FIELD-TYPE-(FOR)->FORM-FIELD (66295: Data Entry Field Type)

FORM-FIELD-(OBEYS)->PREFILL-RULE -> PREFILL-RULE-(FOR)->FORM-FIELD (66311: Prefill Rules)

FORM-FIELD-MAXIMUM-VALUE

FORM-FIELD-MINIMUM-VALUE

FORM-FIELD-MAXIMUM-CHARACTER-COUNT

59732: Form Field Allowable Values

ALLOWABLE-VALUE-(FOR)->DATA-ENTRY-FIELD -> DATA-ENTRY-FIELD-(HAS)->ALLOWABLE-VALUE (42646: Data Entry Form Field)

66295: Data Entry Field Type

FIELD-TYPE-(FOR)->FORM-FIELD -> FORM-FIELD-(HAS)->FIELD-TYPE (42646: Data Entry Form Field)

66308: Layout Type

LAYOUT-TYPE-(FOR)->FORM-STRUCTURE -> FORM-STRUCTURE-(HAS)->LAYOUT-TYPE (66405: Data Entry Form Structure)

207 intersection classes
207 Intersection Classes

Chemical [T103]

Measureable Entity

Etiologic Agent

1780 cases.

Measureable Entity

Laboratory Finding or Test Result [T034]

Finding [T033]

Etiologic Agent

Microbiology Result

Patient Problem

Laboratory Results

1399 cases.

Laboratory Finding or Test Result [T034]

Finding [T033]

Patient Problem

Laboratory Results

3309 cases.

Laboratory Finding or Test Result [T034]

Finding [T033]

Patient Problem

Laboratory Results

New York Hospital (NYH) Laboratory Nomenclature Term

1601 cases.

207 intersection classes1
207 Intersection Classes

Laboratory Finding or Test Result [T034]

Finding [T033]

Patient Problem

New York Hospital (NYH) Laboratory Nomenclature Term

2906 cases.

Laboratory Diagnostic Procedure

Diagnostic Procedure [T060]

Health Care Activity (Procedure) [T058]

Event [T051]

Laboratory Diagnostic Batteries

Single-Result Laboratory Test

New York Hospital (NYH) Laboratory Concept

Assessment Procedures

1197 cases.

Laboratory Diagnostic Procedure

Diagnostic Procedure [T060]

Health Care Activity (Procedure) [T058]

Event [T051]

Laboratory Diagnostic Batteries

New York Hospital (NYH) Laboratory Concept

Assessment Procedures

1822 cases.

207 intersection classes2
207 Intersection Classes

Laboratory Diagnostic Procedure

Diagnostic Procedure [T060]

Health Care Activity (Procedure) [T058]

Event [T051]

Single-Result Laboratory Test

New York Hospital (NYH) Laboratory Concept

Assessment Procedures

3200 cases.

Laboratory Diagnostic Procedure

Diagnostic Procedure [T060]

Health Care Activity (Procedure) [T058]

Event [T051]

Single-Result Laboratory Test

CPMC Single-Result Laboratory Test

Assessment Procedures

3197 cases.

Health Care Activity (Procedure) [T058]

Event [T051]

ICD9 Element

Verification Concept for Generic Column

10048 cases.

revisiting recommendations general
Revisiting Recommendations - General
  • Make “Event” a temporal concept
  • Conceptual vs. Physical polarization
  • Directed Acyclic Graph
  • Merge Network and Metathesaurus
revisiting recommendations specific
Revisiting Recommendations - Specific
  • Tests have Specimens
  • Tests have Parts
  • Separate Medications from Chemicals
  • Liberalize assignment of Relations
revisiting summary
Revisiting Summary
  • Semantic Types provide good coverage
  • Concepts provide good coverage in certain domains
  • No technical reason why UMLS could not incorporate clinical vocabulary
lessons to be learned
Lessons to be Learned
  • The MED is representative of clinical care
  • MED classes work well as introduction points
  • Multiple hierarchy works
  • Semantic Network is largely intact
  • Unifying organization for anatomy needed
  • Further study of MED will suggest additional types and relations