National cancer institute enterprise vocabulary services overview and plans for 2011
1 / 50

National Cancer Institute Enterprise Vocabulary Services Overview and Plans for 2011 - PowerPoint PPT Presentation

  • Uploaded on

National Cancer Institute Enterprise Vocabulary Services Overview and Plans for 2011. January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology. Interoperability. Interoperability:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'National Cancer Institute Enterprise Vocabulary Services Overview and Plans for 2011' - todd

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
National cancer institute enterprise vocabulary services overview and plans for 2011 l.jpg

National Cancer Institute Enterprise Vocabulary Services Overview and Plans for 2011

January 19, 2011

Sherri de Coronado, Semantic Services

Center for Bioinformatics and Information Technology

Interoperability l.jpg

  • Interoperability:

    The ability of a use the parts or equipment of another system

    Source: Merriam-Webster web site

  • Interoperability:

    The ability of two or more systems or components to exchangeinformation and to use the information that has been exchanged.

    Source: IEEE Standard Computer Dictionary, 1990



Extending interoperability beyond the enterprise l.jpg
Extending Interoperability Beyond the Enterprise

  • cancer Biomedical Informatics Grid (caBIG)

    • Shared infrastructure, applications and data

    • Enable cancer research community to focus on innovation and move research from bench to bedside and back

    • Shared vocabulary, data elements, data models facilitate information exchange

    • Interoperable applications developed to common standard

    • Making research data available for mining and integration

  • Several new ARRA initiatives leveraging this infrastructure to extend interoperability principles to the broader healthcare community

Semantic infrastructure futures l.jpg
Semantic Infrastructure Futures

  • Evolution, not Revolution

    • Still gathering requirements and defining approaches

    • Aim: support interoperability with a broader range of partners

  • Services-Oriented Architecture (SOA) approach.

  • Technology-independent specifications that enable others to build interoperable components.

  • Design, develop and deploy software components defined as business capabilities rather than monolithic applications.

Still required terminology services for nci collaborators l.jpg
Still Required: Terminology Services for NCI & Collaborators

  • Terminology Editing Software (NCI Protégé – Protégé 3.4 with extensions)

  • Terminology Content (NCI Thesaurus), published monthly

  • Terminology Content (NCI Metathesaurus)

  • Terminology Content (other standard terminologies like LOINC)

  • Terminology Server (LexEVS and OWL/ RRF/ LexGrid/ OBO loaders, and now Value Set, Pick List and Mapping support/ export) accessible via APIs

  • Browsers for NCI Thesaurus, NCI Metathesaurus and other terminologies served via LexEVS, now being modified for value set query, resolution and export, and mapping query/ review

High value use cases l.jpg
High Value Use Cases

  • EVS Used Directly for Drug and Clinical Information Integration

    • Agents, Clinical Trials and Adverse Events

      • CTEP and DCP clinical trials

      • PDQ Cancer Clinical Trials Registry & NCI Drug Dictionary

      • Federal Medication Terminologies (FMT) - Interagency

      • FDA Structured Product Labeling

      • NCPDP (SCRIPT Standard for e-prescribing)

  • caBIG infrastructure and application use cases

    • Infrastructure providing semantic interoperability

    • caTIES/caTissueCore/caMOD/caNanolab

  • FDA/NCI/CDISC/RCRIM – harmonization/ development - standards

Nci thesaurus ncit l.jpg
NCI Thesaurus (NCIt)

  • Standard reference terminology/ontologyused by NCI, caBIG; underpins caCORE/caBIG/caGRID semantics

  • A Federal Standard Terminology

  • Built using description logics (OWL-DL)

  • Published monthly, with concept history

  • Public domain, open content license

  • Used by many public and private partners, nationally and internationally

What s in ncit l.jpg
What ‘s in NCIt ?

Events, Entities, Processes

+89,000 concepts

Hierarchical arrangement

Preferred Names,

Synonyms & Definitions

Concept relationships & properties


permanent identifier codes

Semantic diversity l.jpg
Semantic Diversity















medical device

embryonic structure

laboratory tests

anatomical structure

anatomical abnormality

bodyparts &organs

congenital abnormality


clinical drug

regulation or law


sign or symptoms

nucleic acid



geographic area

research activity

cell s

genetic function

family group

molecular sequence

disease or syndrome

neoplastic process

educational activity

Mental process

natural phenomenon


experimental model of disease

therapeutic or preventative procedure



health care activity


laboratory procedure

quantitative concept


Fda nci memorandum of understanding l.jpg
FDA-NCIMemorandum of Understanding

  • Significance of MOU

    • Avoids expenditure at FDA to replicate existing, available resources at NCI

  • Leverages multiple efforts

    • Complementary to the CDISC/NCI collaborations on terminology requirements for CDISC models such as the Study Data Tabulation Model (SDTM)

    • FDA and NCI coordinate regarding relevant terminology standards and standards development efforts such as those of the HL7 RCRIM technical committee

    • FDA and NCI seek to identify opportunities to employ consistent terminology and terminology practices, for example in support of FHA/ONC initiatives and goals and such as eGOV

Nci fda terminology collaboration l.jpg
NCI-FDA Terminology Collaboration

  • 2002- partnership and agreements in several terminology areas.

    • Structured Product Labeling (SPL)

    • Unique Ingredient Identifier (UNII)

    • Regulated Product Submission (RPS)

    • Individual Case Safety Report (ICSR)

    • Center for Devices and Radiological Health (CDRH)


    “For terminology standards, the FDA partners with the National Cancer Institute Enterprise Vocabulary Services (EVS). The NCI EVS hosts the FDA terminologies and makes them freely available to the public.”

  • FDA terminology resources are available on the NCI portal website:

Fda structured product labels l.jpg
FDA Structured Product Labels

  • Pharmaceutical Companies must provide information for electronic labels to FDA using controlled terminology

  • FDA needs rapid turnaround terminology for the content of labels but doesn’t want to be in the terminology business.

  • FDA requests terminology in various areas related to product labels, NCI editors work with them, integrate them into NCI Thesaurus, and tag them with subset properties. FDA publishes the lists on their website, and provides links to NCI Thesaurus.

    • Examples

      • Route of Administration

      • Unit of Presentation (Potency)

      • Dosage Form

      • Package Type

  • FDA SPL Web page:

Spl in ncit l.jpg

  • For solid oral dosage form appearance

    • SPL Color – BLUE C48333

    • SPL Shape - ROUND C48348

  • For drug interactions

    • Contributing Factor - General - FOOD OR FOOD PRODUCT C1949

    • Type of Drug Interaction Consequence - PHARMACOKINETIC EFFECT C54386

    • Pharmacokinetic Effect Consequence - INCREASED DRUG LEVEL C54355

    • Limitation of Use – CONTRAINDICATION C50646

    • Sex – FEMALE C16576

    • Race - ASIAN C41259

  • Other

    • SPL DEA Schedule - Controlled Substances – (e.g. CII C48675)

Cdisc terminology l.jpg
CDISC Terminology

  • Clinical Data Interchange Standards Consortium (CDISC) is an international, non-profit organization that develops and supports global data standards for medical research.

  • FDA points to CDISC as key provider of clinical & preclinical standards: “The foundation for the standardized clinical content is the Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM).”


  • EVS is partnered with CDISC to support and publish SDTM and other CDISC terminology including SEND (animal studies), Glossary, CDASH

  • CDISC terminology also published on NCI portal website:

Ncit hesaurus http ncit nci nih gov l.jpg

Search Box

Choices, choices...

Version information

Term search l.jpg
Term search

Search on term - mg - 5 results

Code search l.jpg
Code Search

Search on Code - 1 result



Concept details from browser l.jpg
Concept details from Browser


Semantic Type

Slide20 l.jpg

Concept Relationships

& Associations

Subset Associations:

How concepts are "bundled"

Nci metathesaurus l.jpg
NCI Metathesaurus

  • Purpose: Integrating 76 biomedical national and international sources into one database.

  • UMLS based. About 3.6 million terms/ 1.4 million concepts

  • Provides a mapped overlap and partial inter-relation of current versions of NCI and partner required vocabularies, for ex. the ICD’s, MedDRA, SNOMED, MeSH (NLM Medical Subject Headings), HCPCS (procedures), LOINC (lab values), drug terminologies (VA NDF-RT, AOD, RxNORM, Multum, NCI Thesaurus drugs, etc.)

  • Used as online dictionary and thesaurus, for mapping and document indexing.

  • Major releases at least twice a year, minor releases with NCIt and other updates several more times a year.

Nci metathesaurus https ncim nci nih gov l.jpg
NCI Metathesaurus

3,600,000 terms

76 Sources

1,400,000 concepts

Ncim etathesaurus l.jpg

Choose your source

11 Sources

Nciterm browser http nciterms nci nih gov l.jpg
NCITerm Browser


Evs products services are open l.jpg
EVS Products & Services Are Open

  • NCI Thesaurus is Open Content

  • NCI Metathesaurus is Mostly Open Source

    (See Each Source’s License)

  • NCI EVS Servers Are Freely Accessible

    • On the Web:

    • Via API:

  • All Software Developed by NCI EVS is Public Open Source and Free for the Asking:

Methods of content retrieval l.jpg
Methods of Content Retrieval

  • NCI ftp site:

  • NCI partner web sites (CDISC, FDA, etc.)

  • Request a report from NCI staff:

  • NCIt Browser by subset :




Ncit ftp site http evs nci nih gov ftp1 l.jpg
NCIt ftp site

You can download the entire NCIt in various formats

Shared content standards l.jpg
Shared Content Standards




NIH “Roadmap”







Admin Procedures





Therapeutic Area Standards

Consolidated content services l.jpg
Consolidated Content Services




Ncit editing priorities for 2011 l.jpg
NCIt Editing Priorities for 2011

  • Terminology associated with standardized case report forms of all kinds

  • Safety reporting (drug, device, food)

  • EHR related terminology for the caCIS project, which is an ambulatory oncology EHR extension.

  • Terminology in support of the NCPDP SCRIPT standard for e-prescribing

  • Structured product labeling

  • Nanotechnology (Nanoparticle characterization etc)

  • Imaging (Probably device related expansion of SDTM)

  • caHUB (Cancer Human Biobank project)

Ncit management goals for 2011 l.jpg
NCIt Management Goals for 2011

  • Publish better terminology provenance information

    • Example: What was done to a terminology when it was loaded into LexEVS for a particular purpose

  • Terminology Metadata: Continued progress on ongoing terminology metadata collaboration with NCBO and NCRI, with goals of

    • adopting a core of metadata about terminology, and (for NCI) implementing on caBIG Vocab Knowledge Center

    • creating better ways of disseminating info that helps people choose what terminologies to use

Lexevs terminology server l.jpg
LexEVS Terminology Server

  • Hosts multiple coding schemes/ terminologies including NCIt

  • Uses LexGrid Model (now extended to comply with the draft CTS2 spec)

  • OWL, RRF and other loaders to convert and load terminologies

  • LexGrid 6.0 just released, adds value set, pick list and mapping capabilities

  • Documentation, see: LexEVS on caBIG Vocabulary Knowledge Center Wiki

Lexevs terminology server ver 5 1 l.jpg
LexEVS Terminology Server: Ver 5.1

Includes the following components:

  • Java API - Java interface based on LexGrid 5.0 Object Model

  • REST/HTTP Interface - Offers an HTTP based query mechanism. Results are returned in either XML or HTML formats

  • SOAP/Web Services Interface - Provides a programming language neutral Service-Oriented Architecture (SOA)

  • Distributed LexBIG (DLB) API - A Java interface that relies on a LexEVS Proxy and *Distributed LexEVS Adapter to provide remote clients access to the native LexEVS API

  • LexEVS 5.0 Grid Service - An interface which uses the caGRID infrastructure to provide access to the native LexEVS API via he caGRID Services

Lexevs 6 0 cts2 what is cts2 l.jpg
LexEVS 6.0 / CTS2 – What is CTS2?

  • Common Terminology Services - Release 2 specifies a set of service interfaces to standardize necessary functional operations of a terminology service.

    • Administration

    • Search/ Query

    • Mapping Support

    • Authoring/ Maintenance

  • Focused on extending existing Health Level 7 (HL7) Common Terminology Services (CTS) specification based on consensus requirements from the user community (including LexEVS users).

  • Developed as an HL7 Service Functional Model (SFM); accepted as an HL7 draft standard for trial use (DSTU) and is currently an Object Management Group (OMG) RFP. OMG vote expected in June 2011

What s new in lexevs 6 0 l.jpg
What’s new in LexEVS 6.0

  • LexEVS 6.0 will add comprehensive support for CTS 2 functionalities that are either partially supported or unsupported in LexEVS 5.1.

    • Provide expanded support for value sets

    • Develop the ability to provide local extensions to code sets

    • Provide expanded mapping ability among code sets

    • Develop other capabilities called for in the CTS 2 specification

Lexevs 6 0 updates highlights l.jpg
LexEVS 6.0 Updates (highlights)

  • Association/Mapping Functionality

  • Association Administrative Functionality

  • Association Search / Query Functionality

  • Association Author / Curation Functionality

Search / Query Functionality

  • Value Set Search / Query

  • Concept Domain Search / Query

  • Local Extension Search / Query

    Authoring / Curation Functionality

  • Code System Authoring / Curation

  • Value Set Authoring / Curation

  • Concept Domain and Usage Context Authoring / Curation

Slide37 l.jpg

Mapping capabilities in LexEVS are supported by associations, stored in a coding scheme format like other vocabularies

A map relates a single specific coded concept within a specified code system (source) to a corresponding single specific coded concept (target) within the same or another code system.

Mapping coding scheme does not contain concept details; it relies on participating source and target schemes to provide that information

LexEVS 6.0: Mapping Implementation

Slide38 l.jpg

If the map relates to code systems available in LexEVS, then the map contains resolvable concepts.

Mapping Implementation (Continued)

Other mappings as defined by users and communities and loaded as mapping schemes

ICD9 to SNOWMEDCT Mapping Scheme

NCIT to ICD9 Mapping Scheme






NCI Thesaurus:








concept mappings resolvable due to internal links to live terminologies







Relationship links to other NCIT concepts

Relationship links to other ICD9 concepts

Relationship links to other SNOWMED CT concepts

Slide39 l.jpg

Early Mapping Implementation in Browser the map contains resolvable concepts.

Slide40 l.jpg

(Screen shot continued) the map contains resolvable concepts.

Slide41 l.jpg

Example: NCIt to ICD9CM Mapping the map contains resolvable concepts.

Early Implementation

Slide42 l.jpg

Example: NCIt to ICD9CM Mapping the map contains resolvable concepts.

Early Implementation

Slide43 l.jpg

(Screen shot continued) the map contains resolvable concepts.

Lexevs 6 0 value set services what is a value set definition l.jpg
LexEVS 6.0: Value Set Services the map contains resolvable concepts.What Is A Value Set Definition?

A Value Set represents a uniquely identifiable set of concept codes grouped for a specific purpose.

The Value Set Definition is the mechanism for describing the contents of a Value Set.

The contents are concept codes defined in referencing Code System.

Value Set can contain concept codes from one or more Code Systems.

Lexevs 6 0 value set services value set representation resolution l.jpg
LexEVS 6.0: Value Set Services the map contains resolvable concepts.Value Set Representation / Resolution

  • Content of Value Sets

    • Code system – all concept codes in referencing code system

    • Value Set Definition – all concept codes defined in referencing Value Set Def

    • Code system/concept code – individual code

    • Code system/concept code + relationship + additional rules (leaf only, targetToSource, ...)

    • Code System/Property Name or Value match – all concept codes in the referencing code system that matches property name or property value.

    • Combination of any of the above with or/and/difference operators

  • Resolution

    • A value set definition has to be made against a specific version of a code system. ( But it doesn’t have to be resolved against the same version.)

    • Philosophy: Even a simple list (a, b, c, d) needs to be resolved as, at some future date, “c” might be retired.

    • Resolution does not create a static artifact.

Slide47 l.jpg

Value Set Viewer: Design Stage the map contains resolvable concepts.

Search, Browse, Resolve, View, Export

Slide48 l.jpg

Another Critical Need: Value Set Editor the map contains resolvable concepts.

  • Started: tool to create and edit value set definitions

  • Steps (current design)

    • Enter metadata

    • Define components (code, name, property, relationship, enumeration of codes, or entire vocabulary) with a presentation property (e.g. preferred name), coding scheme, matching criteria, etc., using and/or

    • Preview value set using “Resolve”

  • Example, one could create the FDA subsets as value sets, using the association “concept in subset”, or in theory, create an anatomy subset that includes the is-a and part-of relationships.

Other lexevs work slated for this year l.jpg
Other LexEVS work slated for this year the map contains resolvable concepts.

  • Better REST/ SOAP APIs

    • Enable through these for example,

      • Getting relationship information

      • Get a version of a value set,

      • Get a change set of the differences between two versions

  • OWL 2 support for loaders and exports

  • Probably patch release in spring, and 6.1 release towards end of year.

Contact information l.jpg
Contact Information the map contains resolvable concepts.

Sherri de Coronado

Acting Director

Semantic Services

[email protected]

Larry Wright

Associate Director

Enterprise Vocabulary Services

[email protected]

Margaret Haber

Associate Director

Enterprise Vocabulary Services

[email protected]