1 / 50

National Cancer Institute Enterprise Vocabulary Services Overview and Plans for 2011

National Cancer Institute Enterprise Vocabulary Services Overview and Plans for 2011. January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology. Interoperability. Interoperability:

todd
Download Presentation

National Cancer Institute Enterprise Vocabulary Services Overview and Plans for 2011

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. National Cancer Institute Enterprise Vocabulary Services Overview and Plans for 2011 January 19, 2011 Sherri de Coronado, Semantic Services Center for Bioinformatics and Information Technology

  2. Interoperability • Interoperability: The ability of a system...to use the parts or equipment of another system Source: Merriam-Webster web site • Interoperability: The ability of two or more systems or components to exchangeinformation and to use the information that has been exchanged. Source: IEEE Standard Computer Dictionary, 1990 Syntacticinteroperability Semanticinteroperability

  3. Extending Interoperability Beyond the Enterprise • cancer Biomedical Informatics Grid (caBIG) • Shared infrastructure, applications and data • Enable cancer research community to focus on innovation and move research from bench to bedside and back • Shared vocabulary, data elements, data models facilitate information exchange • Interoperable applications developed to common standard • Making research data available for mining and integration • Several new ARRA initiatives leveraging this infrastructure to extend interoperability principles to the broader healthcare community

  4. Semantic Infrastructure Futures • Evolution, not Revolution • Still gathering requirements and defining approaches • Aim: support interoperability with a broader range of partners • Services-Oriented Architecture (SOA) approach. • Technology-independent specifications that enable others to build interoperable components. • Design, develop and deploy software components defined as business capabilities rather than monolithic applications.

  5. Still Required: Terminology Services for NCI & Collaborators • Terminology Editing Software (NCI Protégé – Protégé 3.4 with extensions) • Terminology Content (NCI Thesaurus), published monthly • Terminology Content (NCI Metathesaurus) • Terminology Content (other standard terminologies like LOINC) • Terminology Server (LexEVS and OWL/ RRF/ LexGrid/ OBO loaders, and now Value Set, Pick List and Mapping support/ export) accessible via APIs • Browsers for NCI Thesaurus, NCI Metathesaurus and other terminologies served via LexEVS, now being modified for value set query, resolution and export, and mapping query/ review

  6. High Value Use Cases • EVS Used Directly for Drug and Clinical Information Integration • Agents, Clinical Trials and Adverse Events • CTEP and DCP clinical trials • PDQ Cancer Clinical Trials Registry & NCI Drug Dictionary • Federal Medication Terminologies (FMT) - Interagency • FDA Structured Product Labeling • NCPDP (SCRIPT Standard for e-prescribing) • caBIG infrastructure and application use cases • Infrastructure providing semantic interoperability • caTIES/caTissueCore/caMOD/caNanolab • FDA/NCI/CDISC/RCRIM – harmonization/ development - standards

  7. NCI Thesaurus (NCIt) • Standard reference terminology/ontologyused by NCI, caBIG; underpins caCORE/caBIG/caGRID semantics • A Federal Standard Terminology • Built using description logics (OWL-DL) • Published monthly, with concept history • Public domain, open content license • Used by many public and private partners, nationally and internationally

  8. What ‘s in NCIt ? Events, Entities, Processes  +89,000 concepts Hierarchical arrangement Preferred Names, Synonyms & Definitions Concept relationships & properties Unique, permanent identifier codes

  9. Semantic Diversity eukaryote plants fungus virus bacterium archaeon animal mammal vertebrates amphibian bird fish reptile human medical device embryonic structure laboratory tests anatomical structure anatomical abnormality bodyparts &organs congenital abnormality language clinical drug regulation or law tissue sign or symptoms nucleic acid gene findings geographic area research activity cell s genetic function family group molecular sequence disease or syndrome neoplastic process educational activity Mental process natural phenomenon event experimental model of disease therapeutic or preventative procedure organization behavior health care activity activity laboratory procedure quantitative concept element,ion,isotope

  10. Terminology Subsets

  11. FDA-NCIMemorandum of Understanding • Significance of MOU • Avoids expenditure at FDA to replicate existing, available resources at NCI • Leverages multiple efforts • Complementary to the CDISC/NCI collaborations on terminology requirements for CDISC models such as the Study Data Tabulation Model (SDTM) • FDA and NCI coordinate regarding relevant terminology standards and standards development efforts such as those of the HL7 RCRIM technical committee • FDA and NCI seek to identify opportunities to employ consistent terminology and terminology practices, for example in support of FHA/ONC initiatives and goals and such as eGOV

  12. NCI-FDA Terminology Collaboration • 2002- partnership and agreements in several terminology areas. • Structured Product Labeling (SPL) • Unique Ingredient Identifier (UNII) • Regulated Product Submission (RPS) • Individual Case Safety Report (ICSR) • Center for Devices and Radiological Health (CDRH) • FDA PDUFA IV IT Plan: “For terminology standards, the FDA partners with the National Cancer Institute Enterprise Vocabulary Services (EVS). The NCI EVS hosts the FDA terminologies and makes them freely available to the public.” • FDA terminology resources are available on the NCI portal website: http://www.cancer.gov/cancertopics/terminologyresources/FDA

  13. FDA Structured Product Labels • Pharmaceutical Companies must provide information for electronic labels to FDA using controlled terminology • FDA needs rapid turnaround terminology for the content of labels but doesn’t want to be in the terminology business. • FDA requests terminology in various areas related to product labels, NCI editors work with them, integrate them into NCI Thesaurus, and tag them with subset properties. FDA publishes the lists on their website, and provides links to NCI Thesaurus. • Examples • Route of Administration • Unit of Presentation (Potency) • Dosage Form • Package Type • FDA SPL Web page: http://www.fda.gov/oc/datacouncil/spl.html

  14. SPL in NCIt • For solid oral dosage form appearance • SPL Color – BLUE C48333 • SPL Shape - ROUND C48348 • For drug interactions • Contributing Factor - General - FOOD OR FOOD PRODUCT C1949 • Type of Drug Interaction Consequence - PHARMACOKINETIC EFFECT C54386 • Pharmacokinetic Effect Consequence - INCREASED DRUG LEVEL C54355 • Limitation of Use – CONTRAINDICATION C50646 • Sex – FEMALE C16576 • Race - ASIAN C41259 • Other • SPL DEA Schedule - Controlled Substances – (e.g. CII C48675)

  15. CDISC Terminology • Clinical Data Interchange Standards Consortium (CDISC) is an international, non-profit organization that develops and supports global data standards for medical research. • FDA points to CDISC as key provider of clinical & preclinical standards: “The foundation for the standardized clinical content is the Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM).” FDA PDUFA IV IT Plan • EVS is partnered with CDISC to support and publish SDTM and other CDISC terminology including SEND (animal studies), Glossary, CDASH • CDISC terminology also published on NCI portal website: http://www.cancer.gov/cancertopics/terminologyresources/CDISC

  16. NCIthesaurushttp://ncit.nci.nih.gov Search Box Choices, choices... Version information

  17. Term search Search on term - mg - 5 results

  18. Code Search Search on Code - 1 result 6 sources

  19. Concept details from Browser Definition(s) Semantic Type

  20. Concept Relationships & Associations Subset Associations: How concepts are "bundled"

  21. NCI Metathesaurus • Purpose: Integrating 76 biomedical national and international sources into one database. • UMLS based. About 3.6 million terms/ 1.4 million concepts • Provides a mapped overlap and partial inter-relation of current versions of NCI and partner required vocabularies, for ex. the ICD’s, MedDRA, SNOMED, MeSH (NLM Medical Subject Headings), HCPCS (procedures), LOINC (lab values), drug terminologies (VA NDF-RT, AOD, RxNORM, Multum, NCI Thesaurus drugs, etc.) • Used as online dictionary and thesaurus, for mapping and document indexing. • Major releases at least twice a year, minor releases with NCIt and other updates several more times a year.

  22. NCI Metathesaurushttps://ncim.nci.nih.gov 3,600,000 terms 76 Sources 1,400,000 concepts

  23. NCImetathesaurus Choose your source 11 Sources

  24. NCITerm Browserhttp://nciterms.nci.nih.gov Sources

  25. EVS Products & Services Are Open • NCI Thesaurus is Open Content http://evs.nci.nih.gov/terminologies • NCI Metathesaurus is Mostly Open Source (See Each Source’s License)http://ncim.nci.nih.gov/ncimbrowser/pages/source_help_info.jsf • NCI EVS Servers Are Freely Accessible • On the Web: http://nciterms.nci.nih.gov http://ncimeta.nci.nih.gov • Via API: https://cabig.nci.nih.gov/tools/LexEVS_API https://cabig.nci.nih.gov/workspaces/Architecture/caGrid • All Software Developed by NCI EVS is Public Open Source and Free for the Asking: http://ncicb.nci.nih.gov/download/#ETools

  26. Methods of Content Retrieval • NCI ftp site: http://evs.nci.nih.gov/ftp1/FDA • NCI partner web sites (CDISC, FDA, etc.) • Request a report from NCI staff: http://ncit.nci.nih.gov/ncitbrowser/pages/contact_us • NCIt Browser by subset : • http://ncit.nci.nih.gov/pages/subset.jsf • Cancer.gov: • http://www.cancer.gov/cancertopics/terminologyresources

  27. NCIt ftp sitehttp://evs.nci.nih.gov/ftp1 You can download the entire NCIt in various formats

  28. Shared Content Standards NICHD NHLBI NINDS NLM NIH “Roadmap” caBIG UNIIs ICSR SPL RPS CDRH Admin Procedures Other SDTM CDASH SEND ADaM Glossary SHARE Therapeutic Area Standards

  29. Consolidated Content Services FedMed SNOMED CT® UCUM

  30. NCIt Editing Priorities for 2011 • Terminology associated with standardized case report forms of all kinds • Safety reporting (drug, device, food) • EHR related terminology for the caCIS project, which is an ambulatory oncology EHR extension. • Terminology in support of the NCPDP SCRIPT standard for e-prescribing • Structured product labeling • Nanotechnology (Nanoparticle characterization etc) • Imaging (Probably device related expansion of SDTM) • caHUB (Cancer Human Biobank project)

  31. NCIt Management Goals for 2011 • Publish better terminology provenance information • Example: What was done to a terminology when it was loaded into LexEVS for a particular purpose • Terminology Metadata: Continued progress on ongoing terminology metadata collaboration with NCBO and NCRI, with goals of • adopting a core of metadata about terminology, and (for NCI) implementing on caBIG Vocab Knowledge Center • creating better ways of disseminating info that helps people choose what terminologies to use

  32. LexEVS Terminology Server • Hosts multiple coding schemes/ terminologies including NCIt • Uses LexGrid Model (now extended to comply with the draft CTS2 spec) • OWL, RRF and other loaders to convert and load terminologies • LexGrid 6.0 just released, adds value set, pick list and mapping capabilities • Documentation, see: LexEVS on caBIG Vocabulary Knowledge Center Wiki

  33. LexEVS Terminology Server: Ver 5.1 Includes the following components: • Java API - Java interface based on LexGrid 5.0 Object Model • REST/HTTP Interface - Offers an HTTP based query mechanism. Results are returned in either XML or HTML formats • SOAP/Web Services Interface - Provides a programming language neutral Service-Oriented Architecture (SOA) • Distributed LexBIG (DLB) API - A Java interface that relies on a LexEVS Proxy and *Distributed LexEVS Adapter to provide remote clients access to the native LexEVS API • LexEVS 5.0 Grid Service - An interface which uses the caGRID infrastructure to provide access to the native LexEVS API via he caGRID Services

  34. LexEVS 6.0 / CTS2 – What is CTS2? • Common Terminology Services - Release 2 specifies a set of service interfaces to standardize necessary functional operations of a terminology service. • Administration • Search/ Query • Mapping Support • Authoring/ Maintenance • Focused on extending existing Health Level 7 (HL7) Common Terminology Services (CTS) specification based on consensus requirements from the user community (including LexEVS users). • Developed as an HL7 Service Functional Model (SFM); accepted as an HL7 draft standard for trial use (DSTU) and is currently an Object Management Group (OMG) RFP. OMG vote expected in June 2011

  35. What’s new in LexEVS 6.0 • LexEVS 6.0 will add comprehensive support for CTS 2 functionalities that are either partially supported or unsupported in LexEVS 5.1. • Provide expanded support for value sets • Develop the ability to provide local extensions to code sets • Provide expanded mapping ability among code sets • Develop other capabilities called for in the CTS 2 specification

  36. LexEVS 6.0 Updates (highlights) • Association/Mapping Functionality • Association Administrative Functionality • Association Search / Query Functionality • Association Author / Curation Functionality Search / Query Functionality • Value Set Search / Query • Concept Domain Search / Query • Local Extension Search / Query Authoring / Curation Functionality • Code System Authoring / Curation • Value Set Authoring / Curation • Concept Domain and Usage Context Authoring / Curation

  37. Mapping capabilities in LexEVS are supported by associations, stored in a coding scheme format like other vocabularies A map relates a single specific coded concept within a specified code system (source) to a corresponding single specific coded concept (target) within the same or another code system. Mapping coding scheme does not contain concept details; it relies on participating source and target schemes to provide that information LexEVS 6.0: Mapping Implementation

  38. If the map relates to code systems available in LexEVS, then the map contains resolvable concepts. Mapping Implementation (Continued) Other mappings as defined by users and communities and loaded as mapping schemes ICD9 to SNOWMEDCT Mapping Scheme NCIT to ICD9 Mapping Scheme ICD9: 199 Disseminated malignant neoplasm NCI Thesaurus: C27469 Disseminated Carcinoma SNOWMED CT: 307593001 Disseminated carcinomatosis concept mappings resolvable due to internal links to live terminologies Properties Properties Properties Associations Associations Associations Relationship links to other NCIT concepts Relationship links to other ICD9 concepts Relationship links to other SNOWMED CT concepts

  39. Early Mapping Implementation in Browser

  40. (Screen shot continued)

  41. Example: NCIt to ICD9CM Mapping Early Implementation

  42. Example: NCIt to ICD9CM Mapping Early Implementation

  43. (Screen shot continued)

  44. LexEVS 6.0: Value Set ServicesWhat Is A Value Set Definition? A Value Set represents a uniquely identifiable set of concept codes grouped for a specific purpose. The Value Set Definition is the mechanism for describing the contents of a Value Set. The contents are concept codes defined in referencing Code System. Value Set can contain concept codes from one or more Code Systems.

  45. LexEVS 6.0: Value Set Services Value Set Representation / Resolution • Content of Value Sets • Code system – all concept codes in referencing code system • Value Set Definition – all concept codes defined in referencing Value Set Def • Code system/concept code – individual code • Code system/concept code + relationship + additional rules (leaf only, targetToSource, ...) • Code System/Property Name or Value match – all concept codes in the referencing code system that matches property name or property value. • Combination of any of the above with or/and/difference operators • Resolution • A value set definition has to be made against a specific version of a code system. ( But it doesn’t have to be resolved against the same version.) • Philosophy: Even a simple list (a, b, c, d) needs to be resolved as, at some future date, “c” might be retired. • Resolution does not create a static artifact.

  46. Value Set Viewer: Design Stage Search, Browse, Resolve, View, Export

  47. Another Critical Need: Value Set Editor • Started: tool to create and edit value set definitions • Steps (current design) • Enter metadata • Define components (code, name, property, relationship, enumeration of codes, or entire vocabulary) with a presentation property (e.g. preferred name), coding scheme, matching criteria, etc., using and/or • Preview value set using “Resolve” • Example, one could create the FDA subsets as value sets, using the association “concept in subset”, or in theory, create an anatomy subset that includes the is-a and part-of relationships.

  48. Other LexEVS work slated for this year • Better REST/ SOAP APIs • Enable through these for example, • Getting relationship information • Get a version of a value set, • Get a change set of the differences between two versions • OWL 2 support for loaders and exports • Probably patch release in spring, and 6.1 release towards end of year.

  49. Contact Information Sherri de Coronado Acting Director Semantic Services decorons@mail.nih.gov Larry Wright Associate Director Enterprise Vocabulary Services lwright@mail.nih.gov Margaret Haber Associate Director Enterprise Vocabulary Services mhaber@mail.nih.gov

More Related