strategic health it advanced research projects sharp area 4 secondary use l.
Skip this Video
Loading SlideShow in 5 Seconds..
Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use PowerPoint Presentation
Download Presentation
Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use

Loading in 2 Seconds...

play fullscreen
1 / 32

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use - PowerPoint PPT Presentation

  • Uploaded on

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use. Dr. Friedman on-site visit, Mayo Clinic 3 September 2010. SHARP: Area 4: Secondary Use of EHR Data. 14 academic and industry partners

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use' - gamba

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
strategic health it advanced research projects sharp area 4 secondary use

Strategic Health IT Advanced Research Projects (SHARP)Area 4: Secondary Use

Dr. Friedman on-site visit, Mayo Clinic

3 September 2010

sharp area 4 secondary use of ehr data
SHARP: Area 4: Secondary Use of EHR Data
  • 14 academic and industry partners
  • Develop tools and resources that influence and extend secondary uses of clinical data
  • Cross-integrated suite of project and products
    • Clinical Data Normalization
    • Natural Language Processing (NLP)
    • Phenotyping (cohorts and eligibility)
    • Common pipeline tooling (UIMA) and scaling
    • Data Quality (metrics, missing value management)
    • Evaluation Framework (population networks)

© 2009 Mayo Clinic


  • Agilex Technologies
  • CDISC (Clinical Data Interchange Standards Consortium)
  • Centerphase Solutions
  • Deloitte
  • Group Health, Seattle
  • IBM Watson Research Labs
  • University of Utah
  • Harvard Univ. & i2b2
  • Intermountain Healthcare
  • Mayo Clinic
  • Minnesota HIE (MNHIE)
  • MIT and i2b2
  • SUNY and i2b2
  • University of Pittsburgh
  • University of Colorado
major achievements
Major Achievements
  • Foster social connections across projects
  • Recognition by team members that not all problems must be solved within their team
    • NLP and phenotypes
    • Phenotypes and CEM normalization
  • Shared responsibility for overlapping dependencies
the bookends projects 1 6 data normalization evaluation

The bookends - Projects 1&6Data Normalization & Evaluation

Christopher G. Chute

Stan Huff (Peter Haug)

  • Build generalizable data normalization pipeline
  • Establish a globally available resource for health terminologies and value sets
  • Establish and expand modular library of normalization algorithms
  • Iteratively test normalization pipelines, including NLP where appropriate, against normalized forms, and tabulate discordance.
  • Use cohort identification algorithms in both EMR data and EDW data. (normalize against CEMs)
  • Designation of Clinical Element Models (CEMs) as canonical form
  • Utilizing use case scenario’s (PAD, CPNA, etc) for CEM normalization.
  • Exploration into generalizable CEM models – diagnosis, medications, labs.
  • Development of processes/tools to identify relevant existing CEM models within CEM libraries
  • Development of processes to identify missing CEMs for data (and classes of data) in use-cases
  • Preliminary population of phenotype use-cases
  • Adopt eMERGE EleMap tooling for CEMs to population canonical model
  • Formalize Meaningful Use vocabularies into LexGrid server
  • Design other components of Data Normalization framework (Terminology Services - NHIN connections)
  • Model end-to-end flow needed to produce normalized data from structured data and unstructured (natural language) data:
    • High level description of process for taking “wild-type” data instances to canonical CEM instances
    • Applicability to use-case data as well as to general classes of data
  • Adopt UMIA data flows for normalization services
  • Examine Regenstreif and SHARP 3 modules

Overarching goal

High-throughput phenotype extraction from clinical free text based on standards and the principle of interoperability


Information extraction (IE): transformation of unstructured text into structured representations (CEMs)

Merging clinical data extracted from free text with structured data


Detailed 4-year project plan

Tasks in execution:

Investigative tasks: (1) defining CEMs and attributes as normalization targets for NLP, (2) defining set of clinical named entities and their attributes, (3) methods for cNE

Engineering tasks: (1) defining users, (2) incorporating site NLP tools into cTAKES and UIMA, (3) common conventions and requirements, (4) de-identification flow and data sharing

Forging cross-SHARP collaborations (SHARP 3, PI Kohane and Mandl)



Gold standard for cNEs, relations and CEMs

Focus on methods for cNE discovery and populating relevant CEMs (many subtasks)

Projected module releases:

Medication extraction (Nov’10)

CEM OrderMedAmb population (Mar’11)

Deep parser for cTAKES (Nov’10)

Dependency parser for cTAKES (Jan’11)

Collaboration with SHARP 3 by providing medication extraction capabilities for the medication SMaRT app

  • Overarching goal
    • To develop techniques and algorithms that operate on normalized EMR data to identify cohorts of potentially eligible subjects on the basis of disease, symptoms, or related findings
  • Focus
    • Portability of phenotyping algorithms
    • Representation of phenotyping logic
    • Measure goodness of EMR data


© 2010 Mayo Clinic



Explored use case phenotypes from eMERGE network for HTP process validation

Representation of phenotype descriptions and data elements using Clinical Element Models

Preliminary execution of phenotyping algorithms (Peripheral Arterial Disease) to compare aggregate data


Interaction and collaboration with Data Normalization and NLP teams to develop “data collection widgets”

Representation of phenotyping execution logic in a machine processable format/language

Development of machine learning methods for semi-automatic cohort identification

project 4 infrastructure scalability

Project 4Infrastructure & Scalability

Jeff Ferraro

Marshal Schor

Calvin Beebe

uima exploitation
UIMA exploitation

Some initial discussions on UIMA were held in a meeting at MIT attended by Peter Szolovits (MIT) and Guergana Savova (Harvard) and some of their team members.

A plan is underway for a UIMA "deep dive" for other members from Intermountain Health and Mayo.

A discussion is pending to understand the how UIMA might fit with RPE (in particular, BPEL)

RPE = Retrieve Process for Execution: an IHE (Integrating the Health Enterprise) profile to automate collaborative workflow between healthcare and secondary use domains)

infrastructure progress
Infrastructure Progress
  • Code repository – Reviewed requirements (e.g. SVN), need pre-release work areas for project teams, bulk of materials will all be in public repository.
  • Licensing compatibility discussion.

Initial discussions on Open Source licensing which is consistent with UIMA and other project teams tooling. Will need to survey teams.

  • Initial platform discussions

Still working on Sandbox (“Shared”) environment, need to consider Cloud in later phases of project.

  • Review repository options with:
    • ONC, Source Forge, Open Health Tools
  • Need to establish straw man proposal for Sandbox configuration.
  • Conduct cross-project discussions
    • Inventory tools that can be shared.
    • Inventory data that can be shared.
    • Identify shared environment site location.
    • Initiate high-level requirements gathering.
project 5 data quality

Project 5Data Quality

Dr. Kent Bailey

(Kim Lemmerman)

  • Support data quality and ascertain data quality issues across projects
  • Deploy and enhance methods for missing or conflicting data resolution
  • Integrate methods into UIMA pipelines
progress planned
Progress & Planned
  • Integrate across projects and gather requirements and standards to establish data quality plan and metrics
  • Compare expected quality of data to actual data quality
  • Provide recommendation and methods to improve data quality and/or possible outcomes
  • Started with early with face-to-face collaboration; cross-knowledge pollination
  • Individual project efforts synergized with timelines in synch; use cases vetted and determined for the first six months of focus.
  • IRB & Data Sharing issues have been raised with best practice sharing and inventory of existing agreements between institutions reviewed.
  • Best practices for IRB submissions and template protocol material will be made available w/ applicable state implications
  • Data use agreements will be completed across sites where needed in short term; effort for ‘consortium’ agreement will commence for long-term data sharing needs
cross onc efforts

Cross-ONC Efforts

Dr. Christopher Chute

sharp area synergies
SHARP Area Synergies
  • Security: ensure piplined data does not have compromisable integrity
  • Cognitive: explore how normalized data and phenotypes can contribute to decisions
  • Applications: Potential for shared architectural strategies

© 2009 Mayo Clinic


beacon synergies
Beacon Synergies
  • High-throughput data normalization and phenotyping (SHARP)
  • Applied to population laboratory (Beacon)
  • Validate on consented sub-samples
  • Potential to include ALL patients in population area – regardless of provider

© 2009 Mayo Clinic


se mn beacon more information
SE MN Beacon: More information…

© 2009 Mayo Clinic