Kai zheng phd qiaozhu mei phd david a hanauer md university of michigan
Download
1 / 46

Kai Zheng, PhD, Qiaozhu Mei, PhD, David A. Hanauer, MD University of Michigan - PowerPoint PPT Presentation


  • 84 Views
  • Uploaded on

Developing an Intelligent and Socially Oriented Search Query Recommendation Service for Facilitating Information Retrieval in Electronic Health Records. Kai Zheng, PhD, Qiaozhu Mei, PhD, David A. Hanauer, MD University of Michigan.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Kai Zheng, PhD, Qiaozhu Mei, PhD, David A. Hanauer, MD University of Michigan' - verity


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Kai zheng phd qiaozhu mei phd david a hanauer md university of michigan

Developing an Intelligent and Socially Oriented Search Query Recommendation Service for Facilitating Information Retrieval in Electronic Health Records

Kai Zheng, PhD, Qiaozhu Mei, PhD, David A. Hanauer, MD

University of Michigan

- On Behalf of William Wilcox, Danny Wu,and Lei Yang


Information retrieval in ehr
Information Retrieval in EHR Recommendation Service for Facilitating

?

Millions of patient records

Specialized language

Rich, implicit intra/inter document structures

Deep NLP/Text Mining is necessary

Complicated information needs

Privacy is a big concern


Problem statement
Problem Statement Recommendation Service for Facilitating

Electronic health records (EHR), through its capability of acquiring and storing vast volumes of data, provides great potential to help create a “rapid learning” healthcare system

However, retrieving information from narrative documents stored in EHRs is extraordinarily challenging, e.g., due to frequent use of non-standard terminologies and acronyms


Problem statement cont
Problem Statement (Cont.) Recommendation Service for Facilitating

Similar to how Google has changed the way people find information on the web, a Google-like, full-text search engine can be a viable solution to increasing the value of unstructured clinical narratives stored in EHRs

However, average users are often unable to construct effective and inclusive search queries due to their lack of search expertise and/or domain knowledge


Proposed solution
Proposed Solution Recommendation Service for Facilitating

An intelligent query recommendation service that can be used by any EHR search engine to

  • Artificial Intelligence: augment human cognition so that average users can quickly construct high quality queries in their EHR search 

  • Collective (social) Intelligence: engender a collaborative and participatory culture among users so that search queries can be socially formulated and refined, and search expertise can be preserved and diffused across people and domains


A typical ir system architecture
A Typical IR System Architecture Recommendation Service for Facilitating

Documents

INDEXING

Query

Rep

Doc

Rep

query

INTERFACE

Ranking

SEARCHING

results

Feedback

Users

QUERY MODIFICATION


Emerse
EMERSE Recommendation Service for Facilitating

EMERSE - Electronic Medical Record Search Engine

Full-text search engine

Created by David Hanauer

Widely used in UMHS since 2005 (and VA)

Boolean keyword queries

Routinely utilized by frontline clinicians, medical coding personnel, quality officers, and researchers at the University of Michigan Health System

The test platform for the solutions being built through this project


Specific aims of the project
Specific Aims of the Project Recommendation Service for Facilitating

Aim #1: Developing AI-based Query Recommendation Algorithms

Aim #2: Leveraging Social Intelligence to Enhance EHR Search

Aim #3: Defining a Flexible Service Architecture


Aim 1 developing ai based query recommendation algorithms
Aim #1: Developing AI-based Query Recommendation Recommendation Service for Facilitating Algorithms

Clinicians find great difficulty to formulate queries to express their information needs

EMERSE provide “semi-automatic” query suggestion (synonyms, spelling, etc.)

Example: uti  uti "urinary tract infection"

25% adoption rate!

Text mining/machine learning methods to automatically select alternative query terms

Technical details left later in the talk


Aim 2 leveraging social intelligence to enhance ehr search
Aim #2: Leveraging Social Intelligence to Enhance EHR Recommendation Service for Facilitating Search

Enhancing AI-based algorithms with social intelligence:

  • Allow users to bundle search terms and share

  • Social appraisal

  • Classifying search terms bundles for easy retrieval

  • Other community features

  • Enhancing collaboration among user communities across institutions


Aim 3 defining a flexible service architecture
Aim #3: Defining a Flexible Service Architecture Recommendation Service for Facilitating

A service-oriented architecture serving general search knowledge

Locally implementable APIs

Implementation of the community features


System architecture
System Architecture Recommendation Service for Facilitating


To challenge us why bother
To Challenge Us – Why Bother? Recommendation Service for Facilitating

Q1: Is this different from PubMed?

  • EHRs have very different properties

    Q2: Is this different from Google?

  • Very different information needs in EHR search

    Q3: Could “social search” even work?


Dictated notes vs typed notes
Dictated Notes vs Typed Notes Recommendation Service for Facilitating

Hypothesis: there exists a considerable amount of lexical and structural differences. Such differences could have a significant impact on the performance of natural language processing tools, necessitating these two different types of documents being differentially treated

Data: 30,000 dictated notes and 30,000 typed notes of deceased patients, randomly sampled

Same genre: encounter notes that physicians composed to describe an outpatient encounter or to communicate with other clinicians regarding patient conditions


Comparison vocabulary
Comparison: Recommendation Service for Facilitating Vocabulary

64,487

> 80%

OHSUMED:

172

UMLS+: English dictionaries + commonly used medical terminologies + all concepts/terms in UMLS


Comparison acronym usage
Comparison: Acronym Usage Recommendation Service for Facilitating


Comparative analysis perplexity
Comparative Analysis: Perplexity Recommendation Service for Facilitating

Fewer occurrences

Sparser information!

Less functional words

Words repeat less

Higher perplexity/randomness

* Typed notes have higher variance of almost all document measures


Lessons learned
Lessons Learned Recommendation Service for Facilitating

Clinical notes are much noisier than biomedical literature

Among them, notes typed-in by physicians are much noisier and sparser than notes dictated.

What about different genres of notes?

These differences of linguistic properties imply potential difficulty in natural language processing


Analysis of emerse query log
Analysis Recommendation Service for Facilitating of EMERSE Query Log

Hours of a day

Days of a week (Mon - Sun)

202,905 queries collected over 4 years

533 users (medical professionals in UMHS)

35,928 user sessions (sequences of queries)


Query distribution not a power law
Query Distribution – Not a Power Law! Recommendation Service for Facilitating

Long tail –

but no fat head


A categorization of ehr search queries
A Categorization of EHR Search Queries Recommendation Service for Facilitating

Almost no navigational queries; most queries are informational/transactional

Using the top-level concepts of SNOMED CT


Comparison to web search
Comparison to Web Search Recommendation Service for Facilitating

Almost no navigational queries (Web: ~ 30%);

Average query length (Web: 2.3):

  • User typed in: 1.7

  • All together (typed in + query suggestions + bundles): 5.0

    Queries with Acronym: 18.9% (Web: ~5%)

    Dictionary coverage: 68% (Web: 85%-90%)

    Average length of session: 5.64 queries (Web: 2.8)

    Query suggestions adopted: 25.9% (Web: < 10%)


Lessons learned1
Lessons Learned Recommendation Service for Facilitating

Question: Can the users help each other to formulate queries?

Medical search is much more challenging than Web search

  • More complicated information need

  • Longer queries, more noise

    Users have substantial difficulty to formulate their queries

  • Longer search sessions

  • High adoption rate of system generated suggestions


Social collaborative search in emerse
“Social” (Collaborative) Search Recommendation Service for Facilitating in EMERSE

- Zheng, Mei, Hanauer. Collaborative search in electronic health records. JAMIA2011

Changing a search experience into a social experience

Users create search bundles (bundled query)

  • Collection of keywords that are found effective as a query

  • Reuse search bundles

  • Share them with other users

    Public sharing vs. private sharing

    Search knowledge diffuses from bundle creators to bundle users


Example a search bundle
Example: a Search Bundle Recommendation Service for Facilitating


Share a bundle publically privately
Share a Bundle Publically/Privately Recommendation Service for Facilitating


The effectiveness of collaborative search
The Effectiveness of Collaborative Search Recommendation Service for Facilitating

Search bundles (as of Dec. 2009):

  • 702 bundles

  • 58.7% of active users

  • Almost half of the pageviews

  • 19.3% of all queries (as of Dec. 2010)

  • 27.7% search sessions ended with a search bundle (as of Dec. 2010)

  • Bundle creator: 188

  • Bundle sharers: 91

  • Bundle leechers: 77


Example bundles
Example Bundles Recommendation Service for Facilitating

GVHD: "GVHD” "GVH” "Graft-Versus-Host-Disease” "Graft-Versus-Host Disease” "Graft Versus Host Disease” "Graft Versus Host” "Graft-Versus-Host” "Graft vs. Host Disease” "Graft vs Host Disease” "Graft vs. Host” "Graft vs Host"


Example bundle cont
Example Bundle (cont.) Recommendation Service for Facilitating

Myocardial infarction:

NSTEMI STEMI ~AMI "non-stelevation” "non stelevation” "st elevation MI” "stelevation” "acute myocardial infarction” "myocardial infarction” "myocardial infarct” "anterior infarction” "anterolateralinfarction” "inferior infarction” "lateral infarction” "anteroseptalinfarction” "anterior MI” "anterolateralMI” "inferior MI” "lateral MI” "anteroseptalMI” infarcted infarction infarct infract "Q wave MI” "Q-wave MI” "Q wave” "Q-wave” "st segment depression” "t wave inversion” "t-wave inversion” "acute coronary syndrome” "non-specific ST wave abnormality” "non specific ST wave abnormality” "ST wave abnormality” "ST-wave abnormality” "CPK-MB” "CPK MB” "troponin” ~^MI -$"MI \s*\d{5}” -systemic


Bundle sharing across departments
Bundle Sharing Across Departments Recommendation Service for Facilitating


Bundle sharing across individual users
Bundle Sharing Across Individual Users Recommendation Service for Facilitating

Red links: cross department links


Bundle sharing facilitated diffusion of information
Bundle Sharing Facilitated Diffusion of Information Recommendation Service for Facilitating

Quantitative network analysis of search knoweldge diffusion networks

Giant component exists

Small world (high clustering coefficient & short paths)

Publically shared bundles better facilitates knowledge diffusion

  • Privately shared bundles adds on top of public bundles

    Users tends to share bundles to people in the same department; but specialty is a more natural representation of communities. (based on modularity)


Lessons learned2
Lessons Learned Recommendation Service for Facilitating

Medical search is much more challenging than Web search

Users have substantial difficulty to formulate their queries

  • Longer search sessions

  • High adoption rate of system generated suggestions

  • High usage of search bundles

    Collaborative search has facilitated the sharing/diffusion of search knowledge

  • Public bundles are more effective than private

  • 30% bundle users are leechers; half of the bundle creators don’t share


Automatic query recommendation methods
Automatic Query Recommendation: Methods Recommendation Service for Facilitating

Similarity based (kNN)

Pseudo-feedback

Semantic term expansion

Network-based ranking

Learning to rank (much labeled training data needed)


Automatic query recommendation available information
Automatic Query Recommendation: Available Information Recommendation Service for Facilitating

Information to leverage:

  • Co-occurrence within queries

  • Transition in query sessions

  • Co-occurrence within clinical documents

  • Annotation by ontological concepts

  • Ontology structures

  • Morphological closeness

  • Clickthrough


A network view
A Network View Recommendation Service for Facilitating


Random walk and hitting time
Random Walk and Hitting Time Recommendation Service for Facilitating

P = 0.3

k

0.3

A

i

0.7

P = 0.7

j

Hitting Time

  • TA: the first time that the random walk is at a vertex in A

    Mean Hitting Time

  • hiA: expectation of TA given that the walk starts from vertex i


Computing hitting time
Computing Hitting Time Recommendation Service for Facilitating

hiA = 0.7 hjA + 0.3 hkA + 1

  • TA: the first time that the random walk is at a vertex in A

h = 0

0.7

k

A

i

  • hiA: expectation of TA given that the walk starting from vertex i

0.7

Apparently, hiA = 0 for those

j

Iterative Computation


Generate query suggestion
Generate Query Suggestion Recommendation Service for Facilitating

Notes/Concepts/sessions…

  • Construct a (kNN) subgraph centered by the query term (s)

  • Could be bipartite

  • Compute transition probabilities (based on co-occurrence/similarity)

  • Compute hitting time hiA

  • Rank candidate queries using hiA

Query

4

D1

uti

2

D2

bacterial

D3…

urinary tract infection


Other network based methods
Other Network-based Methods Recommendation Service for Facilitating

Stationary distribution

Absorbing probability

Commute time

Other measures

More general: network regularization


Ranking with multiple networks
Ranking with Multiple Networks Recommendation Service for Facilitating

B

C

C

……

C

A

D

D

A

B

A

B

D

Query transitions

Distributional similarity

Ontology structures

Ranking/Transductive Learning with Multiple Views (e.g., Zhou et al. 2007, Muthukrishnan et al. 2010)

Suggested Queries


Evaluation
Evaluation Recommendation Service for Facilitating

Cranfield evaluation (adopted by TREC)

  • Sample information needs  queries

  • Fixed test document collection

  • Pool results of multiple candidate systems

  • Human annotation of relevance judgments

  • IR Evaluation (e.g., MAP, NDCG)

    Directly rating by users (bucket testing)


Towards the next generation ehr search engine
Towards the Recommendation Service for Facilitating Next Generation EHR Search Engine

Better understanding of information needs by medical professionals

  • frontline clinicians, administrative personnel, and clinical/translational researchers

    Better natural language processing for patient records

    Better mechanisms of automatic query recommendation in the medical context

    Better ways to facilitate collaborative search and preserve search knowledge

    Better ways to improve the comprehensibility of medical data by patients and families (future)


Publications to date
Publications to Date Recommendation Service for Facilitating

Kai Zheng, Qiaozhu Mei, David A. Hanauer. Collaborative search in electronic health records. JAMIA. 2011;18(3):282–91.

Lei Yang, Qiaozhu Mei, Kai Zheng, David A. Hanauer. Query log analysis of an electronic health record search engine. AMIA Annual Symposium Proc. 2011. (forthcoming)

Kai Zheng, Qiaozhu Mei, Lei Yang, Frank J. Manion, Balis UJ, David A. Hanauer. Voice-dictated versus typed-in clinician notes: Linguistic properties and the potential implications on natural language processing. AMIA Annual Symposium Proc. 2011. (forthcoming)


Thanks
Thanks! Recommendation Service for Facilitating


ad