slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Department of Computer Science and Engineering PowerPoint Presentation
Download Presentation
Department of Computer Science and Engineering

Loading in 2 Seconds...

play fullscreen
1 / 50

Department of Computer Science and Engineering - PowerPoint PPT Presentation

  • Uploaded on

Real-Time Clinical Warning for Hospitalized Patients via Data Mining (数据挖掘实现的住院病人的实时预警). Department of Computer Science and Engineering Yixin Chen (陈一昕) , Yi Mao, Minmin Chen, Rahav Dor , Greg Hackermann , Zhicheng Yang, Chengyang Lu School of Medicine

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Department of Computer Science and Engineering' - aviv

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Real-Time Clinical Warning for Hospitalized Patients via Data Mining (数据挖掘实现的住院病人的实时预警)

Department of Computer Science and Engineering

Yixin Chen (陈一昕), Yi Mao, Minmin Chen, RahavDor, Greg Hackermann, Zhicheng Yang, ChengyangLu

School of Medicine

Kelly Faulkner, Kevin Heard, Marin Kollef, Thomas Bailey

  • The ICU direct costs per day for survivors is between six and seven times those for non-ICU care.
  • Unlike patients at ICUs, general hospital wards (GHW) patients are not under extensive electronic monitoring and nurse care.
  • Clinical study has found that 4–17% of patients will undergo cardiopulmonary or respiratory arrest while in the GHW of hospital.
project mission
Project mission

Sudden deteriorations (e.g. septic shock, cardiopulmonary or respiratory arrest) of GHW patients can often be severe and life threatening.

Goal: Provide early detection and intervention based on data mining

to prevent these serious, often life-threatening events.

Using both clinical data and wireless body sensor data

A NIH-ICTS funded project: currently under clinical trials at Barnes-Jewish Hospital, St. Louis, MO


What exactly do we predict

Is he going

to die?


What exactly do we predict

Is he going

to ICU?


System Architecture

  • Tier 1: EWS (early warning system)
    • Clinical data, lab tests, manually collected, low frequency
  • Tier 2: RDS (real-time data sensing)
    • Body sensor data, automatically collected, wirelessly transmitted, high frequency





Early warning system (EWS)

Background and overview

Real-time data sensing (RDS)

Future work


Medical Record (34 vital signs: pulse, temperature, oxygen saturation, shock index, respirations, age, blood pressure …)



related work
Related Work










Acute Physiology

Score, Chronic

Health Score , and

APACHE score are

used to predict

renal failures

Modified Early


Score (MEWS)






Main problems : Most previous general work uses a snapshot method that takes all the features at a given time as input to a model, discarding the temporal evolving of data

overview of ews
Overview of EWS

Goal: Design an data mining algorithm that can automatically identify patients at risk of clinical deterioration based on their existing electronic medical records time-series.


  • Classification of high- dimensional time series data
  • Irregular data gaps
  • measurement errors
  • class imbalance
key techniques in the ews algorithm
Key Techniques in the EWS Algorithm

Temporal bucketing

Discriminative classification

Bootstrap aggregating (bagging)

Exploratory under-sampling

Exponential moving average smoothing

Kernel-density estimation

data preprocessing
Data Preprocessing

Outlier removal


temporal bucketing
Temporal Bucketing

Bucket 1

Bucket 2

Bucket 3

Bucket 4

Bucket 5

Bucket 6

We retain data in a sliding window of the last 24 hours and divided it evenly into 6 buckets

In order to capture temporal variations, we compute several feature values for each bucket, including the minimum, maximum,and average

discriminative classification
Discriminative Classification
  • Logistic regression (LR)
  • Support vector machine (SVM)
  • Use max, min, and avgof each bucket and each vital sign as the input features. (~ 400 features in total)
  • Use the training data to learn the model parameters.

Clinical data

Data preprocessing

Temporal Bucketing

Classification Algo.

Output Model, Threshold

aggregated bootstrapping bagging
Aggregated Bootstrapping (bagging)


1. Handles outliers

2. Avoid over-fitting

3. Better model quality

evaluation criteria
Evaluation Criteria

AUC (Area Under receives operating characteristic (ROC) Curve) represents the probability that a randomly chosen positive example is correctly rated with greater suspicion than a randomly chosen negative example.

results on historical database
Results on Historical Database

At specificity=0.95

1: bucketing + logistic regression

2: bucketing + logistic regression + bagging

3: bucketing + logistic regression + bucket bagging

4: bucketing + logistic regression + biased bucket bagging

5: bucketing + logistic regression + biased bucket bagging + exploratory undersampling


Clinical Trial at Barnes-Jewish Hospital

Alerts already triggered early prevention that may prevented deaths






Background & Related work

Future work

Early warning system (EWS)

Real-time data sensing (RDS)


Overview of RDS

  • A challenging problem
  • Classification based on multiple high-frequency real-time time-series (heart rate, pulse, oxygen sat., CO2, temperature, etc.)
overview of learning algorithm
Overview of Learning Algorithm

Key techniques:

Feature extraction from multiple time series

Feature selection


Exploratory undersampling


A Large Pool of Features


  • Detrended fluctuation analysis (DFA) features
  • Approximate entropy (ApEn)
  • Spectral features
  • First-order features
  • Second-order features
  • Cross-sign features

Detrended Fluctuation Analysis (DFA)

DFA is a method for quantifying the statistical self-affinity of a time-series

signal. (See: e.g., Peng et al. 1994)

Applicable to both pulse rate and SpO2


Spectral Analysis (FFT)

Used component values of VLF (<0.04Hz), LF (0.04-0,15HZ),

HF (0.15-0.4HZ), and the ratio LF/HF for each signal.

other features
Other Features

Approximate Entropy (ApEn): It quantifies the unpredictability of fluctuations in a time series.

A low value  deterministic

A high value  unpredictable

First Order Features:

Mean, standard deviation

skewness (symmetry of distribution), Kurtosis (peakness of distribution)

Second Order Features: related to co-occurrence of patterns

First quantify a time series into Q discrete bins, then construct a pattern matrix

energy (E), entropy (S), correlation (COR), inertia (F), local homogeneity (LH),

Cross-sign features: link multiple vital signs together

Correlation: the degree of departure of two signals from independence

Coherence: amplitude and phase about the frequencies held in common between two signals


Forward Feature Selection

Empty Feature Set

Current Feature Set

Pick one feature

to add into the set

Evaluate each of the remaining features

(if no improvement)

Final feature set

experimental setup
Experimental Setup

Dataset: MIMIC-II (Multiparameter Intelligent Monitoring in Intensive Care II): A public-access ICU database

The data model can be used for both GHW patients with sensors and ICU patients

Our data: between 2001 and 2008 from a variety of ICUs (medical, surgical, coronary care, and neonatal)

Prediction goal: death or survival

Real-time vital signs: heart rate and oxygen saturation rate

Class imbalance: most patients survived

Evaluation: Based on a 10-fold cross validation


Result – Linear and Nonlinear Classification

LSVM: Linear SVM

LR: Logistic Regression


1: DFA of Heart Rate

2: DFA of Oxygen Saturation

result feature selection
Result – Feature Selection

LR is our first choice: better AUC, interpretability, efficiency

result our final model
Result – Our Final Model

Method 1: Logistic Regression + all features

Method 2:Logistic Regression + all features + exploratory undersampling

Method 3:Logistic Regression + feature selection + exploratory undersampling

current work density based lr
Current Work: Density-based LR

Standard logistic regression φk(x) = xk:

P(y=1|x) = 1/(1 + exp( - ∑ wk xk))

Probability of an event (e.g., ICU, death) grows or decreases monotonically with each feature

Not true in many case: e.g., ICU transfer rate vs. age

Ideas: transform each feature xk

current work density based lr1
Current Work: Density-based LR

Use a kernel-density estimator to estimate p(xk, y=1) and p(xk, y=0) for each feature xk

Resulting in a nonlinear separation plane that conforms to the true distribution of data

Advantages over KLR, SVM

Efficiency, interpretability

example of density based lr
Example of Density-based LR



Original LR

Density-based LR

future work
Future Work

Distance-based classification algorithms for multi-dimensional time-series

Dynamic time warping, information distance

Combination of feature-base and distance-based classification algorithms

Include distance information in the objective function

Combining Tier-1 and Tier-2 data

Multi-kernel methods

Interpretation of alerts

Based on the magnitude and sign of model coefficients


(Assuming feature


Let each be the bucket sample that is independently drawn from . is the predictor.

The aggregated predictor is:

The average prediction error in is:

The error in the aggregated predictor is:

Using the inequality gives us .

Why Bagging Works?

algorithm details biased bucket bagging bbb
Algorithm details – Biased Bucket bagging (BBB)

Standard deviation

A critical factor deciding how much bagging will improve accuracy is the variance of these bootstrap models. We see that BBB with 4 buckets has the largest difference between and . Besides this, BBB with 4 buckets also has the highest standard deviations in predict results. So we choose BBB with 4 buckets as the final method.

result on real time system
Result on Real-Time System

We can see that all cases attain best performance when is around 0.06, showing that the choice of is robust. This small optimal value shows that historical records plays an important role for prediction.

Cross validation for the EMA parameter