discovery of temporal patterns in course of disease medical data
Download
Skip this Video
Download Presentation
Discovery of Temporal Patterns in Course-of-Disease Medical Data

Loading in 2 Seconds...

play fullscreen
1 / 34

Discovery of Temporal Patterns in Course-of-Disease Medical Data - PowerPoint PPT Presentation


  • 62 Views
  • Uploaded on

Discovery of Temporal Patterns in Course-of-Disease Medical Data. Jorge C. G. Ramirez Ph.D. Candidate Lynn L. Peterson and Diane J. Cook Supervising Professors. Overview. Objective Contributions Approach TEMPADIS Summary and Conclusions. Objective.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Discovery of Temporal Patterns in Course-of-Disease Medical Data' - kyna


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
discovery of temporal patterns in course of disease medical data

Discovery of Temporal Patterns in Course-of-Disease Medical Data

Jorge C. G. Ramirez

Ph.D. Candidate

Lynn L. Peterson and Diane J. Cook

Supervising Professors

overview
Overview
  • Objective
  • Contributions
  • Approach
  • TEMPADIS
  • Summary and Conclusions
objective
Objective
  • Discover patterns that represent groups of patients that had a similar course of disease for a catastrophic or chronic illness
  • Motivation
    • Medical
    • AI
contributions
Contributions
  • Data Preprocessing
    • Normalization
    • Learning Missing Data
    • Learning Implicit Knowledge
  • Exploratory Analysis
    • Event Set Sequence Approach
contributions1
Contributions
  • Domain Understanding
    • New perspective on mass of data
    • Identify groups of patients for further medical study
approach
Approach
  • Example Events
    • Laboratory Results
  • 461 L WBC 2.70
  • 461 L HCT 40.10
  • 461 L PLT 239.00
  • 461 L CD4% 19.00
  • 461 L CD4A 188.00
slide7

Approach

  • Example Events
  • Example Events
    • Visits
  • 468 C CV
  • Diagnoses
  • 468 D 043.9 AIDS-RELATED COMPLEX, UNSPECIFIED
  • Pharmacy
  • 469 P CTM 60 CO-TRIMOXAZOLE DS
  • 469 P AZT 200 ZIDOVUDINE 100MG
approach1
Approach
  • Event Set Sequences
    • Events
      • Value Event: laboratory test result, visit
      • Duration Event: pharmacy, diagnosis
    • Event Set is all Events that occur in a window of time
    • Event Set Sequence is all Event Sets that occur over a long period of time
  • Event Set Sequences
slide9

Approach

  • Example Event Set
  • 461 L WBC 2.70
  • 461 L HCT 40.10
  • 461 L PLT 239.00
  • 461 L CD4% 19.00
  • 461 L CD4A 188.00
  • 468 C CV
  • 468 D 043.9 AIDS-RELATED COMPLEX, UNSPECIFIED
  • 469 P CTM 60 CO-TRIMOXAZOLE DS
  • 469 P AZT 200 ZIDOVUDINE 100MG
approach2
Approach
  • Normalization
    • Normal for each patient is different
    • Especially when effected by a catastrophic or chronic illness
    • Example: CD4A
      • General Population Normal: 416 - 1751
      • Well HIV-positive patient: 200 - 350
      • Severely immune-compromised patient: 0 - 50
approach3
Approach
  • Normalization (continued)
    • Scale to -4…0…+4
      • 0 is normal
      • Each number represents a deviation from normal
      • 1 and 2 are noticeable but not severe
      • 3 is severe
      • 4 is very severe
approach4
Approach
  • Replace Missing Data
    • Diagnosis data very incomplete
    • Learn severity of condition from pharmacy data
    • Induce decision tree to classify conditions
approach5
Approach
  • Create Health Status Categories
    • = HIV-positive asymptomatic
    • = Asymptomatic, on anti-HIV therapy
    • = Immune-compromised, on prophylactic therapy
    • = Active illness
    • = Severe active illness
approach6
Approach
  • Learn Implicit Knowledge
    • Need to augment explicit knowledge
    • Recovery time is expert’s implicit knowledge
    • Use neural network to learn recovery time function
      • 0 = Nothing to recover from
      • 1-4 = weeks to recover
      • 5 = 5 or more weeks to recover
approach7
Approach
  • Categorize Pharmacy Data
    • A myriad of drugs prescribed
    • Need to understand significance
    • Categorize by use
approach8
Approach
  • Categories
    • Nucleoside Analogs
    • Protease Inhibitors
    • Prophylaxis Therapies
    • Intraveneous antibiotics
    • Anti-virals
    • Anti-PCP/Toxoplasmosis
    • Anti-mycobacterials
approach9
Approach
  • Categories (continued)
  • Anti-wasting syndrome
  • Anti-fungals
  • Chemotherapies
approach10
Approach
  • Result: Understandable representation of patient data
  • 861 C 1.1 26.1 167 0.0 0 16 0
  • 862 0.0 0.0 0 0.0 0 0 2 24: 30 38: 50
  • 867 H 4.3 19.2 144 0.0 0 11 3 0: 3 22: 1 35: 2
  • 868 H 2.2 26.2 144 0.0 0 5 3 0: 3 22: 1 35: 2
  • 869 0.0 0.0 0 0.0 0 0 1 35: 60
  • 874 C 1.3 32.4 0 0.0 0 17 0
  • 889 C 1.1 30.4 154 0.0 0 36 0
  • 890 0.0 0.0 0 0.0 0 0 3 22: 30 38: 50 39:480
  • 923 0.0 0.0 0 0.0 0 0 1 39:480
  • 933 H 3.6 20.4 182 0.0 0 11 3 0: 2 22: 1 39: 12
approach11
Approach
  • Result: Understandable representation of patient data
  • 861 C 3 1 -4 -3 0 -9 -9 –1 0 0 2 0 0 0 0 0 0 0
  • 867 H 4 4 0 -4 -1 -9 -9 –2 0 0 2 0 0 0 1 1 0 0
  • 868 H 4 1 -2 -3 -1 -9 -9 –4 0 0 2 0 0 0 1 1 0 0
  • 874 C 4 3 -4 -1 -9 -9 -9 0 0 0 2 0 0 0 1 1 0 0
  • 889 C 4 2 -4 -2 -1 -9 -9 2 0 0 2 0 0 0 1 1 0 0
  • 933 H 4 4 0 -4 0 -9 -9 –2 0 0 1 0 0 0 0 2 0 0
approach12
Approach
  • Result: Understandable representation of patient data
  • < { (EV C)(HS 3)(RT 1)(WBC -4)(HCT -3)(PLT 0)
  • (LMPH –1)(onD 0010000000) }
  • { (EV H)(HS 4)(RT 4)(WBC 0)(HCT -4)(PLT -1)
  • (LMPH –2)(onD 0010001100) }
  • { (EV H)(HS 4)(RT 1)(WBC -2)(HCT -3)(PLT -1)
  • (LMPH –4)(onD 0010001100) }
  • { (EV C)(HS 4)(RT 3)(WBC -4)(HCT -1)
  • (onD 00010001100) }
  • { (EV C)(HS 4)(RT 2)(WBC -4)(HCT -2)(PLT -1)
  • (LMPH 2)(onD 0010001100) }
  • { (EV H)(HS 4)(RT 4)(WBC 0)(HCT -4)(PLT 0)
  • (LMPH –2)(onD 0010000100) } >
approach13
Approach
  • Inexact Match
    • Use set difference
      • Partial match, feature by feature
      • Assumes default partial match for missing data
    • Use weakest-link/average-link
      • Require minimum degree of match
      • Require average degree of match
tempadis

Raw Target Data

Data Cleaning

Data Normalization

Normalized Database

TEMPADIS
tempadis1

Decision Tree

Normalized Database

Reduced, Knowledge-Added Data

Neural Net

TEMPADIS
tempadis2

Knowledge-Added Database

Sequence Builder

Temporal Patterns

TEMPADIS
slide25

Results

  • Validation
    • Results are temporal patterns that demonstrate groups of patients had similar experience during the course of disease
    • Only medical experts can assess validity of discovered patterns
    • These results have been validated by the experts in the HIV Clinical Research Group
results
Results
  • Given a database of patients followed for 4 to 9 years
    • Discovered interesting patterns
    • Interestingness has multiple dimensions
      • Length
      • Data that appears in the patterns
      • Data that does not appear in the patterns
results1
Results
  • Advanced patients, subject to various OIs
  • < { (EV C)(HS 3)(RT 0)(WBC 0)(HCT -1)(PLT 0)(LMPH -3)
  • (onD 0000000000) }
  • { (EV E)(HS 3)(RT 2)(WBC 3)(HCT -1)(PLT 1)(LMPH 4)
  • (onD 0000000000) }
  • { (EV C)(HS 3)(RT 0)(WBC 1)(HCT 0)(PLT 0)(CD4P -3)
  • (CD4A -1)(LMPH 0)(onD 1010000000) }
  • { (EV C)(HS 3)(RT 1)(WBC -1)(HCT -1)(PLT 1)(LMPH 2)
  • (onD 1010000000) }
  • { (EV E)(HS 3)(RT 1)(WBC 2)(HCT -1)(PLT 1)(LMPH 4)
  • (onD 0000000000) }
  • { (EV C)(HS 3)(RT 1)(WBC 1)(HCT 0)(PLT 0)(CD4P -3)
  • (CD4A -2)(LMPH 0)(onD 1010000000) } >
slide28

Advanced patients, fairly stable

  • < { (EV C)(HS 3)(RT 0)(WBC -1)(HCT -1)(PLT 1)(CD4P -4)
  • (CD4A -4)(LMPH 0)(onD 0010000000) }
  • { (EV C)(HS 3)(RT 0)(WBC 0)(HCT 0)(PLT -1)(CD4P -4)
  • (CD4A -4)(LMPH 0)(onD 1010000000) }
  • { (EV C)(HS 3)(RT 0)(onD 1010000000) }
  • { (EV C)(HS 3)(RT 0)(WBC -2)(HCT 0)(PLT -1)(CD4P -4)
  • (CD4A -4)(LMPH 0)(onD 0010000000) }
  • { (EV C)(HS 4)(RT 1)(WBC 1)(HCT -4)(PLT 0)(CD4P -4)
  • (CD4A -4)(LMPH -4)(onD 0011001000) }
  • { (EV C)(HS 3)(RT 3)(onD 0010000000) }
  • { (EV )(HS 3)(RT 1)(WBC 0)(HCT 0)(PLT 0)(LMPH 0)
  • (onD 0000000000) }
  • { (EV C)(HS 3)(RT 0)(CD4A -4)(onD 0010000000) } >
slide29

Asymptomatic period

  • < { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 1)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV E)(HS 1)(RT 0)(WBC -1)(HCT 0)(PLT 1)(CD4P -1)
  • (CD4A -2)(LMPH 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(CD4A 0)(onD 0010000000) }
  • { (EV C)(HS 1)(RT 0)(CD4A 0)(onD 0010000000) }
  • { (EV E)(HS 1)(RT 0)(WBC 1)(HCT 0)(PLT 0)(CD4P 0)
  • (CD4A 0)(LMPH 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }
  • { (EV C)(HS 1)(RT 0)(onD 0000000000) } >
summary
Summary
  • Nine Steps of KDD
    • Identify goal
    • Identify target data set
    • Data cleaning and preprocessing
    • Data reduction and projection
    • Identify data mining method
summary1
Summary
  • Nine Steps of KDD
    • Exploratory Analysis
    • Data Mining
    • Interpretation of Mined Patterns
    • Acting on Discovered Knowledge
conclusions
Conclusions
  • Objective Met with Contributions
    • Patterns discovered representing groups of patients with similar experience in course of disease
    • This perspective on the data has not previously been produced
    • This kind of computation on this kind of data has not previously been produced
future work
Future Work
  • Improve discovery algorithm
    • Backtracking is a barrier to overcome
  • Improve search control
  • Develop heuristic for measuring interestingness
  • Add ability to identify clinically identical/similar patterns
future work1
Future Work
  • Move database to new Intelligent Systems in Medicine and Biology Lab
  • Bring database up to date
  • Include more domain data in Event Sets
  • Explore impact of new developments in HIV treatment
ad