Discovery of temporal patterns in course of disease medical data
Download
1 / 34

Discovery of Temporal Patterns in Course-of-Disease Medical Data - PowerPoint PPT Presentation


  • 62 Views
  • Uploaded on

Discovery of Temporal Patterns in Course-of-Disease Medical Data. Jorge C. G. Ramirez Ph.D. Candidate Lynn L. Peterson and Diane J. Cook Supervising Professors. Overview. Objective Contributions Approach TEMPADIS Summary and Conclusions. Objective.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Discovery of Temporal Patterns in Course-of-Disease Medical Data' - kyna


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Discovery of temporal patterns in course of disease medical data

Discovery of Temporal Patterns in Course-of-Disease Medical Data

Jorge C. G. Ramirez

Ph.D. Candidate

Lynn L. Peterson and Diane J. Cook

Supervising Professors


Overview
Overview Data

  • Objective

  • Contributions

  • Approach

  • TEMPADIS

  • Summary and Conclusions


Objective
Objective Data

  • Discover patterns that represent groups of patients that had a similar course of disease for a catastrophic or chronic illness

  • Motivation

    • Medical

    • AI


Contributions
Contributions Data

  • Data Preprocessing

    • Normalization

    • Learning Missing Data

    • Learning Implicit Knowledge

  • Exploratory Analysis

    • Event Set Sequence Approach


Contributions1
Contributions Data

  • Domain Understanding

    • New perspective on mass of data

    • Identify groups of patients for further medical study


Approach
Approach Data

  • Example Events

    • Laboratory Results

  • 461 L WBC 2.70

  • 461 L HCT 40.10

  • 461 L PLT 239.00

  • 461 L CD4% 19.00

  • 461 L CD4A 188.00


Approach Data

  • Example Events

  • Example Events

    • Visits

  • 468 C CV

  • Diagnoses

  • 468 D 043.9 AIDS-RELATED COMPLEX, UNSPECIFIED

  • Pharmacy

  • 469 P CTM 60 CO-TRIMOXAZOLE DS

  • 469 P AZT 200 ZIDOVUDINE 100MG


Approach1
Approach Data

  • Event Set Sequences

    • Events

      • Value Event: laboratory test result, visit

      • Duration Event: pharmacy, diagnosis

    • Event Set is all Events that occur in a window of time

    • Event Set Sequence is all Event Sets that occur over a long period of time

  • Event Set Sequences


Approach Data

  • Example Event Set

  • 461 L WBC 2.70

  • 461 L HCT 40.10

  • 461 L PLT 239.00

  • 461 L CD4% 19.00

  • 461 L CD4A 188.00

  • 468 C CV

  • 468 D 043.9 AIDS-RELATED COMPLEX, UNSPECIFIED

  • 469 P CTM 60 CO-TRIMOXAZOLE DS

  • 469 P AZT 200 ZIDOVUDINE 100MG


Approach2
Approach Data

  • Normalization

    • Normal for each patient is different

    • Especially when effected by a catastrophic or chronic illness

    • Example: CD4A

      • General Population Normal: 416 - 1751

      • Well HIV-positive patient: 200 - 350

      • Severely immune-compromised patient: 0 - 50


Approach3
Approach Data

  • Normalization (continued)

    • Scale to -4…0…+4

      • 0 is normal

      • Each number represents a deviation from normal

      • 1 and 2 are noticeable but not severe

      • 3 is severe

      • 4 is very severe


Approach4
Approach Data

  • Replace Missing Data

    • Diagnosis data very incomplete

    • Learn severity of condition from pharmacy data

    • Induce decision tree to classify conditions


Approach5
Approach Data

  • Create Health Status Categories

    • = HIV-positive asymptomatic

    • = Asymptomatic, on anti-HIV therapy

    • = Immune-compromised, on prophylactic therapy

    • = Active illness

    • = Severe active illness


Approach6
Approach Data

  • Learn Implicit Knowledge

    • Need to augment explicit knowledge

    • Recovery time is expert’s implicit knowledge

    • Use neural network to learn recovery time function

      • 0 = Nothing to recover from

      • 1-4 = weeks to recover

      • 5 = 5 or more weeks to recover


Approach7
Approach Data

  • Categorize Pharmacy Data

    • A myriad of drugs prescribed

    • Need to understand significance

    • Categorize by use


Approach8
Approach Data

  • Categories

    • Nucleoside Analogs

    • Protease Inhibitors

    • Prophylaxis Therapies

    • Intraveneous antibiotics

    • Anti-virals

    • Anti-PCP/Toxoplasmosis

    • Anti-mycobacterials


Approach9
Approach Data

  • Categories (continued)

  • Anti-wasting syndrome

  • Anti-fungals

  • Chemotherapies


Approach10
Approach Data

  • Result: Understandable representation of patient data

  • 861 C 1.1 26.1 167 0.0 0 16 0

  • 862 0.0 0.0 0 0.0 0 0 2 24: 30 38: 50

  • 867 H 4.3 19.2 144 0.0 0 11 3 0: 3 22: 1 35: 2

  • 868 H 2.2 26.2 144 0.0 0 5 3 0: 3 22: 1 35: 2

  • 869 0.0 0.0 0 0.0 0 0 1 35: 60

  • 874 C 1.3 32.4 0 0.0 0 17 0

  • 889 C 1.1 30.4 154 0.0 0 36 0

  • 890 0.0 0.0 0 0.0 0 0 3 22: 30 38: 50 39:480

  • 923 0.0 0.0 0 0.0 0 0 1 39:480

  • 933 H 3.6 20.4 182 0.0 0 11 3 0: 2 22: 1 39: 12


Approach11
Approach Data

  • Result: Understandable representation of patient data

  • 861 C 3 1 -4 -3 0 -9 -9 –1 0 0 2 0 0 0 0 0 0 0

  • 867 H 4 4 0 -4 -1 -9 -9 –2 0 0 2 0 0 0 1 1 0 0

  • 868 H 4 1 -2 -3 -1 -9 -9 –4 0 0 2 0 0 0 1 1 0 0

  • 874 C 4 3 -4 -1 -9 -9 -9 0 0 0 2 0 0 0 1 1 0 0

  • 889 C 4 2 -4 -2 -1 -9 -9 2 0 0 2 0 0 0 1 1 0 0

  • 933 H 4 4 0 -4 0 -9 -9 –2 0 0 1 0 0 0 0 2 0 0


Approach12
Approach Data

  • Result: Understandable representation of patient data

  • < { (EV C)(HS 3)(RT 1)(WBC -4)(HCT -3)(PLT 0)

  • (LMPH –1)(onD 0010000000) }

  • { (EV H)(HS 4)(RT 4)(WBC 0)(HCT -4)(PLT -1)

  • (LMPH –2)(onD 0010001100) }

  • { (EV H)(HS 4)(RT 1)(WBC -2)(HCT -3)(PLT -1)

  • (LMPH –4)(onD 0010001100) }

  • { (EV C)(HS 4)(RT 3)(WBC -4)(HCT -1)

  • (onD 00010001100) }

  • { (EV C)(HS 4)(RT 2)(WBC -4)(HCT -2)(PLT -1)

  • (LMPH 2)(onD 0010001100) }

  • { (EV H)(HS 4)(RT 4)(WBC 0)(HCT -4)(PLT 0)

  • (LMPH –2)(onD 0010000100) } >


Approach13
Approach Data

  • Inexact Match

    • Use set difference

      • Partial match, feature by feature

      • Assumes default partial match for missing data

    • Use weakest-link/average-link

      • Require minimum degree of match

      • Require average degree of match


Tempadis

Raw Target Data Data

Data Cleaning

Data Normalization

Normalized Database

TEMPADIS


Tempadis1

Decision Tree Data

Normalized Database

Reduced, Knowledge-Added Data

Neural Net

TEMPADIS


Tempadis2

Knowledge-Added Database Data

Sequence Builder

Temporal Patterns

TEMPADIS


Results Data

  • Validation

    • Results are temporal patterns that demonstrate groups of patients had similar experience during the course of disease

    • Only medical experts can assess validity of discovered patterns

    • These results have been validated by the experts in the HIV Clinical Research Group


Results
Results Data

  • Given a database of patients followed for 4 to 9 years

    • Discovered interesting patterns

    • Interestingness has multiple dimensions

      • Length

      • Data that appears in the patterns

      • Data that does not appear in the patterns


Results1
Results Data

  • Advanced patients, subject to various OIs

  • < { (EV C)(HS 3)(RT 0)(WBC 0)(HCT -1)(PLT 0)(LMPH -3)

  • (onD 0000000000) }

  • { (EV E)(HS 3)(RT 2)(WBC 3)(HCT -1)(PLT 1)(LMPH 4)

  • (onD 0000000000) }

  • { (EV C)(HS 3)(RT 0)(WBC 1)(HCT 0)(PLT 0)(CD4P -3)

  • (CD4A -1)(LMPH 0)(onD 1010000000) }

  • { (EV C)(HS 3)(RT 1)(WBC -1)(HCT -1)(PLT 1)(LMPH 2)

  • (onD 1010000000) }

  • { (EV E)(HS 3)(RT 1)(WBC 2)(HCT -1)(PLT 1)(LMPH 4)

  • (onD 0000000000) }

  • { (EV C)(HS 3)(RT 1)(WBC 1)(HCT 0)(PLT 0)(CD4P -3)

  • (CD4A -2)(LMPH 0)(onD 1010000000) } >


  • < { (EV C)(HS 3)(RT 0)(WBC -1)(HCT -1)(PLT 1)(CD4P -4)

  • (CD4A -4)(LMPH 0)(onD 0010000000) }

  • { (EV C)(HS 3)(RT 0)(WBC 0)(HCT 0)(PLT -1)(CD4P -4)

  • (CD4A -4)(LMPH 0)(onD 1010000000) }

  • { (EV C)(HS 3)(RT 0)(onD 1010000000) }

  • { (EV C)(HS 3)(RT 0)(WBC -2)(HCT 0)(PLT -1)(CD4P -4)

  • (CD4A -4)(LMPH 0)(onD 0010000000) }

  • { (EV C)(HS 4)(RT 1)(WBC 1)(HCT -4)(PLT 0)(CD4P -4)

  • (CD4A -4)(LMPH -4)(onD 0011001000) }

  • { (EV C)(HS 3)(RT 3)(onD 0010000000) }

  • { (EV )(HS 3)(RT 1)(WBC 0)(HCT 0)(PLT 0)(LMPH 0)

  • (onD 0000000000) }

  • { (EV C)(HS 3)(RT 0)(CD4A -4)(onD 0010000000) } >


  • < { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 1)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV E)(HS 1)(RT 0)(WBC -1)(HCT 0)(PLT 1)(CD4P -1)

  • (CD4A -2)(LMPH 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(CD4A 0)(onD 0010000000) }

  • { (EV C)(HS 1)(RT 0)(CD4A 0)(onD 0010000000) }

  • { (EV E)(HS 1)(RT 0)(WBC 1)(HCT 0)(PLT 0)(CD4P 0)

  • (CD4A 0)(LMPH 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) }

  • { (EV C)(HS 1)(RT 0)(onD 0000000000) } >


Summary
Summary Data

  • Nine Steps of KDD

    • Identify goal

    • Identify target data set

    • Data cleaning and preprocessing

    • Data reduction and projection

    • Identify data mining method


Summary1
Summary Data

  • Nine Steps of KDD

    • Exploratory Analysis

    • Data Mining

    • Interpretation of Mined Patterns

    • Acting on Discovered Knowledge


Conclusions
Conclusions Data

  • Objective Met with Contributions

    • Patterns discovered representing groups of patients with similar experience in course of disease

    • This perspective on the data has not previously been produced

    • This kind of computation on this kind of data has not previously been produced


Future work
Future Work Data

  • Improve discovery algorithm

    • Backtracking is a barrier to overcome

  • Improve search control

  • Develop heuristic for measuring interestingness

  • Add ability to identify clinically identical/similar patterns


Future work1
Future Work Data

  • Move database to new Intelligent Systems in Medicine and Biology Lab

  • Bring database up to date

  • Include more domain data in Event Sets

  • Explore impact of new developments in HIV treatment


ad