Hospitalization Prediction From Health Care Claims Adithya Renduchintala, Benjamin Martin, & Lance Legel University of Colorado Boulder Data Mining Spring 2012
Overview • Why hospitalization? • What will we do? • What is our data? • How will we evaluate? • How will we research? • How will we implement? • When are our milestones?
WHY HOSPITALIZATION? • 70 million Americans hospitalized / year • 5 million / year preventable • $30 billion / year • Data mining can help!
WHAT WILL WE DO? → → Correlate events and hospitalization outcomes to train prediction algorithms Analyze health care data on 76,000 people over a 3 year period with 2.6 million events Predict number of days a person will be hospitalized next year given new event data
WHAT IS OUR DATA? • 2,600,000 instances of above data for 76,000 unique members + • Member sex and age group • Number of drugs prescribed per member • Number of laboratory and pathology tests per member
HOW WILL WE EVALUATE? i = current member n = number of members p = predicted days in hospital for i a = actual days in hospital for i
HOW WILL WE RESEARCH? 1. “Data mining and clinical data repositories: Insights from a 667,000 patient data set” 2. “Introduction to neural networks in health care” 3. “Stock market prediction system with modular neural networks”
HOW WILL WE IMPLEMENT? ↔ ↔ Neural network to quantify number of days “yes” members are hospitalized Support vector machine to classify members as “yes” or “no” for being hospitalized Feature engineering of domain model knowledge into learning algorithms