Role of Big data analytics in Healthcare 16th January 2013 Prof. Indrajit Bhattacharya Professor- Healthcare IT, IIHMR, New Delhi Member, Standardization Committee for Electronic Medical Record (EMR ), MoHFW, Govt. of India
India – Key Health ChallengesDual Disease Burden – NCD, a major challenge! *CVD/diabetes data from 2005; COPD from 2006; cancer from 2004. ** Projected data for CVD/diabetes is for 2015; COPD is 2016; cancer is 2014. Source: World Health Organization, World Health Statistics 2010 Cardio vascular Diseases (CVD) Chronic obstructive pulmonary disease (COPD)
Health Goals ( XI th Plan ) • Reduce IMR to 28 and MMR to 1 / per 1000 live birth and TFR to 2.1 • Provide drinking water and eliminate malnutrition • Reduce anemia in women by 50 % • Raise sex ration 0 – 6 years to 935 ( 2011-12 ) and to 950 ( 2016 – 17 ) • Women and child empowerment Source : Planning Commission ( XI th Plan on Health )
Changing Disease Trends 65* 43* 212* Source: WHO, World Health Statistics, 2010 Total Fertility Rate (TFR) estimates have declined from 2.9 in 2005 to 2.5 in 2010 .* • * Source : http://nrhm-mis.nic.in/ accessed on 2nd Jan 2013
Trend 1/3: New Data Growing at 60% Y/Y Exabytes of information stored 20 Zetta by 2015 1 Yotta by 2030 Yes, you are part of the yotta generation… ….Even more true for health data audio digital tv digital photos camera phones, RFID medical imaging, sensors satellite images, games, scanners, twitter CAD/CAM, appliances, videoconferencing, digital movies Source: The Information Explosion, 2009
How Big is Big data ? Reference : http://highscalability.com/blog/2012/9/11/how-big-is-a-petabyte-exabyte-zettabyte-or-a-yottabyte.html
“Today, I believe we are witnessing one of the most important inflection points in the history of technology, driven by the convergence of emerging technologies such as Big Data and business analytics, and new cloud, mobile and social computing platforms that are really transforming how individuals work and companies compete,” • Jeff Henley , Chairman , ORACLE
……translate that knowledge into improved decision making and performance.
Healthcare is Next Frontier for Big Data • Big Data—the ability to collect, process, and interpret massive amounts of information—is one of today’s most important technological drivers. One of the biggest potential areas of application for society is healthcare. • — January 19, 2012
Trend 3/3: Value from Health Data Exceeds Hardware Cost • Value from the health intelligence of data analytics now outstrips the cost of hardware • Hadoop enables the use of 10x lower cost hardware • Hardware cost halving every 18 months Value Big Iron: $40k/CPU Commodity Cluster: $1k/CPU Cost
A Holistic View of a Big Data System: Real Time Streams Real-Time Processing (s4, storm) Analytics ETL Real Time Structured Database (hBase, Gemfire, Cassandra) Big SQL (Greenplum, AsterData, Etc…) Batch Processing Unstructured Data (HDFS)
“ Big data is all about finding a needle of value in a haystack of unstructured information, to help predict future” – IB, 2013 Prof. Indrajit Bhattacharya (IB)
Key Terms • Hadoop: An open source technology from Apache that provides the ability to cheaply process large amounts of data, regardless of its structure. Historically, enterprise data warehouses were designed to process and store structured data, but not equipped for the agile exploration of massive amounts of unstructured data that exist today in many forms. Hadoop allows organizations to cost-effectively derive value from previously unused data sources. • Structured data: Data that resides in fixed fields within a record or file. Relational databases and spreadsheets are examples of structured data. • Unstructured data: Data that does not reside in fixed locations (e.g., free-form text, images, audio, videos files). Examples are chart notes, discharge summaries, PDF files, email messages, radiology images, blogs, and web pages. • MapReduce: A programming model for processing large data sets, which is typically used to do distributed computing on clusters of computers.
What is • Large Scale Batch Data Processing System • MapReduce for computation and Hadoop distributed file system (HDFS) for storage • Big Data • Hadoop scales by using commodity hardware
Traditional vs Big Data ( open source software framework Hadoop) “ ..with Big Data, value is in the eyes of the data scientist”
Big Data defined (4Vs) A new generation of technologies and architectures designed to extract value economically from very large volumes of a wide variety of data by enabling high-velocity capture, discovery, and/or analysis. –IDC Data Variability - Fuziness Data Volume—Large Data Variety—Unstructured Rapid Velocity—Complex Same data element might mean differently depending on context change and also might be transient and yet important High updates of data for events that do undergo high transience are serious matter of concern Electronic Medical Records (EMRs) and other systems driving paper conversion to digital form. Data in different formats and sub-applications, distanced in space and time
Case Study: Yahoo Front Page • twice the engagement Personalized for each visitor • Result: • twice the engagement News Interests Top Searches Recommended links +43% clicks vs. editor selected +79% clicks vs. randomly selected +160% clicks vs. one size fits all
India Health Portal (IHP) being planned, needs to be personalized for every user using Big Data Analytics in healthcare !
Big Data opportunities in health Increased awareness of consumer trends Treatment planning Health and social services continuity planning Waste and fraud detection Population health management Surveillance and health management Improved research
Measuring Performance : NQF-Endorsed® Standards NQF ( National Quality Forum ) Reports
Big Data opportunities in health Correlational Data Geo-environmental, weather patterns, etc. Clinical Data EMRs, diagnostic images Claims & Cost Data Claims, revenue cycle Big Data Opportunities Patient & Consumer Data Purchasing patterns, social media Pharma & Life Science Data Clinical trials, genomics
Technology Trends 2013 6. Big Data … … ?? Big Data Source: Microsoft survey of industry analysts and customers
Conclusion • Big data analytics in Healthcare requires skills in IT, Mathematics and Medicine …. India is a leader in all three, and has potential to be a leader in this field… • Standards in Healthcare IT and India Health Information Network Development ( i-HIND) needs to be in place urgently, to derive its benefits
“An investment in Big Data Analytics in healthcare can lead to better outcomes at lower costs” email@example.com References IAMI, IBM, Google, Microsoft, Vmware, Oracle Ideal Analytics, IDC, Yahoo, Apache, Fidelity ,MoHFW, Wipro, McKinsey & Co., HIMSS