1 / 17

June 13, 2019

Adding Value to Registries through Geospatial Big Data Fusion Geospatial Health Context Big Table Facilitating Geospatial Analysis in Health Research. Tim Haithcoat & Chi-Ren Shyu University of Missouri Informatics Institute. June 13, 2019. THE GOAL

smccormick
Download Presentation

June 13, 2019

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adding Value to Registries through Geospatial Big Data FusionGeospatial Health Context Big Table Facilitating Geospatial Analysis in Health Research Tim Haithcoat & Chi-Ren Shyu University of Missouri Informatics Institute June 13, 2019

  2. THE GOAL Develop robust processes for health researchers and practitioners to more easily incorporate spatially integrated health, social, cultural, access, infrastructure, and environmental parameters/factors and spatial context in their research using scalable geospatially enabled databases, analytics, and visualizations.

  3. Unique Infrastructure Typical Relational DB Typical Geospatial DB

  4. Tessellation over Census blocks Block centroids = 343,565 points

  5. Tessellation with Census Centroids Thiessen Proximal Polygons

  6. Extent of the Data Table • Defined a point file with 318 million points for contiguous 48 states. • How many columns (attributes)? Projection  10,000+ • How many data sets? US Data.gov – Federal GIS > 1,000 • What is the size of the table? 1.5 Gb/attribute Growth Projection90 Tb • Using Spark big data ecosystem • Australian Cancer Atlas • Determined Main Common Keys • Census Geography • Zip Code • Watershed • Etc. • Created point summary counts for all geographies to use for analytics

  7. Establishing Context • Inter-layer Distance measures • Coded 1st & 2nd Order Relationships

  8. Registry Data Loading Registry Data Records

  9. Leveraging Geospatial in Registries • Geocoding of Registry • Attach an X,Y coordinate to each record with associated confidence (strongest) • Attach a primary key(s) (i.e. Census ID, Zip Code Tabulation Area) based on geocode of address to create ‘easy’ linkage to associated data when needed. • Use geocoded location to determine association with a primary key to move attributes of interest directly to the registry record. • Determine what information, and at what geographic summarization level, registry data gets shared

  10. Using the Big Data Table Geospatial Health Context Big Table Data Required Socio-Economic Demographic Infrastructure Environmental Cultural Derived Physical Modeled User Data Address Zip Code Tract County Inquiry Type Exploratory Simple Question Complex Question Complex Question w Temporal Aggregation Unit Zip Code Tract Block Group County Watershed School Dist Health Service Area LIFESTYLE 50% HEALTH CARE 25% BIOLOGY 15% ENVIRON 10%

  11. Choose an Issue Right-Sizing Care: Over the next decade, the aging American population is expected to place increased demands on the U.S. healthcare system. For older Americans, a review of medical records, found that 38% of doctor visits, including 27% of Emergency Room (E.R.) visits could have been replaced with telemedicine. Effort Required Census data tables (2 hrs) Census geography (1 hr) Hospital types (2 hrs) Road network zones (time and/or distance) (1 week) Broadband type (2 hrs) Query Elements Age > 60 years Gender Hospital Service Area Broadband Service The Data Needed Census age & gender Hospital locations Attributed road network Broadband attributes Census geography

  12. GeoHCBT: A case study of Leukemia

  13. Example Complex Questions • What factors in different demographic groups or locations discourage people from cancer treatment? • How can we update our healthcare delivery strategy based on availability of medical services with relation to cancer risk based on population growth, ageing, and cancer type? • Can we identify any new relationships between cancer occurrence and environmental, socio-cultural, infrastructural, or other data to explore or generate new hypotheses? • What is the magnitude of population cancer disparities in an area, where are they located, and what factors might be creating these ‘hot spots’?

  14. Relevance • The Geospatial Health Context Big Table provides: • Cancer Researchers an integrated big data repository to: • Search - Enable stronger research designs (i.e. develop sampling / surveillance approached). • Explore - Understand spatial interaction of a multitude of attributes. • Ability to add contextual information based on neighborhood • Decision Makers with a new tool to evaluate policy implications and focus on areas / populations affected. • Public Health Professionals an ability to identify, mitigate, and potentially prevent health disparities in cancer incidence.

  15. Acknowledgments Collaborators: Chi-Ren Shyu, PhD Richard D. Hammer, M.D. Tim Matisziw, PhD Iris Zachary, PhD Eileen Avery, PhD Kelly Bowers, D.O. Mirna Becevic, PhD This work is supported by the NIH BD2K T32 Training grant (5T32LM012410-02) The Big Data ecosystem is supported by the NSF CNS-1429294 Looking for research collaborations: Contact: HaithcoatT@missouri.edu

More Related