Download presentation
Presentation is loading. Please wait.
Published byGladys Hill Modified over 5 years ago
1
Adding Value to Registries through Geospatial Big Data Fusion Geospatial Health Context Big Table Facilitating Geospatial Analysis in Health Research Tim Haithcoat & Chi-Ren Shyu University of Missouri Informatics Institute June 13, 2019
3
THE GOAL Develop robust processes for health researchers and practitioners to more easily incorporate spatially integrated health, social, cultural, access, infrastructure, and environmental parameters/factors and spatial context in their research using scalable geospatially enabled databases, analytics, and visualizations.
4
Unique Infrastructure
Typical Relational DB Typical Geospatial DB Talk about current state-of-the-art and shortcomings of raster and vector approaches Why designed this way. Column based – why not row based? Why did we pick this method and design;
5
Tessellation over Census blocks
Block centroids = 343,565 points
6
Thiessen Proximal Polygons
Tessellation with Census Centroids Thiessen Proximal Polygons
7
Extent of the Data Table
Defined a point file with 318 million points for contiguous 48 states. How many columns (attributes)? Projection 10,000+ How many data sets? US Data.gov – Federal GIS > 1,000 What is the size of the table? 1.5 Gb/attribute Growth Projection 90 Tb Using Spark big data ecosystem Australian Cancer Atlas Determined Main Common Keys Census Geography Zip Code Watershed Etc. Created point summary counts for all geographies to use for analytics How deal with various resolutions and scale across database How & why set up regions and to speed query and retrieval How mapped? KD Tree: how make it work and make it efficient
8
Establishing Context Inter-layer Distance measures
Coded 1st & 2nd Order Relationships
9
Registry Data Loading Registry Data Records
10
Leveraging Geospatial in Registries
Geocoding of Registry Attach an X,Y coordinate to each record with associated confidence (strongest) Attach a primary key(s) (i.e. Census ID, Zip Code Tabulation Area) based on geocode of address to create ‘easy’ linkage to associated data when needed. Use geocoded location to determine association with a primary key to move attributes of interest directly to the registry record. Determine what information, and at what geographic summarization level, registry data gets shared
11
Using the Big Data Table
Geospatial Health Context Big Table Data Required Socio-Economic Demographic Infrastructure Environmental Cultural Derived Physical Modeled LIFESTYLE 50% HEALTH CARE 25% BIOLOGY 15% ENVIRON 10% User Data Address Zip Code Tract County Inquiry Type Exploratory Simple Question Complex Question Complex Question w Temporal Aggregation Unit Zip Code Tract Block Group County Watershed School Dist Health Service Area
13
Choose an Issue Right-Sizing Care: Over the next decade, the aging American population is expected to place increased demands on the U.S. healthcare system. For older Americans, a review of medical records, found that 38% of doctor visits, including 27% of Emergency Room (E.R.) visits could have been replaced with telemedicine. Effort Required Census data tables (2 hrs) Census geography (1 hr) Hospital types (2 hrs) Road network zones (time and/or distance) (1 week) Broadband type (2 hrs) Query Elements Age > 60 years Gender Hospital Service Area Broadband Service The Data Needed Census age & gender Hospital locations Attributed road network Broadband attributes Census geography Select & retrieval MAUP can be addressed simply Key geographies
14
GeoHCBT: A case study of Leukemia
15
Example Complex Questions
What factors in different demographic groups or locations discourage people from cancer treatment? How can we update our healthcare delivery strategy based on availability of medical services with relation to cancer risk based on population growth, ageing, and cancer type? Can we identify any new relationships between cancer occurrence and environmental, socio-cultural, infrastructural, or other data to explore or generate new hypotheses? What is the magnitude of population cancer disparities in an area, where are they located, and what factors might be creating these ‘hot spots’?
16
Relevance The Geospatial Health Context Big Table provides:
Cancer Researchers an integrated big data repository to: Search - Enable stronger research designs (i.e. develop sampling / surveillance approached). Explore - Understand spatial interaction of a multitude of attributes. Ability to add contextual information based on neighborhood Decision Makers with a new tool to evaluate policy implications and focus on areas / populations affected. Public Health Professionals an ability to identify, mitigate, and potentially prevent health disparities in cancer incidence.
17
Acknowledgments Looking for research collaborations:
Collaborators: Chi-Ren Shyu, PhD Richard D. Hammer, M.D. Tim Matisziw, PhD Iris Zachary, PhD Eileen Avery, PhD Kelly Bowers, D.O. Mirna Becevic, PhD This work is supported by the NIH BD2K T32 Training grant (5T32LM ) The Big Data ecosystem is supported by the NSF CNS Looking for research collaborations: Contact:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.