The Big Data ecosystem is supported by the NSF CNS

Slides:



Advertisements
Similar presentations
Grandparenting and health in Europe: a longitudinal analysis Di Gessa G, Glaser K and Tinker A Institute of Gerontology, Department of Social Science,
Advertisements

Community Health Assessment San Joaquin County.
Spatial Autocorrelation using GIS
County-level Characteristics Associated with Gonorrhea Rates – United States, 2002 M Greenberg, M Sternberg, E Swint, R Kerani, E Koumans
Spatial statistics Lecture 3.
Stephen Petterson Bob Phillips Options for HPSA and MUA/MUP designation.
Local Measures of Spatial Autocorrelation
Correlation and Autocorrelation
SA basics Lack of independence for nearby obs
Squeezing more out of existing data sources: Small Area Estimation of Welfare Indicators Berk Özler The World Bank Development Research Group, Poverty.
2004 Falls County Health Survey Texas Behavioral Risk Factor Surveillance System (BRFSS)
Adaptive Kernel Density in Demographic Analysis Richard Lycan Institute on Aging Portland State University.
Spatial Statistics Applied to point data.
Community Health Needs Assessment Introduction and Overview Berwood Yost Franklin & Marshall College.
Urbanisation and spatial inequalities in health in Brazil and India
How Big a Problem is Obesity for the Medicare Program? AcademyHealth June 10, 2008 Bruce Stuart, Lirong Zhao, Jennifer Lloyd The Peter Lamy Center Drug.
More information © 2015 Denver Public Health Tobacco Metrics: the Power of Electronic Health Records Theresa Mickiewicz, MSPH Public Health in the Rockies.
The Geography of HIV in Harris County, Texas,
Local Spatial Statistics Local statistics are developed to measure dependence in only a portion of the area. They measure the association between Xi and.
Material from Prof. Briggs UT Dallas
T Relationships do matter: Understanding how nurse-physician relationships can impact patient care outcomes Sandra L. Siedlecki PhD RN CNS.
Measuring the Disability Continuum in a Policy Context Barbara M. Altman, PhD Disability Statistics Consultant Stephen P. Gulley, PhD Brandeis University.
Prevalence of Chronic Bronchitis in First Nations People Punam Pahwa,1, 2,* Chandima P. Karunanayake,1 Donna Rennie, 1 Kathleen McMullin,1 Josh Lawson,1.
Chapter 1 This Is Geography
Moran’s I and Correlation Coefficient r Differences and Similarities
Overview Project assigned as part of dietetic internship student placement Was an opportunity to pilot the integration of GIS into some of our research/operations.
Task 2. Average Nearest Neighborhood
Mesfin S. Mulatu, Ph.D., M.P.H. The MayaTech Corporation
LATEST RESEARCH JUNE 2015 Formed in 2009 the Aston Research Centre for
Rabia Khalaila, RN, MPH, PHD Director, Department of Nursing
Introduction to Spatial Statistical Analysis
Amy Carroll-Scott, PhD, MPH
The Use of Census Data and Spatial Statistical Tools in GIS to Identify Economically Distressed Areas Presented by: Barbara Gibson & Ty Simmons SCAUG User.
Rational Influence Tactics Harsh Influence Tactics
John W. Sipple, PhD Joe D. Francis, PhD Development Sociology
Presentation for the SCTR Scientific Retreat on Aging Related Research
College Student and Non-College Student Poverty in San Marcos, Texas
Do Age, BMI, and History of Smoking play a role?
A Growth Curve Analysis Participant Baseline Characteristics
National Academy of Neuropsychology
South Carolina Economic Summit
Illustrating HIV/AIDS in the United States
Health skills: citizens and professionals
An analysis of the 2015 – 2016 NZ Health Survey
Who Gets What It takes to be “Culinary Omnivores”?
4th Nutrition Center Symposium November 10, 2018
Illustrating HIV/AIDS in the United States
Mental Health and Substance Use among Students with Disabilities
Illustrating HIV/AIDS in the United States
Illustrating HIV/AIDS in the United States
Analysis of Parental Vaccine Beliefs by Child’s School Type
Chapter 1 This Is Geography
Illustrating HIV/AIDS in the United States
TerraPop Goals Lower barriers to conducting interdisciplinary human-environment interactions research by making data with different formats from different.
Current conditions.
Health Disparities Among Rural Populations
Advancing the Science of Transformation in Integrated Primary Care: Informing Options for Scaling-up Innovation   Session 3: Addressing health equity and.
Part 1: Data Sources Frank Porell
Part 2: Defining Geographic Areas Frank Porell
Summary of Slide Content
Illustrating HIV/AIDS in the United States
CONCLUSIONS & IMPLICATIONS
Did not have a usual source of care Went without care because of cost
Wellness County Profile
In the name of Almighty, Eternal, Just And Merciful GOD
American Public Health Association San Francisco, California
Healthy York County Coalition Community Health Assessment Overview of Findings June 2012.
Including People with Disabilities: Public Health Workforce Competencies Module 3 Competency 2: Discuss methods used to assess health issues for people.
Adding Value to Registries through Geospatial Big Data Fusion Geospatial Health Context Big Table Facilitating Geospatial Analysis in Health Research.
STEPS Site Report.
Presentation transcript:

The Big Data ecosystem is supported by the NSF CNS-1429294 Income Inequality and Health: Expanding our Understanding of State Level Effects by using a Geospatial Big Data Approach Tim Haithcoat1, Eileen Avery1,2, Kelly Bowers1,3, Richard D. Hammer1,3, and Chi-Ren Shyu1,4 (1Informatics Institute; 2Department of Sociology; 3Department of Pathology & Laboratory Medicine; 4Department of Electrical Engineering) This work is supported by the NIH BD2K T32 Training grant (5T32LM012410-02) The Big Data ecosystem is supported by the NSF CNS-1429294 Prepared for BigSurv18 Barcelona, Spain October 27, 2018

Motivation New directions in big data technology allow scholars to answer new or revisit existing research questions in unique ways Team currently working on a big data tool “Geospatial Health Context Big Table” (GeoHCBT) Table contains/will contain variables that include decennial census and American Community Survey data, land use/greenspace, pollution/exposures, crime, and so forth Here it is used to examine the relationship between income inequality and health in a unique way

Unique Infrastructure Using Spark big data ecosystem - Clusters Defined a point file with 318 million points for contiguous 48 states. Determined Main Common Keys Census Geography Zip Code Watershed School District Etc. Created point summary counts for all geographies to use for analytics Typical Geospatial DB Typical Relational DB

Relevance The Geospatial Health Context Cube provides: Health Researchers an integrated big data repository to: Search - Enable stronger research designs (i.e. develop sampling / surveillance approached). Explore - Understand spatial interaction models. Add contextually derived characteristics Decision Makers with a new tool to evaluate policy implications and focus on areas / populations affected. Public Health Professionals an ability to identify, mitigate, and potentially prevent health disparities.

Income Inequality and Health Income inequality hypothesis Strong and weak versions Individual level hypotheses (absolute and relative income, deprivation, relative position) Mechanisms Issues with geography Our focus is on ecological income inequality, or the extent of inequality that exists in a given place.

Current Study In this research, we utilize advances in geospatial big data tools and apply them to traditional survey data in order to examine the extent to which overall income inequality in states as captured by the Gini coefficient the overall uniformity of this measure within states across counties the extent to which this inequality is more uniformly high or low are associated with health outcomes in the Behavioral Risk Factor Surveillance System (BRFSS). Results add to a better understanding about the ways that the relationship plays out across space within higher levels of geography such as large political units.

Health Outcomes Physical Health: Mental Health: Diagnosed with depression (including depression, major depression, or minor depression). If yes to: “Because of a physical, mental, or emotional condition, do you have serious difficulty concentrating, remembering, or making decisions?” Accessibility: Restriction to care due to cost (care too expensive) if “yes” to: “Was there a time in the past 12 months when you needed to see a doctor but could not because of cost?” Physical Health: Obese if the respondent’s body mass index (BMI) is 30 or above Diagnosis of chronic obstructive pulmonary disease (COPD) Diagnosis of cardiovascular disease (CVD) Fair or poor self-rated health (versus excellent, very good, or good).

Gini Coefficient and Uniformity Measures Gini index is a measure of statistical dispersion intended to represent the income or wealth distribution of a unit’s residents, and is the most commonly used measurement of inequality. e.g.: United States (41.5 [2016]); Spain (36.2 [2015]); UK (33.2 [2015]); Brazil (51.3 [2015]); South Africa (63 [2014]); China (42.2 [2012]); Ukraine (25.5 [2015]); Sweden (29.2 [2014]) Developed by the Italian statistician and sociologist Corrado Gini and published in his 1912 paper Variability and Mutability Uniformity level overall Uniformly high Uniformly low

State Level Gini Distribution

County Level Gini Distribution

Measure of Spatial Association, Local Moran’s I Local Moran’s I is given as: n equals the total number of counties Positive Value: neighboring county features have either high or low Gini indexes making it a member of a cluster. Negative Value: neighboring features have dissimilar values, which flags this county feature as an outlier. where gi is an the Gini index for county i, G is the mean of the Gini index across all counties (n), di,j is the spatial weight (distance) between county i and county j, and:

Moran’s I and Correlation Coefficient r Differences and Similarities Education Income Correlation Coefficient r Relationship between two variables Moran’s I Involves one variable only and is the correlation between variable, X, and the spatial lag of X formed by averaging all the values of X for the neighboring polygons r = -0.71 Grocery Store Density Grocery Store Density Nearby

Clustering and Outliers Cluster is developed by assessing each county’s Gini value through evaluating it against its neighborhood of counties within a specified distance threshold. A statistically significant cluster of Gini values represents regionalized areas where surrounding counties share similar values. A county with a high Gini index surrounded by other highs, would be labeled HH as a member of a high Gini index cluster, and LL for a county with a low Gini index associating with low Gini index cluster.   An outlier is then defined relative to a cluster as being a county Gini index that falls within the space of an assembled cluster that is significantly dissimilar to that associated cluster. A county with a high Gini index would be labeled HL as an outlier if its surrounding counties are primarily low values, or LH as an outlier in which a low value is surrounded primarily by high values. Statistical significance for this assessment was set at 95% confidence level.

Clustering and Outliers

Uniformity Index

Uniformity Index High

Uniformity Index Low

Controls and Analytic Strategy Controlled for MHI, health insurance (state and individual), % on SNAP, age, race, ethnicity, education, income, relationship status, health behaviors Hierarchical logistic regression models. Random intercepts. Individuals nested within states. Weights utilized.

Descriptive Statistics for all Variables (n = 954,671 / 48)

Hierarchical Logistic Regressions Health Outcomes on Measures of Inequality and Uniformity in Inequality

Conclusions However, Gini reduced the odds of obesity and depression, and residents with more uniformly low inequality states were more likely to be obese. These findings, while disputing the IIH, suggest inequality, and its distribution across space, matters differently for different health outcomes. The nature of the dispersion of inequality across geographies is an important variable to consider when evaluating the IIH. Income inequality, as captured by the Gini coefficient, did not significantly increase the odds of any outcome. Residents of states with more uniformly high levels of inequality across space are more likely to report: below average health, cardiovascular disease, difficulty concentrating lack access to care due to cost.

Future Directions Grouping Analysis based on positive and negative variable correlations / associations with Gini Index Explore other inequality measures Explore the stability of these relationships across various geographic levels Negative Positive