Geospatial Statistics LearnR! Fall 2014 Nathaniel MacNell
What do we do with spatial data?
What approach should I use? Start: the [spatial] “support” of the data What type spatial data do you have? Points (e.g. GPS coordinates) per observation Polygons (“areal units”) per observation Could be different for exposure/covariates/outcome Also: background information What is the mechanism of action? Also: hypothesis/research questions What are you trying to show?
Today: Spatial Dependence A dataset where observations are polygons Census data SEER (cancer) data Patients coded to areas Variety of designs Cross-sectional Cohort Time series
Example: Income & Cancer in NC Research question: do NC counties with lower mean income have higher rates of lung cancer? Poverty → Lung Cancer
Potential Spatial Mechanisms: Effects of the “space” itself Core/periphery areas (agriculture → poverty) Supply of tobacco (agriculture → smoking) Often: “unmeasured [spatial] confounders” Effect of neighbors “Contagiousness” of smoking/income behavior Social norms (you smoking → me smoking) Inheritance of poverty (parent poverty → child poverty)
To Modeling Approaches Spatial Error Model “residuals” of nearby observations not independent I.e. effects of unmeasured spatial factors Spatial Lag Model Observations affected by nearby observations Value of independent variable Value of dependent variable i.e. effects of “echoes” of measured spatial factors
Spatial Models Mathematically Generalized Linear Model (non-spatial) Y = Xβ + ε Y Outcome vector X Covariate vector (including exposure) β Effect vector (slopes) ε Residual (“error”) vector
Spatial Models Mathematically Spatial Error Model Y = Xβ + u u = λWu + ε Spatial Lag Model Y = Xβ + ρWY + ε
What is W? Spatial Weights Matrix Many different coding schemes Who are my neighbors? How “close” am I to each one? (measure of impact) Many different coding schemes Binary: all neighbors affect me equally Row-standardized: all neighbors add up to 1
Example W
How to get W Option 1: Define it (educated guess) E.g. social network analysis Option 2: Figure out something empirically Find all my neighbors in space Choose a coding scheme (still educated guess!)
To the lab! Import spatial data Build a neighbors object Build some weights matrices Try spatial lag and spatial error models