Spatial Point Pattern Analysis GRAD6104/8104 INES 8090 Spatial Statistic- Spring 2017 Spatial Point Pattern Analysis
Image source: http://kids. nationalgeographic
Image source: http://cdn4. lacdn
Image source: http://www.airpano.ru/files/flamingo_01_big.jpg
What about… Trees? Pedestrian? Cars? … Chinese New Year in China http://i.usatoday.net/news/_photos/2008/02/01/china-top.jpg
What about… Trees? Pedestrian? Cars? … Traffic in Los Angeles http://blogs.kcrw.com/whichwayla/wp-content/uploads/2015/03/la-traffic.jpg
Spatial Point Patterns (SPP) Spatial characteristics of point patterns Random? Clustered? Regular? Spatial scales at which these characteristics are observed
Spatial Point Patterns Points are associated with events of a point process Sampled point pattern Events are partially observed Mapped point pattern All events of a realization are observed
Spatial Point Patterns Complete Spatial Random (CSR) Pattern The average number of events per unit area (intensity) is homogeneous throughout the spatial domain Events are independent with each other
Spatial Point Patterns First-order properties: intensity function λ(s) Average number of events per unit area where ds: an infinitesimal area at event s N(ds): # of events in an arbitrary region Second-order properties
Spatial Point Patterns Indicator function h h
Spatial Point Patterns Homogeneous Poisson process (HPP) is the stochastic representation of Complete Spatial Randomness. Statistical test is conducted to compare an observed point pattern against that expected from a HPP
Homogeneous Poisson process Attributes Stationary and isotropic Intensity = λ λ2(s,t) = λ2
Homogeneous Poisson process Computer Simulation (conditioned on n events in D) For a unit square, generate 2 independent uniform random variates Pair them up to get the coordinates of a single event Repeat this independently n times For the scale of a rectangle, rescale coordinates For an irregularly shaped study area D Simulate on a rectangle area containing D Retain only those events that lie within D
Exploratory Analysis First-order properties Second-order properties Kernel Density Estimation Quadrat analysis Second-order properties Nearest neighbor distance Ripley’s K function
Exploratory Analysis Area-based Approaches Distance-based Approaches Quadrat analysis Distance-based Approaches Kernel Density Estimation Nearest neighbor distance Ripley’s K function
Exploratory Analysis Kernel Density Estimation h Figure 4 a,b from Delmelle et al. 2014 Delmelle, E. M., Zhu, H., Tang, W., & Casas, I. (2014). A web-based geospatial toolkit for the monitoring of dengue fever. Applied Geography, 52, 144-152.
Exploratory Analysis Kernel Density Estimation Estimation of first-order property of point patterns Nonparametric approach where h: bandwidth I(.): indicator function k(.): kernel weight function h h Figure 1 from Gatrell et al. 1996) Gatrell, A. C., Bailey, T. C., Diggle, P. J., & Rowlingson, B. S. (1996). Spatial point pattern analysis and its application in geographical epidemiology. Transactions of the Institute of British geographers, 256-274.
Exploratory Analysis Kernel Density Estimation Kernel weight function Uniform Gaussian Epanecknikov (1969) … Effect of bandwidth Selection h Source: http://en.wikipedia.org/wiki/Kernel_%28statistics%29
Exploratory Analysis Kernel Density Estimation Kernel weight function Uniform Gaussian Epanecknikov (1969) … Effect of bandwidth Selection h Source: http://en.wikipedia.org/wiki/Kernel_%28statistics%29
Exploratory Analysis Kernel Density Estimation Kernel weight function Uniform Gaussian Epanecknikov (1969) … Effect of bandwidth Selection h Source: http://en.wikipedia.org/wiki/Kernel_%28statistics%29
Quadrat Analysis Divide the spatial domain D into non-overlapping regions (quadrats) of equal size: r by c quadrats Count events within each quadrat for the SPP with n events Expected #events per quadrat n/(r*c)
Quadrat Analysis Chi-square statistic for goodness-of-fit test A.k.a.: Index of dispersion The distribution of X2 is χ2(rc-1) Provided that is not too small, say ≥ 5 Obtain Pr(X2(rc-1) ≤ X2) In R, use pchisq(q,df) {q = X2; df=rc-1} The test is two-sided Large X2: aggregation Small X2: regularity
Quadrat Analysis Example from the textbook
Quadrat Analysis Criticisms Insensitive to regular departures from CSR Conclusion may depend on quadrat size and shape MAUP (Modifiable Areal Unit Problem) Too much information is lost by reducing the pattern to areal counts
Quadrat Analysis Scale analysis: Vary quadrat size Plot X2 against block size Peaks or toughs: evidence of scales of pattern
Distance-based Methods Clark-Evans test Diggle’s refined NN analysis Ripley’s K function
Distance-based Methods Clark-Evans test (Clark and Evans, 1954) Based on the mean nearest-neighbor (NN) distance: Small: Aggregation Large: Regularity Clark, P. J., & Evans, F. C. (1954). Distance to nearest neighbor as a measure of spatial relationships in populations. Ecology, 445-453.
Distance-based Methods Clark-Evans test (Clark and Evans, 1954) Test statistic: where λ = n/|D| D: area Under CSR, the distribution of CE: N(0,1) Assume edge and overlap effects are ignored Powerful for detecting aggregation and regularity Weak at detecting heterogeneity Clark, P. J., & Evans, F. C. (1954). Distance to nearest neighbor as a measure of spatial relationships in populations. Ecology, 445-453.
Distance-based Methods Diggle’s refined NN analysis A test based on the entire empirical distribution function (EDF) of the NN distances where I(.) is indicator function G(h)= 1 - exp(-λπy2), y ≥ 0 Plot of Ghat(.) against G(.) is a straight line under independence Ghat(h) > G(h) for small h: aggregation Ghat(h) < G(h) for small h: regularity
Distance-based Methods Diggle’s refined NN analysis A test based on the entire empirical distribution function (EDF) of the NN distances Significance testing: Monte Carlo test #runs (recommended): 99 for 5%; 999 for 1% Simulation envelopes Indication of the distance of departure from CSR
Distance-based Methods Diggle’s refined NN analysis Figure 3.2 Location of lightning strikes
Distance-based Methods Diggle’s refined NN analysis Similarly, tests using point-to-nearest event distances can be applied Sample points (m) randomly or systematically generated. F
Distance-based Methods Diggle’s refined NN analysis Both and are Diggle’s refined NN analysis F
Distance-based Methods Ripley’s K Function Second-moment cumulative function For a HPP, K(h)=πh2 and L(h) < 0 (K(h) > πh2 ) for small h: aggregation L(h) > 0 (K(h) < πh2 ) for small h: regularity Note that L(h) function may be defined differently
Distance-based Methods Ripley’s K Function Estimator where A: the area of study area n: #of events Iij(d): indicator function (1 if dij<d; otherwise 0) wij: edge-corrected weights
Distance-based Methods Ripley’s K Function Estimator Correction of edge effect Ripley’s circumference Toroidal shift Guard
Distance-based Methods Ripley’s K Function Estimator Correction of edge effect Ripley’s circumference Toroidal shift Guard
Distance-based Methods Ripley’s K Function Estimator Correction of edge effect Ripley’s circumference Toroidal shift Guard (Inner|Outer)
Distance-based Methods Ripley’s K Function Estimator Significance test Monte Carlo Source: http://resources.esri.com/help/9.3/arcgisengine/java/Gp_ToolRef/spatial_statistics_tools/multi_distance_spatial_cluster_analysis_ripley_s_k_function_spatial_statistics_.htm
Reading Assignment Gatrell, A. C., Bailey, T. C., Diggle, P. J., & Rowlingson, B. S. (1996). Spatial point pattern analysis and its application in geographical epidemiology. Transactions of the Institute of British geographers, 256-274. Chapter 2: Preliminary testing for mapped patterns, by Diggle (1983): Statistical Analysis of Spatial Point Patterns, Academic Press, London