Spatio – Temporal Cluster Detection Using AMOEBA Jimmy Kroon Pennsylvania State University Advisor: Dr. Frank Hardisty
This is a parody – Original Art: http://projectswordtoys. blogspot
Outline Introduction – Clustering and Project Direction The Spatial Scan Statistic and SatScan AMOEBA Proposed Spatio-Temporal AMOEBA Method Software, Data, and Progress
Cluster Detection Cluster: “a geographically and/or temporally bounded group of occurrences of sufficient size and concentration to be unlikely to have occurred by chance” (Knox, 1989) Two Typical Uses Disease Surveillance Week of 2/7/2010 Data: Google Flu Trends – Analysis: GeoDa Epidemiological Studies Brain Cancer in NM Kulldorff et al. 1998 Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Time in Spatial Analysis Time Matters: Many geographic phenomena are dynamic. Spatial patterns we see probably change over time The American Association of Geographers describes temporal geography as a ‘frontier’ of GIScience. Spatio-temporal clusters may exhibit behaviors not seen in purely spatial clusters. Growth Movement Splits / Joins Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress Research Problem Primary: No method exists for the determining the true extent of irregularly shaped clusters in spatio-temporal datasets. Secondary: Spatial AMOEBA has not been implemented in R Project Goals A demonstration of spatio-temporal cluster detection based on the AMOEBA procedure. R scripts for running spatial and spatio-temporal AMOEBA will be contributed to the R community. Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatial Scan Statistic Scan data with a moving ‘window’, calculating local autocorrelation for spatial units that fall within the window. Select the window(s) with the highest calculated autocorrelation value as possible cluster(s). The spatial scan statistic is by far the most popular cluster detection technique, largely due to the availability of SaTScan software by Martin Kulldorff. Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatial Scan Statistic Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Drawbacks of the Spatial Scan Statistic Clusters that are not similar in shape to the scanning window can produce errors. False inclusions False exclusions Identify thin clusters as multiple small clusters Cannot detect holes in clusters Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Elliptical Spatial Scan Statistic Must choose shapes a priori to avoid pre-selection bias See Kulldorff et al. 2006 Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress Ecotope-Based – Regions of contiguous spatial units that are related in terms of z-value Multidirectional – Search in all directions. Optimum – Procedure takes place at the finest spatial scale possible and is capable of revealing all spatial association present in the dataset (Aldstadt and Getis, 2006). AMOEBA Clusters Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress Defining an Ecotope Add a seed location (one polygon) to the ecotope Calculate Gi* (Getis-Ord local autocorrelation statistic) Search in all directions for contiguous polygons Those that increase Gi* are added to the growing ecotope for that seed location Keep searching for more neighbors, growing the ecotope until Gi* no longer increases Repeat – creating ecotopes for each polygon in the dataset Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress The R Neighbor Object Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress From Ecotopes to Clusters Rank ecotopes by final Gi* Select that with the highest Gi* as a cluster Eliminate intersecting ecotopes Select the ecotope with the next highest Gi* as a second cluster Repeat Probability of clusters can be tested using Monte Carlo simulation Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Incorporating Time into AMOEBA Remember - Spatio-temporal clusters may exhibit behaviors not seen in purely spatial clusters. Growth Movement Splits / Joins Visualize temporal data as layers of data with time extending vertically through the layers. Each spatio-temporal unit has spatial neighbors and temporal neighbors Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatio-Temporal Scan Statistic See Kulldorff et al. 1998 Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Software Environment and Test Data The R Project Free, open source statistical software Extendable with user contributed packages www.r-project.org Google Flu Trends Estimates flu incidence levels using aggregated data about user searches for certain keywords 90% accurate compared to CDC data State-level data - updated daily www.google.org/googleflu SEER (Surveillance Epidemiology and End Results) National Cancer Institute incidence, survival, and mortality data Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
AMOEBA ArcToolbox for ArcGIS Python Scripts by Jared Aldstadt and Yeming Fan (Aldstadt, 2010) Google Flu Trends – Feb 1, 2009 Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA in Python: 2009 Flu Epidemic Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress Hmmm… Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
R Programming Progress Compete … Geoprocessing tasks Create spatio-temporal neighbor list Delineate ecotopes Sort and eliminate intersecting ecotopes Returns primary cluster PolyID’s that match the Python results To Do … Monte Carlo simulation Process results and add to the output shapefile Test, test, test Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
References Aldstadt, Jared, and Arthur Getis. 2006. Using AMOEBA to Create a Spatial Weights Matrix and Identify Spatial Clusters. Geographical Analysis 38: 327-343. Aldstadt, Jared. 2010. Spatial Analysis Tools (ArcGIS). Spatial Analysis Tools. http://www.acsu.buffalo.edu/~geojared/tools.htm. Bellec, S, D Hémon, J Rudant, A Goubin, and J Clavel. 2006. Spatial and space–time clustering of childhood acute leukaemia in France from 1990 to 2000: a nationwide study. British Journal of Cancer Duczmal, Luiz, Martin Kulldorff, and Lan Huang. 2006. Evaluation of Spatial Scan Statistics for Irregularly Shaped Clusters. Journal of Computational and Graphical Statistics 15(2): 428-442. Knox, G. 1989. Detection of Clusters. In Methodology of Enquiries into Disease Clustering, ed. P Elliott, 17-22. London: Small Area Health Statistics Unit. Kulldorff, Martin, Athas, William, Feuer, Eric, Miller, Barry, and Key, Charles. 1998. Evaluating cluster alarms: A space-time scan statistic and brain cancer in Los Alamos, New Mexico. American Journal of Public Health 88(9): 1377-1380. Kulldorff, Martin, Lan Huang, Linda Pickle, and Luiz Duczmal. 2006. An elliptic spatial scan statistic. Statistics in Medicine 25(22): 3929. Kulldorff, Martin. 1999. Geographic Information Systems (GIS) community health: Some statistical issues. Journal of Public Health Management and Practice 5(2): 100-106. Original artwork for parody title slide: http://projectswordtoys.blogspot.com/2009/05/project-sword-annual-1967.html