Geographic Information Science

Slides:



Advertisements
Similar presentations
Summary of A Spatial Scan Statistic by M. Kulldorff Presented by Gauri S. Datta SAMSI September 29, 2005.
Advertisements

ANALYSIS OF VARIANCE (ONE WAY)
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Global Clustering Tests. Tests for Spatial Randomness H 0 : The risk of disease is the same everywhere after adjustment for age, gender and/or other covariates.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Data Processing, Fundamental Data Analysis, and Statistical Testing of Differences CHAPTER.
Statistical approaches for detecting clusters of disease. Feb. 26, 2013 Thomas Talbot New York State Department of Health Bureau of Environmental and Occupational.
Statistical Issues in Research Planning and Evaluation
Early Detection of Disease Outbreaks Prospective Surveillance.
Zakaria A. Khamis GE 2110 GEOGRAPHICAL STATISTICS GE 2110.
Empirical/Asymptotic P-values for Monte Carlo-Based Hypothesis Testing: an Application to Cluster Detection Using the Scan Statistic Allyson Abrams, Martin.
University of Wisconsin-Milwaukee Geographic Information Science Geography 625 Intermediate Geographic Information Science Instructor: Changshan Wu Department.
Topic 6: Introduction to Hypothesis Testing
Chapter 10 Quality Control McGraw-Hill/Irwin
Smoothed Maps. This is a Smoothed Map Ideas Behind Smoothing To avoid arbitrary political boundaries To adjust unstable estimates towards a global mean.
Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004.
 Statistical approaches for detecting unexplained clusters of disease.  Spatial Aggregation Thomas Talbot New York State Department of Health Environmental.
Chapter 4 Probability Distributions
Lecture 9: One Way ANOVA Between Subjects
CHAPTER 6 Statistical Analysis of Experimental Data
Cumulative Geographic Residual Test Example: Taiwan Petrochemical Study Andrea Cook.
Why Geography is important.
Advanced GIS Using ESRI ArcGIS 9.3 Arc ToolBox 5 (Spatial Statistics)
PSY 307 – Statistics for the Behavioral Sciences Chapter 8 – The Normal Curve, Sample vs Population, and Probability.
Chapter 14 Inferential Data Analysis
Experimental Research
University of Wisconsin-Milwaukee Geographic Information Science Geography 625 Intermediate Geographic Information Science Instructor: Changshan Wu Department.
Mapping Rates and Proportions. Incidence rates Mortality rates Birth rates Prevalence Proportions Percentages.
Ch 5 Practical Point Pattern Analysis Spatial Stats & Data Analysis by Magdaléna Dohnalová.
University of Wisconsin-Milwaukee Geographic Information Science Geography 625 Intermediate Geographic Information Science Instructor: Changshan Wu Department.
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
Using ArcGIS/SaTScan to detect higher than expected breast cancer incidence Jim Files, BS Appathurai Balamurugan, MD, MPH.
The Spatial Scan Statistic. Null Hypothesis The risk of disease is the same in all parts of the map.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Spatial Statistics Applied to point data.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Review and Preview This chapter combines the methods of descriptive statistics presented in.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Dynamic Lines. Dynamic analysis n Health of people and activity of medical establishments change in time. n Studying of dynamics of the phenomena is very.
Geographic Information Science
Extending Spatial Hot Spot Detection Techniques to Temporal Dimensions Sungsoon Hwang Department of Geography State University of New York at Buffalo DMGIS.
University of Wisconsin-Milwaukee Geographic Information Science Geography 625 Intermediate Geographic Information Science Instructor: Changshan Wu Department.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Zakaria A. Khamis GE 2110 GEOGRAPHICAL STATISTICS GE 2110.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Relative Values. Statistical Terms n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the data  not sensitive to.
KNR 445 Statistics t-tests Slide 1 Introduction to Hypothesis Testing The z-test.
Point Pattern Analysis Point Patterns fall between the two extremes, highly clustered and highly dispersed. Most tests of point patterns compare the observed.
So, what’s the “point” to all of this?….
Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,
Point Pattern Analysis
1 CLUSTER VALIDITY  Clustering tendency Facts  Most clustering algorithms impose a clustering structure to the data set X at hand.  However, X may not.
A Spatial-Temporal Model for Identifying Dynamic Patterns of Epidemic Diffusion Tzai-Hung Wen Associate Professor Department of Geography,
Hypothesis Testing Introduction to Statistics Chapter 8 Feb 24-26, 2009 Classes #12-13.
Statistical Significance: Tests for Spatial Randomness.
Zakaria A. Khamis GE 2110 GEOGRAPHICAL STATISTICS GE 2110.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Quality Control Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill.
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
Spatial Scan Statistic for Geographical and Network Hotspot Detection C. Taillie and G. P. Patil Center for Statistical Ecology and Environmental Statistics.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Criminal Justice and Criminology Research Methods, Second Edition Kraska / Neuman © 2012 by Pearson Higher Education, Inc Upper Saddle River, New Jersey.
Methods of Presenting and Interpreting Information Class 9.
Density Estimation Converts points to a raster
Cases and controls A case is an individual with a disease, whose location can be represented by a point on the map (red dot). In this table we examine.
Spatial analysis Measurements - Points: centroid, clustering, density
Types of Errors Type I error is the error committed when a true null hypothesis is rejected. When performing hypothesis testing, if we set the critical.
Elementary Statistics
Simulation Supplement B.
Presentation transcript:

Geographic Information Science Geography 625 Intermediate Geographic Information Science Week5: Practical Point Pattern Analysis Instructor: Changshan Wu Department of Geography The University of Wisconsin-Milwaukee Fall 2006

Outline Point pattern analysis versus cluster detection Extensions to point pattern measures Multiple sets of events Space-time analysis

Point Pattern Analysis Versus Cluster Detection The application of pure spatial statistical analysis to real-world data and problems is only rarely possible or useful IRP/CSR is rarely an adequate null hypothesis underlying population distribution 2. The pattern-process comparison approach is a global technique, concerned with the overall characteristics of a pattern and saying little about where the pattern deviates from expectations. 3. Cluster detection is often the most important task, because identifying locations where there are more events than expected may be an important first step in determining what causes them.

Cluster Detection Whether leukemia disease has a connection with nuclear plant Draw circles centered at the plant (1km, 2 km, …) Count the number of disease events within 1, 2,…, 10 km of the nuclear plant Count the total population at risk in these bands Determine the rates of occurrence for each band Compare the rates to expected values generated either analytically or by simulation

Cluster Detection Problems The boundaries given by the distance bands are arbitrary and because they can be varied, are subject to the MAUP The test is a post hoc, after the fact, test. It is unfair to choose only the plant as the center of the circles.

Cluster Detection - Geographical Analysis Machine (GAM) Developed by Openshaw et al (1987) A brave but controversial attempt GAM is an automated cluster detector for point patterns that made an exhaustive search using all possible centers of all possible clusters GAM was originally developed to study clustering of certain cancers, especially childhood leukemia, around nuclear facilities in England

Cluster Detection - Geographical Analysis Machine (GAM) Step 1: Lay out a two-dimensional grid over the study region. Step 2: Treat each grid point as the center of a series of search circles Step 3: generate circles of a defined sequence of radii (e.g. 1.0, 2.0, …, 20 km)

Cluster Detection - Geographical Analysis Machine (GAM) Step 4: for each circle, count the number of events and population at risk falling within it Step 5: Determine whether or not this exceeds a specified density threshold. Step 6. If the incidence rate in a circle exceeds some threshold, draw the circle on a map.

Cluster Detection - Geographical Analysis Machine (GAM) Threshold levels for determining significant circles were assessed using Monte Carlo simulation. Calculate the average incidence rate Randomly assign Leukemia to census enumeration districts (ED) such that each ED has same rate. Run 99 times The actual count of Leukemia cases in each circle was compared to the count that would have been observed for each of the 99 simulated outcomes. Any circle whose observed count was highest among this set of 100 patterns was highlighted p= 0.01

Cluster Detection - Geographical Analysis Machine (GAM) Example: the result of an analysis of clustering of childhood acute lymphoblastic leukemia in a study region in England using GAM (Openshaw et al. 1988)

Cluster Detection - Geographical Analysis Machine (GAM) Explanations for clusters Excess risk among children whose fathers worked at the nuclear facilities, especially those fathers who were exposed to a high dose of ionizing before the children’s conception (Wakeford 1990). Others?

Cluster Detection - Geographical Analysis Machine (GAM) Limitations GAM lacks a clear statistical yardstick for evaluating the number of significant circles that appear on the map. Because the circles overlap, many significant circles often contain the same cluster of cases (Poisson tests are not independent) The GAM maps often give the appearance of excess clustering, with a high percentage of “false positive” circles.

Cluster Detection - Spatial Scan Statistic Developed by Kulldorff (1997) Similar to GAM, the method utilized a field approach and searches over a regular grid using circles of different sizes For each circle, the method computes the likelihood that the risk of disease is elevated inside the circle compared to outside the circle. The circle with the highest likelihood value is the circle that has the highest probability of containing a disease cluster Software can be downloaded from NIH web (http://dcp.nci.nih.gov/bb/satscan.html)

Cluster Detection - Spatial Scan Statistic Applied to breast cancer mortality in the northeast United States. Found one statistically significant cluster extending from the New York metropolitan area through parts of New Jersey to Philadelphia.

Cluster Detection - Rushton and Lolonis’s Method It uses a window of constant size to scan the study area for clusters It provides information about the likelihood that a cluster might have occurred by chance Monte Carlo procedures are used to simulate possible spatial patterns of health events

Cluster Detection - Rushton and Lolonis’s Method Analyze spatial clustering of birth defects in Des Moines, Iowa.

Cluster Detection - An Object Oriented Approach Besag and Newell (1991) devised a spatial clustering method that only searches for clusters around cases. Their method adopts an “object” approach, treating health events as objects, instead of field approach of the previous methods. This greatly reduces the amount of spatial search and computation.

Cluster Detection - An Object Oriented Approach Step1: Specify k as the minimum event number of a cluster Step2: for an event i, find the nearest k events Step3: Identify the geographic area Mi that contains these k events Step 4: For Mi, calculate the total event number and the population Step 5: compare the incidence rate to the average rate (simulation). K = 5

Cluster Detection - An Object Oriented Approach Applications Timander & McLafferty (1998) applied this method to search for spatial clustering of breast cancer cases among long-term residents of West Islip, New York.

Extensions to Point Pattern Measures - Multiple Sets of Events Two or more point patterns Do the two point patterns differ significantly? Does one point pattern influence another? Two approaches 1) Contingency table analysis 2) Distance cross functions

Extensions to Point Pattern Measures - Multiple Sets of Events 1) Contingency table analysis A B

Extensions to Point Pattern Measures - Multiple Sets of Events 2) Distance cross functions K12(d) is large, means clustered or dispersed? Event 1 Event 2

Extensions to Point Pattern Measures - Multiple Sets of Events 2) Distance cross functions If pattern 1 represents cases of a disease and pattern 2 represents the at-risk population If D(d) > 0, then ? If D(d) = 0, then ? Event 1 Event 2

Extensions to Point Pattern Measures - Space-time analysis 1. Knox test: for n events we form the n(n-1)/2 possible pairs and find for each their distance in both time and space 2. Decide thresholds for time (near-far) and distance (close-distant) 3. Put them in the contingency table

Extensions to Point Pattern Measures - Space-time analysis Space-time K function λK(d,t) = E (no. events within distance d and time t of an arbitrary event) If there is no space-time interaction Diggle test (1995)