Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,

Slides:



Advertisements
Similar presentations
Spatial point patterns and Geostatistics an introduction
Advertisements

Multiple Analysis of Variance – MANOVA
Inference for Regression
Part II – TIME SERIES ANALYSIS C5 ARIMA (Box-Jenkins) Models
Zakaria A. Khamis GE 2110 GEOGRAPHICAL STATISTICS GE 2110.
GIS and Spatial Statistics: Methods and Applications in Public Health
The Two Factor ANOVA © 2010 Pearson Prentice Hall. All rights reserved.
Correlation and Autocorrelation
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Error Propagation. Uncertainty Uncertainty reflects the knowledge that a measured value is related to the mean. Probable error is the range from the mean.
Applied Geostatistics Geostatistical techniques are designed to evaluate the spatial structure of a variable, or the relationship between a value measured.
Applied Geostatistics
Ch. 6 The Normal Distribution
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Chapter 2 Simple Comparative Experiments
Introduction to Probability and Statistics Linear Regression and Correlation.
Pattern Statistics Michael F. Goodchild University of California Santa Barbara.
Correlation and Regression Analysis
Lecture II-2: Probability Review
1 Multivariate Normal Distribution Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Two-Way Analysis of Variance STAT E-150 Statistical Methods.
University of Wisconsin-Milwaukee Geographic Information Science Geography 625 Intermediate Geographic Information Science Instructor: Changshan Wu Department.
Inference for regression - Simple linear regression
Chap 6-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 6 The Normal Distribution Business Statistics: A First Course 6 th.
Spatial Statistics Applied to point data.
The Examination of Residuals. The residuals are defined as the n differences : where is an observation and is the corresponding fitted value obtained.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010.
Random Sampling, Point Estimation and Maximum Likelihood.
Theory of Probability Statistics for Business and Economics.
Why Is It There? Getting Started with Geographic Information Systems Chapter 6.
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
1 G Lect 3b G Lecture 3b Why are means and variances so useful? Recap of random variables and expectations with examples Further consideration.
CS433: Modeling and Simulation Dr. Anis Koubâa Al-Imam Mohammad bin Saud University 15 October 2010 Lecture 05: Statistical Analysis Tools.
Module 1: Statistical Issues in Micro simulation Paul Sousa.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Geographic Information Science
Extending Spatial Hot Spot Detection Techniques to Temporal Dimensions Sungsoon Hwang Department of Geography State University of New York at Buffalo DMGIS.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Galaxy clustering II 2-point correlation function 5 Feb 2013.
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Spatial Statistics in Ecology: Continuous Data Lecture Three.
Chapter 4 – Distance methods
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Spatial Interpolation III
Applications of Spatial Statistics in Ecology Introduction.
Spatial Analysis & Geostatistics Methods of Interpolation Linear interpolation using an equation to compute z at any point on a triangle.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Spatial Statistics in Ecology: Point Pattern Analysis Lecture Two.
Point Pattern Analysis Point Patterns fall between the two extremes, highly clustered and highly dispersed. Most tests of point patterns compare the observed.
So, what’s the “point” to all of this?….
L15 – Spatial Interpolation – Part 1 Chapter 12. INTERPOLATION Procedure to predict values of attributes at unsampled points Why? Can’t measure all locations:
Correlation & Regression Analysis
Point Pattern Analysis
Probability and Distributions. Deterministic vs. Random Processes In deterministic processes, the outcome can be predicted exactly in advance Eg. Force.
ANOVA, Regression and Multiple Regression March
Chris Ferro Climate Analysis Group Department of Meteorology University of Reading Extremes in a Varied Climate 1.Significance of distributional changes.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions Basic Business.
Chap 6-1 Chapter 6 The Normal Distribution Statistics for Managers.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
METU, GGIT 538 CHAPTER IV ANALYSIS OF POINT PATTERNS.
METU, GGIT 538 CHAPTER V MODELING OF POINT PATTERNS.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Cases and controls A case is an individual with a disease, whose location can be represented by a point on the map (red dot). In this table we examine.
Summary of Prev. Lecture
Comparing Three or More Means
Spatial Point Pattern Analysis
Spatial Point Pattern Analysis
3 4 Chapter Describing the Relation between Two Variables
Presentation transcript:

Methods for point patterns

Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g., spatial interaction). Edge effects can affect the results. Therefore either use a guard region (buffer around edge) or a toroidal shift technique.

Exploring point patterns Quadrat analysis: either completely cover the study area (e.g., lay a grid over the study area) or randomly scatter quadrats (e.g., in a field study). Problems: size of quadrat, loss of relative locations within quadrat, orientation (variance of results under rotation)

Exploring point patterns Kernel estimation: obtain a smooth estimate of the probability density (aka a smoothed histogram). Issues: choice of ‘kernel’, bandwidth (fixed distance or adaptive [fixed # of points])

Exploring point patterns Kernel estimation weights points that are further away less than those that are close. Point A will count less than Point B. Point A Point B Tao is the bandwidth and determines resolution Edge corrections should be used if points are near the edge

Exploring point patterns Nearest neighbour distances: exploring second order properties. Two basic types: nearest neighbour event-event distance G(w), or the nearest neighbour point-event F(x) distance (where the point is a randomly selected point in the study area).

Exploring point patterns NN results are typically plotted as a function of distance classes (# of pairs less than/equal to the distance class, divided by the number of events [or random points sampled]). If curve rises rapidly at beginning, suggests clustering of events, if rises late, suggests regularity.

Exploring point patterns You could also plot G(w) against F(x). If there is no interaction the plot should be roughly a straight line. If the events are clustered, then the G(w) values should be higher than the F(x) values.

Exploring point patterns K functions: basically, the expected number of events within distance h of an arbitrary event. Consider drawing buffers around each event (distance 1, 2, 3, …), counting the [cummulative] # of events within each buffer, taking the average for each distance class (scaled by the area / n 2 ). Could plot k against h.

Exploring point patterns Typically k functions are not plotted as such; they are normally transformed into L plots. L values are K values ‘normalized’ by taking into account the expected number of points per distance class assuming a homogeneous point pattern distribution with no spatial dependence.

Exploring point patterns L(h) plots: peaks indicate clustering and troughs regularity, as a function of distance (h). K functions are the preferred means to examine point patterns as they consider all scales (not just the nearest neighbours), and they have a theoretical basis.

Modelling spatial point patterns The standard model for complete spatial randomness (CSR) is the homogeneous Poisson process. Poisson distributions have means = variances, which enables us to test some of the exploring point pattern statistics just introduced. The is the most commonly assumed process.

Modelling spatial point patterns Other choices for CSR include: Heterogeneous Poisson process (the intensity of the process varies across space [e.g., a trend surface]) Cox process (the intensity of the process randomly varies across space) Poisson cluster process (a two level poisson process—parent points are identified (CSR), and around each parent point a random number of offspring are placed; only the offspring remain).

Modelling spatial point patterns Simple inhibition processes: a hard core process—a CSR process is thinned by removing all pairs of points less than distance d apart Markov point processes: similar to SIP (above), but allowing for the possibility that point pairs could be found at distances less than d apart

Modelling spatial point patterns Quadrat analyses: compute a chi-square value by dividing the variance (* n-1) by the mean. Significantly large values = clustered, small values = regularity. An index of cluster size (ICS) = (variance / mean) – 1. If CSR holds, E(ICS) = 0. If ICS > 0 then clustering is implied, if ICS < 0 regularity is implied.

Modelling spatial point patterns Nearest neighbour tests: based on CSR we can derive a theoretical distribution of nn distances. Edge effects are problematic when working with theoretical distributions, so normally a computational intensive (Monte Carlo) approach is taken.

Modelling spatial point patterns The method: simulate m empirical distributions of n events under CSR, determine the NN for each. The mean values are determined for each distance class, as well as the min and max values observed within each class. The theoretical values (mean, min, max) are plotted against the observed values.

Modelling spatial point patterns If CSR holds, the plot should be roughly linear at a 45 o angle. If clustering is present the plot will lie above the line, if regularity the plot will lie below the line (assuming the simulated values are plotted on the x-axis).

Modelling spatial point patterns K functions: similar to the NN method— produce m simulations and plot the observed L values and the min/max envelopes against the distance (h).

Comparing multiple types of events Say we have two types of events (contaminated / clean wells, cases of larynx and lung cancers, crimes committed by blacks / whites) and what to examine if one type of event is related to the other, does one ‘explain’ the other (dependence).

Comparing multiple types of events Simplest approach: quadrat analyses, count the # of events in each quadrat, produce a x 2 x contingency table (Chi-squared statistic). Type 1 Type 2 AbsentPresent Absent Present

Comparing multiple types of events NN distances: if independent, then the NN distance class values when determining the event i – event j NN values and a random point – event j values should be equal.

Comparing multiple types of events Multivariate (cross) K functions: are the preferred means to explore the relation among multivariate point patterns. The K function is: E(#(type j events ≤ h from an arbitrary type i event)) Again normalized by: Area / n i n j Interestingly, the expected value of K is not affected if the i or j events are clustered, random or regular when considered separately.

Comparing multiple types of events As with K functions, L plots are produced. This time, we plot L ii (h), L jj (h) and L ij (h) simultaneously, which reveals whether individually i or j depart from CSR, as well as if i and j appear to be attracted (positive peaks) or repulsed (negative troughs in the plot).

Comparing multiple types of events To determine the significance of the i-j L plot, the normal approach is to randomly shift one set of events (a toroidal shift) and determine the min/max observed values; if the observed values lie outside of the simulation envelope then the peaks / troughs are assumed to be significant.

Comparing multiple types of events What if you would like to correct for spatial variation in the population at risk? That is, CSR is not a viable option, since there is an expectation that there is some natural spatial variation in the intensity of the process (e.g., population density – disease incidence, crime patterns)

Comparing multiple types of events The simplest approach is to use kernel smoothing, dividing the surface of the variable of interest (say cancer victims, crime incidents) by a surface of the population at risk (say total population, housing density).

Comparing multiple types of events A more sophisticated approach is to use a case / control approach. Use the observed events from one spatial process that are representative of the population variations (the control process).

Comparing multiple types of events K functions are used (similar to the cross K functions described above). In order to determine the significance of the observed pattern, we randomly label the combined events into cases and controls. The values of (K cases – K controls ) are plotted against h, along with the min/max of the randomly labeled simulation results.

Methods for point patterns I have just touched on a few of the methods that can be used to examine point patterns. For linear and areal data there are also a similarly wide variety of methods, and once you get into GLMs you will encounter many sophisticated solutions.