Presentation is loading. Please wait.

Presentation is loading. Please wait.

METU, GGIT 538 CHAPTER V MODELING OF POINT PATTERNS.

Similar presentations


Presentation on theme: "METU, GGIT 538 CHAPTER V MODELING OF POINT PATTERNS."— Presentation transcript:

1 METU, GGIT 538 CHAPTER V MODELING OF POINT PATTERNS

2 METU, GGIT 538 4.1. Introduction 4.2. Case Studies 4.3. Visualizing Spatial Point Patterns 4.4. Exploring Spatial Point Patterns 4.4.1. Quadrat Methods 4.4.2. Kernel Estimation 4.4.3. Nearest Neighbor Distance 4.4.4. The K Function OUTLINE(Last Week) ANALYSIS OF POINT PATTERNS

3 METU, GGIT 538 OUTLINE MODELING OF POINT PATTERNS 5.1. Complete Spatial Randomness (CSR) 5.2. Simple Quadrat Tests for CSR 5.3. Nearest Neighbor Tests for CSR 5.3.1. Testing for CSR Based on Various Summary Statistics 5.3.2. Testing for CSR Based on Distribution Function 5.4. The K Function Tests for CSR

4 METU, GGIT 538 5.1. Introduction The exploratory analyses in most of the cases may be insufficient and it may be required to go further to consider the explicit tests of various hypotheses or construct specific models to explained the observed point pattern. The term modeling refers to statistically comparing various summary measures computed from the observed distribution of events, which leads to designing and testing hypotheses. The common model used is “complete spatial randomness (CSR)”.

5 METU, GGIT 538 Reasons for Testing Against CSR  Rejection of CSR is a prerequisite for any serious attempt to model an observed pattern  Tests are used to explore a set of data and assist in the formulations of alternatives to CSR  CSR operates as a dividing hypothesis between regular and clustered patterns

6 METU, GGIT 538 5.1. Complete Spatial Randomness (CSR) CSR is a standard model and states that the events follow a homogeneous Poisson Process over the study region. In this model, point pattern is considered to be number of events occurring in arbitrary sub-regions or areas, A, of the whole study region R.  Spatial point process is defined by: Where; Y(A) is the number of events occurring in the area A.

7 METU, GGIT 538 A hypothesis of complete spatial randomness for a spatial point pattern {Y(A), A Є R} asserts that:  The number of events in any planar region with area A follows a Poisson distribution with mean λA.  Given n events in A, the events are an independent random sample from a uniform distribution on A  implies constant intensity – no first order effects  implies no spatial interaction

8 METU, GGIT 538 In other words: 1.Any event has an equal probability of occurring at any position in R. 2.The position of any event is independent of the position of any other, i.e. events do not interact with one another

9 METU, GGIT 538 Therefore, by simulating n events from such a process by enclosing R in a rectangle, i.e. generating events with x coordinates from a uniform distribution on (x 1,x 2 ) and y coordinates from a uniform distribution on (y 1,y 2 ), the observed pattern of points can be compared with the simulated ones based on CSR. i.e. CSR represents a baseline hypothesis against which to assess whether observed patterns are regular, clustered or random.

10 METU, GGIT 538 5.2. Simple Quadrat Tests for CSR The quadrat counts can be tested for CSR by using the so called index of dispersion test. Let (x 1,…,x m ) be the counts of the number of events in m quadrats, either randomly scattered in R or forming a regular grid covering the whole of R. Then randomness can be tested based on the idea that if these counts follow a Poisson distribution, it is expected to achieve equal mean and variance of the counts (variance mean ration). H 0 : Point pattern is random and λ = s 2

11 METU, GGIT 538 When the test is applied to particular set of observations, the number of points and grid-squares are fixed, consequently the mean will be constant irrespective of whether the points are clustered, random or regular. It is therefore differences in the variance that indicate the nature of the point pattern.  If the VMR is significantly greater than 1.0 then clustering of the points is indicated whereas value lower than 1.0 denotes regularity.

12 METU, GGIT 538 is called index of dispersion (I) and is called index of cluster size (ICS) The index of dispersion test is advantageous since it can be applied in conjunction with the sampling of point patterns. In this case m quadrats will be randomly scattered in R and events exhaustively counted on each quadrat. Such a sampling scheme can be applied to estimate the intensity, λ of the events in R. E(ICS) = 0  CSR E(ICS) > 0  Clustering (extra events) E(ICS) < 0  Regularity (insufficient events)

13 METU, GGIT 538 The test statistic for I is defined as follows: Where; = Mean observed counts s 2 = Observed variance of the counts m = Number of grids Under CSR the theoretical chi-square distribution is: for m > 6 and > 1

14 METU, GGIT 538 Properties of Quadrat Tests for CSR  Under CSR the test statistic I is distributed as  Compare test statistic I with percentage points of  Significantly large values indicate clustering  Significantly small values indicate regularity

15 METU, GGIT 538 # of events/qua drat (n) # of quadrats with n events (q) Total # of events in quadrats (X) X2X2 07000 142 22652104 31751153 431248 51525 61636 su m 160168408

16 METU, GGIT 538 If λ is assumed to be constant and CSR holds the estimate of λ is given by: Where Q is the area of each quadrat. Then the 95 % confidence interval of λ can be estimated by: Where;

17 METU, GGIT 538 Problems Encountered 1.Problem of overlapping quadrats: If randomly scattered quadrats are to be used, they may overlap each other and produce a problem if occurs frequently, since the x i counts will not be independent. This can be overcome by using a sampling scheme that guaranties disjoint quadrats. 2.Problem of overlapping quadrats with the edge of R: If the quadrats overlap with the edge of R, introduction of a guard area inside the perimeter of R can be a solution. In this case only the quadrats randomly scatter throughout that part of R which is not in the guard area, allowing events in the guard area to be counted as in any quadrats which overlap into this area.

18 METU, GGIT 538 3.Problem of choosing appropriate quadrat size: An empirical suggestion is to aim for a mean quadrat count of about 1.6. 4.Problem of quadrat position: Usually no account is taken care of the relative position of quadrats or the relative position of events within a quadrat. One common method to consider the relative position of quadrats is called Greig-Smith Procedure, which is given by: a.Calculate the variance of quadrat counts for the original grid b.Divide the grids into sub-grids each formed by successive combination of adjacent quadrats in the original grid into blocks of increasing size c.Plot the variance estimates at each block size, where the peaks and troughs indicate evidence of scale of pattern

19 METU, GGIT 538 Table 5.1. Available Indexes for testing CSR

20 METU, GGIT 538 5.3. Nearest Neighbor Tests for CSR In order to test for CSR in nearest neighbor distances, the cumulative distributions of G(w) and F(x) must be known when dealing with any specific area. However, it is usually impossible to know G(w) and F(x) due to the edge effects, since they depend of the particular shape of R. On the other hand, it is possible to derive theoretical distribution results for W and X if the edge effects are ignored. There are two ways for testing for CSR in nearest neighbor distances: Testing based on various summary statistics Testing based on distribution function

21 METU, GGIT 538 5.3.1. Testing for CSR Based on Various Summary Statistics Let the mean density of events / unit area be λ. If CSR holds, events are independent and the number of events in any area is Poisson distributed.  Probability that no events fall within a circle of radius x around any randomly chosen point is:  The distribution function F(x) of nearest neighbor point- event distances X for CSR is given by:, This implies that πX 2 follows an exponential distribution with parameter λ. i.e. 2πλX 2 is distributed as.

22 METU, GGIT 538 Then it may be deduced that: If X 1,…X n are independent nearest neighbor distances then is distributed as.

23 METU, GGIT 538 The same arguments apply to the nearest neighbor event- event distances for CSR process. i.e. Under CSR, the distribution function G(w) is:, E(W) and VAR(W) are the same for X. Now it is possible to derive sampling distributions under CSR of various summary statistics of the observed nearest neighbor distances.

24 METU, GGIT 538 Distribution theory for these tests is based on the assumption that n nearest neighbor measurements randomly sampled from the study region R is independent. This assumption of independence may be violated in case of small numbers of events and if the proportion of them used is large.

25 METU, GGIT 538 Basic Assumption: 1.The nearest neighbor distances used to compute the summary statistics must be independently sampled from the study region. Therefore independence is assured for large number of events.  Rule of thumb: The number m, of the nearest neighbor measurements sampled should be where n is the total number of events. !!!Remark: The general effect of lack of independence is that the test statistics will have a large variance than their theoretical values under independence. This implies that the standard test may show significant departure from CSR, which would not be so is the dependence is not taken into account.

26 METU, GGIT 538 There are various tests suggested to detect departures from CSR based on summary statistics of m randomly sampled nearest neighbor event-event distances (w 1,…,w m ) or point-event distances (x 1,…,x m ). The most commonly used are:  Clark-Evans  Hopkins  Byth and Ripley 2.The nearest neighbor distances used to compute the summary statistics have not been biased by edge effects.

27 METU, GGIT 538 Clark-Evans: It compares with percentage points of the distribution: Basic Properties: The test is based on event-event distances It requires enumerated point pattern to be available, from which events can be randomly sampled and their nearest neighbor distances determined. λ is unknown and needs to be replaced by appropriate estimate, which is λ = n/R (n is the number of events in R). If an estimate of λ is used it is desirable to use all n event-event distances, if possible, rather than a sample of m of them.

28 METU, GGIT 538 For the case m = n Where P is the perimeter of the study region which has area A.

29 METU, GGIT 538

30 Basic Properties:  The test requires complete enumeration of all n events in the study region since it uses w i, so that event-event distances can be randomly sampled.  The above rule can be relaxed an it can be applied in conjunction with sampling of point patterns if a “semi-systematic” sampling scheme is employed, whereby a regular grid of study points for calculating point-event distances x i. Hopkins: It compares with percentage points of the distribution. The physical implication of the test is that in clustered patterns the point-event distances x i will be large relative to event-event distances w i, vice versa in a regular pattern.

31 METU, GGIT 538 Byth & Ripley: It compares with percentage points:, where x i values are randomly paired with the w i values.

32 METU, GGIT 538 Table 5.1. Available statistics for testing CSR in nearest neighbor distances

33 METU, GGIT 538 5.3.2. Testing for CSR Based on Distribution Function Looking at the complete estimated distribution function of W or X rather than just a single statistic is another alternative for testing CSR. The basic question is: ? Can we construct a formal method for comparing the whole of the distribution function with its theoretical form under CSR? The theoretical distributions for G(w) and F(x) under CSR are:

34 METU, GGIT 538 Then the plots of the theoretical distributions G(w) and F(x) are compared with the estimated and. Here there is still no formal way of assessing the significance of differences in the plots. A more satisfactory approach is to compare the estimated functions with a simulation estimate of their theoretical distributions.

35 METU, GGIT 538 The simulation estimate for G(w) under CSR is calculated as: Where; = Empirical distribution functions each of which is estimated from one of m independent simulations of n events under CSR (i = 1, …, m). i.e. n events independently and uniformly distributed in R.

36 METU, GGIT 538 For the purposes of assessing the significance of departures between the simulated CSR distribution, and that is actually observed, it is also necessary to define upper and lower simulation envelopes:

37 METU, GGIT 538 When is plotted against and U(w) and L(w) are added to the plot:  If the data are compatible with CSR  the plot vs should be roughly linear and at 45°.  If the clustering is present the plot will lie above the line.  If the regularity is present the plot will lie under the line.

38 METU, GGIT 538 U(w) and L(w) will help to assess the significance of departures from 45° line in the plot since they have the following property: This also indicates the required number of simulations in order to detect departure at a specified significance level.

39 METU, GGIT 538 5.4. The K Function Tests for CSR Under CSR the expected number of events within a distance of h of a randomly chosen event is: Hence theoretically under CSR:

40 METU, GGIT 538 Hence the estimated K function from the observed data,, is compared with the theoretical one. One way of doing this is comparing theoretical value with the plot of against h Positive peaks  Clustering Negative troughs  Regularity

41 METU, GGIT 538 The formal assessment of the significance of observed peaks and troughs requires knowledge of sampling distribution of and under CSR. This is unknown and complex because of the edge corrections built into. However, it is possible to use an analogous approach to that used for nearest neighbor distances.

42 METU, GGIT 538 The method involves:  Obtaining a simulation estimate of the sampling distributions  Constructing upper and lower simulation envelopes:

43 METU, GGIT 538  Plotting vs h together with plots of and enveloped  Assessing the significance of peaks troughs on the basis of:

44 METU, GGIT 538 Alternate Models to CSR  For clustered patterns  First order effects only:  Heterogeneous Poisson Process  Cox Process  Second order effects only:  Poisson Cluster Process  For regular patterns  Simple Inhibition Process  Markov Point Processes  Either  Markov Point Processes


Download ppt "METU, GGIT 538 CHAPTER V MODELING OF POINT PATTERNS."

Similar presentations


Ads by Google