Pattern Statistics Michael F. Goodchild University of California Santa Barbara.

Pattern Statistics Michael F. Goodchild University of California Santa Barbara

Outline n Some examples of analysis n Objectives of analysis n Cross-sectional analysis n Point patterns

What are we trying to do? n Infer process –processes leave distinct fingerprints on the landscape –several processes can leave the same fingerprints enlist time to resolve ambiguity invoke Occam's Razor confirm a previously identified hypothesis

Alternatives n Expose aspects of pattern that are otherwise invisible –Openshaw –Cova n Expose anomalies, patterns n Convince others of the existence of patterns, problems, anomalies

Cross-sectional analysis n Social data collected in cross-section –longitudinal data are difficult to construct –difficult for bureaucracies to sustain –compare temporal resolution of process to temporal resolution of bureaucracy n Cross-sectional perspectives are rich in context –can never confirm process –though they can perhaps falsify –useful source of hypotheses, insights

What kinds of patterns are of interest? n Unlabeled objects –how does density vary? –do locations influence each other? –are there clusters? n Labeled objects –is the arrangement of labels random? –or do similar labels cluster? –or do dissimilar labels cluster?

First-order effects n Random process (CSR) –all locations are equally likely –an event does not make other events more likely in the immediate vicinity n First-order effect –events are more likely in some locations than others –events may still be independent –varying density

Second-order effects n Event makes others more or less likely in the immediate vicinity –clustering –but is a cluster the result of first- or second- order effects? –is there a prior reason to expect variation in density?

Testing methods n Counts by quadrat –Poisson distribution

Deaths by horse-kick in the Prussian army n Mean m = 0.61, n = 200 Deaths per yr01234 Probability0.5430.3310.1010.0210.003 Number of years expected 109.066.320.24.10.6 Number of years observed 109652231

Towns in Iowa n 1173 towns, 154 quadrats 20mi by 10mi 032.4 1109.9 21120.6 33128.7 43530.0 52825.0 62317.4 7610.4 865.4 9+14.0 Chisquare with 8 df = 12.7 Accept H 0

Distance to nearest neighbor n Observed mean distance r o n Expected mean distance r e = 1/2  d –where d is density per unit area n Test statistic:

Towns in Iowa n 622 points tested n 643 per unit area n Observed mean distance 3.52 n Expected mean distance 3.46 n Test statistic 0.82 n Accept H 0

But what about scale? n A pattern can be clustered at one scale and random or dispersed at another n Poisson test –scale reflected in quadrat size n Nearest-neighbor test –scale reflected in choosing nearest neighbor –higher-order neighbors could be analyzed

Weaknesses of these simple methods n Difficulty of dealing with scale n Second-order effects only –density assumed uniform n Better methods are needed

K-function analysis n K(h) = expected number of events within h of an arbitrarily chosen event, divided by d n How to estimate K? –take an event i –for every event j lying within h of i: score 1

Allowing for edge effects score < 1

The K function n In CSR K(h) =  h 2 n So instead plot:

What about labeled points? n How are the points located? –random, clustered, dispersed n How are the values assigned among the points? –among possible arrangments –random –clustered –dispersed

Moran and Geary indices

Pattern Statistics Michael F. Goodchild University of California Santa Barbara.

Similar presentations

Presentation on theme: "Pattern Statistics Michael F. Goodchild University of California Santa Barbara."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Pattern Statistics Michael F. Goodchild University of California Santa Barbara.

Similar presentations

Presentation on theme: "Pattern Statistics Michael F. Goodchild University of California Santa Barbara."— Presentation transcript:

Similar presentations

About project

Feedback