Download presentation
Presentation is loading. Please wait.
Published byDerrick Farmer Modified over 8 years ago
1
Mining Statistically Significant Co-location and Segregation Patterns
2
Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation
3
Motivation Finding collocated events provides insightful evidences in decision making and scientific research: –Ecology –Biology –Epidemiology –… Colocation patterns caused by randomness need attention: –Presence of spatial autocorrelation –Abundance of feature instances –…
4
Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation
5
Key Concept (1)
6
Key Concept (2) Null hypothesis –A hypothesis that one tries to disprove given the observation from the dataset. Alternative hypothesis –The opposite of null hypothesis, which is true when null hypothesis is rejected.
7
Key Concept (2) Null hypothesis –For a colocation pattern C, a higher participation index can be obtained in a random feature distribution(spatial autocorrelation is considered). –For a segregation pattern C, a lower participation index can be obtained in a random feature distribution.
8
Key Concept (3) Statistical significance –Significance is determined by significance level α (or Type I error), which is the probability of rejecting the null hypothesis given that it is true. –For each observed pattern, this probability is called p-value.
9
Key Concept (4)
10
Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation
11
Problem Definition
12
Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation
13
Related work Co-location Patterns Segregation Patterns Significance Test Spatial Co-location Patterns Detection √ Spatial Segregation Patterns Detection √ Mining Statistically Significant Co-location and Segregation Patterns √√√
14
Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation
15
Challenge The co-location/segregation patterns determined by a manually set threshold will raise false positives and are sensitive to dataset No probability model is available to compute the significance level (p-value) in a closed-form fashion; Computation is expensive to test the significance through Monte Carlo simulation.
16
Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation
17
Contributions Incorporates statistical significance test with colocation and segregation pattern detection which reduces spurious patterns caused by randomness; Proposes three approaches for algorithm acceleration: –a subset-based filter –a grid-based sampling framework –a spatial-join based pruning technique
18
Subset-based Filter
19
Grid-based Sampling
20
Spatial-join Based Pruning
21
Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation
22
Quality of Approximation – Grid-based Participation Index
23
Inhibition (synthetic data set)
24
Auto-correlation (synthetic data set)
25
Mixed Spatial Interactions (synthetic data set)
26
Runtime Comparison (1) Fixed total cluster number of each auto-correlated feature
27
Runtime Comparison (2) Various total cluster number of each auto-correlated feature
28
Experiments (real data set) –Ants –Bramble Canes –Lansing Woods –Toronto address repository
29
Ants Data
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.