Download presentation
Presentation is loading. Please wait.
1
On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras Hong Kong University of Science and Technology
2
What is a Moving Cluster? Dense clusters of objects that move similarly for a long time period Not necessarily the same objects during the lifetime of the cluster Examples Migrating animals Convoy of cars Military applications Solutions: Efficient exact and approximate algorithms
3
Problem Formulation Example: Moving cluster
4
Related Work (Static) Partition-based clustering (k-medoids) Hierarchical clustering (BIRCH, CURE) Density-based clustering (DBSCAN) ε ε MinPts=3
5
Related Work (Moving Objects) Grouping trajectories [Vlachos et.al, ICDE 02] Trajectory cluster: Constant set of objects through its lifetime Only similar movement; no space proximity Dense areas over time [Hadjieleftheriou et.al, SSTD 03] Static dense regions No common objects between regions in sequence Incremental DBSCAN/OPTICS [Ester et.al, VLDB 98] Only a small percentage of objects moves Maintaining Data Bubbles [Nassar et.al, SIGMOD 04] Redistributes updated objects in existing bubbles
6
MC1: The Straight-forward approach G: set of moving clusters Apply clustering to next timeslice S i Expand moving clusters in G Add new moving clusters to G Report ending clusters
7
Hash-based DBSCAN Memory: 10M objects with 1GB RAM
8
MC1 is inefficient! 1. Checks all possible combination of clusters in consecutive timeslices 2. Performs clustering for every timeslice
9
MC2: Minimizing Redundant Checks Clustering in every timeslice Select a random object in c 1 Search the object in S 2 Repeat for remaining objects Max: (1-θ)|c i | objects c 1 c 2 is a moving cluster
10
Ambiguity Cases: θ<0.5 {c 0 c 1, c 2 } {c 0 c 2, c 1 }
11
MC3: Approximate Moving Clusters Intuition: Many clusters will remain the same even if objects move Avoid performing clustering in every timeslice For an object o If o belongs to cluster c in timeslice S i Assume that o also belongs to c in the next timeslice (notice: objects may have moved)
12
Refine clusters Hash new clusters in a grid Legal cluster: Does not meet/intersect with other clusters It is connected (cells meet) Objects in legal clusters are not considered further For the rest of the objects, perform clustering Possible inaccuracies!!!
13
Minimize Error Perform exact clustering to absorb (may not eliminate) the accumulated error Period for exact clustering: Grows linearly, drops exponentially Exact clustering: If more that α|G| clusters have been added/removed
14
Experimental Evaluation 10K-50K objects per timeslice 50-100 timeslices, up to 5M objects Linux, C++, 1.3GHz CPU, 1.2GB RAM Generator: Clusters move/rotate, objects appear/disappear
15
Varying data size (10K-50K per timeslice) Avg: 87% θ=0.9, α=0.1 Larger dataset: larger clusters, more interactions
16
Varying number of clusters (100-800 per timeslice) 5M objects, θ=0.9, α=0.1 Many clusters: Reaches error threshold fast 96% 87% 73%
17
Varying α 5M objects, θ=0.9, 800 clusters α small: may not recover!!!
18
Varying α for different agilities Low agility: Fewer errors faster
19
MC3 for varying θ 5M objects, α=0.1, 800 clusters θ large: incorrect clusters are pruned for not satisfying the θ criterion
20
Conclusions Moving clusters Objects may move/change Exact and approximate solutions Future work Automatic setting of parameter α Better error estimation Constraints (e.g, moving cluster must span at least k timeslices)
21
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.