Swarm: Mining Relaxed Temporal Moving Object Clusters

Slides:



Advertisements
Similar presentations
Mining Periodic Behaviors for Moving Objects
Advertisements

Incremental Clustering for Trajectories
An Efficient Algorithm for Mining Time Interval-based Patterns in Large Databases Yi-Cheng Chen, Ji-Chiang Jiang, Wen-Chih Peng and Suh-Yin Lee Department.
ADAPTIVE FASTEST PATH COMPUTATION ON A ROAD NETWORK: A TRAFFIC MINING APPROACH Hector Gonzalez, Jiawei Han, Xiaolei Li, Margaret Myslinska, John Paul Sondag.
Mining Event Periodicity from Incomplete Observations
Ranking Outliers Using Symmetric Neighborhood Relationship Wen Jin, Anthony K.H. Tung, Jiawei Han, and Wei Wang Advances in Knowledge Discovery and Data.
Frequent Itemset Mining Methods. The Apriori algorithm Finding frequent itemsets using candidate generation Seminal algorithm proposed by R. Agrawal and.
Mining Frequent Patterns Using FP-Growth Method Ivan Tanasić Department of Computer Engineering and Computer Science, School of Electrical.
Graph Mining Laks V.S. Lakshmanan
Mining Compressed Frequent- Pattern Sets Dong Xin, Jiawei Han, Xifeng Yan, Hong Cheng Department of Computer Science University of Illinois at Urbana-Champaign.
Mining for Tree-Query Associations in a Graph Jan Van den Bussche Hasselt University, Belgium joint work with Bart Goethals (U Antwerp, Belgium) and Eveline.
gSpan: Graph-based substructure pattern mining
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
Frequent Closed Pattern Search By Row and Feature Enumeration
LOGO Association Rule Lecturer: Dr. Bo Yuan
Presented by: GROUP 7 Gayathri Gandhamuneni & Yumeng Wang.
ICDM'06 Panel 1 Apriori Algorithm Rakesh Agrawal Ramakrishnan Srikant (description by C. Faloutsos)
Trajectory Pattern Mining ACMGIS’2011 Hoyoung Jeung† Man Lung Yiu‡ Christian S. Jensen* † Ecole Polytechnique F´ed´erale de Lausanne (EPFL) ‡ Hong Kong.
Mining Top-K Large Structural Patterns in a Massive Network Feida Zhu 1, Qiang Qu 2, David Lo 1, Xifeng Yan 3, Jiawei Han 4, and Philip S. Yu 5 1 Singapore.
Constructing Popular Routes from Uncertain Trajectories Authors of Paper: Ling-Yin Wei (National Chiao Tung University, Hsinchu) Yu Zheng (Microsoft Research.
Tru-Alarm: Trustworthiness Analysis of Sensor Network in Cyber Physical Systems Lu-An Tang, Xiao Yu, Sangkyum Kim, Jiawei Han, Chih-Chieh Hung, Wen-Chih.
Rakesh Agrawal Ramakrishnan Srikant
On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras.
4/3/01CS632 - Data Mining1 Data Mining Presented By: Kevin Seng.
USpan: An Efficient Algorithm for Mining High Utility Sequential Patterns Authors: Junfu Yin, Zhigang Zheng, Longbing Cao In: Proceedings of the 18th ACM.
Name: Sujing Wang Advisor: Dr. Christoph F. Eick
黃福銘 (Angus F.M. Huang) ANTS Lab, IIS, Academia Sinica TrajPattern: Mining Sequential Patterns from Imprecise Trajectories.
Efficient Data Mining for Calling Path Patterns in GSM Networks Information Systems, accepted 5 December 2002 SPEAKER: YAO-TE WANG ( 王耀德 )
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
Xiang Zhang, Feng Pan, Wei Wang, and Andrew Nobel VLDB2008 Mining Non-Redundant High Order Correlations in Binary Data.
Garrett Poppe, Liv Nguekap, Adrian Mirabel CSUDH, Computer Science Department.
EntityRank :Searching Entities Directly and Holistically Tao Cheng, Xifeng Yan, Kevin Chen-Chuan Chang Computer Science Department, University of Illinois.
Xiangnan Kong,Philip S. Yu Department of Computer Science University of Illinois at Chicago KDD 2010.
On Node Classification in Dynamic Content-based Networks.
Parallel Mining Frequent Patterns: A Sampling-based Approach Shengnan Cong.
Instructor : Prof. Marina Gavrilova. Goal Goal of this presentation is to discuss in detail how data mining methods are used in market analysis.
Outline Introduction – Frequent patterns and the Rare Item Problem – Multiple Minimum Support Framework – Issues with Multiple Minimum Support Framework.
University at BuffaloThe State University of New York Lei Shi Department of Computer Science and Engineering State University of New York at Buffalo Frequent.
Lei Li Computer Science Department Carnegie Mellon University Pre Proposal Time Series Learning completed work 11/27/2015.
Data Mining Find information from data data ? information.
1 Efficient Mining of Iterative Patterns for Software Specification Discovery David Lo † Joint work with: Siau-Cheng Khoo † and Chao Liu ‡ † Prog. Lang.
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
Mining Top-K Large Structural Patterns in a Massive Network Feida Zhu 1, Qiang Qu 2, David Lo 1, Xifeng Yan 3, Jiawei Han 4, and Philip S. Yu 5 1 Singapore.
Mining Graph Patterns Efficiently via Randomized Summaries Chen Chen, Cindy X. Lin, Matt Fredrikson, Mihai Christodorescu, Xifeng Yan, Jiawei Han VLDB’09.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Data Mining and Decision Support
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
1 Discovering Calendar-based Temporal Association Rules SHOU Yu Tao May. 21 st, 2003 TIME 01, 8th International Symposium on Temporal Representation and.
University at BuffaloThe State University of New York Pattern-based Clustering How to cluster the five objects? qHard to define a global similarity measure.
Approach to Data Mining from Algorithm and Computation Takeaki Uno, ETH Switzerland, NII Japan Hiroki Arimura, Hokkaido University, Japan.
2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 3: Movement Patterns.
Gspan: Graph-based Substructure Pattern Mining
Cohesive Subgraph Computation over Large Graphs
Dr. Hongqin FAN Department of Building and Real Estate
CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets
Data Mining Find information from data data ? information.
TITLE What should be in Objective, Method and Significant
CACTUS-Clustering Categorical Data Using Summaries
G10 Anuj Karpatne Vijay Borra
Jiawei Han Department of Computer Science
CARPENTER Find Closed Patterns in Long Biological Datasets
CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets
Chao Zhang1, Yu Zheng2, Xiuli Ma3, Jiawei Han1
RankClus: Integrating Clustering with Ranking for Heterogeneous Information Network Analysis Yizhou Sun, Jiawei Han, Peixiang Zhao, Zhijun Yin, Hong Cheng,
Community Distribution Outliers in Heterogeneous Information Networks
On Discovery of Gathering Patterns from Trajectories
On Discovery of Traveling Companions from Streaming Trajectories
Online Analytical Processing Stream Data: Is It Feasible?
CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets
Presentation transcript:

Swarm: Mining Relaxed Temporal Moving Object Clusters Zhenhui (Jessie) Li, Bolin Ding, Jiawei Han University of Illinois at Urbana-Champaign Roland Kays New York State Museum VLDB conference Singapore September 15, 2010 Work supported by NSF, ARL (NS-CTA), AFOSR (MURI), NASA, and Boeing

Outline Motivation Problem Definition Algorithm Experiment Summary Discussion

Outline Motivation Problem Definition Algorithm Experiment Summary Discussion

Widely Available Moving Object Data Animal movement data Biological studies Data collected by tags, sensors, GPS MoveBank.org: 173 animal datasets (bear, buffalo, deer, fish, coyote...) Human movement data Location-based service Data collected by vehicle GPS, cell phones GeoLife project at MSRA: ~200 human trajectories

Mining the Relationships of Moving Objects The most basic relationship of moving objects: being together Animals in the same herd Human could have relationships: husband/wife, colleagues, friends One snapshot only tells temporary locations at one time 10:00 11:00 12:00 13:00 Time Relationship can only be detected dynamically over time

“Moving Cluster”: Moving together for “Consecutive Times”?? Flock [Gudmundsson, GIS’06] Objects are within a circle for k consecutive times Convoy [Jeung, VLDB’08] Objects are within a cluster for k consecutive times From [Jeung, VLDB’08] Flock fails to detect cluster with any shape Convoy fails to detect moving clusters for non-consecutive times

Relaxing Temporal Constraint: Essential for Detection of Moving Relationships Reason 1. In real application, objects could meet and depart Example: People travel: group/individual activity Animal migrate: move/hunt for food Reason II. It makes the moving object cluster detection less sensitive to “closeness” parameter 5.1m not close? 3.5m 3m 4m Example: - “5 meters” = “close enough”?

Outline Motivation Problem Definition Algorithm Experiment Summary Discussion

Swarm: A New Defn. of Moving Object Cluster Given clusters of moving objects for each time snapshot, Example: mino = 2, mint = 3 O = {o1,o2,o4} T = {t1, t2, t4} (O,T) forms a swarm A set of objects O, a set of timestamps T, (O, T) forms a swarm: |O| ≥ mino |T| ≥ mint For each timestamp t in T, objects in O are in the same cluster.

Closed Swarm: Reducing Redundancy Swarm (O,T): time-closed swarm No swarm (O,T’), where T’>T ((o1,o2),(t1,t2)) is NOT time-closed ((o1,o2),(t1,t2,t4)) is time-closed object-closed swarm No swarm (O’,T), where O’>O ((o1,o2),(t1,t2,t4)) is NOT object-closed ((o1,o2,o4),(t1,t2,t4)) is object-closed Closed swarm is both time-closed and object-closed mino = 2 mint = 3

Outline Motivation Problem Definition Algorithm Experiment Summary Discussion

Swarm Mining: A Challenging Problem It is very hard to detect swarm manually The possible combination of swarm is huge: e.g.: the possible combination for swarms is 232*290 32 bears in Alaska, 2000. May — 2000. Sept Trajectories plotted Movement animated

Why Not Traditional Frequent Pattern Mining? FP mining problem: a set of objects for each transaction Swarm mining problem: a set of clusters (cluster = a set of objects) for each timestamp

ObjectGrowth: Depth-First Search Based on Objects Naïve approach enumerate every combination of (O,T) search space: 2number of objects*2number of times We only need to enumerate objectset Reduce the search space from 2number of objects*2number of times to 2number of objects Example: If O={o1,o2}, only when T={t1,t2,t4}, (O,T) is possibly time-closed. Such T is called the maximal timeset of O. Tmax(O) = {t1,t2,t4}.

ObjectGrowth (Initial Illustration) 1 2 3 4 5 6 Search based on objectset; maintain the maximal timeset Depth-first order Search space is still huge in worst case: 2number of objects Pruning rules are needed!

ObjectGrowth: Apriori pruning mino = 2 mint = 2 |Tmax(O)| < mint

ObjectGrowth: Backward Pruning Tmax of {o1,o4} is {t1,t2,t4} = Tmax of {o1,o2,o4} is {t1,t2,t4}. Node {o1,o4} and its subtree is pruned.

ObjectGrowth: Forward Closure Checking Nodes passed Apriori and Backward pruning rules are NOT necessarily closed swarms. {o1,o2},{t1,t2,t4} is not a closed swarm because there is a (closed) swarm in its subtree.

ObjectGrowth: Identification of Closed Swarms closed swarms must pass all the rules Apriori, Backward and Forward rules Closed swarm nodes passed rules must be a closed swarm? YES! if |O|≥mino With the Theorem, we can output the closed swarm on-the-fly in the search process.

ObjectGrowth: Summary mino = 2 mint = 2 Start with empty objectset Not a closed swarm by Forward Closure Checking Pruned by Apriori Pruned by Apriori Pruned by Backward pruning rule Pruned by Apriori Passed all the rules and |O|≥2 Output this node as a closed swarm Passed all the rules and |O|≥2 Output this node as a closed swarm Pruned by Apriori Two closed swarms detected.

Outline Motivation Problem Definition Algorithm Experiment Summary Discussion

SWARM: A Component in MoveMine dm.cs.uiuc.edu/movemine Zhenhui Li et al., “MoveMine: Mining Moving Object Databases" (system demo), SIGMOD’10

Effectiveness Testing on Real Data Raw buffalo data 165 buffalo from Year 2000 to Year 2006 DBScan to preprocess the data (minPts=5, eps=0.001)

Swarms Mined from Buffalo Data Parameter: mino=2, mint =0.5(half of the time span) Result: 66 swarms Timestamps that they are in the same cluster are NOT consecutive DBScan to preprocess the data (minPts=5, eps=0.001)

Comparing with Convoy Mining Parameter: mino=2, mint =0.5 (half of the time span) Result: 0 convoy! Parameter: mino=2, mint=0.2 (20% of the time span, lower temporal constraint) Result: 1 convoy swarm This convoy is only a subset of one swarm. A period of consecutive time.

Efficiency: Test on Synthetic Data Number of objects: 500, number of timestamps: 105 Parameter: mino=0.01, mint =0.01 VG-Growth is DFS with Apriori pruning rule only ObjectGrowth+ is for probabilistic data (see paper Appendix) Vary the database size

Efficiency: Test on Synthetic Data Number of objects: 500, number of timestamps: 105 Parameter: mino=0.01, mint =0.01 VG-Growth is DFS with Apriori pruning rule only ObjectGrowth+ is for probabilistic data (see paper Appendix) Vary the parameter

Outline Motivation Problem Definition Algorithm Experiment Summary Discussion

Summary Our goal is to detect the moving object clusters. Swarm, by relaxing the temporal constraint, can discover moving object cluster in real scenarios. ObjectGrowth algorithm is proposed to mine all the closed swarms. Apriori pruning rule Backward pruning rule Forward Closure checking

Outline Motivation Problem Definition Algorithm Experiment Summary Discussion

Discussion Missing data interpolation Different time constraint A and B are together for 12 days in a year A and B are together for one day in each month Swarm ranking A and B form a swarm C and D form a swarm which has closer relationship?

THANKS! http://www.cs.uiuc.edu/homes/zli28 zli28@uiuc.edu