On Discovery of Gathering Patterns from Trajectories

Slides:



Advertisements
Similar presentations
Xiaolei Li, Zhenhui Li, Jiawei Han, Jae-Gil Lee. 1. Motivation 2. Anomaly Definitions 3. Algorithm 4. Experiments 5. Conclusion.
Advertisements

Incremental Clustering for Trajectories
An Interactive-Voting Based Map Matching Algorithm
Swarm: Mining Relaxed Temporal Moving Object Clusters
Mining Compressed Frequent- Pattern Sets Dong Xin, Jiawei Han, Xifeng Yan, Hong Cheng Department of Computer Science University of Illinois at Urbana-Champaign.
Discovering Lag Interval For Temporal Dependencies Larisa Shwartz Liang Tang, Tao Li, Larisa Shwartz1 Liang Tang, Tao Li
A Framework for Clustering Evolving Data Streams Charu C. Aggarwal, Jiawei Han, Jianyong Wang, Philip S. Yu Presented by: Di Yang Charudatta Wad.
Konstanz, Jens Gerken ZuiScat An Overview of data quality problems and data cleaning solution approaches Data Cleaning Seminarvortrag: Digital.
Trajectory Pattern Mining ACMGIS’2011 Hoyoung Jeung† Man Lung Yiu‡ Christian S. Jensen* † Ecole Polytechnique F´ed´erale de Lausanne (EPFL) ‡ Hong Kong.
Visual Event Detection & Recognition Filiz Bunyak Ersoy, Ph.D. student Smart Engineering Systems Lab.
Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.
Spatio-Temporal Outlier Detection in Precipitation Data
Continuous Data Stream Processing  Music Virtual Channel – extensions  Data Stream Monitoring – tree pattern mining  Continuous Query Processing – sequence.
Cascading Spatio-Temporal Pattern Discovery P. Mohan, S.Shekhar, J. Shine, J. Rogers CSci 8715 Presented by: Atanu Roy Akash Agrawal.
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
Reza Sherkat ICDE061 Reza Sherkat and Davood Rafiei Department of Computing Science University of Alberta Canada Efficiently Evaluating Order Preserving.
Visualization and graphics research group CIPIC January 21, 2003Multiresolution (ECS 289L) - Winter Surface Simplification Using Quadric Error Metrics.
1 Preserving Privacy in GPS Traces via Uncertainty-Aware Path Cloaking by: Baik Hoh, Marco Gruteser, Hui Xiong, Ansaf Alrabady ACM CCS '07 Presentation:
Reducing Uncertainty of Low-sampling-rate Trajectories Kai Zheng, Yu Zheng, Xing Xie, Xiaofang Zhou University of Queensland & Microsoft Research Asia.
Tokyo Research Laboratory © Copyright IBM Corporation 2009 | 2009/04/03 | SDM 09 / Travel-Time Prediction Travel-Time Prediction using Gaussian Process.
A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data Authors: Eleazar Eskin, Andrew Arnold, Michael Prerau,
Time-focused density-based clustering of trajectories of moving objects Margherita D’Auria Mirco Nanni Dino Pedreschi.
1 Verifying and Mining Frequent Patterns from Large Windows ICDE2008 Barzan Mozafari, Hetal Thakkar, Carlo Zaniolo Date: 2008/9/25 Speaker: Li, HueiJyun.
Spatial-Temporal Models in Location Prediction Jingjing Wang 03/29/12.
Discovering Deformable Motifs in Time Series Data Jin Chen CSE Fall 1.
Efficient OLAP Operations for Spatial Data Using P-Trees Baoying Wang, Fei Pan, Dongmei Ren, Yue Cui, Qiang Ding William Perrizo North Dakota State University.
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
Christoph F. Eick Questions and Topics Review November 11, Discussion of Midterm Exam 2.Assume an association rule if smoke then cancer has a confidence.
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
Monitoring k-NN Queries over Moving Objects Xiaohui Yu University of Toronto Joint work with Ken Pu and Nick Koudas.
Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy Nurjahan BegumLiudmila Ulanova Jun Wang 1 Eamonn Keogh University.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Trajectory Data Mining
Big traffic data processing framework for intelligent monitoring and recording systems 學生 : 賴弘偉 教授 : 許毅然 作者 : Yingjie Xia a, JinlongChen a,b,n, XindaiLu.
Database Management Systems, R. Ramakrishnan 1 Algorithms for clustering large datasets in arbitrary metric spaces.
Extracting stay regions with uncertain boundaries from GPS trajectories a case study in animal ecology Haidong Wang.
1 Complex Spatio-Temporal Pattern Queries Cahide Sen University of Minnesota.
03/02/20061 Evaluating Top-k Queries Over Web-Accessible Databases Amelie Marian Nicolas Bruno Luis Gravano Presented By: Archana and Muhammed.
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
Graph Indexing From managing and mining graph data.
Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.
Clustering Microarray Data based on Density and Shared Nearest Neighbor Measure CATA’06, March 23-25, 2006 Seattle, WA, USA Ranapratap Syamala, Taufik.
University at BuffaloThe State University of New York Pattern-based Clustering How to cluster the five objects? qHard to define a global similarity measure.
Privacy Vulnerability of Published Anonymous Mobility Traces Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip (Purdue University) Nageswara S. V. Rao (Oak.
Mining Coherent Dense Subgraphs across Multiple Biological Networks Vahid Mirjalili CSE 891.
Presented by Niwan Wattanakitrungroj
Managing Massive Trajectories on the Cloud
Tian Xia and Donghui Zhang Northeastern University
An Efficient Algorithm for Incremental Update of Concept space
What Is Cluster Analysis?
T-Share: A Large-Scale Dynamic Taxi Ridesharing Service
Urban Sensing Based on Human Mobility
G10 Anuj Karpatne Vijay Borra
Distance Computation “Efficient Distance Computation Between Non-Convex Objects” Sean Quinlan Stanford, 1994 Presentation by Julie Letchner.
Mining Spatio-Temporal Reachable Regions over Massive Trajectory Data
Clustering Uncertain Taxi data
Mining the Most Influential k-Location Set from Massive Trajectories
Spatio-temporal Pattern Queries
Mining Frequent Itemsets over Uncertain Databases
Yongli Zhang and Christoph F. Eick University of Houston, USA
Finding Fastest Paths on A Road Network with Speed Patterns
On Discovery of Traveling Companions from Streaming Trajectories
Temporal Databases.
Presented by: Mahady Hasan Joint work with
Ying Dai Faculty of software and information science,
Pei Lee, ICDE 2014, Chicago, IL, USA
Continuous Motion Pattern Query
Continuous Density Queries for Moving Objects
Learning from Data Streams
Chapter 11: Indexing and Hashing
Presentation transcript:

On Discovery of Gathering Patterns from Trajectories Kai Zheng, Yu Zheng, Jing Yuan, Shuo Shang ICDE 2013, Brisbane, Australia 12/26/2018

Background Prevalence of trajectory data Usefulness of trajectory data Advance of location-acquistion technology Easier to track moving objects Huge amount, diverse type Usefulness of trajectory data Understand periodic/regular/specific movement behaviour of individuals Obtain travel patterns of groups in certain areas (traffic analysis) Detect abnormality/event that trigger lots of movements Trajectory data analysis Storage and indexing (e.g., spatio-temporal indexing) Query processing (e.g., range, knn, etc) Mining and pattern discovery (e.g., clustering, co-travellers patterns) 12/26/2018

Co-travellers Pattern A group of objects that move together for sufficiently long time Representative definitions Flock (2008 Benkert et al, 2009 Vieira et al, 2006 Gudmundsson et al) Convoy (2008 Jeung et al, 2012 Tang et al) Swarm (2010 Li et al) Difference Pattern Definition of group Time requirement Flock Disc-based cluster Consecutive Convoy Density-based cluster Swarm Non-consecutive strict flexible 12/26/2018

Co-travellers Pattern (cont.) Threshold of co-travelling time period: 2 timestamps Flock: O2, O3, O4 O5 is a co-traveller, but no included Convoy: O2, O3, O4, O5 O1 is not in cluster when t=2 Swarm: all five objects 12/26/2018

Gathering Pattern: intuition Informally, it represents some group events or incidents that involves congregation of moving objects For example: Celebrations Parades Protests Traffic jams/accidents Motivation Sensing, monitoring, predicating significant event Making plans early Quick emergency response 12/26/2018

Attributes of gathering pattern Scale Typically involves a large number of individuals Density High density (but with arbitrary shapes) Durability Last for sufficiently long time period Stationariness The geometric properties (e.g., shape, location) do not change rapidly Existing co-travellers patterns do not desire this Commitment Dedicated members exist, who participate the event for certain (possibly non-consecutive) time period To exclude the dense areas (e.g., busy road intersections) 12/26/2018

Problem we studied Formulating the definition of gathering pattern How to find patterns quickly from a large trajectory set How to find patterns incrementally when new trajectory data come 12/26/2018

Definition of crowd Input Thresholds Trajectories of moving objects Odb within certain time domain Assumption: location snapshot is available for each object at each time instant (some pre-processing may be required such as interpolation) Snapshot cluster: density-based cluster of location snapshot Thresholds Support threshold mc: how many individuals it contains Variation threshold 𝛿 : how rapidly it changes Lifetime threshold kc: How long it will last for Crowd: a consecutive sequence of snapshot clusters At least mc individuals at any time Distance between any consecutive clusters ≤𝛿 Lifetime > kc 12/26/2018

How to measure distance of clusters A cluster is a set of points Hausdorff distance Metric for point set, polygons Small Hausdorff distance means the location and shape of the cluster do not change much Intuitively, the Hausdorff distance is the longest distance one can be forced to travel by an adversary who chooses a point in one of the two sets, from where you must travel to the other set Exactly what we desire! 12/26/2018

Definition of gathering pattern Crowd captures the attributes 1 – 4 No requirement for the commitment of individuals (dedicated member) Participator An object that appears at least kp times in a crowd It may not stay in the crowd continuously Flexible for real scenarios Gathering A crowd that has at least mp participators at any time 12/26/2018

Example Two crowds Participators (kp = 2) Gathering (mp = 3) Cra: C1, C2, C4 Crb: C1, C3, C4 C5 moves too fast Participators (kp = 2) Cra: O1, O2, O3, O4 Crb: O2, O3, O5 Gathering (mp = 3) Cra 12/26/2018

Approach overview Find snapshot clusters at each time point Perform cluster on simplified trajectories first [2008 Jeung et al] Output: a database of snapshot clusters, CDB Detect all the crowds Deal with large number of clusters at each time instant Hausdorff distance evaluation is expensive Discover gathering patterns Check the occurrences of large number of objects in many crowds Downward-closure does not hold (a gathering doesn’t imply its sub-sequences are all gathering) 12/26/2018

Crowd detection algorithm Extend until fail Starting from a cluster at current time Extend it with one of the clusters at the next time instant, which is close to it Continue until cannot extend any more Downward-closure property guarantees the correctness Finding the close clusters is quite costly R-tree and grid index for clusters Prune irrelevant clusters based on fast calculation of distance bound Grid index can also reduce the cost of Hausdorff distance evaluation 12/26/2018

Gathering discovery Check each crowd if it is a gathering pattern Extend-until-fail does not work Test-and-divide framework Test: if the current sequence is a gathering; if not Divide: remove the invalid clusters Performed recursively until no more crowd exists in any subsequence Example All thresholds are set to 3 Test: C5 only has 2 participators Divide: C1,C2,C3,C4 and C6,C7,C8 Output: C1,C2,C3,C4 Time Need to repeatedly count the occurrences in LONG sequence! 12/26/2018

Fast counting Bit Vector Signature Hamming weight Compactly represents the existence of object in each cluster Hamming weight The number of 1 bits in a bit vector Within log2(n) bit operations x = B(o1) mask vector The decimal value is number of 1s in B(o1) 12/26/2018

Incremental discovery algorithm New batch of trajectories are collected periodically Keep the results up-to-date Do not re-compute everything from scratch Incremental crowd extension Only certain old crowds are extensible Incremental gathering update Only need to update in the crowd that has just been extended Re-use the information in the old part of the crowd Terminate the test-and-divide process early 12/26/2018

Experiment Dataset Effectiveness Efficiency 33,000 taxicabs in Beijing 120K trajectories 3 months – March, April and May in 2009 132,480 time instants (minute) Effectiveness Capture the traffic condition Efficiency Crowd detection Gathering discovery Incremental update 12/26/2018

Experiment (cont.) #convoy, swarm, crowd, gathering/day Time of a day Peak time – 6am to 10am and 5pm to 8pm Work time – 10am to 5pm Casual time – 8pm to 5am Weather of a day More gatherings in rainy and snowy weather Moving slowly and minor accidents -- Convoy and swarm find co-travellers -- Gatherings find dense group of low-speed objects Many crowds, few gathering More gathering in peak time Many convoys and swarms during peak and casual time 12/26/2018

Experiment (cont.) Crowd detection Gathering discovery Two pruning strategies with R-tree Grid-based pruning Gathering discovery TAD: test and divide TAD*: with fast counting Incremental update Divide whole dataset into five groups Sequentially append each group 12/26/2018

Conclusion Define the gathering pattern to capture the group events or incidents in real life Efficient discovery algorithms Incremental update upon new arrivals of trajectories Case study for effectiveness of the concept Performance evaluation based on real trajectory dataset 12/26/2018