On Discovery of Traveling Companions from Streaming Trajectories

Slides:



Advertisements
Similar presentations
Incremental Clustering for Trajectories
Advertisements

Mining User Similarity Based on Location History Yu Zheng, Quannan Li, Xing Xie Microsoft Research Asia.
An Interactive-Voting Based Map Matching Algorithm
Ranking Outliers Using Symmetric Neighborhood Relationship Wen Jin, Anthony K.H. Tung, Jiawei Han, and Wei Wang Advances in Knowledge Discovery and Data.
Swarm: Mining Relaxed Temporal Moving Object Clusters
A Framework for Clustering Evolving Data Streams Charu C. Aggarwal, Jiawei Han, Jianyong Wang, Philip S. Yu Presented by: Di Yang Charudatta Wad.
Mining Mobile Group Patterns: A Trajectory-based Approach San-Yih Hwang, Ying-Han Liu, Jeng-Kuen Chiu NSYSU, Taiwan Ee-Peng Lim NTU, Singapore.
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
Retrieving k-Nearest Neighboring Trajectories by a Set of Point Locations Lu-An Tang, Yu Zheng, Xing Xie, Jing Yuan, Xiao Yu, Jiawei Han University of.
Constructing Popular Routes from Uncertain Trajectories Authors of Paper: Ling-Yin Wei (National Chiao Tung University, Hsinchu) Yu Zheng (Microsoft Research.
Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.
Tru-Alarm: Trustworthiness Analysis of Sensor Network in Cyber Physical Systems Lu-An Tang, Xiao Yu, Sangkyum Kim, Jiawei Han, Chih-Chieh Hung, Wen-Chih.
Critical Analysis Presentation: T-Drive: Driving Directions based on Taxi Trajectories Authors of Paper: Jing Yuan, Yu Zheng, Chengyang Zhang, Weilei Xie,
T-Drive : Driving Directions Based on Taxi Trajectories Microsoft Research Asia University of North Texas Jing Yuan, Yu Zheng, Chengyang Zhang, Xing Xie,
CENTRE Cellular Network’s Positioning Data Generator Fosca GiannottiKDD-Lab Andrea MazzoniKKD-Lab Puntoni SimoneKDD-Lab Chiara RensoKDD-Lab.
Trajectories Simplification Method for Location-Based Social Networking Services Presenter: Yu Zheng on behalf of Yukun Cheng, Kai Jiang, Xing Xie Microsoft.
Tokyo Research Laboratory © Copyright IBM Corporation 2009 | 2009/04/03 | SDM 09 / Travel-Time Prediction Travel-Time Prediction using Gaussian Process.
Detecting Distance-Based Outliers in Streams of Data Fabrizio Angiulli and Fabio Fassetti DEIS, Universit `a della Calabria CIKM 07.
Replication and Consistency (3). Reference r Automated Hoarding for Mobile Computers by G. Kuenning and G. Popek.
Clustering Moving Objects in Spatial Networks Jidong Chen, Caifeng Lai, Xiaofeng Meng, Renmin University of China Jianliang Xu, and Haibo Hu Hong Kong.
Elastic Pathing: Your Speed Is Enough to Track You Presented by Ali.
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
1 Efficient Algorithms for Incremental Update of Frequent Sequences Minghua ZHANG Dec. 7, 2001.
SocialTube: P2P-assisted Video Sharing in Online Social Networks
黃福銘 (Angus). Angus Fuming Huang Academia Sinica, Institute of Information Science, ANTS Lab Jae-Gil Lee Jiawei Han UIUC Kyu-Young Whang KAIST ACM SIGMOD’07.
Intelligent DataBase System Lab, NCKU, Taiwan Josh Jia-Ching Ying 1, Wang-Chien Lee 2, Tz-Chiao Weng 1 and Vincent S. Tseng 1 1 Department of Computer.
Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan Chih-Chieh Hung,
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Big traffic data processing framework for intelligent monitoring and recording systems 學生 : 賴弘偉 教授 : 許毅然 作者 : Yingjie Xia a, JinlongChen a,b,n, XindaiLu.
1 Complex Spatio-Temporal Pattern Queries Cahide Sen University of Minnesota.
Xiangnan Kong,Philip S. Yu An Ensemble-based Approach to Fast Classification of Multi-label Data Streams Dept. of Computer Science University of Illinois.
Graph Indexing From managing and mining graph data.
CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks (WWW2013) BEUTEL, ALEX, WANHONG XU, VENKATESAN GURUSWAMI, CHRISTOPHER.
Presented by Niwan Wattanakitrungroj
Managing Massive Trajectories on the Cloud
Prof. Yu-Chee Tseng Department of Computer Science
Mining Data Streams with Periodically changing Distributions Yingying Tao, Tamer Ozsu CIKM’09 Supervisor Dr Koh Speaker Nonhlanhla Shongwe April 26,
Presented by: Mi Tian, Deepan Sanghavi, Dhaval Dholakia
CACTUS-Clustering Categorical Data Using Summaries
Massive Spatial Query on the Kepler Architecture
RE-Tree: An Efficient Index Structure for Regular Expressions
Location Cloaking for Location Safety Protection of Ad Hoc Networks
E-Commerce Theories & Practices
Cache Memory Presentation I
DS595/CS525 Team#2 - Mi Tian, Deepan Sanghavi, Dhaval Dholakia
Mining Spatio-Temporal Reachable Regions over Massive Trajectory Data
Clustering Uncertain Taxi data
Supporting Fault-Tolerance in Streaming Grid Applications
CS 685: Special Topics in Data Mining Jinze Liu
Spatio-temporal Pattern Queries
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Chao Zhang1, Yu Zheng2, Xiuli Ma3, Jiawei Han1
Spatio-Temporal Databases
Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform
A Framework for Clustering Evolving Data Streams
Probabilistic Data Management
On Discovery of Gathering Patterns from Trajectories
The BIRCH Algorithm Davitkov Miroslav, 2011/3116
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
2IMG15 Algorithms for Geographic Data
Time Relaxed Spatiotemporal Trajectory Joins
Pei Lee, ICDE 2014, Chicago, IL, USA
Continuous Density Queries for Moving Objects
Ying Dai Faculty of software and information science,
Spatial Databases: Spatio-Temporal Databases
Topological Signatures For Fast Mobility Analysis
Prediction Networks Prediction A simple example (section 3.7.3)
Efficient Aggregation over Objects with Extent
Presentation transcript:

On Discovery of Traveling Companions from Streaming Trajectories Lu-An Tang, Yu Zheng, Jing Yuan, Jiawei Han, Alice Leung, Chih-Chieh Hung and Wen-Chih Peng

Outline Introduction Related Works Companion Discovery Framework The Buddy-based Approach Experiments and Conclusion

Trajectory Data Streams Technical advances in mobile & tracking devices have lead to huge volume of trajectory data Trajectory stream: the devices report the object locations with timestamps in sequences Taxi traces by GPS Animal movements Military trajectories on battlefields Location based social network: check-in sequences

Motivation It is interesting and useful to study the partnership in trajectory streams – discover the group of objects that move together, i.e., traveling companions Applications: animal behavior analysis, migration path study traffic jam detection, smart driving direction recommendation anti-crime and anti-terrorist, battlefiled survilliance and control location based social network, online game play

A Motivation Example size threshold = 4 & time threshold = 4 snapshots

A Motivation Example size threshold = 4 & time threshold = 4 snapshots {o1, o2, o3, o4} is the traveling companion

Problem Formulation Let δs be the size threhsold and δt be the duration threshold, a group of objects q is called traveling companion if: The members of q are desity connected by themselves for a period t where t ≥ δt size(q) ≥ δs Let trajectory stream S = {s1, s2, … si, …}, eash snapshot si = {(o1,x1,i,y1,i), (o2,x2,i,y2,i), …, (on,xn,i,yn,i)}, the task is to discover the traveling companion set Q

The Challenges Key issue: travel together – in the same cluster; the cluster may be in arbitrary shape – density-based clusters Efficiency discover the companions along the data streams (cannot scan the whole dataset) scalable with large number of objects and long time lasting trajectories Effectiveness report the large and long-lasting companions, rather than small and short-lasting ones

Our Contributions Introduce the framwork to discover companions by clustering-and-intersection Improve the technique with smart-intersection and closed companions Propose the buddy-based approach to discover companions with higher efficiency Evaluate the performences on both real and synthetic datasets

Outline Introduction Related Works Companion Discovery Framework The Buddy-based Approach Experiments and Conclusion

Related Studies Moving group discovery, Kalnis et.al., 2005: two consecutive clusters with the similar contents Flock, Gudmundsson et.al., 2004: a group of objects that move together within a circle of user given ridus “r”, i.e., a disc Spatial –tempo joins, Bakalov et.al., 2005: a pair of objects (only two) travel together TraCluster, Lee et.al., 2007: the clusters that represent the main moving direction of sub-trajectories

Convoy Query and Swarm Query Convoy, Jeung et.al., 2008: a group of objects that traveled together continuously for a period of time Swarm, Li et.al., 2010: relaxed temporal moving object clusters Why don’t they work on trajectory streams? Efficency: high time or I/O costs Effectiveness: the cluster must be in round shape, i.e, disc Generate results after scaning the entire dataset – for static dataset, but not data streams

Outline Introduction Related Works Companion Discovery Framework The Buddy-based Approach Experiments and Conclusion

The Framework: Clustering-and-Intersection A two-step process to retrieve the traveling companions clustering the objects in each snapshot intersecting the clusters to generate companion candidates, if the candidates meet the size and time standards, output them as companion

Example of CI: δt=40m, δs=4

Analysis of Clustering-and-Intersection Pros: Guarantee not missing any companions Cons: high costs on both clustering and intersection steps In each snapshot, the intersection is carried out in every pair of candidate and cluster Some redundant and unnecessary candidates are stored

The Smart Intersection and Closed Candidates Can we stop the intersection earlier? Smart Intersection: if the objects of a candidate has already been found in a cluster, no need to intersect the candidate furthermore with other clusters Can we only add the necessary ones? Closed candidate: for a new candidate ri, if there exists already another candidate rj that , and duration(rj) ≥ duration (ri), then ri is not necessary to add into the memory

Example of Smart-Intersection and Closed Candidates Once we found r1’s objects in c1, stop the interesection; do not add the un-closed candidates

Outline Introduction Related Works Companion Discovery Framework The Buddy-based Approach Experiments and Conclusion

The Bottleneck of Companion Discovery The clustering step: density-based clustering algorithm cost O(n2) time without spatial index It is costly to maintain a spatial index in each snapshot , since the object locations change a lot [Lee et.al., 2003] The clustering step is indeed the bottleneck

The Buddy-based Approach Intuition: Speed up the clustering step by reusing the information of previous clusters Observation: People, animal and other creatures like to travel within small groups – the buddies Couples/close friends like to travel together Animals migrate in families

The Buddy Maintainence Although the buddies may not be larger enough as the companion, they can still be used to improve clustering efficiency The buddy only stores the relationships of objects The maintain cost of buddies is low: with buddy radius, size and center, easy to update the buddy’s information when add/remove member objects

The Buddy-based Clustering How can the buddies help clustering process? The principles (Lemma 2 to 4) If a buddy is tight (enough size with small radius), all the members of the buddy are density-connected If two buddies’ center distance is large, then the two of them cannot be directly density connected Lemma 4: If two tight buddies are close, then all their members are density-connected

The buddy-based companion discovery The buddies can be used to help companion discovery Construct a buddy index {BID, ObjectSet, CanIDs} If a buddy stay unchanged, then the system only needs to check the buddy ID without looking object details in the intersection process – reduce the intersection times

Example: The Buddy-based Approach

Outline Introduction Related Works Companion Discovery Framework The Buddy-based Approach Experiments and Conclusion

Experiment Setup Four datasets: two real, two synthetic comparing the methods of smart-and-closed (SC), buddy-based (BU) with clutering-and-intersection (CI), trajectory clustering (TC) and swarm pattern (SW)

Efficency Study I BU costs only 10-20% time of CI SC costs 20-30% time of CI Larger δt, less time

Efficency Study II Larger δs, fewer companion candidagtes, less time If the average buddy size is larger than 2.5, BU outperforms density-based clustering

Effectiveness Study CI’s precision is low, too many non-closed companions TC(Trajectory clustering) may miss some companions

Conclusion Thank You Very Much! Any Questions? We have investigated the problem of companion discovery on streaming trajectories Cluster-and-Intersection framework is introduced as the baseline, the improvement of smart-intersection and closed-candidates are proposed The buddy-based companion algorithm is proposed for efficency companion discovery Thank You Very Much! Any Questions?