Travel Time Estimation of a Path using Sparse Trajectories

Slides:



Advertisements
Similar presentations
When Urban Air Quality Meets Big Data
Advertisements

An Interactive-Voting Based Map Matching Algorithm
Driving with Knowledge from the Physical World Jing Yuan, Yu Zheng Microsoft Research Asia.
Urban Computing with Taxicabs
Probabilistic Skyline Operator over Sliding Windows Wenjie Zhang University of New South Wales & NICTA, Australia Joint work: Xuemin Lin, Ying Zhang, Wei.
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
User Oriented Trajectory Similarity Search ACM SIGKDD International Workshop on Urban Computing (UrbComp 2012), August 12, Beijing, China Haibo.
Learning Location Correlation From GPS Trajectories Yu Zheng Microsoft Research Asia March 16, 2010.
4/15/2017 Using Gaussian Process Regression for Efficient Motion Planning in Environments with Deformable Objects Barbara Frank, Cyrill Stachniss, Nichola.
Constructing Popular Routes from Uncertain Trajectories Authors of Paper: Ling-Yin Wei (National Chiao Tung University, Hsinchu) Yu Zheng (Microsoft Research.
Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.
Adaptive Fastest Path Computation on a Road Network : A Traffic Mining Approach Hector Gonzalez Jiawei Han Xiaolei Li Margaret Myslinska John Paul Sondag.
Critical Analysis Presentation: T-Drive: Driving Directions based on Taxi Trajectories Authors of Paper: Jing Yuan, Yu Zheng, Chengyang Zhang, Weilei Xie,
T-Drive : Driving Directions Based on Taxi Trajectories Microsoft Research Asia University of North Texas Jing Yuan, Yu Zheng, Chengyang Zhang, Xing Xie,
Computing with Spatial Trajectories
Trajectories Simplification Method for Location-Based Social Networking Services Presenter: Yu Zheng on behalf of Yukun Cheng, Kai Jiang, Xing Xie Microsoft.
Dieter Pfoser, LBS Workshop1 Issues in the Management of Moving Point Objects Dieter Pfoser Nykredit Center for Database Research Aalborg University, Denmark.
Route Planning Vehicle navigation systems, Dijkstra’s algorithm, bidirectional search, transit-node routing.
Abstract Shortest distance query is a fundamental operation in large-scale networks. Many existing methods in the literature take a landmark embedding.
1 Vehicular Sensor Networks for Traffic Monitoring In proceedings of 17th International Conference on Computer Communications and Networks (ICCCN 2008)
Learning Transportation Mode from Raw GPS Data for Geographic Applications on the Web Yu Zheng, Like Liu, Xing Xie Microsoft Research.
Urban Computing with Taxicabs Yu Zheng Microsoft Research Asia.
Dan Schonfeld Co-Director, Multimedia Communications Laboratory Professor, Departments of ECE, CS & Bioengineering University of Illinois at Chicago.
Reducing Uncertainty of Low-sampling-rate Trajectories Kai Zheng, Yu Zheng, Xing Xie, Xiaofang Zhou University of Queensland & Microsoft Research Asia.
CrowdAtlas: Self-Updating Maps for Cloud and Personal Use Mike Lin.
Graph Data Management Lab, School of Computer Science gdm.fudan.edu.cn XMLSnippet: A Coding Assistant for XML Configuration Snippet.
Mining Interesting Locations and Travel Sequences from GPS Trajectories IDB & IDS Lab. Seminar Summer 2009 강 민 석강 민 석 July 23 rd,
« Performance of Compressed Inverted List Caching in Search Engines » Proceedings of the International World Wide Web Conference Commitee, Beijing 2008)
On Graph Query Optimization in Large Networks Alice Leung ICS 624 4/14/2011.
Determination of Number of Probe Vehicles Required for Reliable Travel Time Measurement in Urban Network.
LECTURE 13. Course: “Design of Systems: Structural Approach” Dept. “Communication Networks &Systems”, Faculty of Radioengineering & Cybernetics Moscow.
Efficient Route Computation on Road Networks Based on Hierarchical Communities Qing Song, Xiaofan Wang Department of Automation, Shanghai Jiao Tong University,
Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.
Siyuan Liu *#, Yunhuai Liu *, Lionel M. Ni *# +, Jianping Fan #, Minglu Li + * Hong Kong University of Science and Technology # Shenzhen Institutes of.
Mingyang Zhu, Huaijiang Sun, Zhigang Deng Quaternion Space Sparse Decomposition for Motion Compression and Retrieval SCA 2012.
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
Yu Zheng Microsoft Research, Beijing, China
Efficient Data Compression in Location Based Services Yuni Xia, Yicheng Tu, Mikhail Atallah, Sunil Prabhakar.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Trajectory Data Mining
Forecasting Fine-Grained Air Quality Based on Big Data Date: 2015/10/15 Author: Yu Zheng, Xiuwen Yi, Ming Li1, Ruiyuan Li1, Zhangqing Shan, Eric Chang,
Trajectory Data Mining
Graph Indexing From managing and mining graph data.
Location-based Social Networks 6/11/20161 CENG 770.
Presented by: Siddhant Kulkarni Spring Authors: Publication:  ICDE 2015 Type:  Research Paper 2.
Computational Challenges in BIG DATA 28/Apr/2012 China-Korea-Japan Workshop Takeaki Uno National Institute of Informatics & Graduated School for Advanced.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
ST-MVL: Filling Missing Values in Geo-sensory Time Series Data
A Flexible Spatio-temporal indexing Scheme for Large Scale GPS Tracks Retrieval Yu Zheng, Longhao Wang, Xing Xie Microsoft Research.
Spatial Approximate String Search. Abstract This work deals with the approximate string search in large spatial databases. Specifically, we investigate.
Managing Massive Trajectories on the Cloud
Optimal Acceleration and Braking Sequences for Vehicles in the Presence of Moving Obstacles Jeff Johnson, Kris Hauser School of Informatics and Computing.
T-Share: A Large-Scale Dynamic Taxi Ridesharing Service
Pagerank and Betweenness centrality on Big Taxi Trajectory Graph
Urban Sensing Based on Human Mobility
DNN-Based Urban Flow Prediction
Urban Water Quality Prediction based on Multi-task Multi-view Learning
Mining Spatio-Temporal Reachable Regions over Massive Trajectory Data
Mining the Most Influential k-Location Set from Massive Trajectories
Spatio-temporal Pattern Queries
Chao Zhang1, Yu Zheng2, Xiuli Ma3, Jiawei Han1
Predicting Traffic Dmitriy Bespalov.
CSE572, CBS572: Data Mining by H. Liu
Showcasing work by Jing Yuan, Yu Zheng, Xing Xie, Guangzhou Sun
Topological Signatures For Fast Mobility Analysis
Overview: Chapter 2 Localization and Tracking
Presentation transcript:

Travel Time Estimation of a Path using Sparse Trajectories Dr. Yu Zheng yuzheng@microsoft.com Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University

Goal Estimate the travel time of any given path on road network instantly using historical and current trajectories generated by a sample of vehicles D S

Challenges Data sparsity Trajectory concatenation Multiple ways to combine sub-trajectories Length of a sub-trajectory and its support Scalability and efficiency A citywide estimation E.g. Beijing has over 100,000 road segments S 𝑟 1 → 𝑟 2 → 𝑟 3 → 𝑟 4 𝑟 1 + 𝑟 2 + 𝑟 3 ( 𝑟 1 → 𝑟 2 )+ 𝑟 3 𝑟 1 +( 𝑟 2 → 𝑟 3 ) D

Methodology Framework Context-Aware Tensor Decomposition (CATD) Optimal Concatenation (OC)

Methodology Supplementing missing values

Methodology Supplementing missing values ℒ 𝑆, 𝑅,𝑈,𝑇,𝐹,𝐺 = 1 2 𝒜−𝑆 × 𝑅 𝑅 × 𝑈 𝑈 × 𝑇 𝑇 2 + 𝜆 1 2 𝑋−𝑇𝐺 2 + 𝜆 2 2 𝑌−𝑅𝐹 2 + 𝜆 3 2 𝑆 2 + 𝑅 2 + 𝑈 2 + 𝑇 2 + 𝐹 2 + 𝐺 2

Methodology Framework

Methodology Optimal Concatenation Suppose 𝑷 is decomposed as 𝑃 1 || 𝑃 2 ||⋯|| 𝑃 𝑘 true travel time: 𝜇 𝑷 . estimated travel time: 𝑡 𝑃 1 + 𝑡 𝑃 2 +…+ 𝑡 𝑃 𝑘 𝐿𝑆𝐸 𝑷, 𝑃 1 , 𝑃 2 ,⋯, 𝑃 𝑘 ≜ 𝐸 𝜇 𝑷 − 𝑡 𝑃 1 − 𝑡 𝑃 2 −…− 𝑡 𝑃 𝑘 2 argmin 𝑃 1 , 𝑃 2 ,⋯, 𝑃 𝑘 𝐿𝑆𝐸 𝑷, 𝑃 1 , 𝑃 2 ,⋯, 𝑃 𝑘 , subject to 𝑃 1 || 𝑃 2 ||⋯|| 𝑃 𝑘 =𝑷 𝑡 𝑃 1 S 𝑡 𝑃 2 𝑡 𝑃 𝑘 D 𝑃 1 𝑃 2 𝑃 𝑘 𝜇 𝑷

Methodology Optimal Concatenation 𝐿𝑆𝐸 𝑷, 𝑃 1 , 𝑃 2 ,⋯, 𝑃 𝑘 = 𝐸 𝜇 𝑷 − 𝑡 𝑃 1 − 𝑡 𝑃 2 −…− 𝑡 𝑃 𝑘 2 =𝐸 𝜇 𝑃 1 + 𝜇 𝑃 2 +… 𝜇 𝑃 𝑘 − 𝑡 𝑃 1 − 𝑡 𝑃 2 −…− 𝑡 𝑃 𝑘 2 =𝐸 𝑖=1 𝑘 𝜇 𝑃 𝑖 − 𝑡 𝑃 𝑖 2 + 𝑖=1 𝑘 𝑗=1 𝑘 ( 𝜇 𝑃 𝑖 − 𝑡 𝑃 𝑖 )( 𝜇 𝑃 𝑗 − 𝑡 𝑃 𝑗 ) = 𝑖=1 𝑘 𝐸( 𝜇 𝑃 𝑖 − 𝑡 𝑃 𝑖 ) 2 + 𝑖=1 𝑘 𝑗=1 𝑘 𝐸 ( 𝜇 𝑃 𝑖 − 𝑡 𝑃 𝑖 )( 𝜇 𝑃 𝑗 − 𝑡 𝑃 𝑗 ) assuming 𝑡 𝑃 𝑖 and 𝑡 𝑃 𝑗 are independent 𝐸 ( 𝜇 𝑃 𝑖 − 𝑡 𝑃 𝑖 )( 𝜇 𝑃 𝑗 − 𝑡 𝑃 𝑗 ) =𝐸 𝜇 𝑃 𝑖 − 𝑡 𝑃 𝑖 𝐸( 𝜇 𝑃 𝑗 − 𝑡 𝑃 𝑗 )=0 𝐿𝑆𝐸 𝑷, 𝑃 1 , 𝑃 2 ,⋯, 𝑃 𝑘 = 𝑖=1 𝑘 𝐸( 𝜇 𝑃 𝑖 − 𝑡 𝑃 𝑖 ) 2 𝐸( 𝜇 𝑃 𝑖 − 𝑡 𝑃 𝑖 ) 2 =𝐸( 𝜇 𝑃 𝑖 − 1 𝑛 𝑃 𝑖 𝑗=1 𝑛 𝑃 𝑖 𝑡 𝑃 𝑖 ,𝑗 ) 2 = 1 𝑛 𝑃 𝑖 2 𝐸 𝑗=1 𝑛 𝑃 𝑖 ( 𝜇 𝑃 𝑖 − 𝑡 𝑃 𝑖 ,𝑗 ) 2 = 1 𝑛 𝑃 𝑖 2 𝑗=1 𝑛 𝑃 𝑖 𝐸( 𝜇 𝑃 𝑖 − 𝑡 𝑃 𝑖 ,𝑗 ) 2 = 𝟏 𝒏 𝑷 𝒊 𝑽𝒂𝒓( 𝒕 𝑷 𝒊 ,𝒋 )

Methodology Solved by dynamic programming Support 𝑛 𝑃 𝑖 vs. Variance 𝑉𝑎𝑟 𝑡 𝑃 𝑖 ,𝑗 The bigger the support is, the smaller the error is The bigger the variance the bigger the error Solved by dynamic programming Denote 𝑔 𝑃 𝑖 = 1 𝑛 𝑃 𝑖 𝑉𝑎𝑟( 𝑡 𝑃 𝑖 ,𝑗 ) argmin 𝑃 1 , 𝑃 2 ,⋯, 𝑃 𝑘 𝑖=1 𝑘 1 𝑛 𝑃 𝑖 𝑉𝑎𝑟( 𝑡 𝑃 𝑖 ,𝑗 ) subject to 𝑃 1 || 𝑃 2 ||⋯|| 𝑃 𝑘 =𝑷 argmin 𝑃 1 , 𝑃 2 ,⋯, 𝑃 𝑙 𝑗=1 𝑙 𝑔 𝑃 𝑗 subject to 𝑃 1 || 𝑃 2 ||⋯|| 𝑃 𝑙 =𝑃′. 𝑜𝑝𝑡 𝑛 = min 1≤𝑖<𝑛 𝑜𝑝𝑡 𝑖 +𝑔( 𝑃 𝑟 𝑖+1 || 𝑟 𝑖+2 ⋯ ||𝑟 𝑛 )

Methodology Make optimal concatenation more efficient Frequent trajectory pattern mining Not necessary to check every combination Using suffix-tree-based method

Methodology Combining the suffix tree with tensors Searching for frequent trajectory pattern from the suffix tree Find the travel time of a particular user from the tensor

Methodology Deal with efficiency and scalability data-driven space partition an element-wise optimization algorithm Use trajectory patterns as concatenation candidates Indexing recent trajectories for a fast online retrieval

Experiments Datasets Download data here Taxi Trajectories: Generated by 32,670 taxicabs in Beijing From Sept. 1 to Oct. 31, 2013. 673+ million GPS points Total length: over 26 million km. Sampling rate: 96 seconds per point. Road networks: 148,110 nodes and 196,307 edges. Covers a 40×50km spatial range Total length of road segments: 21,985km. POIs: Th273,165 POIs of Beijing 195 tier two categories. Chose the top 10 categories that occur around road segments

Experiments Performance of CATD 𝒜 r /(5×5): 4,736×12,674×4 0.09%   MAE (min) RMSE 𝑇𝐷 0.747 1.646 𝑇𝐷+𝐻 0.732 1.629 𝐶𝐴𝑇𝐷 (𝑇𝐷+𝐻+𝐶) 0.714 1.613 Number of time slots to formulate a tensor. tradeoff The number of partition. Tradeoff Metrics 𝑀𝐴𝐸= 𝑖 | 𝑦 𝑖 − 𝑦 𝑖 | 𝑛 𝑀𝑅𝐸= 𝑖 | 𝑦 𝑖 − 𝑦 𝑖 | 𝑖 𝑦 𝑖 𝑅𝑀𝑆𝐸= 𝑖 ( 𝑦 𝑖 − 𝑦 𝑖 ) 2 𝑛

Experiments Query paths From taxi trajectories In the field study 12,384 queries, 50 paths per hour/day Traveled by at least two drivers Total length 76,412.6km Effective time span: 2,734 hours In the field study 114 queries from Sept. 1 to Oct. 30, 2013 Total length 999km Effective time span: 62 hours We select over 12,000 query paths, each of which has been entirely traveled at least twice by taxis. The average travel time of these taxis on such a route is used as the ground truth for the travel time estimation. We guarantee these queries have a wide a coverage in geospatial and temporal spaces.

Experiments Baselines Query paths from taxi data In-the-field study Speed-Constraint-based (SC) method Trajectory-based Simple Concatenation (TSC) method. Optimal Concatenation with Historical Travel Time (OC+H) method. Optimal Concatenation with Nonnegative Matrix Factorization (OC+MF) Query paths from taxi data   MAE (min) MRE MAE/L (min/km) 𝑆𝐶 8.808 0.665 1.428 𝑇𝑆𝐶 5.244 0.396 0.850 𝑂𝐶+𝐻 3.245 0.245 0.526 𝑂𝐶+𝑀𝐹 3.061 0.231 0.496 𝑷𝑻𝑻𝑬 2.545 0.192 0.412 In-the-field study   MAE (min) MRE MAE/L (min/km) 𝑆𝐶 18.193 0.561 2.075 𝑇𝑆𝐶 11.300 0.349 1.289 𝑂𝐶+𝐻 4.990 0.154 0.569 𝑂𝐶+𝑀𝐹 4.052 0.125 0.462 𝑷𝑻𝑻𝑬 3.771 0.116 0.430

Deal with missing values (𝐶𝐴𝑇𝐷) Optimal Concatenation (OC) Experiments Efficiency   Components Time Memory (MB) Deal with missing values (𝐶𝐴𝑇𝐷) Building matrix 𝑋, 𝑌 34ms 9 Tensor construction 𝒜 𝑟 44ms 4.4 𝒜 ℎ 233ms 14.6 Decomposition 5×5 6.31min 116 Total 6.4min 144 Optimal Concatenation (OC) Best OC 2.3ms 995 w/o trajectory patterns 8.6ms 877 w/o index 12.2s 714

Download data and codes Conclusion A very fundamental but challenging task Data sparsity Trade off between length of a path and support of trajectories Efficiency and scalability Our method Context-Aware Tensor Decomposition Optimal Concatenation Results Effectiveness 12,384 query paths and 114 in the field study Relative error ratio: 19% and 11.6% Efficiency Partition a city into grids Suffix-tree-based pattern mining and index Infer the travel time of a path in 2.3ms Download data and codes

Search for “Urban Computing” Thanks! Yu Zheng yuzheng@microsoft.com Homepage

Experiments Weekdays Weekends

An element-wise optimization algorithm ×: the matrix multiplication; × 𝑅 : the tensor-matrix multiplication, where the subscript 𝑅 stands for the direction, e.g., 𝐻=𝑆 × 𝑅 𝑅 is 𝐻 𝑖𝑗𝑘 = 𝑖=1 𝑑 𝑅 𝑆 𝑖𝑗𝑘 × 𝑅 𝑖𝑗 ; ⊗ is the tensor outer product (also called Kronecker product); the entries of the 𝑖th row of matrix 𝑅 are represented as 𝑅 𝑖∗ .

Experiments Datasets Download data here Taxi Trajectories: Generated by 32,670 taxicabs in Beijing From Sept. 1 to Oct. 31, 2013. 673+ million GPS points Total length: over 26 million km. Sampling rate: 96 seconds per point. Road networks: 148,110 nodes and 196,307 edges. Covers a 40×50km spatial range Total length of road segments: 21,985km. POIs: Th273,165 POIs of Beijing 195 tier two categories. Chose the top 10 categories that occur around road segments