Passenger Demand Prediction with Cellular Footprints

Slides:



Advertisements
Similar presentations
Research Challenges in the CarTel Mobile Sensor System Samuel Madden Associate Professor, MIT.
Advertisements

Hydrological information systems Svein Taksdal Head of section, Section for Hydroinformatics Hydrology department Norwegian Water Resources and Energy.
Urban Computing with Taxicabs
Collaborative QoS Prediction in Cloud Computing Department of Computer Science & Engineering The Chinese University of Hong Kong Hong Kong, China Rocky.
SEKE 2014, Hyatt Regency, Vancouver, Canada
DSPIN: Detecting Automatically Spun Content on the Web Qing Zhang, David Y. Wang, Geoffrey M. Voelker University of California, San Diego 1.
High Throughput Computing and Protein Structure Stephen E. Hamby.
Critical Analysis Presentation: T-Drive: Driving Directions based on Taxi Trajectories Authors of Paper: Jing Yuan, Yu Zheng, Chengyang Zhang, Weilei Xie,
Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1
A reactive location-based service for geo-referenced individual data collection and analysis Xiujun Ma Department of Machine Intelligence, Peking University.
Cloud Computing for Chemical Property Prediction Paul Watson School of Computing Science Newcastle University, UK Microsoft Cloud.
Tracking Moving Objects in Anonymized Trajectories Nikolay Vyahhi 1, Spiridon Bakiras 2, Panos Kalnis 3, and Gabriel Ghinita 3 1 St. Petersburg State University.
Crossroads: A Practical Data Sketching Solution for Mining Intersection of Streams Jun Xu, Zhenglin Yu (Georgia Tech) Jia Wang, Zihui Ge, He Yan (AT&T.
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
Wind Energy Forecaster A Web-based Wind Energy Prediction Tool Aditya Trivedi ’16 Advisor: Dr. Eric Larson.
Business Process Performance Prediction on a Tracked Simulation Model Andrei Solomon, Marin Litoiu– York University.
A Framework for Mining Signatures from Event Sequences and Its Applications in Healthcare Data.
Harikishan Perugu, Ph.D. Heng Wei, Ph.D. PE
Real-time Bus Arrival Time Prediction: An Application to the Case of Chinese Cities Shandong University, China & University of Maryland at College Park,
Siyuan Liu *#, Yunhuai Liu *, Lionel M. Ni *# +, Jianping Fan #, Minglu Li + * Hong Kong University of Science and Technology # Shenzhen Institutes of.
CCGrid, 2012 Supporting User Defined Subsetting and Aggregation over Parallel NetCDF Datasets Yu Su and Gagan Agrawal Department of Computer Science and.
Forecasting Fine-Grained Air Quality Based on Big Data Date: 2015/10/15 Author: Yu Zheng, Xiuwen Yi, Ming Li1, Ruiyuan Li1, Zhangqing Shan, Eric Chang,
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.
Smart Grid Big Data: Automating Analysis of Distribution Systems Steve Pascoe Manager Business Development E&O - NISC.
Federal Land Manager Environmental Database (FED) Overview and Update June 6, 2011 Shawn McClure.
Multi-Area Load Forecasting for System with Large Geographical Area S. Fan, K. Methaprayoon, W. J. Lee Industrial and Commercial Power Systems Technical.
USING HISTORICAL FLIGHT DATA TO EVALUATE AIRBORNE DEMAND, DELAY AND TRAFFIC FLOW CONTROL Michael Brennan, Terence Thompson Metron Aviation, Inc Steve Bradford,
Fast Data Analysis with Integrated Statistical Metadata in Scientific Datasets By Yong Chen (with Jialin Liu) Data-Intensive Scalable Computing Laboratory.
Naifan Zhuang, Jun Ye, Kien A. Hua
Experience Report: System Log Analysis for Anomaly Detection
Managing Massive Trajectories on the Cloud
Information Systems in Organizations
T-Share: A Large-Scale Dynamic Taxi Ridesharing Service
Outline Introduction Standards Project General Idea
Urban Sensing Based on Human Mobility
Belinda Boateng, Kara Johnson, Hassan Riaz
DNN-Based Urban Flow Prediction
CARP: Context-Aware Reliability Prediction of Black-Box Web Services
The Simpler The Better: A Unified Approach to Predicting Original Taxi Demands based on Large-Scale Online Platforms Yongxin Tong1, Yuqiang Chen2, Zimu.
A Web-enabled Approach for generating data processors
Chaoyun Zhang, Xi Ouyang, and Paul Patras
Meng Lu and Edzer Pebesma
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Mining Spatio-Temporal Reachable Regions over Massive Trajectory Data
C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs Shuo Wang1, Zhe Li2, Caiwen Ding2, Bo Yuan3, Qinru Qiu2, Yanzhi Wang2,
Federal Land Manager Environmental Database (FED)
Spatio-temporal Pattern Queries
Chapter 12: Automated data collection methods
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Spatial Online Sampling and Aggregation
Master’s Thesis defense Ming Du Advisor: Dr. Yi Shang
MEASURING INDIVIDUALS’ TRAVEL BEHAVIOUR BY USE OF A GPS-BASED SMARTPHONE APPLICATION IN DAR ES SALAAM CITY 37th Annual Southern African Transport Conference.
Introduction to Neural Networks
Stefano Grassi WindEurope Summit
Traffic Data Analysis for Vehicular Network Connectivity
SAS Deep Learning: From Toolkit to Fast Model Prototyping
Predicting Frost Using Artificial Neural Network
MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.
Declarative Transfer Learning from Deep CNNs at Scale
المشرف د.يــــاســـــــــر فـــــــؤاد By: ahmed badrealldeen
Yi Zhao1, Yanyan Shen*1, Yanmin Zhu1, Junjie Yao2
Huifeng Sun 1, Zibin Zheng 2, Junliang Chen 1, Michael R. Lyu 2
Topological Signatures For Fast Mobility Analysis
Automatic Handwriting Generation
Hao Hu, Luo Qi, Fazhi Qi IHEP 22 Mar. 2018
Presented By: Harshul Gupta
38th Southern African Transport Conference 8 July 2019 By Muzi Nkosi
Online Education Evaluation for Signal Processing Course
CVPR2019 Jiahe Li SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression.
Presentation transcript:

Passenger Demand Prediction with Cellular Footprints Jing Chu1, Kun Qian1, Xu Wang1, Lina Yao2, Fu Xiao3, Jianbo Li4, Xin Miao1 and Zheng Yang1 1School of Software, Tsinghua University 2School of Computer Science and Engineering, The University of New South Wales 3School of Computer Science, Nanjing University of Posts and Telecommunications 4College of Computer Science and Techonology, Qingdao University Presenter: Jing Chu

Motivation Imbalance Passenger Demand Prediction Passenger Waiting Time Driver Profit

Motivation Cellular Data Can get crowd related data compared with User Location Mobility Modelling Cellular Data Data Traffic Engineering User Portrait Research Can get crowd related data compared with the data from online car-hailing platform

Related Work Data sparsity problem Lack of crowd flow analysis Target to the number of pick-ups, ignoring potential passengers who eventually give up taking taxis Lack of crowd flow analysis Reflect potential passenger demand Improper handling of spatial relations Fixed grid region Fail to capture the complex spatial dependency Using deep learning model to predict citywide passenger demand with cellular data on flexible region partition accurately

Challenges Challenge 1 How to identify region characteristics? How to realize the reasonable region partition? Challenge 3 How to capture spatial and temporal correlation?

System Overview Database server module Pre-processing module Deep learning module Visualization and evaluation module

System Overview Database server module: Stores the big cellular data and provides retrival and aggregation services for fast preprocessing Pre-processing module Deep learning module Visualization and evaluation module

290 Billion Cellular Records Data Source Cellular data: Collected by a major cellular carrier of China Record: Unique anonymized user ID, create time, cell tower ID, App ID, URL Dec 5th, 2016-Feb 4th 2017 290 Billion Cellular Records 8000 Cell Towers 1.5 Million Covered Users Shenyang City, China Weather data: From Dark Sky API Weather States Temperature Wind Speed Visibility

System Overview Database server module Pre-processing module: Passenger information extraction and flexible region partition Deep learning module Visualization and evaluation module

Passenger Demand Extraction Intercepting analysis of network data packets: Using HTTP proxy tool to understand the meaning behind URLs of DiDi Chuxing Extract passenger demand from cellular data

Region Partition Partition the city by primary road network from OpenStreetMap Finer-grained partition: Partition the city by the main secondary roads if the passenger demand of any block is too high

Crowd Outflow Extraction Crowd outflow: The number of people leaving the region Outflow Start Other regions A region

System Overview Database server module Pre-processing module Deep learning module: Deep learning architecture FlowFlexDP to model and predict the passenger demand for each region Visualization and evaluation module

Deep Learning Architecture FlowFlexDP Predict the passenger demand for each region

Graph Convolutional Neural Network Model spatial dependency: Graph Convolutional Neural Network Spectral Graph Theory The normalized graph Laplacian Convolution theorem Polynomial filter: Consider k-order neighbors

GCNN with Residual Learning Capture the long-distance spatial dependency: Apply residual learning to GCNN Residual learning The residual unit of GCNN

Sequences Fusion Passenger demand = Hourly + Daily Closeness Periodicity Crowd Outflow Sequences Fusion Cross Correlation Coefficient Office Region Residential Region

Parametric-matrix Based Fusion External Factors Fusion Predicted Passenger Demand Weather Σ Regression Passenger Demand Crowd Outflow Weather & Time Metadata Parametric-matrix Based Fusion Time Metadata Holiday Snowy Day

System Overview Database server module Pre-processing module Deep learning module Visualization and evaluation module

Experimental Setting Training dataset The first 65% for training data, the second 10% for validation set and the rest 25% for testing Spatio-temporal sequence: Min-Max normalization to [0,1] External factors: One-hot coding for day-of-week, time-of-day, holidays, weather state; Min-Max normalization for the temperature, wind speed and visibility The length of hourly sequence: {3, 4, 5, 6, 7, 8} The length of daily sequence: {1, 2, 3, 4, 5, 6, 7, 8}

Evaluation Baselines HA: The Historical Average model ARIMA: The Autoregressive Integrated Moving Average SARIMA: The Seasonal Autoregressive Integrated Moving Average model VAR: Vector Auto-Regressive LSTM: Long-Short Term Memory is a Recurrent Neural Network architecture ANN: The Artificial Neural Network Metric: RMSE

Results Performance Evaluation ResGCNN: Combines GCNN and residual learning ResGCNN-D: Further adds the passenger demand daily sequence ResGCNN-DO: Further fuses the crowd outflow hourly and daily sequences FlowFlexDP: Our final model All variants achieve better performance than the baseline by at least 12.99%

Visualization Our prediction results are very close to the real state

Conclusion We demonstrate cellular data as a rich data source for passenger demand prediction, which has been largely overlooked and unexplored previously It is the first work that uses crowd flow information from cellular data for passenger demand prediction We propose a deep learning model FlowFlexDP, which uses GCNN and residual learning for predicting passenger demand on flexible region partition Evaluation results on a large-scale real data set show that our model outperforms existing models

Thanks Q&A