Download presentation
Presentation is loading. Please wait.
Published byRonald Gray Modified over 6 years ago
1
Passenger Demand Prediction with Cellular Footprints
Jing Chu1, Kun Qian1, Xu Wang1, Lina Yao2, Fu Xiao3, Jianbo Li4, Xin Miao1 and Zheng Yang1 1School of Software, Tsinghua University 2School of Computer Science and Engineering, The University of New South Wales 3School of Computer Science, Nanjing University of Posts and Telecommunications 4College of Computer Science and Techonology, Qingdao University Presenter: Jing Chu
2
Motivation Imbalance Passenger Demand Prediction
Passenger Waiting Time Driver Profit
3
Motivation Cellular Data Can get crowd related data compared with
User Location Mobility Modelling Cellular Data Data Traffic Engineering User Portrait Research Can get crowd related data compared with the data from online car-hailing platform
4
Related Work Data sparsity problem Lack of crowd flow analysis
Target to the number of pick-ups, ignoring potential passengers who eventually give up taking taxis Lack of crowd flow analysis Reflect potential passenger demand Improper handling of spatial relations Fixed grid region Fail to capture the complex spatial dependency Using deep learning model to predict citywide passenger demand with cellular data on flexible region partition accurately
5
Challenges Challenge 1 How to identify region characteristics?
How to realize the reasonable region partition? Challenge 3 How to capture spatial and temporal correlation?
6
System Overview Database server module Pre-processing module
Deep learning module Visualization and evaluation module
7
System Overview Database server module: Stores the big cellular data and provides retrival and aggregation services for fast preprocessing Pre-processing module Deep learning module Visualization and evaluation module
8
290 Billion Cellular Records
Data Source Cellular data: Collected by a major cellular carrier of China Record: Unique anonymized user ID, create time, cell tower ID, App ID, URL Dec 5th, 2016-Feb 4th 2017 290 Billion Cellular Records 8000 Cell Towers 1.5 Million Covered Users Shenyang City, China Weather data: From Dark Sky API Weather States Temperature Wind Speed Visibility
9
System Overview Database server module
Pre-processing module: Passenger information extraction and flexible region partition Deep learning module Visualization and evaluation module
10
Passenger Demand Extraction
Intercepting analysis of network data packets: Using HTTP proxy tool to understand the meaning behind URLs of DiDi Chuxing Extract passenger demand from cellular data
11
Region Partition Partition the city by primary road network from OpenStreetMap Finer-grained partition: Partition the city by the main secondary roads if the passenger demand of any block is too high
12
Crowd Outflow Extraction
Crowd outflow: The number of people leaving the region Outflow Start Other regions A region
13
System Overview Database server module Pre-processing module
Deep learning module: Deep learning architecture FlowFlexDP to model and predict the passenger demand for each region Visualization and evaluation module
14
Deep Learning Architecture FlowFlexDP
Predict the passenger demand for each region
15
Graph Convolutional Neural Network
Model spatial dependency: Graph Convolutional Neural Network Spectral Graph Theory The normalized graph Laplacian Convolution theorem Polynomial filter: Consider k-order neighbors
16
GCNN with Residual Learning
Capture the long-distance spatial dependency: Apply residual learning to GCNN Residual learning The residual unit of GCNN
17
Sequences Fusion Passenger demand = Hourly + Daily Closeness
Periodicity Crowd Outflow Sequences Fusion Cross Correlation Coefficient Office Region Residential Region
18
Parametric-matrix Based Fusion
External Factors Fusion Predicted Passenger Demand Weather Σ Regression Passenger Demand Crowd Outflow Weather & Time Metadata Parametric-matrix Based Fusion Time Metadata Holiday Snowy Day
19
System Overview Database server module Pre-processing module
Deep learning module Visualization and evaluation module
20
Experimental Setting Training dataset
The first 65% for training data, the second 10% for validation set and the rest 25% for testing Spatio-temporal sequence: Min-Max normalization to [0,1] External factors: One-hot coding for day-of-week, time-of-day, holidays, weather state; Min-Max normalization for the temperature, wind speed and visibility The length of hourly sequence: {3, 4, 5, 6, 7, 8} The length of daily sequence: {1, 2, 3, 4, 5, 6, 7, 8}
21
Evaluation Baselines HA: The Historical Average model
ARIMA: The Autoregressive Integrated Moving Average SARIMA: The Seasonal Autoregressive Integrated Moving Average model VAR: Vector Auto-Regressive LSTM: Long-Short Term Memory is a Recurrent Neural Network architecture ANN: The Artificial Neural Network Metric: RMSE
22
Results Performance Evaluation
ResGCNN: Combines GCNN and residual learning ResGCNN-D: Further adds the passenger demand daily sequence ResGCNN-DO: Further fuses the crowd outflow hourly and daily sequences FlowFlexDP: Our final model All variants achieve better performance than the baseline by at least 12.99%
23
Visualization Our prediction results are very close to the real state
24
Conclusion We demonstrate cellular data as a rich data source for
passenger demand prediction, which has been largely overlooked and unexplored previously It is the first work that uses crowd flow information from cellular data for passenger demand prediction We propose a deep learning model FlowFlexDP, which uses GCNN and residual learning for predicting passenger demand on flexible region partition Evaluation results on a large-scale real data set show that our model outperforms existing models
25
Thanks Q&A
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.