Andrew Karl, Ph.D. James Wisnowski, Ph.D. Lambros Petropoulos

Slides:



Advertisements
Similar presentations
Time Series Building 1. Model Identification
Advertisements

Objectives (BPS chapter 24)
Neural Network Based Approach for Short-Term Load Forecasting
Branch Prediction with Neural- Networks: Hidden Layers and Recurrent Connections Andrew Smith CSE Dept. June 10, 2004.
Statistical Forecasting Models
CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.
Forecasting Financial Time Series using Neural Networks, Genetic Programming and AutoRegressive Models.
G. Peter Zhang Neurocomputing 50 (2003) 159–175 link Time series forecasting using a hybrid ARIMA and neural network model Presented by Trent Goughnour.
Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Time series data: each case represents a point in time. Each cell gives a value for each variable for each time period. Stationarity: Data are stationary.
Linear Discrimination Reading: Chapter 2 of textbook.
Training Conditional Random Fields using Virtual Evidence Boosting Lin Liao, Tanzeem Choudhury †, Dieter Fox, and Henry Kautz University of Washington.
Visualizing and Understanding recurrent Neural Networks
Brain-Machine Interface (BMI) System Identification Siddharth Dangi and Suraj Gowda BMIs decode neural activity into control signals for prosthetic limbs.
Data Mining: Neural Network Applications by Louise Francis CAS Convention, Nov 13, 2001 Francis Analytics and Actuarial Data Mining, Inc.
Predicting the dropouts rate of online course using LSTM method
Lecture 12 Time Series Model Estimation Materials for lecture 12 Read Chapter 15 pages 30 to 37 Lecture 12 Time Series.XLSX Lecture 12 Vector Autoregression.XLSX.
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.
Machine Learning Supervised Learning Classification and Regression
Welcome deep loria !.
Convolutional Sequence to Sequence Learning
Unsupervised Learning of Video Representations using LSTMs
SD Study RNN & LSTM 2016/11/10 Seitaro Shinagawa.
Regression Analysis AGEC 784.
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.
Benchmarking Deep Learning Inference
Deep Learning.
Linear Regression.
Predicting Salinity in the Chesapeake Bay Using Neural Networks
Speaker Classification through Deep Learning
DeepCount Mark Lenson.
Recurrent Neural Networks for Natural Language Processing
Chapter 4: Seasonal Series: Forecasting and Decomposition
ICS 491 Big Data Analytics Fall 2017 Deep Learning
Intelligent Information System Lab
Different Units Ramakrishna Vedantam.
Machine Learning Basics
Comparison Between Deep Learning Packages
Policy Compression for MDPs
Convolutional Networks
Attention Is All You Need
Master’s Thesis defense Ming Du Advisor: Dr. Yi Shang
A critical review of RNN for sequence learning Zachary C
Deep Learning Packages
Image Captions With Deep Learning Yulia Kogan & Ron Shiff
A First Look at Music Composition using LSTM Recurrent Neural Networks
Tips for Training Deep Network
How to select regressors and specifications in Demetra+?
Understanding LSTM Networks
ECE599/692 - Deep Learning Lecture 14 – Recurrent Neural Network (RNN)
Use of Time-Series Data in Strategic Managerial Decisions
Neural Networks Geoff Hulten.
Other Classification Models: Recurrent Neural Network (RNN)
Lecture 16: Recurrent Neural Networks (RNNs)
Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions
Neural networks (3) Regularization Autoencoder
LSTM: Long Short Term Memory
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Dilated Neural Networks for Time Series Forecasting
The Updated experiment based on LSTM
Introduction to Neural Networks
Rgh
Question Answering System
Recurrent Neural Networks
Time Series Forecasting with Recurrent Neural Networks NN3 Competition Mahmoud Abou-Nasr Research & Advanced Engineering Ford Motor Company
Machine Learning.
Bidirectional LSTM-CRF Models for Sequence Tagging
Beating the market -- forecasting the S&P 500 Index
Artificial Neural Network learning
Presentation transcript:

Lessons Learned Applying Deep Learning Approaches to Forecasting Complex Seasonal Behavior Andrew Karl, Ph.D. James Wisnowski, Ph.D. Lambros Petropoulos Adsurgo, LLC USAA

Forecasting Call Center Arrivals Process under study: call volumes over 30 minute periods for each of several “skills” at the USAA call center Process mean depends on day-of-week, period-of-day, holiday, skill, etc. Short- and long-term correlations: within and between days Traditional approaches: ARIMA, Winters Additive Smoothing Current best practice: linear mixed model (LMM) which uses correlated random day-level intercepts and correlated within-day residuals (doubly stochastic model). Research question: can we obtain better predictions from recurrent neural networks (RNN) than from LMM?

Recurrent Neural Networks Traditional (dense) neural networks perform a gradient update as each observation is processed in random order: no time order is assumed for the observations In an RNN, the output for each node is used as an input to that same node as the next observation is processed, an additional parameter is fit as a coefficient for the previous state in each node. Useful in time series applications and language processing Google uses an RNN with seven layers to drive Google Translate We used the R interface to the Keras Python package

Tuning the RNN Configuration Full factorial experiment on RNN type (LSTM, GRU, Elman, Dense) Number of Layers (1, 2) Number of Nodes (25, 50, 75, 100) L2 Kernel Regularization (0, 0.001) Use LMM predictions as input for RNN (TRUE, FALSE) Analyzed with a loglinear variance model. Significant difference in variability from RNN type, Number of Nodes, and inclusion of LMM predictions Optimal RNN configuration was GRU RNN (using LMM as input) with 50 nodes on each of 2 layers with kernel.L2.reg complexity parameter set to 0.0001. Same configuration was also optimal when LMM input was disallowed.

Conclusions For short-term training windows (five weeks) on moderate to large call volumes, RNNs were unable to match the performance of the LMM. Consistent with Uber findings for ride volumes. RNNs did show some improvement over LMM for splits with sparse call volumes (only a few calls received per day). Allowing the RNN to use the LMM predictions as an input significantly improved the performance of the RNN, but the resulting predictions were no different than the original LMM predictions. RNNs also add programming and maintenance complexity.