Andrew Karl, Ph.D. James Wisnowski, Ph.D. Lambros Petropoulos

Slides:

Advertisements

Similar presentations

Time Series Building 1. Model Identification

Advertisements

Objectives (BPS chapter 24)

Neural Network Based Approach for Short-Term Load Forecasting

Branch Prediction with Neural- Networks: Hidden Layers and Recurrent Connections Andrew Smith CSE Dept. June 10, 2004.

Statistical Forecasting Models

CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.

Forecasting Financial Time Series using Neural Networks, Genetic Programming and AutoRegressive Models.

G. Peter Zhang Neurocomputing 50 (2003) 159–175 link Time series forecasting using a hybrid ARIMA and neural network model Presented by Trent Goughnour.

Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.

Time series data: each case represents a point in time. Each cell gives a value for each variable for each time period. Stationarity: Data are stationary.

Linear Discrimination Reading: Chapter 2 of textbook.

Training Conditional Random Fields using Virtual Evidence Boosting Lin Liao, Tanzeem Choudhury †, Dieter Fox, and Henry Kautz University of Washington.

Visualizing and Understanding recurrent Neural Networks

Brain-Machine Interface (BMI) System Identification Siddharth Dangi and Suraj Gowda BMIs decode neural activity into control signals for prosthetic limbs.

Data Mining: Neural Network Applications by Louise Francis CAS Convention, Nov 13, 2001 Francis Analytics and Actuarial Data Mining, Inc.

Predicting the dropouts rate of online course using LSTM method

Lecture 12 Time Series Model Estimation Materials for lecture 12 Read Chapter 15 pages 30 to 37 Lecture 12 Time Series.XLSX Lecture 12 Vector Autoregression.XLSX.

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.

Machine Learning Supervised Learning Classification and Regression

Welcome deep loria !.

Convolutional Sequence to Sequence Learning

Unsupervised Learning of Video Representations using LSTMs

SD Study RNN & LSTM 2016/11/10 Seitaro Shinagawa.

Regression Analysis AGEC 784.

CS 388: Natural Language Processing: LSTM Recurrent Neural Networks

CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.

Benchmarking Deep Learning Inference

Linear Regression.

Predicting Salinity in the Chesapeake Bay Using Neural Networks

Speaker Classification through Deep Learning

DeepCount Mark Lenson.

Recurrent Neural Networks for Natural Language Processing

Chapter 4: Seasonal Series: Forecasting and Decomposition

ICS 491 Big Data Analytics Fall 2017 Deep Learning

Intelligent Information System Lab

Different Units Ramakrishna Vedantam.

Machine Learning Basics

Comparison Between Deep Learning Packages

Policy Compression for MDPs

Convolutional Networks

Attention Is All You Need

Master’s Thesis defense Ming Du Advisor: Dr. Yi Shang

A critical review of RNN for sequence learning Zachary C

Deep Learning Packages

Image Captions With Deep Learning Yulia Kogan & Ron Shiff

A First Look at Music Composition using LSTM Recurrent Neural Networks

Tips for Training Deep Network

How to select regressors and specifications in Demetra+?

Understanding LSTM Networks

ECE599/692 - Deep Learning Lecture 14 – Recurrent Neural Network (RNN)

Use of Time-Series Data in Strategic Managerial Decisions

Neural Networks Geoff Hulten.

Other Classification Models: Recurrent Neural Network (RNN)

Lecture 16: Recurrent Neural Networks (RNNs)

Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions

Neural networks (3) Regularization Autoencoder

LSTM: Long Short Term Memory

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Dilated Neural Networks for Time Series Forecasting

The Updated experiment based on LSTM

Introduction to Neural Networks

Question Answering System

Recurrent Neural Networks

Time Series Forecasting with Recurrent Neural Networks NN3 Competition Mahmoud Abou-Nasr Research & Advanced Engineering Ford Motor Company

Machine Learning.

Bidirectional LSTM-CRF Models for Sequence Tagging

Beating the market -- forecasting the S&P 500 Index

Artificial Neural Network learning

Presentation transcript:

Lessons Learned Applying Deep Learning Approaches to Forecasting Complex Seasonal Behavior Andrew Karl, Ph.D. James Wisnowski, Ph.D. Lambros Petropoulos Adsurgo, LLC USAA

Forecasting Call Center Arrivals Process under study: call volumes over 30 minute periods for each of several “skills” at the USAA call center Process mean depends on day-of-week, period-of-day, holiday, skill, etc. Short- and long-term correlations: within and between days Traditional approaches: ARIMA, Winters Additive Smoothing Current best practice: linear mixed model (LMM) which uses correlated random day-level intercepts and correlated within-day residuals (doubly stochastic model). Research question: can we obtain better predictions from recurrent neural networks (RNN) than from LMM?

Recurrent Neural Networks Traditional (dense) neural networks perform a gradient update as each observation is processed in random order: no time order is assumed for the observations In an RNN, the output for each node is used as an input to that same node as the next observation is processed, an additional parameter is fit as a coefficient for the previous state in each node. Useful in time series applications and language processing Google uses an RNN with seven layers to drive Google Translate We used the R interface to the Keras Python package

Tuning the RNN Configuration Full factorial experiment on RNN type (LSTM, GRU, Elman, Dense) Number of Layers (1, 2) Number of Nodes (25, 50, 75, 100) L2 Kernel Regularization (0, 0.001) Use LMM predictions as input for RNN (TRUE, FALSE) Analyzed with a loglinear variance model. Significant difference in variability from RNN type, Number of Nodes, and inclusion of LMM predictions Optimal RNN configuration was GRU RNN (using LMM as input) with 50 nodes on each of 2 layers with kernel.L2.reg complexity parameter set to 0.0001. Same configuration was also optimal when LMM input was disallowed.

Conclusions For short-term training windows (five weeks) on moderate to large call volumes, RNNs were unable to match the performance of the LMM. Consistent with Uber findings for ride volumes. RNNs did show some improvement over LMM for splits with sparse call volumes (only a few calls received per day). Allowing the RNN to use the LMM predictions as an input significantly improved the performance of the RNN, but the resulting predictions were no different than the original LMM predictions. RNNs also add programming and maintenance complexity.