Lessons Learned Applying Deep Learning Approaches to Forecasting Complex Seasonal Behavior Andrew Karl, Ph.D. James Wisnowski, Ph.D. Lambros Petropoulos Adsurgo, LLC USAA
Forecasting Call Center Arrivals Process under study: call volumes over 30 minute periods for each of several “skills” at the USAA call center Process mean depends on day-of-week, period-of-day, holiday, skill, etc. Short- and long-term correlations: within and between days Traditional approaches: ARIMA, Winters Additive Smoothing Current best practice: linear mixed model (LMM) which uses correlated random day-level intercepts and correlated within-day residuals (doubly stochastic model). Research question: can we obtain better predictions from recurrent neural networks (RNN) than from LMM?
Recurrent Neural Networks Traditional (dense) neural networks perform a gradient update as each observation is processed in random order: no time order is assumed for the observations In an RNN, the output for each node is used as an input to that same node as the next observation is processed, an additional parameter is fit as a coefficient for the previous state in each node. Useful in time series applications and language processing Google uses an RNN with seven layers to drive Google Translate We used the R interface to the Keras Python package
Tuning the RNN Configuration Full factorial experiment on RNN type (LSTM, GRU, Elman, Dense) Number of Layers (1, 2) Number of Nodes (25, 50, 75, 100) L2 Kernel Regularization (0, 0.001) Use LMM predictions as input for RNN (TRUE, FALSE) Analyzed with a loglinear variance model. Significant difference in variability from RNN type, Number of Nodes, and inclusion of LMM predictions Optimal RNN configuration was GRU RNN (using LMM as input) with 50 nodes on each of 2 layers with kernel.L2.reg complexity parameter set to 0.0001. Same configuration was also optimal when LMM input was disallowed.
Conclusions For short-term training windows (five weeks) on moderate to large call volumes, RNNs were unable to match the performance of the LMM. Consistent with Uber findings for ride volumes. RNNs did show some improvement over LMM for splits with sparse call volumes (only a few calls received per day). Allowing the RNN to use the LMM predictions as an input significantly improved the performance of the RNN, but the resulting predictions were no different than the original LMM predictions. RNNs also add programming and maintenance complexity.