A comparison of automatic model selection procedures for seasonal adjustment Cathy Jones
Motivation Automatic forecasting procedures are required for seasonal adjustment of official statistics time series Current seasonal adjustment software offers two automatic procedures Aim of this work is to compare the performance of these procedures in the context of seasonal adjustment
Overview Time series background What is seasonal adjustment? X-13ARIMA-SEATS ARIMA models Forecasting Methods for automatic model selection Results Future work
Time Series Background Data source: ONS
Seasonal Adjustment What is seasonal adjustment? “process of removing from a time series variations associated with the time of year and/or the arrangement of the calendar” Why seasonally adjust? primary interest of users is movements in time series removes predictable variation from time series in order to aid interpretation
Seasonal Adjustment Data source: ONS
X-13ARIMA-SEATS Chosen as the seasonal adjustment software for use in official statistics (agreed by the Statistical Policy and Standards Committee in 2012) Produced by US Census Bureau RegARIMA modelling used to ‘clean’ series Can seasonally adjust with X-11 algorithm or SEATS
RegARIMA models Time series regression models with ARIMA errors used to deal with autocorrelation Used to forecast/backcast reduces revisions to seasonally adjusted series caused by asymmetric moving averages used in the X-11 method also used to forecast some components of National Accounts ‘Cleans’ the time series estimation and removal of outliers, level shifts etc
End point problem Data source: ONS
Forecasting Data source: ONS
Why selecting a good model is important Forecasts needed to deal with end point problem (asymmetric averages give an implied forecast) Good forecasts minimise revisions to the seasonally adjusted estimates Selection and estimation of regressors
Why is automatic model selection needed? ONS seasonally adjust thousands of time series -last year TSAB reviewed over 13,500 series Manual selection of ARIMA models is very time consuming- we’d never get through them all
Automatic model selection X-13ARIMA-SEATS provides two automatic model selection routines Pickmdl Automdl Pickmdl was used in older versions of the software (from X-11-ARIMA onwards) Automdl is based on routine from TRAMO (available from X-12-ARIMA onwards)
Pickmdl Chooses the first model in the following list that satisfies a number of tests (0,1,1)(0,1,1) s (0,1,2)(0,1,1) s (2,1,0)(0,1,1) s (0,2,2)(0,1,1) s (2,1,2)(0,1,1) s The tests are: testing that the average absolute percent error of forecasted values are within certain limits test that the residuals are not correlated no sign of over differencing
Automdl fits a (0,1,1)(0,1,1) s model identification of differencing orders by empirical unit root tests iterative procedure to determine ARMA model orders (maximum orders are set by default and can be changed) identified model is compared to default final model checks
Data Monthly GDP 176 monthly series 40 quarterly series Data spans from January 1995 to October 2014 Automatic detection of calendar effects and outliers Removed series that included seasonal breaks
Results Stability Forecast errors Forecast differences Differences in seasonal adjustment estimates
Stability of model selected 5 Years 7 years 9 years 11 years Full series
Stability of model selected 5 Years 7 years 9 years 11 years Full series
Stability of model selected 5 Years 7 years 9 years 11 years Full series
Stability of the model selected
Forecasts Data source: ONS
Average absolute percentage error in within-sample forecast values
Automdl lower 30% of the time Pickmdl lower 28% of the time Same 42% of the time On average, automdl's error is roughly 8% lower than pickmdl's
Average absolute percentage error in within-sample forecast values Last three years
Average absolute percentage error in within-sample forecast values Last three years Automdl lower 35% of the time Pickmdl lower 23% of the time Same 42% of the time On average, automdl's error is roughly 7% lower than pickmdl's
Forecast results F = Failures (%) AFD = Average Absolute Forecast Difference (%) BFP = Best Forecast Performance (%) LMV = Lowest Model Variance (%) Monthly AutomdlPickmdl F027.6 AFD11.54 BFP4753 LMV6136 Quarterly AutomdlPickmdl F05 AFD BFP5842 LMV6832
Seasonal adjustment estimates Data source: ONS
Seasonal adjustment estimates Pickmdl produced better adjustment in 6% of cases Automdl was better in 33% of cases 41% were exactly the same 20% showed very little difference Data source: ONS
Revision analysis Span 1 Span 2 Span 3 Span 4 Span 5
Revision analysis Span 1 Span 2 Span 3 Span 4 Span 5
Revision analysis Span 1 Span 2 Span 3 Span 4 Span 5
Revision analysis Span 1 Span 2 Span 3 Span 4 Span 5
Seasonal adjustment revisions
Conclusions Pickmdl performs better when considering model stability Automdl has a lower average absolute percentage error in within-sample forecast values Out-of-sample forecasts performance: Pickmdl appears slightly better for monthly series Automdl appears much better for quarterly series Little difference in seasonal adjustment revisions between methods
Future work SEATS Series with differing volatility Simulated series How different model selection impacts on regressor identification and estimation Errors on shorter spans Sliding spans stability Using current regressors
Thanks for listening!