Big Data at Home Depot KSU – Big Data Survey Course Steve Einbender Advanced Analytics Architect
2 Time Series Concept
3 Operationalizing the Concept p - AutoRegressive (Auto Correlation) d - Integrated (Stationarity / Trend) q - Moving Average (Shocks / Error) P – Seasonal Auto Correlation D – Seasonal Trend Q – Seasonal Error Seasonal effects: If there are spikes in the data every four periods for quarterly data, or every 12 periods for monthly data, there is a seasonal effect.
4 Time Series Parameter Specifications ARIMA modeling involves three stages: (1) Identification of the initial p, d, and q parameters Autoregressive component (p). Usually 0, 1, or 2 Integrated component (d). Usually 0, 1, or 2 Moving average component (q). Usually 0, 1, or 2 (2) Estimation of the p (auto-regressive) and q (moving average) components to see if they contribute significantly to the model or if one or the other should be dropped; and (3) Diagnosis of the residuals to see if they are random and normally distributed, indicating a good model. An ARIMA (0,1,1) model means no autoregressive component, differencing one time to remove linear trends, and a lag 1 moving average component.
5 Time Series Forecasting System ( TSFS) Demo Data Range identification View Series graphically What Functions and Tests do we use to derive the most accurate Time Series model possible ? Autocorrelation Function Partial Autocorrelation Function Patterns in the ACF/PACF functions can be used to suggest different models to use. White Noise Test Dickey-Fuller Unit Root / Stationarity Test After a candidate set of models are identified, the models are estimated and their fit assessed The best fitting model is used to generate a forecast.
6 ARIMA with Dynamic Regression Another use of Time Series is for the introduction of Covariates/Predictors. An extension of ordinary Regression One or more of the Independent Variables(i.e., predictors) are correlated with the Dependent Variable at non-concurrent time lags. Intervention Analysis Two basic activities Identify the Functional Form of the Intervention Assess the Statistical Significance of the Intervention Let’s look at how we build a Time Series ADS….