Download presentation
Presentation is loading. Please wait.
Published byDaniela Cain Modified over 8 years ago
1
Resampling and its Role in Time Series Getting more out of your model. Credit: /u/undercome http://www.autobox.com/resample.pptx
2
About Our Presenter B.B.A. in Statistics from CCNY, M.S. in Statistics from Villanova, A.B.D. in Applied Economics from Penn. 40 Years experience in statistical consulting. For PhD developed an expert modeling package, AUTOBOX. CrossValidated: IrishStat
3
About our Company Launched 1976 Consulting and Software Software on Different Platforms Innovated Intervention Modeling Only Software to recognize lead/lag on causals automatically
4
Our Agenda Preface A Quick Refresher: Assumptions More Refreshments: OLS and Transfer- Functions Business Cases – Non-Causal Example – Qualitative Example – Causal Example Reflection Q&A
5
PREFACE: WHY WE MODEL
6
“To do science is to search for repeated patterns. To detect anomalies is to identify values that do not follow repeated patterns. For whoever knows the ways of Nature will more easily notice her deviations and, on the other hand, whoever knows her deviations will more accurately describe her ways. One learns the rules by observing when the current rules fail.” - Francis Bacon A Scientific Approach Photo Credit: Markus Spiske
7
With time series, it is often that our assumptions are important Increasing VolatilityOutlier Nonexistence Bimodal DistributionSimpson’s Paradox
8
A TIME SERIES MODEL
9
Two Standard Assumptions: The Confidence Intervals will take into account the variance of the model errors. (N.B. not the individual errors) Auto-projective dependence on previous forecasts is incorporated for multi-period forecasts using the ψ-weighted method of interval creation.
10
Confidence Interval Creation Amounts to:
11
It all stems from the ARIMA estimation process Computing the psi weights:
12
Moving Forward A unified approach is to require an error distribution that is normal (no fat tails) while allowing for forecast uncertainties to be based upon an error distribution with fat tails. In this case, your outlier will not skew the coefficients too much. They'll be close to the coefficients with an outlier removed. However, the outlier effects should be considered in the forecast error distribution. Essentially, you'll end up with wider and more realistic forecast confidence bands.
13
A Quick Refresher! Photo Credit: Viktor Hanacek
14
Standard Regression
15
The Standard Regression: Confidence Interval Creation Incorporates standard errors (uncertainty) of the coefficients
16
Three Unstated Assumptions: Estimated model parameters are presumed to be the global parameters. We are not accounting for uncertainty in future values for causal input series. This only applies to causal models. Anomalies are not expected or accounted for in the future.
17
Some more Refreshments!
18
Refresher: Causal Modeling with OLS A Simple OLS Model – Contemporaneous effects only, no lags. – Y(t) = a + b*X(t) – Or maybe you recall:
19
Refresher: Causal Modeling with TFs Generally more robust to stochastic patterns. Practically an aggressive form of ARIMA incorporating supporting series.
20
Because they can look like this:
21
BUSINESS CASE OVERVIEW
22
Business Case: Meet The Players The Price of Oil (X COMMON ) Sales (Y)
23
The Standardized Plot
24
The Price of Oil (X COMMON ) Demand Planners at DepotCo are planning a product launch. They would like to determine the feasibility of this action, given market conditions. Market research tells them that the price of oil is a determinant of whether or not they can enter the market unimpeded. We will perform three univariate analyses to model/predict the behavior of X.
25
OR The Impact of a Marketing Forecast on Expected System Performance Marketing Department Planners at XYZ Cellular are providing estimates for the number of future customers based upon promotions and product life cycles. System performance depends critically on the number of customers. How to incorporate the uncertainty in the exogenous forecast of the number of customers.
26
BUSINESS CASE PART ONE A Univariate Example for the price of oil.
27
The Scenarios We Will Use ApproachDescription The Regular Case (A)No adjustment for the 3 assumptions. The Resampled Case (B)Model errors are incorporated. The Robust Case (C)Model errors and outliers are considered. We will show three different methods of dealing with the assumptions that were previously mentioned (i.e. global parameters, CIs reliance on ψ-weights, and that outliers will not happen in the future)
28
Predicting the Predictor(A) Basic ARIMA Incorporating Outliers using the Psi Weights
29
Predicting the Predictor(B) Adjusted for model errors with Probability Sampling i.e. Resampling model errors to enable uncertainty in the parameters.
30
Predicting the Predictor(C) Used Probability Sampling to incorporate model errors as well as the possibility of future outliers.
31
(B) (C) (A)
32
COMPARISON OF FORECAST TABLES FROM (A) AND (C ) Regular Case (A) Robust Case (C)
33
Comparing Simulated Residuals for (C)
35
BUSINESS CASE PART TWO Let’s Consider a Qualitative Example.
36
Step 3: Work PDF into Model Build a model for the series Generate a distribution based on PDF containing SME input – this represents their qualitative analysis. Resample from distribution. Incorporate pseudo-qualitative confidence interval into model.
37
BUSINESS CASE PART THREE Let’s do some causal modeling
38
Sales (Y) DepotCo wants the model for sales to include the price of oil. Using the three univariate cases for X from before, we will provide potential scenarios to help drive their business decisions.
39
OLS between X and Y for Comparison No probabilistic methods were employed. Fixed forecasts for the next 4 periods were used, from (A)
40
OLS with Fixed Input Forecasts from (A) This should serve as a gentle reminder as to the nature of causal modeling with OLS. 1.The future values for X are pre- specified, because of prior analysis. 2.Forecasts for X are dejected from the analysis which created them. 3.With no uncertainties around the forecasts, the values for X are ultimately a vector. 4.In short, the uncertainties in variable X are unaccounted for in the analysis of Y. 5.Both fixed-input forecasting and OLS can be a bad move.
41
Transfer-Function (ARMAX) Solution Our Model:
42
The Regular Case: Transfer-Function X(A)|Y Regular case using ARMAX generates less model error and the ψ-weighted method of interval creation leads to potentially unreasonably tight values.
43
The Resampled Case: Transfer-Function X(B)|Y Incorporating uncertainty in X leads to a wider probability space. Pulses have not been resampled from X.
44
The Robust Case: Transfer-Function X(C)|Y Resampling with errors in X as well as outlier impacts leads to a wider CI that is more representative the uncertainty within X as Pulses are included in the resampling.
45
The Resampled Case (B)The Robust Case (C) The Regular Case (A)
46
DepotCo Consults with Industry Experts In order to have a better understanding of their causal variables, DepotCo decides to consult with industry experts. These functional SMEs have qualitative opinions that can improve DepotCo’s analysis. How do we incorporate industry knowledge into the existing analysis? Photo credit: Sebastiaan ter Burg
47
Step 1: Collect SME knowledge DepotCo hires SMEs for input. They interview experts for knowledge gathering. Request that their analysis be numerical in nature– containing a component where their assessment comes down to probability. i.e. What is the percent chance of a $1.00 increase in price? What about 50¢? A 10¢ decrease?
48
Step 2: Establish probabilities of certain outcomes based on their knowledge. PDF – Probability Density Function
49
REFLECTION What have we learned?
50
We have learned… There are explicit and implicit assumptions in Confidence Interval creation: 1.It will take into account the variance of the model errors. 2.Auto-projective dependence on previous forecasts is incorporated for multi-period forecasts using the ψ-weighted method of interval creation. 3.Estimated model parameters are presumed to be the global parameters. 4.We are not accounting for uncertainty in future values for causal input series. This only applies to causal models. 5.Anomalies are not expected or accounted for in the future. We can defeat the negative effects of these assumptions by incorporating probabilistic sampling methods into our modeling process. We can incorporate model errors, outliers, and even qualitative inputs into these intervals. Resampling can help model causal behavior more accurately, by allowing for uncertainty in the causal variables. Previous methods do not allow for this. There is no harm in doing this, because if the resampled results are non- normal, we have learned more about our dataset. If they are normal, we lose nothing.
51
Any Questions?
53
Photo Credits https://goo.gl/HW1vdw http://temporausch.com/01/ https://www.viktorhanacek.com/ https://www.flickr.com/photos/ter-burg/ http://www.gratisography.com/
54
Comparing Model Residuals to Simulated Residuals for TF(C)
55
A unified approach is to require an error distribution that is normal (no fat tails) while allowing for forecast uncertainties to be based upon an error distribution with fat tails. In this case, your outlier will not skew the coefficients too much. They'll be close to the coefficients with an outlier removed. However, the outlier effects should be considered in the forecast error distribution. Essentially, you'll end up with wider and more realistic forecast confidence bands.
56
Agenda We Can Do Better Why This is the Case How We Can Improve The Probabalistic Solution How this is Accomplished Walkthrough Q/A
57
We can do better than we are currently doing.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.