Download presentation
Presentation is loading. Please wait.
Published byMaude Palmer Modified over 9 years ago
1
Development of a Macro Editing Approach Work Session on Statistical Data Editing, Topic v: Editing based on results 21-23 April 2008 WP 30
2
Overview Introduction to series of surveys that measures U.S. petroleum product supplied Limitation of micro editing and need for an edit approach at the aggregate level Approach considered for macro editing and the three types of models developed using one product as an example In sample forecast results and out-of-sample forecast performance results Summary and conclusions
3
The PSRS and Micro Edit Limitations The surveys, respondents and data collected –WPSRS: Weekly, six cut-off sample surveys –MPSRS : Monthly, nine population census surveys –PSA: Annual of revised monthly estimates, population census Limitations –Variability of responses –Lagged population coverage Corrective Measures –Micro editing –Imputation
4
The Approach Purpose of Study –Develop point and interval forecast at national and regional levels –One-month ahead forecast Approach –Econometric time-series models –Three models : Base, ARMA, and Supplemental Models –Micro editing enhanced by providing capabilities to identify outliers at the aggregate level
5
Model Development Model at product level –Distillate (Low Sulfur, High Sulfur, Total) –Gasoline Model at two geographic levels –National –Regional (PADD)
6
Model Forms Base Model: trends and seasonal factors expressed as: ARMA Model: Box-Jenkins approach utilizing AR and MA to capture the variation and seasonal pattern expressed as: Supplemental Model: Base Model with exogenous variables expressed as:
7
US Distillate Demand: 1996-2006
10
In-Sample One-Month-Out Forecast Evaluation Statistics Total Distillate Models BaseARMASuppl. RMSE100.55126.4889.36 MAE83.0397.3273.96 MAPE2.222.591.98 HSD Models BaseARMASuppl. RMSE83.48109.3971.22 MAE63.6784.9654.99 MAPE5.767.595.13 LSD Models BaseARMASuppl. RMSE74.7494.8474.36 MAE60.2276.4259.85 MAPE2.282.932.26 Note: There is no evidence of bias in any of the models
11
U.S. Distillate Demand Best Model Summary Statistics TotalHSDLSD Adjusted R 2 0.9090.8980.949 S.E. of Regression 96.5876.9879.55 Note: Estimation period Jan 1996 through Dec 2006
12
In-Sample Model Fit: Best Model 2000-2006 ( 2 forecast standard errors)
15
Out-of-Sample Forecast Results: Best Model 2006-2007
16
Out-of-Sample Forecast Results: Best Model 2006-2007, HSD
17
Out-of-Sample Results: Best Model 2006-2007, LSD
18
Regional Models Regions: Petroleum Administration for Defense District Identify exogenous variables to explain regional patterns of distillate demand –Residential heating in the Northeast (PADD 1): Heating Degree-Days –Agriculture in the Midwest (PADD 2): Precipitation HDD DEVPopulation-Weighted Heating Degree-Days: Deviation from Normal PRECIP DEVArea-Weighted Precipitation: Deviation from Long-Term Normal EMP TRANSEmployment in Transportation Industries IPI MFGIndex of Industrial Production for Durable Goods FREIGHT INDXTransportation Services Index for Freight PRICE RATIOAverage monthly spot price ratio: No.2 Fuel Oil / Natural Gas Exogenous Variables Used in Supplemental Distillate Models PADD 1PADD 2PADD 3PADD 5NATIONAL HSDLSDTOTHSDLSDTOTHSDLSDTOTHSDLSDTOTHSDLSDTOT HDD DEVXXXXX PRECIP DEVXXXX EMP TRANSXX IPI MFGX FREIGHT INDXXXXX PRICE RATIOX
19
Regional Model Details: In-Sample Model Fit, PADD 1 HSD
20
Regional Model Details: In-Sample Model Fit, PADD 1, LSD
21
Regional Model Details: In-Sample Model Fit, PADD 2, HSD
22
Regional Model Details: In-Sample Model Fit, PADD 2, LSD
23
Regional Model Details: Out-of-Sample Forecast Results, PADD 1, HSD
24
Regional Model Details: Out-of-Sample Forecast Results, PADD 1, LSD
25
Regional Model Details: Out-of-Sample Forecast Results, PADD 2, HSD
26
Regional Model Details: Out-of-Sample Forecast Results, PADD 2, LSD
27
Benefits & Limitations How does this improve EIA’s current activities? –Establishes a range of expected results at the aggregate level that will alert a reviewer when to investigate possible anomalies in the respondent data –Can identify the region which provides largest contribution to deviation, guiding further editing and imputation activities prior to data release –Reduces risk of revisions to released data Limitations of Modeling –Reasons for deviations are not always readily apparent: respondent error, structural shifts in consumption, or failure of the model to respond to external influences –Regional-level models provide guidance, but not necessarily answers –Ranges may be too large
28
Future Plans Model improvements –Dynamic adjustments to known issues like shifts –Better exogenous variables Automation of gathering and formatting model inputs –Weather Data –Economic Data –Forecast generation Expand to other key petroleum products –Gasoline and gasoline subcomponents (currently underway) –Residual fuel oil
29
US Distillate Demand: Best Model * The variable “MONTH” indicates 11 monthly dummy variables to account for seasonality in demand. In each of the models the probability value is obtained from the results of an F-test of the collective significance of the seasonal dummy variables. Total DistillateHSDLSD VariableCoefficient Std. ErrorProb.VariableCoefficient Std. ErrorProb.VariableCoefficient Std. ErrorProb. C3034.4250.140.000C1047.5945.160.000C2066.3263.130.000 @TREND5.860.860.000T_JAN03-2.780.760.000@TREND5.691.310.000 MONTH **** 0.000L_OCT01-60.8321.060.005MONTH **** 0.000 L_SEP01-304.6345.270.000MONTH **** 0.000L_NOV01-113.4431.990.001 T_MAY999.332.020.000HDD_DEV0.990.130.000T_MAY984.661.560.004 T_JAN03-8.791.690.000PR_RAT(-1)-48.4823.840.044TSI_FRT(-1)10.526.460.106 HDD_DEV1.080.180.000AR(7)-0.220.100.022AR(10)-0.320.100.001 TSI_FRT12.905.820.029AR(1)0.160.090.092MA(3)0.250.100.009 AR(4)-0.320.090.001MA(2)0.190.090.043 Adj. R^2SE RegD-WAdj. R^2SE RegD-WAdj. R^2SE RegD-W 0.90996.5832.1690.89876.9791.9750.94979.5511.791
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.