Ian Newcombe CO 2 LEVEL RISE OVER 26 YEARS
DATASET Quarterly Mauna Loa, HI CO 2 Record Quarterly US gasoline sales Quarterly US car and light truck sales Quarterly US GDP Quarterly US electricity usage content/uploads/2012/10/Electric-lines.jpg 05.jpg 80_Eastshore_Fwy.jpg items/images/fig1_keeling_curve.jpg content/uploads/2014/02/economy.jpg
REGRESSION The overall model is significant P-values < 0.05 The gas and cars variables appear to be insignificant The GDP and elec variables are both significant
All variables show high significance with each other All p-values < 0.05 VARIABLE ANALYSIS
If I remove the gas and cars variables, the model improves, slightly Root MSE goes from to Adjusted R-Squared goes from to MODIFIED REGRESSION
Normality Assumption of normally distributed residuals is violated Visually, the residuals do not appear normal Looking at the Shapiro-Wilk test for normality it is confirmed that the residuals are not normal P-value is < 0.05 RESIDUAL ANALYSIS
Constant Variance This assumption is confirmed The White’s test for heteroscedasticty shows that there is no heteroscedasticty P-value > 0.05 RESIDUAL ANALYSIS
No Auto-correlation The assumption that residuals will not be auto-correlated is violated The residuals are not white noise P-values < 0.05 RESIDUAL ANALYSIS
Including squared and interaction variables, this is the best model according to SAS C(p) of 4.59 Root MSE = Gd = interaction variable of gasoline sales and GDP Gde= interaction between GDP and electricity SELECTED REGRESSION MODEL
Normal Regression GDP, elec 0.98% MAPE Selected Regression Gas, GDP, elec, gas2, GDP2, gd, gde 0.93% MAPE REGRESSION PREDICTIONS ActualPredicted ActualPredicted
The three indicator variables are the quarters 1-3 Quarter 4 is the reference period When all other variables are 0, CO2 will equal INDICATOR VARIABLE MODEL
By removing orders 1 and 5, I can achieve a significant model It appears that all orders are insignificant POLYNOMIAL MODELS Order 5 ModelCorrected Model
CYCLICAL MODEL
F-test shows that sin52 needs to stay All included variables have high significance CYCLICAL MODEL
0.45% MAPE 0.11% MAPE PREDICTIONS Indicator Variable ModelPolynomial Model ActualPredicted ActualPredicted
0.40% MAPE PREDICTIONS Cyclical Model ActualPredicted
MA MODEL Took 4th difference of the variable ACF cuts after lag 2 PACF decays I will fit a MA(3) model
All lags appear to be significant Residuals are white noise MA MODEL
Data is quarterly and there is an very obvious seasonal component I fit and ARIMA(0,1,1)x (0,1,1) 4 model Residuals are white noise SARIMA MODEL
Residuals from regression are not white noise I took a 1, 4 seasonal difference ARIMA(1,1,4) x (1,1,4) 4 SARIMA WITH RESIDUALS
SARIMA VS MA RESIDUAL SARIMASARIMA with Residuals
MAPE = 0.06% MAPE = 0.10% MODEL COMPARISONS MA(3)SARIMA ActualPredicted ActualPredicted
MAPE = 0.08% MODEL COMPARISONS ARIMA Residual ActualPredicted
OVERALL COMPARISON The best regression model was the backward selected model Root MSE = 1.59 vs The best deterministic time series model was the cyclical model Root MSE = vs vs 2.30 The best time series model was the SARIMA model STD error = 0.37 vs 0.39 (MA) vs 0.91(ARIMA RES) The best overall therefore is the cyclical model
PREDICTIONS COMPARISON The best predictions were made with the SARIMA model MAPE = 0.06% ActualPredicted
CONCLUSIONS CO 2 levels can be predicted very well if you know how much gasoline was bought, cars sold, US GDP, and electricity usage CO 2 levels follow a seasonal pattern Summer/Winter in Northern Hemisphere 2014 GDP
QUESTIONS?