Time Series Analysis Lecture 11

Slides:



Advertisements
Similar presentations
SMA 6304 / MIT / MIT Manufacturing Systems Lecture 11: Forecasting Lecturer: Prof. Duane S. Boning Copyright 2003 © Duane S. Boning. 1.
Advertisements

Cointegration and Error Correction Models
Autocorrelation Functions and ARIMA Modelling
Part II – TIME SERIES ANALYSIS C4 Autocorrelation Analysis © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Using SAS for Time Series Data
Part II – TIME SERIES ANALYSIS C5 ARIMA (Box-Jenkins) Models
Nonstationary Time Series Data and Cointegration Prepared by Vera Tabakova, East Carolina University.
Unit Roots & Forecasting
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 7: Box-Jenkins Models – Part II (Ch. 9) Material.
Time Series Analysis Materials for this lecture Lecture 5 Lags.XLS Lecture 5 Stationarity.XLS Read Chapter 15 pages Read Chapter 16 Section 15.
Time Series Building 1. Model Identification
R. Werner Solar Terrestrial Influences Institute - BAS Time Series Analysis by means of inference statistical methods.
Objectives (BPS chapter 24)
How should these data be modelled?. Identification step: Look at the SAC and SPAC Looks like an AR(1)- process. (Spikes are clearly decreasing in SAC.
Multiple Regression Forecasts Materials for this lecture Demo Lecture 2 Multiple Regression.XLS Read Chapter 15 Pages 8-9 Read all of Chapter 16’s Section.
Modeling Cycles By ARMA
Time Series Model Estimation Materials for this lecture Read Chapter 15 pages 30 to 37 Lecture 7 Time Series.XLS Lecture 7 Vector Autoregression.XLS.
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
Cycles and Exponential Smoothing Models Materials for this lecture Lecture 4 Cycles.XLS Lecture 4 Exponential Smoothing.XLS Read Chapter 15 pages
ARIMA Forecasting Lecture 7 and 8 - March 14-16, 2011
Materials for Lecture 13 Purpose summarize the selection of distributions and their appropriate validation tests Explain the use of Scenarios and Sensitivity.
Non-Seasonal Box-Jenkins Models
AGEC 622 Mission is prepare you for a job in business Have you ever made a price forecast? How much confidence did you place on your forecast? Was it correct?
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
Business Forecasting Chapter 5 Forecasting with Smoothing Techniques.
Ordinary Least Squares
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
BOX JENKINS METHODOLOGY
AR- MA- och ARMA-.
Inference for regression - Simple linear regression
Regression Method.
Time Series Model Estimation Materials for lecture 6 Read Chapter 15 pages 30 to 37 Lecture 6 Time Series.XLSX Lecture 6 Vector Autoregression.XLSX.
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 8: Estimation & Diagnostic Checking in Box-Jenkins.
#1 EC 485: Time Series Analysis in a Nut Shell. #2 Data Preparation: 1)Plot data and examine for stationarity 2)Examine ACF for stationarity 3)If not.
Lecture 7: Forecasting: Putting it ALL together. The full model The model with seasonality, quadratic trend, and ARMA components can be written: Ummmm,
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 27 Time Series.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Cointegration in Single Equations: Lecture 6 Statistical Tests for Cointegration Thomas 15.2 Testing for cointegration between two variables Cointegration.
Univariate Linear Regression Problem Model: Y=  0 +  1 X+  Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both.
Big Data at Home Depot KSU – Big Data Survey Course Steve Einbender Advanced Analytics Architect.
Copyright © 2011 Pearson Education, Inc. Time Series Chapter 27.
Lecture 12 Time Series Model Estimation
1 Another useful model is autoregressive model. Frequently, we find that the values of a series of financial data at particular points in time are highly.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
How do we identify non-stationary processes? (A) Informal methods Thomas 14.1 Plot time series Correlogram (B) Formal Methods Statistical test for stationarity.
Correlation & Regression Analysis
The Box-Jenkins (ARIMA) Methodology
Seasonal ARIMA FPP Chapter 8.
Cycles and Exponential Smoothing Models Materials for this lecture Lecture 10 Cycles.XLS Lecture 10 Exponential Smoothing.XLSX Read Chapter 15 pages
Economics 20 - Prof. Anderson1 Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
AUTOCORRELATION 1 Assumption B.5 states that the values of the disturbance term in the observations in the sample are generated independently of each other.
Subodh Kant. Auto-Regressive Integrated Moving Average Also known as Box-Jenkins methodology A type of linear model Capable of representing stationary.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
EC 827 Module 2 Forecasting a Single Variable from its own History.
Statistics for Business and Economics Module 2: Regression and time series analysis Spring 2010 Lecture 8: Time Series Analysis and Forecasting 2 Priyantha.
Lecture 12 Time Series Model Estimation Materials for lecture 12 Read Chapter 15 pages 30 to 37 Lecture 12 Time Series.XLSX Lecture 12 Vector Autoregression.XLSX.
Lecture 12 Time Series Model Estimation
Nonstationary Time Series Data and Cointegration
Financial Econometrics Lecture Notes 2
Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3
Chapter 6: Autoregressive Integrated Moving Average (ARIMA) Models
Cycles and Exponential Smoothing Models
Autocorrelation.
Introduction to Time Series
Autocorrelation.
BOX JENKINS (ARIMA) METHODOLOGY
Chap 7: Seasonal ARIMA Models
Presentation transcript:

Time Series Analysis Lecture 11 Materials for this lecture Lecture 11 Lags.XLSX Lecture 11 Stationarity.XLSX Read Chapter 15 pages 30-37 Read Chapter 16 Section 15

Forecast a variable based only on its past values Time Series Analysis Forecast a variable based only on its past values Called autoregressive models We are going to focus on the application and less on the estimation calculations because they are just simply OLS Simetar estimates TS models easily with a menu and provides forecasts of the time series model

Time Series Analysis TS is a comprehensive forecasting methodology for univariate distributions Best for variables unrelated to other variables in a structural model Autoregressive process in the simplest form is a regression model such as: Ŷt = f(Yt-1, Yt-2, Yt-3, Yt-4, … )

Time Series Analysis Steps for estimating a times series model are: Graph the data series to see what patterns are present trend, cycle, seasonal, and random Test data for Stationarity with a Dickie Fuller Test If original series is not stationary then difference it until it is Number of Differences (p) to make a series stationary is determined using the D-F Test Use the stationary (differenced) data series to determine the number of Lags that best forecasts the historical period Given a stationary series, use the Schwarz Criteria (SIC) or autocorrelation table to determine the best number of lags (q) to include when estimating the model Estimate the AR(p,q) Model and make a forecast using the Chain Rule

Two Time Series Analysis Assumptions Assume series you are forecasting is really random No trend, seasonal, or cyclical pattern remains in series Series is stationary or “white noise” To guarantee this assumption we must make the data stationary Assume the variability is constant, i.e., the same for the future as for the past, in other words σ2T+i = σ2t Can test for constant variance by testing correlation over time Crucial assumption because if σ2 changes overtime, then forecast will explode overtime

First Make the Data Series Stationary Take differences of the data to make it stationary First difference of raw data in Y to calculate D1,t D1,t = Yt – Yt-1 Calculate first difference of D1,t for a Second Difference of Y or D2,t = D1,t – D1,t-1 Calculate first difference of D2,t for a Third Difference of Y or D3,t = D2,t – D2,t-1 Stop differencing data when series is stationary

Make Data Series Stationary Example the Difference table for a time series data set D1t in period 1 is 0.41= 71.47 – 71.06 D1t in period 2 is -1.41= 70.06 – 71.47 D2t in period 1 is -1.82 = -1.41 – 0.41 D2t in period 2 is 1.66 = 0.25 – (– 1.41) Etc.

Make Data Series Stationary Example the Difference table for a time series data set D1t in period 1 is 0.41= 71.47 – 71.06 D1t in period 2 is -1.41= 70.06 – 71.47 D2t in period 1 is -1.82 = -1.41 – 0.41 D2t in period 2 is 1.66 = 0.25 – (– 1.41) Etc.

Dickie-Fuller Test for stationarity First D-F test First is original data stationary? Estimate regression Ŷ = a + b D1,t Calculate the t-ratio for b to see if a significant slope exists. D-F Test statistic is the t-statistic; if more negative than -2.9, we declare the dependent series (Ŷ in this case) stationary at alpha 0.05 level (-2.90 is a 1 tailed t test critical value) Thus a D-F statistic of -3.0 is more negative than -2.90 so the original series is stationary and differencing is not necessary

Next Level of Testing for Stationarity Second D-F Test for stationarity of the D1,t series, in other words Here we are testing if the Y series will be stationary with one differencing? Or we are asking if the D1,t series is stationary Estimate regression for D1,t = a + b D2,t t-statistic on slope b is the second D-F test statistic

Next Level of Testing for Stationarity Third D-F Test for stationarity of the D2,t series In other words will the Y series be stationary with two differences? So we are testing if D2,t is stationary Estimate regression for D2,t = a + b D3,t t-statistic on slope b is the third D-F test statistic Continue with a fourth and fifth DF test if necessary

Test for Stationarity How does D-F test work? A series is stationary, if it oscillates about the mean and has a slope of zero If a 1 period lag Yt-1 is used to predict Yt , they are good predictors of each other because they have the same mean. The t-statistic on b will be significant for D1,t = a + b D2,t Lagging the data 1 period causes the two series to be inversely related so the slope is negative in the OLS equation. That explains why the “t” coefficient of -2.9 is negative. The -2.9 represents a 5% 1 tail test statistic.

Test for Stationarity Estimated regression for: Ŷt = a + b D1,t DF is -1.868. You can see that by examining the t ratio on slope See a trend in original data and no trend in the residuals Intercept not zero so mean of Y is not constant

Test for Stationarity Estimated regression for: D1,t = a + b D2,t DF is -12.948, which is the t ratio on the slope parameter “b” See the residuals oscillate about a mean of zero, no trend in either series Intercept is 0.121 or about zero, so the mean is more likely to be constant

Test for Stationarity DF is -24.967 the t ratio on the slope variable Estimated regression for: D2,t = a + b D3,t DF is -24.967 the t ratio on the slope variable Intercept is about zero

DF Stationarity Test in Simetar Dickie Fuller (DF) function in Simetar =DF ( Data Series, Trend, No Lags, No. Diff to Test) Where: Data series is the location of the data, Trend is True or False No. Lags is optional, set to ZERO No. Diff is the number of differences to test

Test for Stationarity Number of differences that make the data series stationary is the one that has a DF test statistic more negative than -2.9 Here it is 1 difference no matter if trend included or not Lecture 5

Summarize Stationarity Yt is the original data series Di,t is the ith difference of the Yt series We difference the data to make it stationary To guarantee the assumption that the mean is constant Dickie Fuller test in Simetar to determine the no. of differences needed to make series stationary =DF(Data, Trend, , No. of Differences) Test 0 to 6 differences with and without trend and zero lags using =DF() Select the first difference with a DF test statistic more negative than -2.90

NEXT We Determine the Number of Lags Once we have determined the number of differences we know what form the auto regressive (AR) model will be fit AR ( No. Differences, No. Lags) or AR(p,q) If the series is stationary with 1 difference, estimate the OLS model D1,t = a + b1 D1,t-1 + b2 D1,t-2 + … The only question that remains is how many lags (q) of D1,t will we need to forecast the series To determine the number of lags we use several tests

Number of Lags for AR(p,q) Model Partial autocorrelation coefficients used to estimate number of lags for Di,t in model. If D1,t is stationary then: Test for one lag use b1 from OLS regression model D1,t = a + b1 D1,t-1 Test for two lags use b2 from OLS regression model D1,t = a + b1 D1,t-1 + b2 D1,t-2 Test for three lags use b3 from OLS regression model D1,t = a + b1 D1,t-1 + b2 D1,t-2 + b3 D1,t-3 After each regression we only use the beta (bi) for the longest number of lags, i.e., the bold ones above Use the t stat on the bi to determine contribution of the last lag to explaining D1,t So this calls for running lots of regressions with different numbers of lags, right?

Determining No. of Lags with Simetar Find the optimal number of lags the easy way Use Simetar to build a Sample Autocorrelation Table (SAC) =AUTOCORR(Data Series, No. Lags, No. Diff) Pick best no. of lags based on the last lag with a statistically significant t value

Number of Lags for Time Series Model Bar chart of autocorrelation coefficients in Sample AUTOCORR() Table The explanatory power of the distant lags is not large enough to warrant including in the model, based on their t stats, so do not include them.

Autocorrelation Charts of Sample Autocorrelation Coefficients (SAC) Dying down in a dampened exponential fashion – oscillation 1 -1 Lag k Dying down quickly Dying down extremely slowly 1 -1 Lag k A SAC that cuts off Dying down in a dampened exponential fashion – no oscillation

Number of Lags for Time Series Model Schwarz Criteria (SIC) tests for the optimal number of lags in a TS model Objective is to find the number of lags which minimizes the SIC In Simetar use the ARLAG() function which returns the optimal number of lags based on SIC test =ARLAG(Data Series, Constant, No. of Differences)

Schwarz Criteria and ARLAG Tables ARLAG indicates no. of lags to minimize SC –> 1 lag, given the no. of differences. Pick the no. of lags to Minimize the SC –> 1 lag Autocorr finds no. of lags where AC drops –> 1 lag

Summarize Stationarity/Lag Determination Make the data series stationary by differencing the data Use the Dickie-Fuller Test (df < -2.90) to find Di series that is stationary Use the =DF() function in Simetar Determine the number of Lags for the Di series Use the sample autocorrelation coefficients for alternative lag models =AUTOCORR() function in Simetar Minimize the Schwarz Criteria =ARLAG() or =ARSCHWARZ() functions in Simetar

Note: Partial vs. Sample Autocorrelation Partial autocorrelation coefficients (PAC) show the contribution of adding one more lag. It takes into consideration the impacts of lower order lags A beta for the 3rd lag shows the contribution of 3rd lag after having lags 1-2 in place D1,t = a + b1 D1,t-1 + b2 D1,t-2 + b3 D1,t-3 Sample autocorrelation coefficients (SAC) show contribution of adding a particular lag. A SAC for 3 lags shows the contribution of just the 3rd lag. Thus the SAC does not equal the PAC

Number of Lags for Time Series Model Some authors suggest using SAC to determine the number of differences to achieve stationarity If the SAC cuts off or dies down rapidly it is an indicator that the series is stationary If the SAC dies down very slowly, the series is not stationary This is a good check of the DF test, but we will rely on the DF test for stationarity