STAT 497 LECTURE NOTE 12 COINTEGRATION.

Slides:



Advertisements
Similar presentations
Multivariate Cointegartion
Advertisements

Cointegration and Error Correction Models
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
VAR Models Gloria González-Rivera University of California, Riverside
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
The Multiple Regression Model.
Part 12: Asymptotics for the Regression Model 12-1/39 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Vector Autoregressive Models
Using SAS for Time Series Data
Nonstationary Time Series Data and Cointegration Prepared by Vera Tabakova, East Carolina University.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
COINTEGRATION 1 The next topic is cointegration. Suppose that you have two nonstationary series X and Y and you hypothesize that Y is a linear function.
Economics 20 - Prof. Anderson1 Time Series Data y t =  0 +  1 x t  k x tk + u t 2. Further Issues.
Financial Econometrics
Unit Roots & Forecasting
Regression with Time-Series Data: Nonstationary Variables
Vector Error Correction and Vector Autoregressive Models
Instrumental Variables Estimation and Two Stage Least Square
The General Linear Model. The Simple Linear Model Linear Regression.
FITTING MODELS WITH NONSTATIONARY TIME SERIES 1 Detrending Early macroeconomic models tended to produce poor forecasts, despite having excellent sample-period.
STAT 497 APPLIED TIME SERIES ANALYSIS
Multiple regression analysis
The Simple Linear Regression Model: Specification and Estimation
Economics Prof. Buckles1 Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
The Simple Regression Model
Chapter 11 Multiple Regression.
Economics 20 - Prof. Anderson
Topic 3: Regression.
Linear and generalised linear models
Basics of regression analysis
14 Vector Autoregressions, Unit Roots, and Cointegration.
Correlation and Regression Analysis
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
Regression Method.
STAT 497 LECTURE NOTES 2.
The Examination of Residuals. The residuals are defined as the n differences : where is an observation and is the corresponding fitted value obtained.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation Note: Homework Due Thursday.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
Centre of Full Employment and Equity Slide 2 Short-run models and Error Correction Mechanisms Professor Bill Mitchell Director, Centre of Full Employment.
Cointegration in Single Equations: Lecture 6 Statistical Tests for Cointegration Thomas 15.2 Testing for cointegration between two variables Cointegration.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
The Properties of Time Series: Lecture 4 Previously introduced AR(1) model X t = φX t-1 + u t (1) (a) White Noise (stationary/no unit root) X t = u t i.e.
How do we identify non-stationary processes? (A) Informal methods Thomas 14.1 Plot time series Correlogram (B) Formal Methods Statistical test for stationarity.
Correlation & Regression Analysis
EC208 – Introductory Econometrics. Topic: Spurious/Nonsense Regressions (as part of chapter on Dynamic Models)
Previously Definition of a stationary process (A) Constant mean (B) Constant variance (C) Constant covariance White Noise Process:Example of Stationary.
Experimental Statistics - week 9
NURHIKMAH OLA LAIRI (LAILUOLA) Ph.D International Trade Student Id :
Financial Econometrics – 2014 – Dr. Kashif Saleem 1 Financial Econometrics Dr. Kashif Saleem Associate Professor (Finance) Lappeenranta School of Business.
Cointegration in Single Equations: Lecture 5
Univariate Time series - 2 Methods of Economic Investigation Lecture 19.
Lecture 5 Stephen G. Hall COINTEGRATION. WE HAVE SEEN THE POTENTIAL PROBLEMS OF USING NON-STATIONARY DATA, BUT THERE ARE ALSO GREAT ADVANTAGES. CONSIDER.
Lecturer: Ing. Martina Hanová, PhD. Business Modeling.
Review of Unit Root Testing D. A. Dickey North Carolina State University (Previously presented at Purdue Econ Dept.)
Financial Econometrics Lecture Notes 4
Nonstationary Time Series Data and Cointegration
Spurious Regression and Simple Cointegration
Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3
CHAPTER 16 ECONOMIC FORECASTING Damodar Gujarati
6-1 Introduction To Empirical Models
Spurious Regression and Simple Cointegration
Spurious Regression and Simple Cointegration
VAR Models Gloria González-Rivera University of California, Riverside
Product moment correlation
Tutorial 6 SEG rd Oct..
Presentation transcript:

STAT 497 LECTURE NOTE 12 COINTEGRATION

Multivariate Unit Root Processes Generally we cannot reject the null hypothesis, that many time series have unit roots. For example,log consumption and log output are both non-stationary, but log consumption –log output is stationary. This situation is called cointegration. The practical problem is that when we have cointegration, asymptotics change completely. Furthermore, we really do not have enough data to definitively tell whether or not we have cointegrated series.

Multivariate Unit Root Processes In a univariate nonstationary time series Yt is said to be integrated of order d, I(d), if its (d1)th difference is nonstationary but d-th difference is stationary. If Yt is nonstationary but Yt=(1B)Yt is stationary, then Yt is integrated of order 1. Yt~I(1) but Yt~I(0)

Multivariate Unit Root Processes In many time series, integrated processes are considered together and they form equilibrium relationships. Short-term and long-term interest rates Income and consumption These leads to the concept of cointegration. The idea behind the cointegration is that although multivariate time series is integrated, certain linear transformations of the time series may be stationary.

Granger Causality Tests According to Granger, causality can be further sub-divided into long-run and short-run causality. This requires the use of error correction models or VECMs, depending on the approach for determining causality. Long-run causality is determined by the error correction term, whereby if it is significant, then it indicates evidence of long run causality from the explanatory variable to the dependent variable. Short-run causality is determined as before, with a test on the joint significance of the lagged explanatory variables, using an F-test or Wald test.

Granger Causality Tests Before the ECM can be formed, there first has to be evidence of cointegration, given that cointegration implies a significant error correction term, cointegration can be viewed as an indirect test of long-run causality. It is possible to have evidence of long-run causality, but not short-run causality and vice versa. In multivariate causality tests, the testing of long-run causality between two variables is more problematic, as it is impossible to tell which explanatory variable is causing the causality through the error correction term.

SPURIOUS REGRESSION If we regress a y series with unit root on regressors who also have unit roots the usual t tests on regression coefficients show statistically significant regressions, even if in reality it is not so. The Spurious Regression Problem can appear with I(0) series (see Granger, Hyung and Jeon (1998)). This is telling us that the problem is generated by using WRONG CRITICAL VALUES!!!! In a Spurious Regression the errors would be correlated and the standard t-statistic will be wrongly calculated because the variance of the errors is not consistently estimated. In the I(0) case the solution is:

SPURIOUS REGRESSION How do we detect a Spurious Regression (between I(1) series)? Looking at the correlogram of the residuals and also by testing for a unit root on them. How do we convert a Spurious Regression into a valid regression? By taking differences. Does this solve the SPR problem? It solves the statistical problems but not the economic interpretation of the regression. Think that by taking differences we are loosing information and also that it is not the same information contained in a regression involving growth rates than in a regression involved the levels of the variables.

SPURIOUS REGRESSION Typical symptom: “High R2, t-values, F-value, but low DW” 1. Egyptian infant mortality rate (Y), 1971-1990, annual data, on Gross aggregate income of American farmers (I) and Total Honduran money supply (M) Y ^ = 179.9 - .2952 I - .0439 M, R2 = .918, DW = .4752, F = 95.17 (16.63) (-2.32) (-4.26) Corr = .8858, -.9113, -.9445 2. US Export Index (Y), 1960-1990, annual data, on Australian males’ life expectancy (X) Y ^ = -2943. + 45.7974 X, R2 = .916, DW = .3599, F = 315.2 (-16.70) (17.76) Corr = .9570

SPURIOUS REGRESSION 3. US Defense Expenditure (Y), 1971-1990, annual data, on Population of South African (X) Y ^= -368.99 + .0179 X, R2 = .940, DW = .4069, F = 280.69 (-11.34) (16.75) Corr = .9694 4. Total Crime Rates in the US (Y), 1971-1991, annual data, on Life expectancy of South Africa (X) Y ^= -24569 + 628.9 X, R2 = .811, DW = .5061, F = 81.72 (-6.03) (9.04) Corr = .9008 5. Population of South Africa (Y), 1971-1990, annual data, on Total R&D expenditure in the US (X) Y ^= 21698.7 + 111.58 X, R2 = .974, DW = .3037, F = 696.96 (59.44) (26.40) Corr = .9873

SPURIOUS REGRESSION Does it make sense a regression between two I(1) variables? Yes if the regression errors are I(0). Can this be possible? The same question asked David Hendry to Clive Granger time ago. Clive answered NO WAY!!!!! but he also said that he would think about. In the plane trip back home to San Diego, Clive thought about it and concluded that YES IT IS POSSIBLE. It is possible when both variables share the same source of the I(1)’ness (co-I(1)), when both variables move together in the long-run (co-move), ... when both variables are COINTEGRATED!

COINTEGRATION An mx1 vector time series Yt is said to be cointegrated of order (d, b), CI(d,b) where 0<bd, if each of its component series Yit is I(d) but some linear combination of the series ’Yt is I(db) for some nonzero constant vector ’. ’ is the cointegrating vector or the long run parameter and it is not unique. The most common case is d=b=1.

 Yt is cointegrated of rank k. COINTEGRATION More generally, if the mx1 vector series Yt contains more than two components, each being I(1), then there may exist k (<m) linearly independent 1xm vectors 1’, 2’,…, k’, such that ’Yt is a nonstationary kx1 vector process where is a kxm cointegrating matrix. The number of linearly independent cointegrating vectors is called the cointegrating rank.  Yt is cointegrated of rank k.

EXAMPLE Consider the following system of processes

VAR with Cointegration Let Yt be mx1. Suppose we estimate VAR(p) or Let say we have a unit root. Then, we can write This is like a multivariate version of the augmented Dickey- Fuller test

VAR with Cointegration Rearranging the equation where Rank((1)I)<m. There are two cases: (1)= I then we have m independent unit roots, so there is no cointegration, and we should run the VAR in differences. 0<Rank((1)I)=k<m, then we can write (1)I =’ where  and  are mxk. The equation becomes: This is called a vector error correction model (VECM).

VAR with Cointegration Note that if you run OLS in differences, then the modeled is misspecified and the results will be biased. What can you do? (a)If you know the location of the unit roots and cointegration relations, then you can run the VECM by doing OLS of Yt on lags of Y and ’Yt1. (b)If you know nothing, then you can either (i) run OLS in levels, or (ii) test (many times) to estimate cointegrating relations, and run VECM. The problem with this approach is that you are testing many times and you are estimating cointegrating relationships. This leads to poor finite sample properties.

Residual Based Tests of the Null of No Cointegration Procedures designed to distinguish a system without cointegration from a system with at least one cointegrating relationship; they do not estimate the number of cointegrating vectors (the k). Tests are conditional on a pretest for unit roots in each of the variables. When the cointegration vector is known: construct the hypothesized linear combination that is stationary, treat it as data, and apply a Dickey-Fuller unit root test to that linear combination. The null hypothesis is that there is a unit root, or no cointegration.

Residual Based Tests of the Null of No Cointegration When the cointegration vector is not known: Assume that, if there exists a cointegrating relation, the coefficient on Y1t is nonzero, allowing us to express the “static regression equation” as You can apply a unit root test to the estimated OLS residual from estimation of the above equation, but Include a constant in the static regression if the alternative allows for a nonzero mean in ut Include a trend in the static regression if the alternative is stochastic cointegration, i.e., a nonzero trend for A’Yt.

Residual Based Tests of the Null of No Cointegration The first step in testing cointegration is to test the null hypothesis of a unit root in each component series Yit individually using the univariate unit root tests. If the hypothesis is not rejected, then the next step is to test cointegration among the components, i.e., to test whether ’Yt is stationary.

Residual Based Tests of the Null of No Cointegration In practice, the cointegration vector is unknown. One way to test the existence of cointegration is the regression method (Engle&Granger, 1986, 1987). If Yt=(Y1t,Y2t,…,Ymt) is cointegrated, ’Yt is stationary where =(1, 2,…, m). Then, (1/1) is also a cointegrated vector where 10.

Residual Based Tests of the Null of No Cointegration Consider the regression model for Y1t and check whether t is I(1) or I(0). If t~I(1), then Yt is not cointegrated. If t~I(0), then Yt is cointegrated with a normalizing cointegrating vector ’=(1,1,…, m1) .

Residual Based Tests of the Null of No Cointegration In testing the error series for nonstationary, Calculate the OLS estimate Use the residual series for the test using the standard ADF or PP . if t~I(0). H0: =1 vs H1: <1 for the model H0: =0 vs H1:  <1 for the model

Residual Based Tests of the Null of No Cointegration t-statistic: The critical values are obtained by simulation (Engle&Granger, 1987). Level of significance 1% 5% p=1 4.07 3.37 p>1  3.73 3.17 If T<Critical Value, reject H0Cointegration exists.

The Johansen Trace and Maximal Eigenvalue Tests To test whether the variables are cointegrated or not, one of the well-known tests is the Johansen trace test. The Johansen test is used to test for the existence of cointegration and is based on the estimation of the ECM by the maximum likelihood, under various assumptions about the trend or intercepting parameters, and the number k of cointegrating vectors, and then conducting likelihood ratio tests.

The Johansen Trace and Maximal Eigenvalue Tests Assuming that the ECM errors are independent Nm[0, ] distribution, and given the cointegrating restrictions on the trend or intercept parameters, the maximum likelihood Lmax(k) is a function of the cointegration rank k. The trace test is based on the log-likelihood ratio ln[Lmax(k)/Lmax(k+1)], and is conducted sequentially for k = m-1,...,1,0. The name comes from the fact that the test statistics involved are the trace (the sum of the diagonal elements) of a diagonal matrix of generalized eigenvalues. This test examines the null hypothesis that the cointegration rank is less than or equal to k, against the alternative that the cointegration rank is greater than k. If the trace is greater than the critical value for a certain rank, then the null hypothesis that the cointegration rank is equal to k is rejected.

The Johansen Trace and Maximal Eigenvalue Tests Consider a non-stationary cointegrated VAR(p) model where at are normally distributed with mean 0 and covariance matrix . In a series of influential papers, Johansen (1988, 1991), and Johansen and Juselius (1990) proposed practical full maximum likelihood estimation and testing approaches based on the error correction representation (ECM).

The Johansen Trace and Maximal Eigenvalue Tests Consider the ECM where ,dt is a vector of deterministic variables, such as constant and seasonal dummy variables, are m×m, ,A and  are m×k parameter matrices, the are i.i.d. Nm(0, ) errors, and det( ) has all of its roots outside the unit circle.

The Johansen Trace and Maximal Eigenvalue Tests This ECM is based on the Engle-Granger (1987) error correction representation theorem for cointegrated systems, and the asymptotic inference involved is related to the work of Sims, Stock, and Watson (1990). By step-wise concentrating all the parameter matrices in the likelihood function out except for the matrix A, Johansen shows that the maximum likelihood estimator of A can be derived as the solution of a generalized eigenvalue problem. Likelihood ratio tests of hypotheses about the number of cointegrating vectors can then be based on these eigenvalues. Moreover, Johansen (1988) also proposes likelihood ratio tests for linear restrictions on these cointegrating vectors.

The Johansen Trace and Maximal Eigenvalue Tests The Johansen test for the existence of cointegration is based on the estimation of the above ECM by the maximum likelihood and is used to test the hypothesis , where k is less than m. This formulation shows that I(1) models form nested sequence models where H(m) is the unrestricted VAR model or I(0) model, and H(0) corresponds to the restriction =0, which is the VAR model for indifferences. Since , it is equivalent to test that A and  are of full column rank k, the number of independent cointegrating vectors that forms the matrix A. The test has been named the Johansen trace test because the likelihood ratio test statistic is the trace of a diagonal matrix of generalized eigenvalues from .

The Johansen Trace and Maximal Eigenvalue Tests Sequential tests: i. H0: k=0, cannot be rejected →stop (at most zero coint) rejected →next test ii. H0: k<=1, cannot be rejected →stop→k=1 (at most one coint) rejected →next test iii. H0: k2, cannot be rejected →stop→k=2 (at most two coint) rejected →next test

The Johansen Trace and Maximal Eigenvalue Tests (i) Rank k = m: all variables in x are I(0), not an interesting case to start with. (ii) Rank k = 0: there are no linear combinations of x that are I(0), no cointegration exists, and  is full of zeros. Model on differenced series (iii) Rank k  (m-1): up to (m-1) cointegration relationships ´xt-k. i.e. k  (m-1) rows of  form k linearly independent combinations of variables in x, each of which is I(0); alternatively (m-k) nonstationary vectors forming I(1) stochastic trends.

The Johansen Trace and Maximal Eigenvalue Tests Under some regularity conditions, we can write the cointegrated process as an Error Correction Model (ECM): where  is the difference operator , the at's are i.i.d. N(0, ).

The Johansen Trace and Maximal Eigenvalue Tests We can write this ECM as where , , , The likelihood ratio statistic for hypothesis is given by where denotes the eigenvalues of and are ordered by

The Johansen Trace and Maximal Eigenvalue Tests Where If the test statistics are greater than the critical value for rank k, then the null hypothesis that the cointegration rank is equal to k is rejected.

The Johansen Trace and Maximal Eigenvalue Tests The statistic ln has the following limiting distribution which can be expressed in terms of a mk dimensional Brownian motion as The percentiles of the asymptotic distribution for the trace statistic are tabulated in Johansen (1988, Table 1) using simulation analysis.

The Johansen Trace and Maximal Eigenvalue Tests An alternative LR statistic, given by and called the maximal eigenvalue statistic, examines the null hypothesis of k cointegrating vectors versus the alternative k+1 cointegrating vectors. The asymptotic distribution of this statistic is given by the maximum eigenvalue of the stochastic matrix in

Analysis of U.S. Economic Variables (From SAS Online Doc) Consider the following four-dimensional system of U.S. economic variables. Quarterly data for the years 1954 to 1987 are used (Lütkepohl 1993, Table E.3.). The following statements plot the series and proceed with the VARMAX procedure.

SAS Code symbol1 v=none height=1 c=black; symbol2 v=none height=1 c=black; title 'Analysis of U.S. Economic Variables'; data us_money; date=intnx( 'qtr', '01jan54'd, _n_-1 ); format date yyq. ; input y1 y2 y3 y4 @@; y1=log(y1); y2=log(y2); label y1='log(real money stock M1)' y2='log(GNP in bil. of 1982 dollars)' y3='Discount rate on 91-day T-bills' y4='Yield on 20-year Treasury bonds'; datalines; ... data lines omitted ... ; legend1 across=1 frame label=none;

SAS Code (Contd.) proc gplot data=us_money; symbol1 i = join l = 1; symbol2 i = join l = 2; axis2 label = (a=-90 r=90 " "); plot y1 * date = 1 y2 * date = 2 / overlay vaxis=axis2 legend=legend1; run; plot y3 * date = 1 y4 * date = 2 / overlay vaxis=axis2 legend=legend1; proc varmax data=us_money; id date interval=qtr; model y1-y4 / p=2 lagmax=6 dftest print=(iarr(3)) cointtest=(johansen=(iorder=2)) ecm=(rank=1 normalize=y1); cointeg rank=1 normalize=y1 exogeneity;

SAS Output This example performs the Dickey-Fuller test for stationarity, the Johansen cointegrated test integrated order 2, and the exogeneity test. The VECM(2) fits the data. From the outputs shown below, you can see that the series has unit roots and is cointegrated in rank 1 with integrated order 1. The fitted VECM(2) is given as

SAS Output

SAS Output Dickey-Fuller Unit Root Tests Variable Type Rho Pr < Rho Tau Pr < Tau y1 Zero Mean 0.05 0.6934 1.14 0.9343   Single Mean -2.97 0.6572 -0.76 0.8260 Trend -5.91 0.7454 -1.34 0.8725 y2 0.13 0.7124 5.14 0.9999 -0.43 0.9309 -0.79 0.8176 -9.21 0.4787 -2.16 0.5063 y3 -1.28 0.4255 -0.69 0.4182 -8.86 0.1700 -2.27 0.1842 -18.97 0.0742 -2.86 0.1803 y4 0.40 0.7803 0.45 0.8100 -2.79 0.6790 -1.29 0.6328 -12.12 0.2923 -2.33 0.4170

Cointegration Rank Test for I(2) SAS Output Cointegration Rank Test for I(2) r\k-r-s 4 3 2 1 Trace of I(1) 5% CV of I(1) 384.60903 214.37904 107.93782 37.02523 55.9633 47.21   219.62395 89.21508 27.32609 20.6542 29.38 73.61779 22.13279 2.6477 15.34 38.29435 0.0149 3.84 5% CV I(2) 47.21000 29.38000 15.34000 3.84000

SAS Output Long-Run Parameter Beta Estimates When RANK=1 Variable 1 y1 1.00000 y2 -0.46458 y3 14.51619 y4 -9.35520 Adjustment Coefficient Alpha Estimates When RANK=1 Variable 1 y1 -0.01396 y2 -0.02811 y3 -0.00215 y4 0.00510

Diagnostic Checks Schematic Representation of Cross Correlations of Residuals Variable/Lag 1 2 3 4 5 6 y1 ++.. .... +... ..-- y2 ++++ y3 .+++ +.-. ..++ -... y4 ..+. + is > 2*std error,  - is < -2*std error,  . is between Portmanteau Test for Cross Correlations of Residuals Up To Lag DF Chi-Square Pr > ChiSq 3 16 53.90 <.0001 4 32 74.03 5 48 103.08 6 64 116.94

Diagnostic Checks Univariate Model ANOVA Diagnostics Variable R-Square Standard Deviation F Value Pr > F y1 0.6754 0.00712 32.51 <.0001 y2 0.3070 0.00843 6.92 y3 0.1328 0.00807 2.39 0.0196 y4 0.0831 0.00403 1.42 0.1963 Univariate Model White Noise Diagnostics Variable Durbin Watson Normality ARCH Chi-Square Pr > ChiSq F Value Pr > F y1 2.13418 7.19 0.0275 1.62 0.2053 y2 2.04003 1.20 0.5483 1.23 0.2697 y3 1.86892 253.76 <.0001 1.78 0.1847 y4 1.98440 105.21 21.01

Univariate Model AR Diagnostics Diagnostic Checks Univariate Model AR Diagnostics Variable AR1 AR2 AR3 AR4 F Value Pr > F y1 0.68 0.4126 2.98 0.0542 2.01 0.1154 2.48 0.0473 y2 0.05 0.8185 0.12 0.8842 0.41 0.7453 0.30 0.8762 y3 0.56 0.4547 2.86 0.0610 4.83 0.0032 3.71 0.0069 y4 0.01 0.9340 0.16 0.8559 1.21 0.3103 0.95 0.4358

Testing Weak Exogeneity of Each Variables Diagnostic Checks Testing Weak Exogeneity of Each Variables Variable DF Chi-Square Pr > ChiSq y1 1 6.55 0.0105 y2 12.54 0.0004 y3 0.09 0.7695 y4 1.81 0.1786 Whether each variable is the weak exogeneity of other variables. The variable y1 is not the weak exogeneity of other variables, y2, y3, and y4; the variable y2 is not the weak exogeneity of other variables, y1, y3, and y4. If a variable can be taken as "given" without losing information for the purpose of statistical inference, it call weak exogenous. Weak exogeneityLong-run noncausality