Generalized method of moments estimation FINA790C HKUST Spring 2006.

Slides:



Advertisements
Similar presentations
Tests of Static Asset Pricing Models
Advertisements

Generalized Method of Moments: Introduction
1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.
Classical Linear Regression Model
General Linear Model With correlated error terms  =  2 V ≠  2 I.
The Simple Regression Model
Part 12: Asymptotics for the Regression Model 12-1/39 Econometrics I Professor William Greene Stern School of Business Department of Economics.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Instrumental Variables Estimation and Two Stage Least Square
The General Linear Model. The Simple Linear Model Linear Regression.
L18: CAPM1 Lecture 18: Testing CAPM The following topics will be covered: Time Series Tests –Sharpe (1964)/Litner (1965) version –Black (1972) version.
Economics Prof. Buckles1 Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
Maximum likelihood (ML) and likelihood ratio (LR) test
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Simple Linear Regression
Chapter 3 Simple Regression. What is in this Chapter? This chapter starts with a linear regression model with one explanatory variable, and states the.
Point estimation, interval estimation
Prof. Dr. Rainer Stachuletz
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
SYSTEMS Identification
Generalized Regression Model Based on Greene’s Note 15 (Chapter 8)
Linear beta pricing models: cross-sectional regression tests FINA790C Spring 2006 HKUST.
The Simple Regression Model
Topic4 Ordinary Least Squares. Suppose that X is a non-random variable Y is a random variable that is affected by X in a linear fashion and by the random.
Tests of linear beta pricing models (I) FINA790C Spring 2006 HKUST.
Empirical Financial Economics 2. The Efficient Markets Hypothesis - Generalized Method of Moments Stephen Brown NYU Stern School of Business UNSW PhD Seminar,
Economics Prof. Buckles
Part 21: Generalized Method of Moments 21-1/67 Econometrics I Professor William Greene Stern School of Business Department of Economics.
The Lognormal Distribution
1 In a second variation, we shall consider the model shown above. x is the rate of growth of productivity, assumed to be exogenous. w is now hypothesized.
Separate multivariate observations
9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.
Regression Method.
Multiple Linear Regression - Matrix Formulation Let x = (x 1, x 2, …, x n )′ be a n  1 column vector and let g(x) be a scalar function of x. Then, by.
Chapter 4-5: Analytical Solutions to OLS
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
Generalised method of moments approach to testing the CAPM Nimesh Mistry Filipp Levin.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
1 G Lect 4W Multiple regression in matrix terms Exploring Regression Examples G Multiple Regression Week 4 (Wednesday)
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Computacion Inteligente Least-Square Methods for System Identification.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Econometrics III Evgeniya Anatolievna Kolomak, Professor.
Estimation Econometría. ADE.. Estimation We assume we have a sample of size T of: – The dependent variable (y) – The explanatory variables (x 1,x 2, x.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Statistical Interpretation of Least Squares ASEN.
MathematicalMarketing Slide 3c.1 Mathematical Tools Chapter 3: Part c – Parameter Estimation We will be discussing  Nonlinear Parameter Estimation  Maximum.
Estimating standard error using bootstrap
Esman M. Nyamongo Central Bank of Kenya
6. Simple Regression and OLS Estimation
CH 5: Multivariate Methods
Evgeniya Anatolievna Kolomak, Professor
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
Undergraduated Econometrics
Simultaneous equation models Prepared by Nir Kamal Dahal(Statistics)
Simple Linear Regression
Econometrics Chengyuan Yin School of Mathematics.
Product moment correlation
Linear Panel Data Models
Econometrics I Professor William Greene Stern School of Business
Topic 11: Matrix Approach to Linear Regression
Presentation transcript:

Generalized method of moments estimation FINA790C HKUST Spring 2006

Outline Basic motivation and examples First case: linear model and instrumental variables –Estimation –Asymptotic distribution –Test of model specification General case of GMM –Estimation –Testing

Basic motivation Method of moments: suppose we want to estimate the population mean  and population variance  2 of a random variable v t These two parameters satisfy the population moment conditions E[v t ] -  = 0 E[v t 2 ] – (  2 +  2 ) = 0

Method of moments So let’s think of the analogous sample moment conditions: T -1  v t -  = 0 T -1  v t 2 – (  2 +  2 ) = 0 This implies  = T -1  v t  2 = T -1  (v t -  )2 Basic idea: population moment conditions provide information which can be used for estimation of parameters

Example Simple supply and demand system: q t D =  p t + u t D q t S =  1 n t +  2 p t + u t S q t D = q t S = q t q t D, q t S are quantity demanded and supplied; p t is price and q t equals quantity produced Problem: how to estimate , given q t and p t OLS estimator runs into problems

Example One solution: find z t D such that cov(z t D, u t D ) = 0 Then cov(z t D,q t ) –  cov(z t D,p t ) = 0 So if E[u t D ] = 0 then E[z t D q t ] –  E[z t D p t ] = 0 Method of moments leads to  * = (T -1  z t D q t )/(T -1  z t D p t ) which is the instrumental variables estimator of  with instrument z t D

Method of moments Population moment condition: vector of observed variables, v t, and vector of p parameters θ, satisfy a px1 element vector of conditions E[f(v t,θ)] = 0 for all t The method of moments estimator θ* T solves the analogous sample moment conditions g T (θ*) = T -1  f(v t,θ* T ) = 0 (1) where T is the sample size Intuitively θ* T → in probability to the solution θ 0 of (1)

Generalized method of moments Now suppose f is a qx1 vector and q>p The GMM estimator chooses the value of θ which comes closest to satisfying (1) as the estimator of θ 0, where “closeness” is measured by Q T (θ) = T -1  f(v t,θ)W T T -1  f(v t,θ) = g T (θ)W T g T (θ) and W T is psd and plim(W T ) = W pd

GMM example 1 Power utility based asset pricing model E t [  (C t+1 /C t ) -  R it+1 – 1] = 0 with unknown parameters  The population unconditional moment conditions are E[(  (C t+1 /C t ) -  R it+1 – 1)z jt ] = 0 for j=1,…,q where z jt is in information set

GMM example 2 CAPM says E[ R it (1-  i ) –  i R mt+1 ] =0 Market efficiency E[( R it (1-  i ) –  i R mt+1 )z jt ] =0

GMM example 3 Suppose the conditional probability density function of the continuous stationary random vector v t, given V t-1 ={v t-1,v t-2,…} is p(v t ;θ 0,V t-1 ) The MLE of θ 0 based on the conditional log likelihood function is the value of which maximizes L T (θ) =  ln{p(v t ;θ,V t-1 )}, i.e. which solves ∂L T (θ)/∂θ = 0 This implies that the MLE is just the GMM estimator based on the population moment condition E[ ∂ln{p(v t ;θ,V t-1 )}/∂θ} ] =0

GMM estimation The GMM estimator θ* T = argmin θ ε  Q T (θ) generates the first order conditions {T -1  ∂f(v t,θ* T )/∂θ´}´W T {T -1  f(v t,θ* T )} = 0 (2) where ∂f(v t,θ* T )/∂θ´ is the q x p matrix with i,j element ∂f i (v t,θ* T )/∂θ j ´ There is typically no closed form solution for θ* T so it must be obtained through numerical optimization methods

Example: Instrumental variable estimation of linear model Suppose we have y t = x t ´θ 0 + u t, t=1,…,T Where x t is a (px1) vector of observed explanatory variables, y t is an observed scalar, u t is an unobserved error term with E[u t ] = 0 Let z t be a qx1 vector of instruments such that E[z t ´u t ]=0 (contemporaneously uncorrelated) Problem is to estimate θ 0

IV estimation The GMM estimator θ* T = argmin θ ε  Q T (θ) where Q T (θ) = {T -1 u(θ)’Z}W T {T -1 Z’u(θ)} The FOCs are (T -1 X’Z)W T (T -1 Z’y) = (T -1 X’Z)W T (T -1 Z’X)θ* T When # of instruments q = # of parameters p (“just-identified”) and (T -1 Z’X) is nonsingular then θ* T = (T -1 Z’X) -1 (T -1 Z’y) independently of the weighting matrix W T

IV estimation When # instruments q > # parameters p (“over-identified”) then θ* T = {(T -1 X’Z)W T (T -1 Z’X)} -1 (T -1 X’Z)W T (T -1 Z’y)

Identifying and overidentifying restrictions Go back to the first-order conditions (T -1 X’Z)W T (T -1 Z’u(θ* T )) = 0 These imply that GMM = method of moments based on population moment conditions E[x t z t ’]WE[z t u t (θ 0 )] = 0 When q = p GMM = method of moments based on E[z t u t (θ 0 )] = 0 When q > p GMM sets p linear combinations of E[z t u t (θ 0 )] equal to 0

Identifying and over-identifying restrictions: details From (2), GMM is method of moments estimator based on population moments {E[∂f(v t,θ 0 )/∂θ’]}’W{E[f(v t,θ 0 )]} = 0(3) Let F(θ 0 )=W ½ E[∂f(v t,θ 0 )/∂θ´] and rank(F(θ 0 ))=p. (The rank condition is necessary for identification of θ 0 ). Rewrite (3) as F(θ 0 )’W ½ E[f(v t,θ 0 )] = 0 or F(θ 0 )[F(θ 0 )’F(θ 0 )] -1 F(θ 0 )’W ½ E[f(v t,θ 0 )]=0(4) This says that the least squares projection of W ½ E[f(v t,θ 0 )] on to the column space of F(θ 0 ) is zero. In other words the GMM estimator is based on rank{F(θ 0 )[F(θ 0 )’F(θ 0 )] -1 F(θ 0 )’} = p restrictions on the (transformed) population moment condition W ½ E[f(v t,θ 0 )]. These are the identifying restrictions; GMM chooses θ* T to satisfy them.

Identifying and over-identifying restrictions: details The restrictions that are left over are {I q - F(θ 0 )[F(θ 0 )’F(θ 0 )] -1 F(θ 0 )’}W ½ E[f(v t,θ 0 )]=0 This says that the projection of W ½ E[f(v t,θ 0 )] on to the orthogonal complement of F(θ 0 ) is zero, generating q-p restrictions on the transformed population moment condition. These over-identifying restrictions are ignored by the GMM estimator, so they need not be satisfied in the sample From (2), W T ½ T -1  f(v t,θ* T )= {I q - F T (θ* T )[F T (θ* T )’F T (θ* T )] -1 F T (θ* T )’}W T ½ T -1 f(v t,θ* T ) where F T (θ)= W T ½ T -1  ∂f(v t,θ)/ ∂θ´. Therefore Q T (θ* T ) is like a sum of squared residuals, and can be interpreted as a measure of how far the sample is from satisfying the over-identifying restrictions.

Asymptotic properties of GMM instrumental variables estimator It can be shown that: θ* T is consistent for θ 0 (θ* T,i – θ 0,i )/√V* T,ii /T ~ N(0,1) asymptotically where –V* T = (X’ZW T Z’X) -1 X’ZW T S* T W T Z’X (X’ZW T Z’X) -1 –S* T = lim T→∞ Var[T -½ Z’u]

Covariance matrix estimation Assuming u t is serially uncorrelated Var[T -½ Z’u] = E{T -½  z t u t }{T -½  z t u t } = T -1  E[u t 2 z t z t ’} Therefore, S* T = T -1  u t (θ* T ) 2 z t z t ’

Two step estimator Notice that the asymptotic variance depends on the weighting matrix W T The optimal choice is W T = S* T -1 to give V* T = (X’Z S* T -1 Z’X) -1 But we need θ* T to construct S* T This suggests a two-step (iterative) procedure –(a) estimate with sub-optimal W T, get θ* T (1),get S* T (1) –(b) estimate with W T = S* T (1) -1 –(c) iterated GMM: from (b) get θ* T (2), get S* T (2), set W T = S* T (2) -1, repeat

GMM and instrumental variables If u t is homoskedastic: var[u]=  2 I T then S* T =  * 2 Z t Z t ’ where Z t is a (Txq) matrix with t-th row = z t ’ and  * 2 is a consistent estimate of  2 Choosing this as the weighting matrix W T = S* T -1 gives θ* T = {X’Z(Z’Z) -1 Z’X)} -1 X’Z(Z’Z) -1 Z’y = {X^’X^} -1 X^’y where X^=Z(Z’Z) -1 Z’X is the predicted value of X from a regression of X on Z. This is two-stage least squares (2SLS): first regress X on Z, then regress y on the fitted value of X from the first stage.

Model specification test Identifying restrictions are satisfied in the sample regardless of whether the model is correct Over-identifying restrictions not imposed in the sample This gives rise to the overidentifying restrictions test J T = TQ T (θ* T ) = T -½ u(θ* T )’Z S* T -1 T -½ Z’u(θ* T ) Under H 0 : E[z t u t (θ 0 )] = 0, J T asymptotically follows a  2 (q-p) distribution

(From last time) In this case the MRS is m t,t+j =  j { C t+j /C t } -  Substituting in above gives: E t [ln(R it,t+j )] = -jln  +  E t  c t+j - ½  2 var t [  lnc t+j ] - ½var t [lnR it,t+j ] +  cov t [  lnc t+j,lnR it,t+j ]

Example: power utility + lognormal distributions First order condition from investor maximizing power utility with log-normal distribution gives: ln(R it+1 ) =  i +  lnC t+1 + u it+1 the error term u it+1 could be correlated with  lnC t+1 so we can’t use OLS However E t [u it+1 ] = 0 means u it+1 is uncorrelated with any z t known at time t

Instrumental variables regressions for returns and consumption growth Return Stage 1 Stage 2J T test r Δlnc γ  r Δlnc CP (0.00) (0.15) (0.57) (0.11) (0.00) (0.10) Stock (0.11) (0.15) (1.65) (0.06) (0.06) (0.08) Annual US data on growth in log real consumption of nondurables & services, log real return on S&P500, real return on 6- month commercial paper. Instruments are 2 lags each of: real commercial paper rate, real consumption growth rate and log of dividend-price ratio

GMM estimation – general case Go back to GMM estimation but let f be a vector of continuous nonlinear functions of the data and unknown parameters In our case, we have N assets and the moment condition is that E[(m(x t,  0 )R it -1)z jt-1 ]=0 using instruments z jt-1 for each asset i=1,…,N and each instrument j=1,…,q Collect these as f(v t,  )= z t-1 ’  u t (X t,  ) where z t is a 1xq vector of instruments and u t is a Nx1 vector. f is a column vector with qN elements – it contains the cross-product of each instrument with each element of u. The population moment condition is E[f(v t,  0 )] = 0

GMM estimation – general case As before, let g T (θ) = T -1  f(v t,θ). The GMM estimator is θ* T = argmin θ ε  Q T (θ) The FOCs are G T (  )’W T g T (  ) = 0 where G T (  ) is a matrix of partial derivatives with i, j element δg Ti /δ  j

GMM asymptotics It can be shown that: θ* T is consistent for θ 0 asyvar(θ* T,ii ) = MSM’ where –M = (G 0 ’WG 0 ) -1 G 0 ’W –G 0 = E[∂f(v t,θ 0 )/∂θ’] –S = lim T→∞ Var[T -½ g T (θ 0 )]

Covariance matrix estimation for GMM In practice V* T = M* T S* T M* T ’ is a consistent estimator of V, where M* T = (G T (θ* T )’W T G T (θ* T )) -1 G T (θ* T )’W T S* T is a consistent estimator of S Estimator of S depends on time series properties of f(v t,  0 ). In general it is S = Γ 0 +  ( Γ i + Γ i ’) where Γ i = E{f t -E(f t )}{f t-i -E(f t-i )}’=E[f t f t-I ’] is the i-th autocovariance matrix of f t = f(v t,  0 ).

GMM asymptotics So (θ* T,i – θ 0,i )/√V* T,ii /T ~ N(0,1) asymptotically As before, we can choose W T = S* T -1 and – V* T is then (G T (θ* T )’S T -1 G T (θ* T )) -1 –a test of the model’s over-identifying restrictions is given by J T = TQ T (θ* T ) which is asymptotically  2 (qN-p )

Covariance matrix estimation We can consistently estimate S with S* T = Γ* 0 (θ* T ) +  ( Γ* i (θ* T ) + Γ* i (θ* T )’) where Γ* i (θ* T ) = T -1  f t (v t,θ* T )f t-i (v t-i,θ* T )’ If theory implies that the autocovariances of f(v t,θ 0 ) = 0 for some lag j then we can exclude these from S* T –e.g. u t are serially uncorrelated implies S* T = T -1  { u t (v t,θ* T )u t (v t,θ* T )’  (z t ’z t ) }

GMM adjustments Iterated GMM is recommended in small samples More powerful tests by subtracting sample means of f t (v t,θ* T ) in calculating Γ* i (θ* T ) Asymptotic standard errors may be understated in small samples: multiply asymptotic variances by “degrees of freedom adjustment” T/(T-p) or (N+q)T/{(N+q)T – k} where k = p+((Nq) 2 +Nq)/2