3SLS 3SLS is the combination of 2SLS and SUR.

Slides:



Advertisements
Similar presentations
Example 12.2 Multicollinearity | 12.3 | 12.3a | 12.1a | 12.4 | 12.4a | 12.1b | 12.5 | 12.4b a12.1a a12.1b b The Problem.
Advertisements

Multiple Regression.
Correlation & the Coefficient of Determination
Things to do in Lecture 1 Outline basic concepts of causality
PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
Panel Data Models Prepared by Vera Tabakova, East Carolina University.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 9) Slideshow: two-stage least squares Original citation: Dougherty, C. (2012) EC220.
Instrumental Variables: 2-Stage and 3-Stage Least Squares Regression of a Linear Systems of Equations 2009 LPGA Performance Statistics and Prize Winnings.
1 MADE Why do we need econometrics? If there are two points and we want to know what relation describes that? X Y.
Economics 20 - Prof. Anderson
Chapter 4 Systems of Linear Equations; Matrices
1 CAPM Betas The Capital Asset Pricing Model (“CAPM”) [ R s - R f ] = b 0 + b 1 [ R M - R f ] + e People commonly refer to the b 0 in this model as the.
The Simple Regression Model
There are at least three generally recognized sources of endogeneity. (1) Model misspecification or Omitted Variables. (2) Measurement Error.
Applied Econometrics Second edition
Statistics and Quantitative Analysis U4320
Econ 140 Lecture 81 Classical Regression II Lecture 8.
Instrumental Variables Estimation and Two Stage Least Square
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Three Ending Tuesday, September 11 (Note: You must go over these slides and complete every.
Econometric Details -- the market model Assume that asset returns are jointly multivariate normal and independently and identically distributed through.
Classical Regression III
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
By Hrishikesh Gadre Session II Department of Mechanical Engineering Louisiana State University Engineering Equation Solver Tutorials.
Econ 140 Lecture 121 Prediction and Fit Lecture 12.
Prof. Dr. Rainer Stachuletz
Additional Topics in Regression Analysis
Canonical correlations
Review of Matrix Algebra
Chapter 15 Panel Data Analysis.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Example of Simple and Multiple Regression
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, November 19 Chi-Squared Test of Independence.
Multiple Linear Regression - Matrix Formulation Let x = (x 1, x 2, …, x n )′ be a n  1 column vector and let g(x) be a scalar function of x. Then, by.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Introduction to Regression Analysis. Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test.
CHAPTER 14 MULTIPLE REGRESSION
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Review of Measures of Central Tendency, Dispersion & Association
GG 313 Geological Data Analysis Lecture 13 Solution of Simultaneous Equations October 4, 2005.
Educ 200C Wed. Oct 3, Variation What is it? What does it look like in a data set?
Panel Data Models ECON 6002 Econometrics I Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
1/69: Topic Descriptive Statistics and Linear Regression Microeconometric Modeling William Greene Stern School of Business New York University New.
Trees Example More than one variable. The residual plot suggests that the linear model is satisfactory. The R squared value seems quite low though,
Regression Analysis: Part 2 Inference Dummies / Interactions Multicollinearity / Heteroscedasticity Residual Analysis / Outliers.
ANOVA, Regression and Multiple Regression March
Economics 310 Lecture 21 Simultaneous Equations Three Stage Least Squares A system estimator. More efficient that two-stage least squares. Uses all information.
Module 4 Forecasting Multiple Variables from their own Histories EC 827.
Linear Regression Linear Regression. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Purpose Understand Linear Regression. Use R functions.
10-1 MGMG 522 : Session #10 Simultaneous Equations (Ch. 14 & the Appendix 14.6)
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
The Instrumental Variables Estimator The instrumental variables (IV) estimator is an alternative to Ordinary Least Squares (OLS) which generates consistent.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Vera Tabakova, East Carolina University
Vera Tabakova, East Carolina University
PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
Multiple Regression.
(Residuals and
Instrumental Variables and Two Stage Least Squares
Some issues in multivariate regression
Simultaneous equation models Prepared by Nir Kamal Dahal(Statistics)
Eviews Tutorial for Labor Economics Lei Lei
Correlation and Simple Linear Regression
Lecture 20 Two Stage Least Squares
Presentation transcript:

3SLS 3SLS is the combination of 2SLS and SUR. It is used in an system of equations which are endogenous, i.e. In each equation there are endogenous variables on both the left and right hand sides of the equation. THAT IS THE 2SLS PART. But there error terms in each equation are also correlated. Efficient estimation requires we take account of this. THAT IS THE SUR (SEEMINGLY UNRELATED REGRESSIONS). PART. Hence in the regression for the ith equation there are endogenous (Y ) variables on the rhs AND the error term is correlated with the error terms in other equations.

3SLS log using "g:summ1.log" If you type the above then a log is created on drive g (on my computer this is the flash drive, on yours you may need to specify another drive. The name summ1 can be anything. But the suffx must be log At the end you can close the log by typing: log close So open a log now and you will have a record of this session

3SLS Load Data Clear use http://www.ats.ucla.edu/stat/stata/examples/greene/TBL16-2 THAT link no longer works. But the following does webuse klein In order to get the rest to work rename consump c rename capital1 k1 rename invest i rename profits p rename govt g rename wagegovt wg rename taxnetx t rename totinc t rename wagepriv wp generate x=totinc

*generate variables generate w = wg+wp generate k = k1+i generate yr=year-1931 generate p1 = p[_n-1] generate x1 = x[_n-1]

OLS Regression regress c p p1 w Regresses c on p , p1 and w (what this equation means is not so important).

Usual output

reg3 By the command reg3, STATA estimates a system of structural equations, where some equations contain endogenous variables among the explanatory variables. Estimation is via three-stage least squares (3SLS). Typically, the endogenous regressors are dependent variables from other equations in the system. In addition, reg3 can also estimate systems of equations by seemingly unrelated regression (SURE), multivariate regression (MVREG), and equation-by-equation ordinary least squares (OLS) or two-stage least squares (2SLS).

2SLS Regression reg3 (c p p1 w), 2sls inst(t wg g yr p1 x1 k1) Regresses c on p , p1 and w. The instruments (i.e. The predetermined or exogenous variables in this equation and the rest of the system) are t wg g yr p1 x1 k1 This means that p and w (which are not included in the instruments are endogenous).

The output is as before, but it confirms what the exogenous and endogenous variables are.

2SLS Regression ivreg c p1 (p w = t wg g yr p1 x1 k1) This is an alternative command to do the same thing. Note that the endogenous variables on the right hand side of the equation are specified in (p w And the instruments follow the = sign.

The results are identical.

3SLS Regression reg3 (c p p1 w) (i p p1 k1) (wp x x1 yr), 3sls inst(t wg g yr p1 x1 k1) This format does two new things. First it specifies all the three equations in the system. Note it has to do this. Because it needs to calculate the covariances between the error terms and for this it needs to know what the equations – and hence the errors –are. Secondly it says 3sls not 2sls

All 3 equations are printed out All 3 equations are printed out. This tells us what these equations look like

Lets compare the three different sets of equations Lets compare the three different sets of equations. Look at the coefficient on w. In OLS very significant and in 2SLS not significant but in 3SLS its back to similar with OLS and significant. That is odd. Now I expect that if 2sls is different because of bias then so should 3sls. As it stands it suggests that OLS is closer to 3SLS than 2SLS is to 3SLS. Which does not make an awful lot of sense. But we do not have many observations. Perhaps that is partly why.

3SLS Regression reg3 (c p p1 w) (i p p1 k1) (wp x x1 yr), 3sls inst(t wg g yr p1 x1 k1) matrix sig=e(Sigma) Now this command stores the variances and covariances between the error terms in a matrix I call sig. You have used generate to generate variables, scalar to generate scalars. Similarly matrix produces a matrix. e(Sigma)stores this variance covariance matrix from the previous regression

3SLS Regression reg3 (c p p1 w) (i p p1 k1) (wp x x1 yr), 3sls inst(t wg g yr p1 x1 k1) matrix sig=e(Sigma) display sig[1,1], sig[1,2], sig[1,3] display sig[2,1], sig[2,2], sig[2,3] display sig[3,1], sig[3,2], sig[3,3] . display sig[1,1], sig[1,2], sig[1,3] 1.0440596 .43784767 -.3852272 . . display sig[2,1], sig[2,2], sig[2,3] .43784767 1.3831832 .19260612 . display sig[3,1], sig[3,2], sig[3,3] -.3852272 .19260612 .47642626 Variance of 1st error term Covariance of error terms from equations 2 and 3

3SLS Regression . This relates to the variance covariance matrix in the lecture Hence 0.437848 relates to σ12 and of course σ21 This matrix is Σ

3SLS Regression display sig[1,2]/( sig[1,1] ^0.5* sig[2,2]^0.5) Now this should give the correlation between the error terms from equations 1 and 2. It is this formula Correlation (x, y) = σxy /(σx σx). When we do this we get:

Lets check reg3 (c p p1 w) (i p p1 k1) (wp x x1 yr), 3sls inst(t wg g yr p1 x1 k1) matrix sig=e(Sigma) matrix cy= e(b) generate rc=c-(cy[1,1]*p+ cy[1,2]*p1+ cy[1,3]*w+cy[1,4]) generate ri=i-(cy[1,5]*p+ cy[1,6]*p1+ cy[1,7]*k1+ cy[1,8]) correlate ri rc matrix cy= e(b) stores the coefficients from the regression in a regression vector we call cy, cy[1,1] is the first coefficient on p in the first equation cy[1,4] is the fourth coefficient in the first equation (the constant term) cy[1,5] is the first coefficient ion p in the second equation Note this is cy[1,5] NOT cy[2,1]

Lets check reg3 (c p p1 w) (i p p1 k1) (wp x x1 yr), 3sls inst(t wg g yr p1 x1 k1)matrix sig=e(Sigma) matrix cy= e(b) generate rc=c-(cy[1,1]*p+ cy[1,2]*p1+ cy[1,3]*w+cy[1,4]) generate ri=i-(cy[1,5]*p+ cy[1,6]*p1+ cy[1,7]*k1+ cy[1,8]) correlate ri rc Thus cy[1,1]*p+ cy[1,2]*p1+ cy[1,3]*w+cy[1,4] is the predicted value from this first regression. and i-(cy[1,5]*p+ cy[1,6]*p1+ cy[1,7]*k1+ cy[1,8]) Is the actual minus the predicted value, i.e. The error term from the 2nd equation correlate ri rc prints out the correlation between the two error terms

Lets check reg3 (c p p1 w) (i p p1 k1) (wp x x1 yr), 3sls inst(t wg g yr p1 x1 k1)matrix sig=e(Sigma) matrix cy= e(b) generate rc=c-(cy[1,1]*p+ cy[1,2]*p1+ cy[1,3]*w+cy[1,4]) generate ri=i-(cy[1,5]*p+ cy[1,6]*p1+ cy[1,7]*k1+ cy[1,8]) correlate ri rc The correlation is 0,30, close to what we had before. But not the same. Now the main purpose of this class is to illustrate commands. So its not too important. I think it could be because stata is not calculating the e(sigma) matrix by dividing by n-k, but just n?????

Lets check Click on help (on tool bar at the top of the screen to the right). Click on ‘stata command’ In the dialogue box type reg3 Move down towards the end of the file and you get the following

Some important retrievables e(mss_#) model sum of squares for equation # e(rss_#) residual sum of squares for equation # e(r2_#) R-squared for equation # e(F_#) F statistic for equation # (small) e(rmse_#) root mean squared error for equation # e(ll) log likelihood Where # is a number e.g. If 2 it means equation 2. And Matrices e(b) coefficient vector e(Sigma) Sigma hat matrix e(V) variance-covariance matrix of the estimators

The Hausman Test Again We looked at this with respect to panel data. But it is a general test to allow us to compare an equation which has been estimated by two different techniques. Here we apply the technique to comparing ols with 3sls. reg3 (c p p1 w) (i p p1 k1) (wp x x1 yr),ols est store EQNols reg3 (c p p1 w) (i p p1 k1) (wp x x1 yr) , 3sls inst(t wg g yr p1 x1 k1) est store EQN3sls hausman EQNols EQN3sls

The Hausman Test Again Below we run the three regressions specifying ols and store the results as EQNols. reg3 (c p p1 w) (i p p1 k1) (wp x x1 yr),ols est store EQNols Then we run the three regressions specifying 3sls and store the results as EQN3sls. reg3 (c p p1 w) (i p p1 k1) (wp x x1 yr) , 3sls inst(t wg g yr p1 x1 k1) est store EQN3sls Then we do the Hausman test hausman EQNols EQN3sls

The Results The table prints out the two sets of coefficients and their difference. The Hausman test statistic is 0.06 The significance level is 0.9963 This is clearly very far from being significant at the 10% level.

The Hausman Test Again Hence it would appear that the coefficients from the two regressions are not significantly different. If OLS was giving biased estimates that 3SLS corrects they would be different. Hence we would conclude that there is no endogeneity which requires endogenous techniques. But because the error terms do appear correlated SUR is probably the approriate technique as it produces better results.

Tasks Using the display command, e.g. display e(mss_2) Print on the screen some of the retrievables from eqach regression (the above the model sum of squared residuals for the second equation. 2. Lets look at the display command Type: display "The residual sum of squares =" e(mss_2)

Tasks display "The residual sum of squares =" e(mss_2), "and the R2 =" e(r2_2) display _column(20) "The residual sum of squares =" e(mss_2), _column(50) "and the R2 =" e(r2_2) display _column(20) "The residual sum of squares =" e(mss_2), _column(60) "and the R2 =" e(r2_2) display _column(20) "The residual sum of squares =" e(mss_2), _column(60) "and the R2 =" _skip(5) e(r2_2) display _column(20) "The residual sum of squares =" e(mss_2), _column(60) "and the R2 =" _skip(10) e(r2_2)

Tasks Close log: log close And have a look at it in word.

webuse klein In order to get the rest to work rename consump c rename capital1 k1 rename invest i rename profits p rename govt g rename wagegovt wg rename taxnetx t rename totinc t rename wagepriv wp generate x=totinc generate w = wg+wp generate k = k1+i generate yr=year-1931 generate p1 = p[_n-1] generate x1 = x[_n-1] reg3 (c p p1 w), 2sls inst(t wg g yr p1 x1 k1) reg3 (c p p1 w) (i p p1 k1) (wp x x1 yr), 3sls inst(t wg g yr p1 x1 k1)