Econometrics I Professor William Greene Stern School of Business

Slides:



Advertisements
Similar presentations
Tests of Static Asset Pricing Models
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Managerial Economics in a Global Economy
The Multiple Regression Model.
Part 12: Asymptotics for the Regression Model 12-1/39 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
3. Binary Choice – Inference. Hypothesis Testing in Binary Choice Models.
Part 7: Estimating the Variance of b 7-1/53 Econometrics I Professor William Greene Stern School of Business Department of Economics.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Ch11 Curve Fitting Dr. Deshi Ye
Part 8: Hypothesis Testing 8-1/50 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Some Topics In Multivariate Regression. Some Topics We need to address some small topics that are often come up in multivariate regression. I will illustrate.
Chapter 13 Multiple Regression
Chapter 10 Simple Regression.
Chapter 12 Multiple Regression
Chapter 11 Multiple Regression.
Part 18: Regression Modeling 18-1/44 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Part 5: Regression Algebra and Fit 5-1/34 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Part 7: Multiple Regression Analysis 7-1/54 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Part 3: Regression and Correlation 3-1/41 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Part 24: Multiple Regression – Part /45 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Multiple Linear Regression - Matrix Formulation Let x = (x 1, x 2, …, x n )′ be a n  1 column vector and let g(x) be a scalar function of x. Then, by.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Introduction to Linear Regression
Part 9: Hypothesis Testing /29 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Lecture 11 Preview: Hypothesis Testing and the Wald Test Wald Test Let Statistical Software Do the Work Testing the Significance of the “Entire” Model.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Multiple Regression Analysis: Inference
Inference about the slope parameter and correlation
Chapter 14 Introduction to Multiple Regression
Chapter 20 Linear and Multiple Regression
Chapter 5 STATISTICAL INFERENCE: ESTIMATION AND HYPOTHESES TESTING
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
CHAPTER 7 Linear Correlation & Regression Methods
THE LINEAR REGRESSION MODEL: AN OVERVIEW
Lecture 8 Preview: Interval Estimates and Hypothesis Testing
Chapter 13 Simple Linear Regression
...Relax... 9/21/2018 ST3131, Lecture 3 ST5213 Semester II, 2000/2001
Correlation and Regression
Econometric Analysis of Panel Data
6-1 Introduction To Empirical Models
Review for Exam 2 Some important themes from Chapters 6-9
Discrete Event Simulation - 4
Review of Statistical Inference
Interval Estimation and Hypothesis Testing
Econometrics I Professor William Greene Stern School of Business
Chengyuan Yin School of Mathematics
Simple Linear Regression
Tutorial 1: Misspecification
Econometrics Analysis
Chapter 7: The Normality Assumption and Inference with OLS
Seminar in Economics Econ. 470
Product moment correlation
Chengyuan Yin School of Mathematics
Inference on the Mean of a Population -Variance Known
Econometrics I Professor William Greene Stern School of Business
The Multiple Regression Model
Microeconometric Modeling
Econometrics I Professor William Greene Stern School of Business
Microeconometric Modeling
Econometrics I Professor William Greene Stern School of Business
Presentation transcript:

Econometrics I Professor William Greene Stern School of Business Department of Economics

Econometrics I Part 8 – Interval Estimation and Hypothesis Testing

Interval Estimation b = point estimator of  We acknowledge the sampling variability. Estimated sampling variance b =  + sampling variability induced by 

Point estimate is only the best single guess Form an interval, or range of plausible values Plausible  likely values with acceptable degree of probability. To assign probabilities, we require a distribution for the variation of the estimator. The role of the normality assumption for 

Confidence Interval bk = the point estimate Std.Err[bk] = sqr{[σ2(XX)-1]kk} = vk Assume normality of ε for now: bk ~ N[βk,vk2] for the true βk. (bk-βk)/vk ~ N[0,1] Consider a range of plausible values of βk given the point estimate bk. bk  sampling error. Measured in standard error units, |(bk – βk)/ vk| < z* Larger z*  greater probability (“confidence”) Given normality, e.g., z* = 1.96  95%, z*=1.64590% Plausible range for βk then is bk ± z* vk

Computing the Confidence Interval Assume normality of ε for now: bk ~ N[βk,vk2] for the true βk. (bk-βk)/vk ~ N[0,1] vk = [σ2(X’X)-1]kk is not known because σ2 must be estimated. Using s2 instead of σ2, (bk-βk)/Est.(vk) ~ t[N-K]. (Proof: ratio of normal to sqr(chi-squared)/df is pursued in your text.) Use critical values from t distribution instead of standard normal. Will be the same as normal if N > 100.

Confidence Interval Critical t[.975,29] = 2.045 Confidence interval based on t: 1.27365  2.045 * .1501 Confidence interval based on normal: 1.27365  1.960 * .1501

Bootstrap Confidence Interval For a Coefficient

Bootstrap CI for Least Squares

Bootstrap CI for Least Absolute Deviations

Econometrics I Part 8.1 – Hypothesis Testing

Testing a Hypothesis Using a Confidence Interval Given the range of plausible values Testing the hypothesis that a coefficient equals zero or some other particular value: Is the hypothesized value in the confidence interval? Is the hypothesized value within the range of plausible values? If not, reject the hypothesis.

Test a Hypothesis About a Coefficient

Classical Hypothesis Testing We are interested in using the linear regression to support or cast doubt on the validity of a theory about the real world counterpart to our statistical model. The model is used to test hypotheses about the underlying data generating process.

Types of Tests Nested Models: Restriction on the parameters of a particular model y = 1 + 2x + 3z + , 3 = 0 Nonnested models: E.g., different RHS variables yt = 1 + 2xt + 3xt-1 + t yt = 1 + 2xt + 3yt-1 + wt Specification tests:  ~ N[0,2] vs. some other distribution

Methodology Bayesian Prior odds compares strength of prior beliefs in two states of the world Posterior odds compares revised beliefs Symmetrical treatment of competing ideas Not generally practical to carry out in meaningful situations Classical “Null” hypothesis given prominence Propose to “reject” toward default favor of “alternative” Asymmetric treatment of null and alternative Huge gain in practical applicability

Neyman – Pearson Methodology Formulate null and alternative hypotheses Define “Rejection” region = sample evidence that will lead to rejection of the null hypothesis. Gather evidence Assess whether evidence falls in rejection region or not.

Inference in the Linear Model Formulating hypotheses: linear restrictions as a general framework Hypothesis Testing J linear restrictions Analytical framework: y = X +  Hypothesis: R - q = 0, Substantive restrictions: What is a "testable hypothesis?"  Substantive restriction on parameters  Reduces dimension of parameter space  Imposition of restriction degrades estimation criterion

Testable Implications of a Theory Investors care about nominal interest rates and expected inflation: I = b1 + b2r + b3dp + e Investors care about real interest rates I = c1 + c2(r-dp) + c3dp + e No testable restrictions implied. c1 =b1, c2=b2-b3, c3=b3. Investors care only about real interest rates I = f1 + f2(r-dp) + f3dp + e. f3 = 0

The General Linear Hypothesis: H0: R - q = 0 A unifying departure point: Regardless of the hypothesis, least squares is unbiased. E[b] =  The hypothesis makes a claim about the population R – q = 0. Then, if the hypothesis is true, E[Rb – q] = 0. The sample statistic, Rb – q will not equal zero. Two possibilities: Rb – q is small enough to attribute to sampling variability Rb – q is too large (by some measure) to be plausibly attributed to sampling variability Large Rb – q is the rejection region.

Approaches to Defining the Rejection Region (1) Imposing the restrictions leads to a loss of fit. R2 must go down. Does it go down “a lot?” (I.e., significantly?). Ru2 = unrestricted model, Rr2 = restricted model fit. F = { (Ru2 – Rr2)/J } / [(1 – Ru2)/(N-K)] = F[J,N-K]. (2) Is Rb - q close to 0? Basing the test on the discrepancy vector: m = Rb - q. Using the Wald criterion: m(Var[m])-1m has a chi-squared distribution with J degrees of freedom But, Var[m] = R[2(X’X)-1]R. If we use our estimate of 2, we get an F[J,N-K], instead. (Note, this is based on using ee/(N-K) to estimate 2.) These are the same for the linear model

Testing Fundamentals - I SIZE of a test = Probability it will incorrectly reject a “true” null hypothesis. This is the probability of a Type I error. Under the null hypothesis, F(3,100) has an F distribution with (3,100) degrees of freedom. Even if the null is true, F will be larger than the 5% critical value of 2.7 about 5% of the time.

Testing Procedures How to determine if the statistic is 'large.' Need a 'null distribution.' If the hypothesis is true, then the statistic will have a certain distribution. This tells you how likely certain values are, and in particular, if the hypothesis is true, then 'large values' will be unlikely. If the observed statistic is too large, conclude that the assumed distribution must be incorrect and the hypothesis should be rejected. For the linear regression model with normally distributed disturbances, the distribution of the relevant statistic is F with J and N-K degrees of freedom.

Distribution Under the Null

A Simulation Experiment sample ; 1 - 100 $ matrix ; fvalues=init(1000,1,0)$ proc$ create ; fakelogc = rnn(-.319557,1.54236)$ Coefficients all = 0 regress ; quietly ; lhs = fakelogc Compute regression ; rhs=one,logq,logpl_pf,logpk_pf$ calc ; fstat = (rsqrd/3)/((1-rsqrd)/(n-4))$ Compute F matrix ; fvalues(i)=fstat$ Save 1000 Fs endproc execute ; i= 1,1000 $ 1000 replications histogram ; rhs = fvalues ; title=F Statistic for H0:b2=b3=b4=0$

Simulation Results 48 outcomes to the right of 2.7 in this run of the experiment.

Testing Fundamentals - II POWER of a test = the probability that it will correctly reject a “false null” hypothesis This is 1 – the probability of a Type II error. The power of a test depends on the specific alternative.

Power of a Test Null: Mean = 0. Reject if observed mean < -1.96 or > +1.96. Prob(Reject null|mean=0) = 0.05 Prob(Reject null|mean=.5)=0.07902 Prob(Reject null|mean=1)=0.170066. Increases as the (alternative) mean rises.

Test Statistic For the fit measures, use a normalized measure of the loss of fit:

Test Statistics Forming test statistics: For distance measures use Wald type of distance measure, W = m[Est.Var(m)]-1m An important relationship between t and F For a single restriction, m = r’b - q. The variance is r’(Var[b])r The distance measure is m / standard error of m.

An important relationship between t and F For a single restriction, F[1,N-K] is the square of the t ratio.

Gasoline Market Regression Analysis: logG versus logIncome, logPG The regression equation is logG = - 0.468 + 0.966 logIncome - 0.169 logPG Predictor Coef SE Coef T P Constant -0.46772 0.08649 -5.41 0.000 logIncome 0.96595 0.07529 12.83 0.000 logPG -0.16949 0.03865 -4.38 0.000 S = 0.0614287 R-Sq = 93.6% R-Sq(adj) = 93.4% Analysis of Variance Source DF SS MS F P Regression 2 2.7237 1.3618 360.90 0.000 Residual Error 49 0.1849 0.0038 Total 51 2.9086

Application Time series regression, LogG = 1 + 2logY + 3logPG + 4logPNC + 5logPUC + 6logPPT + 7logPN + 8logPD + 9logPS +  Period = 1960 - 1995. Note that all coefficients in the model are elasticities.

Full Model ---------------------------------------------------------------------- Ordinary least squares regression ............ LHS=LG Mean = 5.39299 Standard deviation = .24878 Number of observs. = 36 Model size Parameters = 9 Degrees of freedom = 27 Residuals Sum of squares = .00855 <******* Standard error of e = .01780 <******* Fit R-squared = .99605 <******* Adjusted R-squared = .99488 <******* --------+------------------------------------------------------------- Variable| Coefficient Standard Error t-ratio P[|T|>t] Mean of X Constant| -6.95326*** 1.29811 -5.356 .0000 LY| 1.35721*** .14562 9.320 .0000 9.11093 LPG| -.50579*** .06200 -8.158 .0000 .67409 LPNC| -.01654 .19957 -.083 .9346 .44320 LPUC| -.12354* .06568 -1.881 .0708 .66361 LPPT| .11571 .07859 1.472 .1525 .77208 LPN| 1.10125*** .26840 4.103 .0003 .60539 LPD| .92018*** .27018 3.406 .0021 .43343 LPS| -1.09213*** .30812 -3.544 .0015 .68105

Test About One Parameter Is the price of public transportation really relevant? H0 : 6 = 0. Confidence interval: b6  t(.95,27)  Standard error = .11571  2.052(.07859) = .11571  .16127 = (-.045557 ,.27698) Contains 0.0. Do not reject hypothesis Regression fit if drop? Without LPPT, R-squared= .99573 Compare R2, was .99605, F(1,27) = [(.99605 - .99573)/1]/[(1-.99605)/(36-9)] = 2.187 = 1.4722 (with some rounding difference)

Hypothesis Test: Sum of Coefficients = 1? ---------------------------------------------------------------------- Ordinary least squares regression ............ LHS=LG Mean = 5.39299 Standard deviation = .24878 Number of observs. = 36 Model size Parameters = 9 Degrees of freedom = 27 Residuals Sum of squares = .00855 <******* Standard error of e = .01780 <******* Fit R-squared = .99605 <******* Adjusted R-squared = .99488 <******* --------+------------------------------------------------------------- Variable| Coefficient Standard Error t-ratio P[|T|>t] Mean of X Constant| -6.95326*** 1.29811 -5.356 .0000 LY| 1.35721*** .14562 9.320 .0000 9.11093 LPG| -.50579*** .06200 -8.158 .0000 .67409 LPNC| -.01654 .19957 -.083 .9346 .44320 LPUC| -.12354* .06568 -1.881 .0708 .66361 LPPT| .11571 .07859 1.472 .1525 .77208 LPN| 1.10125*** .26840 4.103 .0003 .60539 LPD| .92018*** .27018 3.406 .0021 .43343 LPS| -1.09213*** .30812 -3.544 .0015 .68105

Imposing the Restriction ---------------------------------------------------------------------- Linearly restricted regression LHS=LG Mean = 5.392989 Standard deviation = .2487794 Number of observs. = 36 Model size Parameters = 8 <*** 9 – 1 restriction Degrees of freedom = 28 Residuals Sum of squares = .0112599 <*** With the restriction Residuals Sum of squares = .0085531 <*** Without the restriction Fit R-squared = .9948020 Restrictns. F[ 1, 27] (prob) = 8.5(.01) Not using OLS or no constant.R2 & F may be < 0 --------+------------------------------------------------------------- Variable| Coefficient Standard Error t-ratio P[|T|>t] Mean of X Constant| -10.1507*** .78756 -12.889 .0000 LY| 1.71582*** .08839 19.412 .0000 9.11093 LPG| -.45826*** .06741 -6.798 .0000 .67409 LPNC| .46945*** .12439 3.774 .0008 .44320 LPUC| -.01566 .06122 -.256 .8000 .66361 LPPT| .24223*** .07391 3.277 .0029 .77208 LPN| 1.39620*** .28022 4.983 .0000 .60539 LPD| .23885 .15395 1.551 .1324 .43343 LPS| -1.63505*** .27700 -5.903 .0000 .68105 F = [(.0112599 - .0085531)/1] / [.0085531/(36 – 9)] = 8.544691

Joint Hypotheses Joint hypothesis: Income elasticity = +1, Own price elasticity = -1. The hypothesis implies that logG = β1 + logY – logPg + β4 logPNC + ... Strategy: Regress logG – logY + logPg on the other variables and Compare the sums of squares With two restrictions imposed Residuals Sum of squares = .0286877 Fit R-squared = .9979006 Unrestricted Residuals Sum of squares = .0085531 Fit R-squared = .9960515 F = ((.0286877 - .0085531)/2) / (.0085531/(36-9)) = 31.779951 The critical F for 95% with 2,27 degrees of freedom is 3.354. The hypothesis is rejected.

Basing the Test on R2 After building the restrictions into the model and computing restricted and unrestricted regressions: Based on R2s, F = ((.9960515 - .997096)/2)/((1-.9960515)/(36-9)) = -3.571166 (!) What's wrong? The unrestricted model used LHS = logG. The restricted one used logG-logY. The calculation is safe using the sums of squared residuals.

Wald Distance Measure Testing more generally about a single parameter. Sample estimate is bk Hypothesized value is βk How far is βk from bk? If too far, the hypothesis is inconsistent with the sample evidence. Measure distance in standard error units t = (bk - βk)/Estimated vk. If t is “large” (larger than critical value), reject the hypothesis.

The Wald Statistic

Robust Tests The Wald test generally will (when properly constructed) be more robust to failures of the narrow model assumptions than the t or F Reason: Based on “robust” variance estimators and asymptotic results that hold in a wide range of circumstances. Analysis: Later in the course – after developing asymptotics.

Particular Cases Some particular cases: One coefficient equals a particular value: F = [(b - value) / Standard error of b ]2 = square of familiar t ratio. Relationship is F [ 1, d.f.] = t2[d.f.] A linear function of coefficients equals a particular value (linear function of coefficients - value)2 F = ---------------------------------------------------- Variance of linear function Note square of distance in numerator Suppose linear function is k wk bk Variance is kl wkwl Cov[bk,bl] This is the Wald statistic. Also the square of the somewhat familiar t statistic. Several linear functions. Use Wald or F. Loss of fit measures may be easier to compute.

Hypothesis Test: Sum of Coefficients Do the three aggregate price elasticities sum to zero? H0 :β7 + β8 + β9 = 0 R = [0, 0, 0, 0, 0, 0, 1, 1, 1], q = [0] Variable| Coefficient Standard Error t-ratio P[|T|>t] Mean of X --------+------------------------------------------------------------- LPN| 1.10125*** .26840 4.103 .0003 .60539 LPD| .92018*** .27018 3.406 .0021 .43343 LPS| -1.09213*** .30812 -3.544 .0015 .68105

Wald Test

Using the Wald Statistic --> Matrix ; R = [0,1,0,0,0,0,0,0,0 / 0,0,1,0,0,0,0,0,0]$ --> Matrix ; q = [1/-1]$ --> Matrix ; list ; m = R*b - q $ Matrix M has 2 rows and 1 columns. 1 +-------------+ 1| .35721 2| .49421 --> Matrix ; list ; vm = R*varb*R' $ Matrix VM has 2 rows and 2 columns. 1 2 +-------------+-------------+ 1| .02120 .00291 2| .00291 .00384 --> Matrix ; list ; w = m'<vm>m $ Matrix W has 1 rows and 1 columns. 1| 63.55962 Joint hypothesis: b(LY) = 1 b(LPG) = -1

Application: Cost Function

Regression Results ---------------------------------------------------------------------- Ordinary least squares regression ............ LHS=C Mean = 3.07162 Standard deviation = 1.54273 Number of observs. = 158 Model size Parameters = 9 Degrees of freedom = 149 Residuals Sum of squares = 2.56313 Standard error of e = .13116 Fit R-squared = .99314 Adjusted R-squared = .99277 --------+------------------------------------------------------------- Variable| Coefficient Standard Error t-ratio P[|T|>t] Mean of X Constant| 5.22853 3.18024 1.644 .1023 K| -.21055 .21757 -.968 .3348 4.25096 L| -.85353*** .31485 -2.711 .0075 8.97280 F| .18862 .20225 .933 .3525 3.39118 Q| -.96450*** .36798 -2.621 .0097 8.26549 Q2| .05250*** .00459 11.430 .0000 35.7913 QK| .04273 .02758 1.550 .1233 35.1677 QL| .11698*** .03728 3.138 .0021 74.2063 QF| .05950** .02478 2.401 .0176 28.0108 Note: ***, **, * = Significance at 1%, 5%, 10% level.

Price Homogeneity: Only Price Ratios Matter β2 + β3 + β4 = 1 ---------------------------------------------------------------------- Linearly restricted regression.................... LHS=C Mean = 3.07162 Standard deviation = 1.54273 Number of observs. = 158 Model size Parameters = 7 Degrees of freedom = 151 Residuals Sum of squares = 2.85625 Standard error of e = .13753 Fit R-squared = .99236 Restrictns. F[ 2, 149] (prob) = 8.5(.0003) Not using OLS or no constant. Rsqrd & F may be < 0 --------+------------------------------------------------------------- Variable| Coefficient Standard Error t-ratio P[|T|>t] Mean of X Constant| -7.24078*** 1.01411 -7.140 .0000 K| .38943** .16933 2.300 .0228 4.25096 L| .19130 .19530 .979 .3289 8.97280 F| .41927** .20077 2.088 .0385 3.39118 Q| .45889*** .12402 3.700 .0003 8.26549 Q2| .06008*** .00441 13.612 .0000 35.7913 QK| -.02954 .02125 -1.390 .1665 35.1677 QL| -.00462 .02364 -.195 .8454 74.2063 QF| .03416 .02489 1.373 .1720 28.0108

Imposing the Restrictions Alternatively, compute the restricted regression by converting to price ratios and Imposing the restrictions directly. This is a regression of log(c/pf) on log(pk/pf), log(pl/pf) etc. ---------------------------------------------------------------------- Ordinary least squares regression ............ LHS=LOGC_PF Mean = -.31956 Standard deviation = 1.54236 Number of observs. = 158 Model size Parameters = 7 Degrees of freedom = 151 Residuals Sum of squares = 2.85625 (restricted) Residuals Sum of squares = 2.56313 (unrestricted) --------+------------------------------------------------------------- Variable| Coefficient Standard Error t-ratio P[|T|>t] Mean of X Constant| -7.24078*** 1.01411 -7.140 .0000 LOGQ| .45889*** .12402 3.700 .0003 8.26549 LOGQSQ| .06008*** .00441 13.612 .0000 35.7913 LOGPK_PF| .38943** .16933 2.300 .0228 .85979 LOGPL_PF| .19130 .19530 .979 .3289 5.58162 LQ_LPKPF| -.02954 .02125 -1.390 .1665 7.15696 LQ_LPLPF| -.00462 .02364 -.195 .8454 46.1956 F Statistic = ((2.85625 – 2.56313)/2) / (2.56313/(158-9)) = 8.52 (This is a problem. The underlying theory requires linear homogeneity in prices.)

Wald Test of the Restrictions Chi squared = J*F wald ; fn1 = b_k + b_l + b_f - 1 ; fn2 = b_qk + b_ql + b_qf – 0 $ ----------------------------------------------------------- WALD procedure. Estimates and standard errors for nonlinear functions and joint test of nonlinear restrictions. Wald Statistic = 17.03978 Prob. from Chi-squared[ 2] = .00020 --------+-------------------------------------------------- Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Fncn(1)| -1.87546*** .45621 -4.111 .0000 Fncn(2)| .21921*** .05353 4.095 .0000 Note: ***, **, * = Significance at 1%, 5%, 10% level. Recall: F Statistic = ((2.85625 – 2.56313)/2) / (2.56313/(158-9)) = 8.52 The chi squared statistic is JF. Here, 2F = 17.04 = the chi squared.

Test of Homotheticity Cross Product Terms = 0 Homothetic Production: β7 = β8 = β9 = 0. With linearity homogeneity in prices already imposed, this is β6 = β7 = 0. ----------------------------------------------------------- WALD procedure. Estimates and standard errors for nonlinear functions and joint test of nonlinear restrictions. Wald Statistic = 2.57180 Prob. from Chi-squared[ 2] = .27640 --------+-------------------------------------------------- Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Fncn(1)| -.02954 .02125 -1.390 .1644 Fncn(2)| -.00462 .02364 -.195 .8451 The F test would produce F = ((2.90490 – 2.85625)/1)/(2.85625/(158-7)) = 1.285

Econometrics I Part 8.2 – Restrictions

Linear Least Squares Subject to Restrictions Restrictions: Theory imposes certain restrictions on parameters. Some common applications  Dropping variables from the equation = certain coefficients in b forced to equal 0. (Probably the most common testing situation. “Is a certain variable significant?”)  Adding up conditions: Sums of certain coefficients must equal fixed values. Adding up conditions in demand systems. Constant returns to scale in production functions.  Equality restrictions: Certain coefficients must equal other coefficients. Using real vs. nominal variables in equations. General formulation for linear restrictions: Minimize the sum of squares, ee, subject to the linear constraint Rb = q.

Restricted Least Squares

Restricted Least Squares Solution General Approach: Programming Problem Minimize for  L = (y - X)(y - X) subject to R = q Each row of R is the K coefficients in a restriction. There are J restrictions: J rows 3 = 0: R = [0,0,1,0,…] q = (0). 2 = 3: R = [0,1,-1,0,…] q = (0) 2 = 0, 3 = 0: R = 0,1,0,0,… q = 0 0,0,1,0,… 0

Solution Strategy Quadratic program: Minimize quadratic criterion subject to linear restrictions All restrictions are binding Solve using Lagrangean formulation Minimize over (,) L* = (y - X)(y - X) + 2(R-q) (The 2 is for convenience – see below.)

Restricted LS Solution

Restricted Least Squares

Aspects of Restricted LS 1. b* = b - Cm where m = the “discrepancy vector” Rb - q. Note what happens if m = 0. What does m = 0 mean? 2. =[R(XX)-1R]-1(Rb - q) = [R(XX)-1R]-1m. When does  = 0. What does this mean? 3. Combining results: b* = b - (XX)-1R. How could b* = b?

Example: Panel Data on Spanish Dairy Farms N = 247 farms, T = 6 years (1993-1998) Units Mean Std. Dev. Minimum Maximum OutputMilk Milk production (liters) 131,107 92,584 14,410 727,281 Input Cows # of milking cows 22.12 11.27 4.5 82.3 Labor # man-equivalent units 1.67 0.55 1.0 4.0 Land Hectares of land devoted to pasture and crops. 12.99 6.17 2.0 45.1 Input Feed Total amount of feedstuffs fed to dairy cows (Kg) 57,941 47,981 3,924.14 376,732

Application y = log output x = Cobb douglas production: x = 1,x1,x2,x3,x4 = constant and logs of 4 inputs (5 terms) z = Translog terms, x12, x22, etc. and all cross products, x1x2, x1x3, x1x4, x2x3, etc. (10 terms) w = (x,z) (all 15 terms) Null hypothesis is Cobb Douglas, alternative is translog = Cobb-Douglas plus second order terms.

Translog Regression Model x H0:z=0

F test based on the change in fit

Chi squared test based on the Wald distance (from 0) Close to 0?

Likelihood Ratio Test The normality assumption Does it work ‘approximately?’ For any regression model yi = h(xi,)+εi where εi ~N[0,2], (linear or nonlinear), at the linear (or nonlinear) least squares estimator, however computed, with or without restrictions, This forms the basis for likelihood ratio tests.

The LM test Theory 1. b* = b - Cm where Practice: m = the “discrepancy vector” Rb - q. 2. =[R(XX)-1R]-1(Rb - q) = [R(XX)-1R]-1m. When does  = 0. What does this mean? 3. Test  = 0 using a Wald test Practice: 1. Compute residuals from restricted regression, e0 2. Regress e0 on unrestricted spec, 3. LM = NR2 = N(1 - ResidualSS / e0’e0 )

LM test: Chi squared = N * (uncentered R2 of e(short) on [X,Z]). = N * [RSS/ e0’e0] = N * [ 0.812115/29.0957] = 41.36