Download presentation
Presentation is loading. Please wait.
Published byEsmond Bradley Modified over 5 years ago
1
Econometrics Chengyaun yin School of Mathematics SHUFE
2
6. Finite Sample Properties of the Least Squares Estimator
Applied Econometrics 6. Finite Sample Properties of the Least Squares Estimator
3
Terms of Art Estimates and estimators
Properties of an estimator - the sampling distribution “Finite sample” properties as opposed to “asymptotic” or “large sample” properties
4
The Statistical Context of Least Squares Estimation
The sample of data from the population The stochastic specification of the regression model Endowment of the stochastic properties of the model upon the least squares estimator
5
Least Squares
6
Deriving the Properties
So, b = a parameter vector + a linear combination of the disturbances, each times a vector. Therefore, b is a vector of random variables. We analyze it as such. The assumption of nonstochastic regressors. How it is used at this point. We do the analysis conditional on an X, then show that results do not depend on the particular X in hand, so the result must be general – i.e., independent of X.
7
Properties of the LS Estimator
Expected value and the property of unbiasedness. E[b|X] = = E[b]. Prove this result. A Crucial Result About Specification: y = X11 + X22 + Two sets of variables. What if the regression is computed without the second set of variables? What is the expectation of the "short" regression estimator? b1 = (X1X1)-1X1y
8
The Left Out Variable Formula
(This is a VVIR!) E[b1] = 1 + (X1X1)-1X1X22 The (truly) short regression estimator is biased. Application: Quantity = 1Price + 2Income + If you regress Quantity on Price and leave out Income. What do you get? (Application below)
9
The Extra Variable Formula
A Second Crucial Result About Specification: y = X11 + X22 + but 2 really is 0. Two sets of variables. One is superfluous. What if the regression is computed with it anyway? The Extra Variable Formula: (This is a VIR!) E[b1.2| 2 = 0] = 1 The long regression estimator in a short regression is unbiased.) Extra variables in a model do not induce biases. Why not just include them, then? We'll pursue this later.
10
Application: Left out Variable
Leave out Income. What do you get? E[b1] = 2 In time series data, 1 < 0, 2 > 0 (usually) Cov[Price,Income] > 0 in time series data. So, the short regression will overestimate the price coefficient. Simple Regression of G on a constant and PG Price Coefficient should be negative.
11
Estimated ‘Demand’ Equation
Simple Regression of G on a constant and PG Price Coefficient should be negative?
12
Multiple Regression
13
Variance of the Least Squares Estimator
14
Gauss-Markov Theorem A theorem of Gauss and Markov: Least Squares is the MVLUE 1. Linear estimator 2. Unbiased: E[b|X] = β Comparing positive definite matrices: Var[c|X] – Var[b|X] is nonnegative definite for any other linear and unbiased estimator. What are the implications?
15
Aspects of the Gauss-Markov Theorem
Indirect proof: Any other linear unbiased estimator has a larger covariance matrix. Direct proof: Find the minimum variance linear unbiased estimator Other estimators Biased estimation – a minimum mean squared error estimator. Is there a biased estimator with a smaller ‘dispersion’? Normally distributed disturbances – the Rao-Blackwell result. (General observation – for normally distributed disturbances, ‘linear’ is superfluous.) Nonnormal disturbances - Least Absolute Deviations and other nonparametric approaches
16
Fixed X or Conditioned on X?
The role of the assumption of nonstochastic regressors Finite sample results: Conditional vs. unconditional results Its importance in the asymptotic results.
17
Specification Errors-1
Omitting relevant variables: Suppose the correct model is y = X11 + X22 + . I.e., two sets of variables. Compute least squares omitting X2. Some easily proved results: Var[b1] is smaller than Var[b1.2]. (The latter is the northwest submatrix of the full covariance matrix. The proof uses the residual maker (again!). I.e., you get a smaller variance when you omit X2. (One interpretation: Omitting X2 amounts to using extra information (2 = 0). Even if the information is wrong (see the next result), it reduces the variance. (This is an important result.)
18
Omitted Variables (No free lunch) E[b1] = 1 + (X1X1)-1X1X22 1. So, b1 is biased.(!!!) The bias can be huge. Can reverse the sign of a price coefficient in a “demand equation.” b1 may be more “precise.” Precision = Mean squared error = variance + squared bias. Smaller variance but positive bias. If bias is small, may still favor the short regression. (Free lunch?) Suppose X1X2 = 0. Then the bias goes away. Interpretation, the information is not “right,” it is irrelevant. b1 is the same as b1.2.
19
Specification Errors-2
Including superfluous variables: Just reverse the results. Including superfluous variables increases variance. (The cost of not using information.) Does not cause a bias, because if the variables in X2 are truly superfluous, then 2 = 0, so E[b1.2] = 1.
20
Other Models Looking ahead to nonlinear models: neither of the preceding results extend beyond the linear regression model: In a nonlinear model, lack of multicollinearity among the variables is no guarantee that a similar phenomenon related to certain other functions of the xs might not still reappear. Omitting relevant variables from a model is always costly. (No exceptions.) The benign result above almost never carries over to more involved nonlinear models. The negative result for omitted variables generally extends to nonlinear models, but the benign result for extra variables may not.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.