MEASUREMENT ERROR 1 In this sequence we will investigate the consequences of measurement errors in the variables in a regression model. To keep the analysis.

Slides:



Advertisements
Similar presentations
1 Although they are biased in finite samples if Part (2) of Assumption C.7 is violated, OLS estimators are consistent if Part (1) is valid. We will demonstrate.
Advertisements

Christopher Dougherty EC220 - Introduction to econometrics (chapter 12) Slideshow: consequences of autocorrelation Original citation: Dougherty, C. (2012)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 8) Slideshow: model b: properties of the regression coefficients Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: a Monte Carlo experiment Original citation: Dougherty, C. (2012) EC220.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: introduction to maximum likelihood estimation Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 11) Slideshow: adaptive expectations Original citation: Dougherty, C. (2012) EC220.
1 THE DISTURBANCE TERM IN LOGARITHMIC MODELS Thus far, nothing has been said about the disturbance term in nonlinear regression models.
EC220 - Introduction to econometrics (chapter 7)
1 XX X1X1 XX X Random variable X with unknown population mean  X function of X probability density Sample of n observations X 1, X 2,..., X n : potential.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: asymptotic properties of estimators: plims and consistency Original.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 13) Slideshow: stationary processes Original citation: Dougherty, C. (2012) EC220 -
1 THE NORMAL DISTRIBUTION In the analysis so far, we have discussed the mean and the variance of a distribution of a random variable, but we have not said.
HETEROSCEDASTICITY-CONSISTENT STANDARD ERRORS 1 Heteroscedasticity causes OLS standard errors to be biased is finite samples. However it can be demonstrated.
EC220 - Introduction to econometrics (chapter 7)
1 PROBABILITY DISTRIBUTION EXAMPLE: X IS THE SUM OF TWO DICE red This sequence provides an example of a discrete random variable. Suppose that you.
Random effects estimation RANDOM EFFECTS REGRESSIONS When the observed variables of interest are constant for each individual, a fixed effects regression.
EC220 - Introduction to econometrics (chapter 9)
ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY
1 ASSUMPTIONS FOR MODEL C: REGRESSIONS WITH TIME SERIES DATA Assumptions C.1, C.3, C.4, C.5, and C.8, and the consequences of their violations are the.
EC220 - Introduction to econometrics (chapter 2)
EC220 - Introduction to econometrics (chapter 9)
1 We will now consider the distributional properties of OLS estimators in models with a lagged dependent variable. We will do so for the simplest such.
TESTING A HYPOTHESIS RELATING TO THE POPULATION MEAN 1 This sequence describes the testing of a hypothesis at the 5% and 1% significance levels. It also.
1 A MONTE CARLO EXPERIMENT In the previous slideshow, we saw that the error term is responsible for the variations of b 2 around its fixed component 
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: prediction Original citation: Dougherty, C. (2012) EC220 - Introduction.
Cross-sectional:Observations on individuals, households, enterprises, countries, etc at one moment in time (Chapters 1–10, Models A and B). 1 During this.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: maximum likelihood estimation of regression coefficients Original citation:
DERIVING LINEAR REGRESSION COEFFICIENTS
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the normal distribution Original citation: Dougherty, C. (2012)
1 In a second variation, we shall consider the model shown above. x is the rate of growth of productivity, assumed to be exogenous. w is now hypothesized.
1 This sequence shows why OLS is likely to yield inconsistent estimates in models composed of two or more simultaneous relationships. SIMULTANEOUS EQUATIONS.
1 PREDICTION In the previous sequence, we saw how to predict the price of a good or asset given the composition of its characteristics. In this sequence,
EC220 - Introduction to econometrics (review chapter)
1 UNBIASEDNESS AND EFFICIENCY Much of the analysis in this course will be concerned with three properties of estimators: unbiasedness, efficiency, and.
FIXED EFFECTS REGRESSIONS: WITHIN-GROUPS METHOD The two main approaches to the fitting of models using panel data are known, for reasons that will be explained.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: sampling and estimators Original citation: Dougherty, C. (2012)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 12) Slideshow: autocorrelation, partial adjustment, and adaptive expectations Original.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: conflicts between unbiasedness and minimum variance Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 8) Slideshow: measurement error Original citation: Dougherty, C. (2012) EC220 - Introduction.
THE FIXED AND RANDOM COMPONENTS OF A RANDOM VARIABLE 1 In this short sequence we shall decompose a random variable X into its fixed and random components.
CONSEQUENCES OF AUTOCORRELATION
ALTERNATIVE EXPRESSION FOR POPULATION VARIANCE 1 This sequence derives an alternative expression for the population variance of a random variable. It provides.
CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE
ASYMPTOTIC AND FINITE-SAMPLE DISTRIBUTIONS OF THE IV ESTIMATOR
EC220 - Introduction to econometrics (chapter 8)
MULTIPLE RESTRICTIONS AND ZERO RESTRICTIONS
Simple regression model: Y =  1 +  2 X + u 1 We have seen that the regression coefficients b 1 and b 2 are random variables. They provide point estimates.
A.1The model is linear in parameters and correctly specified. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS 1 Moving from the simple to the multiple.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 9) Slideshow: instrumental variable estimation: variation Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: multiple restrictions and zero restrictions Original citation: Dougherty,
1 We will now look at the properties of the OLS regression estimators with the assumptions of Model B. We will do this within the context of the simple.
1 Y SIMPLE REGRESSION MODEL Suppose that a variable Y is a linear function of another variable X, with unknown parameters  1 and  2 that we wish to estimate.
1 We will continue with a variation on the basic model. We will now hypothesize that p is a function of m, the rate of growth of the money supply, as well.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: alternative expression for population variance Original citation:
1 ASYMPTOTIC PROPERTIES OF ESTIMATORS: THE USE OF SIMULATION In practice we deal with finite samples, not infinite ones. So why should we be interested.
Definition of, the expected value of a function of X : 1 EXPECTED VALUE OF A FUNCTION OF A RANDOM VARIABLE To find the expected value of a function of.
HETEROSCEDASTICITY 1 This sequence relates to Assumption A.4 of the regression model assumptions and introduces the topic of heteroscedasticity. This relates.
INSTRUMENTAL VARIABLES 1 Suppose that you have a model in which Y is determined by X but you have reason to believe that Assumption B.7 is invalid and.
1 INSTRUMENTAL VARIABLE ESTIMATION OF SIMULTANEOUS EQUATIONS In the previous sequence it was asserted that the reduced form equations have two important.
1 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS This sequence presents two methods for dealing with the problem of heteroscedasticity. We will.
1 ESTIMATORS OF VARIANCE, COVARIANCE, AND CORRELATION We have seen that the variance of a random variable X is given by the expression above. Variance.
1 We will illustrate the heteroscedasticity theory with a Monte Carlo simulation. HETEROSCEDASTICITY: MONTE CARLO ILLUSTRATION 1 standard deviation of.
VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE In this sequence we will investigate the consequences of including an irrelevant variable.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: simple regression model Original citation: Dougherty, C. (2012) EC220.
FOOTNOTE: THE COCHRANE–ORCUTT ITERATIVE PROCESS 1 We saw in the previous sequence that AR(1) autocorrelation could be eliminated by a simple manipulation.
VARIABLE MISSPECIFICATION I: OMISSION OF A RELEVANT VARIABLE In this sequence and the next we will investigate the consequences of misspecifying the regression.
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Presentation transcript:

MEASUREMENT ERROR 1 In this sequence we will investigate the consequences of measurement errors in the variables in a regression model. To keep the analysis simple, we will confine it to the simple regression model.

2 We will start with measurement errors in the explanatory variable. Suppose that Y is determined by a variable Z, but Z is subject to measurement error, w. We will denote the measured explanatory variable X. MEASUREMENT ERROR

3 Substituting for Z from the second equation, we can rewrite the model as shown. MEASUREMENT ERROR

4 We are thus able to express Y as a linear function of the observable variable X, with the disturbance term being a compound of the disturbance term in the original model and the measurement error. MEASUREMENT ERROR

5 However if we fit this model using OLS, Assumption B.7 will be violated. X has a random component, the measurement error w. MEASUREMENT ERROR

6 And w is also one of the components of the compound disturbance term. Hence u is not distributed independently of X. MEASUREMENT ERROR

7 We will demonstrate that the OLS estimator of the slope coefficient is inconsistent and that in large samples it is biased downwards if  2 is positive, and upwards if  2 is negative. MEASUREMENT ERROR

8 We begin by writing down the OLS estimator and substituting for Y from the true model. In this case there are alternative versions of the true model. The analysis is simpler if you use the equation relating Y to X. MEASUREMENT ERROR

9 Simplifying, we decompose the slope coefficient into the true value and an error term as usual. MEASUREMENT ERROR

10 We have reached this point many times before. We would like to investigate whether b 2 is biased. This means taking the expectation of the error term. MEASUREMENT ERROR

11 However, it is not possible to obtain a closed-form expression for the expectation of the error term. Both its numerator and its denominator are functions of w and there are no expected value rules that can allow us to simplify. MEASUREMENT ERROR

12 As a second-best measure, we take plims and investigate what would happen in large samples. The plim rules often allow us to obtain analytical results when the expected value rules do not. MEASUREMENT ERROR

13 We focus on the error term. We would like to use the plim quotient rule. The plim of a quotient is the plim of the numerator divided by the plim of the denominator, provided that both of these limits exist. MEASUREMENT ERROR if A and B have probability limits and plim B is not 0.

14 However, as the expression stands, the numerator and the denominator of the error term do not have limits. The denominator increases indefinitely as the sample size increases. The nominator has no particular limit. MEASUREMENT ERROR if A and B have probability limits and plim B is not 0.

15 To deal with this problem, we divide both the numerator and the denominator by n. MEASUREMENT ERROR if A and B have probability limits and plim B is not 0.

16 It can be shown that the limit of the numerator is the covariance of X and u and the limit of the denominator is the variance of X. MEASUREMENT ERROR

17 Hence the numerator and the denominator of the error term have limits and we are entitled to implement the plim quotient rule. We need var(X) to be non-zero, but this will be the case assuming that there is some variation in X. MEASUREMENT ERROR

18 We can decompose both the numerator and the denominator of the error term. We will start by substituting for X and u in the numerator. MEASUREMENT ERROR

19 We expand the expression using the first covariance rule. MEASUREMENT ERROR

20 If we assume that Z, v, and w are distributed indepndently of each other, the first 3 terms are 0. The last term gives us –  2  w 2. MEASUREMENT ERROR

21 We next expand the denominator of the error term. The first two terms are variances. The covariance is 0 if we assume w is distributed independently of Z. MEASUREMENT ERROR

22 Thus in large samples, b 2 is biased towards 0 and the size of the bias depends on the relative sizes of the variances of w and Z. MEASUREMENT ERROR

23 Since b 2 is an inconsistent estimator, it is safe to assume that it is biased in finite samples as well. MEASUREMENT ERROR

24 If our assumptions concerning Z, v, and w are incorrect, b 2 would almost certainly still be an inconsistent estimator, but the expression for the large-sample bias would be more complicated. MEASUREMENT ERROR

25 A further consequence of the violation of Assumption B.7 is that the standard errors, t tests, and F test are invalid. MEASUREMENT ERROR

26 The analysis will be illustrated with a simulation. The true model is Y = Z + u, with the values of Z drawn randomly from a normal distribution with mean 10 and variance 4, and the values of u being drawn from a normal distribution with mean 0 and variance 4. MEASUREMENT ERROR Simulation

27 X = Z + w, where w is drawn from a normal distribution with mean 0 and variance 1. With this information, we are able to determine plim b 2. MEASUREMENT ERROR Simulation

28 The figure shows the distributions of b 2 for sample size 20 and sample size 1,000, for 10 million samples. For both sample sizes, the distributions reveal that the OLS estimator is biased downwards. MEASUREMENT ERROR 10 million samples

29 Further, the figure suggests that, if the sample size were increased, the distribution would contract to the limiting value of MEASUREMENT ERROR 10 million samples

30 There remains the question of whether the limiting value provides guidance to the mean of the distribution for a finite sample. In general, the mean will be different from the limiting value, but will approach it as the sample size increase. MEASUREMENT ERROR 10 million samples

31 In the present case, however, the mean of the sample is almost exactly equal to 0.64, even for sample size 20. MEASUREMENT ERROR 10 million samples

32 Measurement error in the dependent variable has less serious consequences. Suppose that the true dependent variable is Q, that the measured variable is Y, and that the measurement error is r. MEASUREMENT ERROR

33 We can rewrite the model in terms of the observable variables by substituting for Q from the second equation. MEASUREMENT ERROR

34 In this case the presence of the measurement error does not lead to a violation of Assumption B.7. If v satisfies that assumption in the original model, u will satisfy it in the revised one, unless for some strange reason r is not distributed independently of X. MEASUREMENT ERROR

35 The standard errors and tests will remain valid. However the standard errors will tend to be larger than they would have been if there had been no measurement error, reflecting the fact that the variances of the coefficients are larger. MEASUREMENT ERROR

Copyright Christopher Dougherty These slideshows may be downloaded by anyone, anywhere for personal use. Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author. The content of this slideshow comes from Section 8.4 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press. Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre Individuals studying econometrics on their own who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school course EC212 Introduction to Econometrics or the University of London International Programmes distance learning course EC2020 Elements of Econometrics