THE DUMMY VARIABLE TRAP 1 Suppose that you have a regression model with Y depending on a set of ordinary variables X 2,..., X k and a qualitative variable.

Slides:



Advertisements
Similar presentations
EXPECTED VALUE RULES 1. This sequence states the rules for manipulating expected values. First, the additive rule. The expected value of the sum of two.
Advertisements

EC220 - Introduction to econometrics (chapter 8)
THE ERROR CORRECTION MODEL 1 The error correction model is a variant of the partial adjustment model. As with the partial adjustment model, we assume a.
1 MAXIMUM LIKELIHOOD ESTIMATION OF REGRESSION COEFFICIENTS X Y XiXi 11  1  +  2 X i Y =  1  +  2 X We will now apply the maximum likelihood principle.
1 This very short sequence presents an important definition, that of the independence of two random variables. Two random variables X and Y are said to.
EC220 - Introduction to econometrics (chapter 5)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: a Monte Carlo experiment Original citation: Dougherty, C. (2012) EC220.
1 THE DISTURBANCE TERM IN LOGARITHMIC MODELS Thus far, nothing has been said about the disturbance term in nonlinear regression models.
1 XX X1X1 XX X Random variable X with unknown population mean  X function of X probability density Sample of n observations X 1, X 2,..., X n : potential.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 12) Slideshow: dynamic model specification Original citation: Dougherty, C. (2012)
1 THE NORMAL DISTRIBUTION In the analysis so far, we have discussed the mean and the variance of a distribution of a random variable, but we have not said.
1 BINARY CHOICE MODELS: PROBIT ANALYSIS In the case of probit analysis, the sigmoid function F(Z) giving the probability is the cumulative standardized.
1 PROBABILITY DISTRIBUTION EXAMPLE: X IS THE SUM OF TWO DICE red This sequence provides an example of a discrete random variable. Suppose that you.
Random effects estimation RANDOM EFFECTS REGRESSIONS When the observed variables of interest are constant for each individual, a fixed effects regression.
MEASUREMENT ERROR 1 In this sequence we will investigate the consequences of measurement errors in the variables in a regression model. To keep the analysis.
1 ASSUMPTIONS FOR MODEL C: REGRESSIONS WITH TIME SERIES DATA Assumptions C.1, C.3, C.4, C.5, and C.8, and the consequences of their violations are the.
EXPECTED VALUE OF A RANDOM VARIABLE 1 The expected value of a random variable, also known as its population mean, is the weighted average of its possible.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: expected value of a function of a random variable Original citation:
TESTING A HYPOTHESIS RELATING TO THE POPULATION MEAN 1 This sequence describes the testing of a hypothesis at the 5% and 1% significance levels. It also.
1 INTERPRETATION OF A REGRESSION EQUATION The scatter diagram shows hourly earnings in 2002 plotted against years of schooling, defined as highest grade.
1 A MONTE CARLO EXPERIMENT In the previous slideshow, we saw that the error term is responsible for the variations of b 2 around its fixed component 
1 In the previous sequence, we were performing what are described as two-sided t tests. These are appropriate when we have no information about the alternative.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: maximum likelihood estimation of regression coefficients Original citation:
DERIVING LINEAR REGRESSION COEFFICIENTS
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the normal distribution Original citation: Dougherty, C. (2012)
1 In a second variation, we shall consider the model shown above. x is the rate of growth of productivity, assumed to be exogenous. w is now hypothesized.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy variable classification with two categories Original citation:
1 PREDICTION In the previous sequence, we saw how to predict the price of a good or asset given the composition of its characteristics. In this sequence,
EC220 - Introduction to econometrics (review chapter)
1 UNBIASEDNESS AND EFFICIENCY Much of the analysis in this course will be concerned with three properties of estimators: unbiasedness, efficiency, and.
FIXED EFFECTS REGRESSIONS: WITHIN-GROUPS METHOD The two main approaches to the fitting of models using panel data are known, for reasons that will be explained.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: sampling and estimators Original citation: Dougherty, C. (2012)
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 12) Slideshow: autocorrelation, partial adjustment, and adaptive expectations Original.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: conflicts between unbiasedness and minimum variance Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 8) Slideshow: measurement error Original citation: Dougherty, C. (2012) EC220 - Introduction.
THE FIXED AND RANDOM COMPONENTS OF A RANDOM VARIABLE 1 In this short sequence we shall decompose a random variable X into its fixed and random components.
1 TWO SETS OF DUMMY VARIABLES The explanatory variables in a regression model may include multiple sets of dummy variables. This sequence provides an example.
1 General model with lagged variables Static model AR(1) model Model with lagged dependent variable Methodologically, in developing a regression specification.
Confidence intervals were treated at length in the Review chapter and their application to regression analysis presents no problems. We will not repeat.
CONSEQUENCES OF AUTOCORRELATION
ALTERNATIVE EXPRESSION FOR POPULATION VARIANCE 1 This sequence derives an alternative expression for the population variance of a random variable. It provides.
CONFLICTS BETWEEN UNBIASEDNESS AND MINIMUM VARIANCE
1 PROXY VARIABLES Suppose that a variable Y is hypothesized to depend on a set of explanatory variables X 2,..., X k as shown above, and suppose that for.
MULTIPLE RESTRICTIONS AND ZERO RESTRICTIONS
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 12) Slideshow: footnote: the Cochrane-Orcutt iterative process Original citation: Dougherty,
TYPE II ERROR AND THE POWER OF A TEST A Type I error occurs when the null hypothesis is rejected when it is in fact true. A Type II error occurs when the.
A.1The model is linear in parameters and correctly specified. PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS 1 Moving from the simple to the multiple.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 9) Slideshow: instrumental variable estimation: variation Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: multiple restrictions and zero restrictions Original citation: Dougherty,
1 We will now look at the properties of the OLS regression estimators with the assumptions of Model B. We will do this within the context of the simple.
1 COVARIANCE, COVARIANCE AND VARIANCE RULES, AND CORRELATION Covariance The covariance of two random variables X and Y, often written  XY, is defined.
1 Y SIMPLE REGRESSION MODEL Suppose that a variable Y is a linear function of another variable X, with unknown parameters  1 and  2 that we wish to estimate.
1 We will continue with a variation on the basic model. We will now hypothesize that p is a function of m, the rate of growth of the money supply, as well.
COST 11 DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES 1 This sequence explains how you can include qualitative explanatory variables in your regression.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: alternative expression for population variance Original citation:
Definition of, the expected value of a function of X : 1 EXPECTED VALUE OF A FUNCTION OF A RANDOM VARIABLE To find the expected value of a function of.
INSTRUMENTAL VARIABLES 1 Suppose that you have a model in which Y is determined by X but you have reason to believe that Assumption B.7 is invalid and.
1 ESTIMATORS OF VARIANCE, COVARIANCE, AND CORRELATION We have seen that the variance of a random variable X is given by the expression above. Variance.
1 CHANGES IN THE UNITS OF MEASUREMENT Suppose that the units of measurement of Y or X are changed. How will this affect the regression results? Intuitively,
1 REPARAMETERIZATION OF A MODEL AND t TEST OF A LINEAR RESTRICTION Linear restrictions can also be tested using a t test. This involves the reparameterization.
1 We will illustrate the heteroscedasticity theory with a Monte Carlo simulation. HETEROSCEDASTICITY: MONTE CARLO ILLUSTRATION 1 standard deviation of.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: independence of two random variables Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: simple regression model Original citation: Dougherty, C. (2012) EC220.
FOOTNOTE: THE COCHRANE–ORCUTT ITERATIVE PROCESS 1 We saw in the previous sequence that AR(1) autocorrelation could be eliminated by a simple manipulation.
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Presentation transcript:

THE DUMMY VARIABLE TRAP 1 Suppose that you have a regression model with Y depending on a set of ordinary variables X 2,..., X k and a qualitative variable.

2 Suppose that the qualitative variable has s categories. We choose one of them as the omitted category (without loss of generality, category 1) and define dummy variables D 2,..., D s for the rest. THE DUMMY VARIABLE TRAP

3 What would happen if we did not drop the reference category? Suppose we defined a dummy variable D 1 for it and included it in the specification. What would happen then? THE DUMMY VARIABLE TRAP

4 We would fall into the dummy variable trap. It would be impossible to fit the model as specified. THE DUMMY VARIABLE TRAP

5 We will start with an intuitive explanation. The coefficient of each dummy variable represents the increase in the intercept relative to that for the basic category. But there is no basic category for such a comparison. THE DUMMY VARIABLE TRAP

6  1 represents the fixed component of Y for the basic category. But again, there is no basic category. Thus the model does not have any logical interpretation. THE DUMMY VARIABLE TRAP

7 Mathematically, we have a special case of exact multicollinearity. If there is no omitted category, there is an exact linear relationship between X 1 and the dummy variables. The table gives an example where there are 4 categories. THE DUMMY VARIABLE TRAP Observation CategoryX 1 D 1 D 2 D 3 D

8 X 1 is the variable whose coefficient is  1. It is equal to 1 in all observations. Usually we do not write it explicitly because there is no need to do so. THE DUMMY VARIABLE TRAP Observation CategoryX 1 D 1 D 2 D 3 D

9 If there is an exact linear relationship among a set of the variables, it is impossible in principle to estimate the separate coefficients of those variables. To understand this properly, one needs to use linear algebra. THE DUMMY VARIABLE TRAP Observation CategoryX 1 D 1 D 2 D 3 D

10 If you tried to run the regression anyway, the regression application should detect the problem and do one of two things. It may simply refuse to run the regression. THE DUMMY VARIABLE TRAP Observation CategoryX 1 D 1 D 2 D 3 D

11 Alternatively, it may run it, dropping one of the variables in the linear relationship, effectively defining the omitted category by itself. THE DUMMY VARIABLE TRAP Observation CategoryX 1 D 1 D 2 D 3 D

12 There is another way of avoiding the dummy variable trap. That is to drop the intercept (and X 1 ). There is no longer a problem because there is no longer an exact linear relationship linking the variables. THE DUMMY VARIABLE TRAP Observation CategoryX 1 D 1 D 2 D 3 D

13 The  parameters are now the intercepts in the relationship for the individual categories. For example, if the observation relates to category 2, all the dummy variables except D 2 will be equal to 0. D 2 = 1, and hence the relationship for that observation has intercept  2. Observation CategoryX 1 D 1 D 2 D 3 D THE DUMMY VARIABLE TRAP

Copyright Christopher Dougherty These slideshows may be downloaded by anyone, anywhere for personal use. Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author. The content of this slideshow comes from Section 5.2 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press. Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre Individuals studying econometrics on their own who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school course EC212 Introduction to Econometrics or the University of London International Programmes distance learning course EC2020 Elements of Econometrics