9.1 Chapter 9: Dummy Variables A Dummy Variable: is a variable that can take on only 2 possible values: yes, no up, down male, female union member, non-union member They provide a method for “quantifying” a “qualitative” variable The variable D = 1 if yes, D = 0 if no It doesn’t matter which category gets the 0 or 1.
9.2 Estimation with Dummy Variables If the dummy variable is the only independent variable: Y t = 1 + 2 D t + e t If D = 0 Y t = 1 + e t If D = 1 Y t = ( 1 + 2 ) + e t Example: Wage data (See class handout) FE = 0 if the person is male FE = 1 if the person is female Wage t = 1 + 2 FE t + e t Least squares regression will produce a b 1 and b 2 value such that b 1 = the mean of the Wage values for the FE=0 values b 1 + b 2 = the mean of the Wage values for the FE=1 values
9.3 Estimation with Dummy Variables If there is one continuous explanatory variable and one dummy variable: Y t = 1 + 2 X t + D t + e t If D = 0 Y t = 1 + 2 X t + e t If D = 1 Y t = ( 1 + ) + 2 X t + e t X Y 11 1 + Suppose that 1 >0, 2 >0, > 0 It is as though we have two regression lines that have the same slope coefficient but have difference intercepts. 22 22
9.4 Estimation with Dummy Variables Example: Wage data (See class handout) FE = 0 if the person is male FE = 1 if the person is female Wage t = 1 + 2 ED t + 3 FE t + e t We estimate this model as an ordinary multiple regression model. Our estimate b 3 will measure the difference in wages for males vs. females, after controlling for differences in education. See class handout.
9.5 Interaction Terms An interaction term is an independent variable that is the product of two other independent variables. These independent variables can be continuous or dummy variables Y t = 1 + 2 X t + 3 Z t + 4 X t Z t + e t In this model, the effect of X on Y will depend on the level of Z. In this model, the effect of Z on Y will depend on the level of X.
9.6 Interaction Terms Involving Dummy Variables Y t = 1 + 2 X t + 3 D t + 4 D t X t + e t If D = 0 Y t = 1 + 2 X t + e t If D = 1 Y t = ( 1 + 3 ) + ( 2+ 4 )X t + e t X Y 11 Suppose that 1 >0, 2 >0, 3 >0, 4 >0 It is as though we have two regression lines that have different slope coefficients and different intercepts. 22 2+ 4 1 + 3
9.7