Download presentation
Presentation is loading. Please wait.
Published byFrancis Logan Modified over 6 years ago
1
Lecture 13 Preview: Dummy and Interaction Variables
Preliminary Mathematics: Averages and Regressions Including Only a Constant An Example: Discrimination in Academe Average Salaries Dummy Variables Models Type 1 Models: No explanatory variables; only a constant. Type 2 Models: A constant and a single dummy explanatory variable denoting sex. Type 3 Models: A constant, a dummy explanatory variable denoting sex, and other explanatory variable(s). Beware of Implicit Assumptions Interaction Variables: Sex and Experience Conclusions Beware of Averages Power of Multiple Regression Analysis Flexibility of Multiple Regression Analysis An Example: Internet and Television Use – An International Comparison Internet and Television Use: Similarities and Differences Interaction Variable: Economic and Political Interaction
2
Preliminary Mathematics: Averages and Regressions
Model: yt = Const + et Const = the actual constant et = the error term Estimate: Estyt = bConst bConst estimates the actual value of the constant, Const Residual: Rest = yt Estyt The actual value of y less the estimated value of y = (y1 Esty1)2 + (y2 Esty2)2 + (y3 Esty3)2 = (y1 bConst)2 + (y2 bConst)2 + (y3 bConst)2 dSSR = 2(y1 bConst) 2(y2 bConst) 2(y3 bConst) = dbConst (y1 bConst) (y2 bConst) (y3 bConst) = y y y = 3bConst When a regression only includes a constant (that is, when there are no explanatory variables), the ordinary least squares estimate of the constant equals the average value of y. y y2 + y3 = bConst 3 = bConst An Example: Discrimination in Academe Data for 200 Faculty Members: Full Disclosure – The data are artificially generated. Salaryt Salary in dollars Dummy variables take on only two values: 0 or 1; they divide the observations into two disjoint groups, two mutually exclusive categories. Experiencet Years of experience Articlest Number of articles published SexM1t 1 if male, 0 if female
3
Discrimination in Academe
Average Salaries Both males and females ,802 Males only 91,841 Question: Should we conclude that discrimination is present? Females only 63,148 Difference 28,693 On average, men earn $28,693 more than women. Theory: Discrimination against women exists in academe. Type 1 Models – No explanatory variables: constant only Model: Salaryt = Const + et EViews Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-Statistic Prob Const 0.0000 Number of Observations 200 We have confirmed that running a regression that includes only a constant is equivalent to calculating the average. Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-Statistic Prob Const 0.0000 Number of Observations 137 Sample SexM1 = 1 Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-Statistic Prob Const 0.0000 Number of Observations 63 Sample SexM1 = 0
4
Type 2 Models – Dummy explanatory variable: constant and a dummy denoting sex
Step 0: Construct a model reflecting the theory to be tested SexM1t = 1 if male, 0 if female Model: Salaryt = Const + SexM1SexM1t + et Theory: SexM1 > 0 Step 1: Collect data, run the regression, and interpret the estimates. Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-Statistic Prob SexM1 0.0000 Const Number of Observations 200 EViews EstSalary = 63, ,693SexM1 For men: SexM1 = 1 EstSalaryMen = 63, ,693 = 91,841 For women: SexM1 = 0 EstSalaryWomen = 63, = 63,148 Interpretation: SexM1 Coefficient Estimate = 28,693 We estimate that men earn $28,693 more than women. This evidence lends support to our theory. Running a regression that includes only a constant and a dummy variable is equivalent to comparing averages. Recall the salary averages: Males only 91,841 Females only 63,148 Difference 28,693 Difference is the SexM1 coefficient estimate.
5
Type 2 Models – Dummy explanatory variables: constant and a different dummy
SexF1t = 1 if female, 0 if male Theory: SexF1 < 0 SexF1t = SexM1t SexF1t = = 0 SexF1t = = 1 Step 0: Model: Salaryt = Const + SexF1SexF1t + et For women, SexM1t = 0 For men, SexM1t = 1 Step 1: Collect data, run the regression, and interpret the estimates. EViews Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-Statistic Prob SexF1 0.0000 Const Number of Observations 200 EstSalary = 91,841 28,693Sex F1 For men: SexF1 = 0 EstSalaryMen = 91,841 0 = 91,841 For women: SexF1 = 1 EstSalaryWomen = 91,841 28,693 = 63,148 Interpretation: SexF1 Coefficient Estimate = 28,693 Conclusions: We estimate that women earn $28,693 less than men. A regression that includes only a constant and a dummy variable is equivalent to comparing averages. Recall the salary averages: Males only 91,841 The choice of which group to denote by 0 and which group by 1 does not affect the conclusions. Females only 63,148 Difference 28,693
6
Model: Salaryt = Const + SexF1SexF1t + et
SexF1t = 1 if female, 0 if male Theory: SexF1 < 0 Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-Statistic Prob SexF1 0.0000 Const Number of Observations 200 Interpretation: SexF1 Coefficient Estimate= 28,693: Women earn $28,693 less than men. Steps 2, 3, 4, and 5 H0: SexF1 = 0 No discrimination H1: SexF1 < 0 Discrimination against women Prob[Results IF H0 True]: Estimate was 28,693: What is the probability that the coefficient estimate in one regression would be 28,693 or less, if H0 were true (if the actual coefficient, SexF1, equaled 0; that is, if there were no discrimination)? Question: Can we use the tails probability? Yes Prob[Results IF H0 True] < .0001 Do the regression results provide convincing evidence of gender discrimination? On the one hand, yes. The dummy variable coefficient estimate suggests that women earn less than men, $28,693 less. is very significant: Prob[Results IF H0 True] < On the other hand, what implicit assumption is our discrimination model making? It implicitly assumes that the only relevant factor in determining faculty salaries is whether the faculty member is a man or a woman. Is this reasonable?
7
Discrimination Theory: SexF1 < 0 Experience Theory: E > 0
Type 3 Models – Other explanatory variables: constant, dummy, and other factors Step 0: Construct a model reflecting the theory to be tested Model: Salaryt = Const + SexF1SexF1t + EExperiencet + et SexF1t = 1 if female if male Discrimination Theory: SexF1 < 0 Experience Theory: E > 0 Step 1: Collect data, run the regression, and interpret the estimates. EViews Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-Statistic Prob SexF1 0.4638 Experience 0.0000 Const Number of Observations 200 Interpretation: We estimate that Women earn about $2,240 less than men after accounting for experience. SexF1 Coefficient Estimate = 2,240 Each additional year of experience results in a $2,447 increase in salary. Experience Coefficient Estimate = 2,447
8
For men: SexF1 = 0: For women: SexF1 = 1:
Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-Statistic Prob SexF1 0.4638 Experience 0.0000 Const Number of Observations 200 EstSalary = 42,238 2,240SexF ,447Experience For men: SexF1 = 0: EstSalaryMen = 42, ,447Experience EstSalaryMen = 42, ,447Experience For women: SexF1 = 1: EstSalaryWomen = 42,237 , ,447Experience EstSalaryWomen = 39, ,447Experience EstSalaryMen = 42, ,447Experience EstSalary Slope = 2,447 42,238 EstSalaryWomen = 39, ,447Experience 2,240 39,998 Experience
9
Model: Salaryt = Const + SexF1SexF1t + EExperiencet + et
Steps 2, 3, 4, and 5 Discrimination Hypotheses H0: SexF1 = 0 No discrimination H1: SexF1 < 0 Discrimination Experience Hypotheses H0: E = 0 Experience has no effect on salary H1: E > 0 Experience increases salary Question: Is this convincing evidence of discrimination? Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-Statistic Prob SexF1 0.4638 Experience 0.0000 Const Number of Observations 200 Probability Distribution of the Estimates Prob[Results IF H0 True] for Discrimination: What is the probability that the coefficient estimate in one regression would be 2,240 or less, if H0 were true (if the actual coefficient, SexF1, equaled 0; that is, if no discrimination existed)? .4638/2 .23 .4638 Prob[Results IF H0 True] = 2 bSexF1 .23 Question: Can we use the tails probability? Answer: Yes Question: Should we reject H0? Answer: Not at the “traditional” significance levels. Question: Should we conclude that no discrimination exists?
10
Taking Stock: Should we conclude that no discrimination exists?
EstSalary EstSalaryMen = 42, ,447Experience Slope = 2,447 Question: What implicit assumptions are we making? 42,238 EstSalaryWomen = 39, ,447Experience 2,240 Women start off about $2,240 behind and then remain behind by that same $2,240 throughout their careers. 39,998 Experience Each additional year of experience increases the salary of men and women by equal amounts. Question: Might discrimination take another form?
11
Exper_SexF1 = Experience
Different Type of Discrimination Perhaps, men could get higher annual raises than women. How can we model this? SexF1t = 1 if female if male Interaction variable: Exper_SexF1t = Experiencet SexF1t Salaryt = Const + SexF1SexF1t + EExperiencet + Exper_SexExper_SexF1t + et EViews Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-Statistic Prob SexF1 0.0490 Experience 0.0000 Exper_SexF1 0.0050 Const Number of Observations 200 EstSalary = 37, ,970SexF ,676Experience 1,135Exper_SexF1 Now, focus on SexF and Exper_SexF1 = Experience SexF1 For men SexF1 = 0 For women SexF1 = 1 Exper_SexF1 = 0 Exper_SexF1 = Experience
12
For men SexF1 = 0 Exper_SexF1 = 0
Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-Statistic Prob SexF1 0.0490 Experience 0.0000 Exper_SexF1 0.0050 Const Number of Observations 200 For men SexF1 = 0 Exper_SexF1 = 0 For women SexF1 = 1 Exper_SexF1 = Experience EstSalary = 37, ,970SexF ,676Experience 1,135Exper_SexF1 For men, EstSalaryMen = 37, ,970SexF ,676Experience 1,135Exper_SexF1 = 37, ,676Experience = 37, ,676Experience For women, EstSalaryWomen = 37, ,970SexF ,676Experience 1,135Exper_SexF1 = 37, , ,676Experience 1,135Experience = 48, ,541Experience Interpretation: We estimate that When first hired (when Experience = 0), women earn $10,970 MORE than men (SexF1). For each additional year of experience, women receive a $1,135 LOWER raise (Exper_SexF1).
13
EstSalary = 37,595 + 10,970SexF1 + 2,676Experience 1,135Exper_SexF1
Dependent Variable: Salary Explanatory Variable(s): Estimate SE t-Statistic Prob SexF1 0.0490 Experience 0.0000 Exper_SexF1 0.0050 Const Number of Observations 200 EstSalary = 37, ,970SexF ,676Experience 1,135Exper_SexF1 For men, EstSalaryMen = 37, ,676Experience For each additional year of experience, women receive lower raises. For women, EstSalaryWomen = 48, ,541Experience Prob[Results IF H0 True] for SexF1 and Exper_SexF1. EstSalary EstSalaryMen = 37, ,676Experience Question: What does the interaction variable, Exper_SexF1 reflect? Answer: How experience affects the salary of men and women differently. EstSalaryWomen = , ,541Experience The interaction variable captures the different impact that experience has on the two different groups. 48,565 37,595 Experience
14
Conclusions Beware of Averages: We should not consider differences in averages, by themselves, as evidence of discrimination. When we just consider average salaries, we are implicitly adopting a model of salary determination that few, if any, people consider realistic. When just looking at averages, we are implicitly assuming that the ONLY factor that determines an individual’s salary is his/her sex. While many would argue that sex is one factor, I know no one who would argue that sex is the only factor. Power of Multiple Regression Analysis: Since is it naïve to just consider averages, what quantitative tools should we use to assess the presence of discrimination? Multiple regression analysis is the appropriate tool. Multiple regression analysis allows us to consider the roles played by several factors in the determination of salary by separating out the individual influence of each factor. Multiple regression analysis allows us to consider not only the role that sex may play, but also the role that the other factors may play as well. Multiple regression analysis sorts out the impact that each individual explanatory variable has on the dependent variable. Flexibility of Multiple Regression Analysis: Not only does multiple regression analysis allow us to consider the roles played by various factors in salary determination, but also it allows us to consider them in various ways. Our example illustrates how we can assess the possible presence “lump sum” discrimination and/or “raise” discrimination.
15
An Example: Internet and Television Use – An International Comparison
Internet and TV Workfile: 208 countries from 1995 to 2002 LogUsersInternett Logarithm of Internet users per 1,000 people LogUsersTVt Logarithm of television users per 1,000 people Yeart to 2002 CapitalHumant Literacy rate (percent of population 15 and over) CapitalPhysicalt Telephone mainlines per 1,000 people GdpPCt Per capital Gdp (1,000’s of “international” dollars) Autht Composite index derived from the Freedom House and the Heritage Foundation to measure political and economic authoritarianism. The index ranges from 0 to 10; 0 is the most democratic and the most authoritarian. Canada and the U.S. have a 1.25 rating; Cuba and Libya have a rating. Models:
16
Theory Hypotheses Theory Hypotheses
Theory and Hypotheses Question: How do Year, CapitalHuman, CapitalPhysical, GDP, and Authoritarian affect Internet and television use? Internet Use TV Use Theory Hypotheses Theory Hypotheses Year Emerging versus Mature Technology CapitalHuman CapitalPhysical GdpPC Auth Role of Content Control An authoritarian nation has difficulty control the content of the internet, but has complete control over the content of its television network.
17
EViews Dependent Variables
Use Logarithms: LogUsersInternet and LogUsersTV Why? We can interpret the coefficients are percent changes. Dependent Variable: LogUsersInternet Explanatory Variable(s): Estimate SE t-Statistic Prob Year 0.0000 CapitalHuman CapitalPhysical GdpPC Auth Const Number of Observations 566 EstLogUsersInternet = Year CapitalHuman CapitalPhysical GdpPC .096Auth Dependent Variable: LogUsersTV Explanatory Variable(s): Estimate SE t-Statistic Prob Year 0.1487 CapitalHuman 0.0000 CapitalPhysical 0.0002 GdpPC Auth Const 0.1575 Number of Observations 742 EstLogUsersTV = Year CapitalHuman CapitalPhysical GdpPC Auth
18
Summary Table LogUsersInternet LogUsersTV Year * (<.0001) (.1487) CapitalHuman * * (<.0001) (<.0001) CapitalPhysical * * (<.0001) (.0001) GdpPC * * Auth .096* * Prob[Results IF H0 True] is in the parentheses. * Indicates significance at the 1 percent level. Critical Regression Results: We estimate that CapitalHuman, CapitalPhysical, and GdpPC promote both Internet and television use. Internet use grows by an estimated 45 percent per year whereas the growth rate of television use does not differ significantly from 0 at the traditional significance levels. Increases in the authoritarian index results to a significant decrease Internet use, but a significant increase television use.
19
EViews Interaction Variables
Interaction variables explore how one variable can affect the impact that another variable has on the dependent variable. Does the political nature of the nation affect the impact of GdpPC on Internet use? Auth_GdpPC = Auth GdpPC Dependent Variable: LogUsersInternet Explanatory Variable(s): Estimate SE t-Statistic Prob Year 0.0000 CapitalHuman CapitalPhysical GdpPC 0.0236 Auth Auth_GdpPC Const Number of Observations 566 EViews EstLogUsersInternet = Year CapitalHuman CapitalPhysical GdpPC .230Auth Auth_GdpPC Focus attention on the effect of GdpPC: Authoritarian Index .033GdpPC Auth_GdpPC GdpPC Auth GdpPC 2 .033GdpPC 2GdpPC = GdpPC 4 .033GdpPC 4GdpPC = GdpPC 6 .033GdpPC 6GdpPC = GdpPC 8 .033GdpPC 8GdpPC = GdpPC What do the estimates suggest? As countries become more authoritarian, an additional $1,000 of per capita GdpPC will have a greater impact on Internet use. How can we explain this?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.