1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x 2 +...  k x k + u 5. Dummy Variables.

Slides:



Advertisements
Similar presentations
Quantitative Methods II
Advertisements

Multiple Regression Analysis
The Simple Regression Model
1 Javier Aparicio División de Estudios Políticos, CIDE Otoño Regresión.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
FIN822 Li11 Binary independent and dependent variables.
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Econ 140 Lecture 151 Multiple Regression Applications Lecture 15.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 12: Joint Hypothesis Tests (Chapter 9.1–9.3, 9.5–9.6)
1Prof. Dr. Rainer Stachuletz Limited Dependent Variables P(y = 1|x) = G(  0 + x  ) y* =  0 + x  + u, y = max(0,y*)
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Prof. Dr. Rainer Stachuletz
Interaksi Dalam Regresi (Lanjutan) Pertemuan 25 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
1Prof. Dr. Rainer Stachuletz Simultaneous Equations y 1 =  1 y 2 +  1 z 1 + u 1 y 2 =  2 y 1 +  2 z 2 + u 2.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
January 6, afternoon session 1 Statistics Micro Mini Multiple Regression January 5-9, 2008 Beth Ayers.
© 2000 Prentice-Hall, Inc. Chap Multiple Regression Models.
Multiple Regression Models. The Multiple Regression Model The relationship between one dependent & two or more independent variables is a linear function.
Functional Form, Scaling and Use of Dummy Variables Copyright © 2006 Pearson Addison-Wesley. All rights reserved
Economics 20 - Prof. Anderson
© 2003 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
Econ 140 Lecture 181 Multiple Regression Applications III Lecture 18.
7 Dummy Variables Thus far, we have only considered variables with a QUANTITATIVE MEANING -ie: dollars, population, utility, etc. In this chapter we will.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 4. Further Issues.
FIN357 Li1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u Chapter 7. Dummy Variables.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 3. Asymptotic Properties.
Multiple Regression Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 7-1.
The Simple Regression Model
EC Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Ch. 14: The Multiple Regression Model building
1Prof. Dr. Rainer Stachuletz Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
Economics Prof. Buckles
1Prof. Dr. Rainer Stachuletz Panel Data Methods y it =  0 +  1 x it  k x itk + u it.
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Multiple Linear Regression Analysis
Hypothesis Testing in Linear Regression Analysis
1 Research Method Lecture 6 (Ch7) Multiple regression with qualitative variables ©
LESSON 5 Multiple Regression Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 7-1.
Chapter 14 Introduction to Multiple Regression
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Chapter 13 Multiple Regression
Lecture 4 Introduction to Multiple Regression
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to.
Multiple Regression  Similar to simple regression, but with more than one independent variable R 2 has same interpretation R 2 has same interpretation.
9.1 Chapter 9: Dummy Variables A Dummy Variable: is a variable that can take on only 2 possible values: yes, no up, down male, female union member, non-union.
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 14-1 Chapter 14 Introduction to Multiple Regression Statistics for Managers using Microsoft.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Inference.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Chapter 14 Introduction to Multiple Regression
Multiple Regression Analysis with Qualitative Information
Multiple Regression Analysis with Qualitative Information
S519: Evaluation of Information Systems
Multiple Regression Analysis with Qualitative Information
Chapter 8: DUMMY VARIABLE (D.V.) REGRESSION MODELS
Multiple Regression Analysis with Qualitative Information
Chapter 9 Dummy Variables Undergraduated Econometrics Page 1
Financial Econometrics Fin. 505
Cases. Simple Regression Linear Multiple Regression.
General Linear Regression
Presentation transcript:

1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 5. Dummy Variables

2Prof. Dr. Rainer Stachuletz Dummy Variables A dummy variable is a variable that takes on the value 1 or 0 Examples: male (= 1 if are male, 0 otherwise), south (= 1 if in the south, 0 otherwise), etc. Dummy variables are also called binary variables, for obvious reasons

3Prof. Dr. Rainer Stachuletz A Dummy Independent Variable Consider a simple model with one continuous variable (x) and one dummy (d) y =  0 +  0 d +  1 x + u This can be interpreted as an intercept shift If d = 0, then y =  0 +  1 x + u If d = 1, then y = (  0 +  0 ) +  1 x + u The case of d = 0 is the base group

4Prof. Dr. Rainer Stachuletz Example of  0 > 0 x y { 00 } 00 y = (  0 +  0 ) +  1 x y =  0 +  1 x slope =  1 d = 0 d = 1

5Prof. Dr. Rainer Stachuletz Dummies for Multiple Categories We can use dummy variables to control for something with multiple categories Suppose everyone in your data is either a HS dropout, HS grad only, or college grad To compare HS and college grads to HS dropouts, include 2 dummy variables hsgrad = 1 if HS grad only, 0 otherwise; and colgrad = 1 if college grad, 0 otherwise

6Prof. Dr. Rainer Stachuletz Multiple Categories (cont) Any categorical variable can be turned into a set of dummy variables Because the base group is represented by the intercept, if there are n categories there should be n – 1 dummy variables If there are a lot of categories, it may make sense to group some together Example: top 10 ranking, 11 – 25, etc.

7Prof. Dr. Rainer Stachuletz Interactions Among Dummies Interacting dummy variables is like subdividing the group Example: have dummies for male, as well as hsgrad and colgrad Add male*hsgrad and male*colgrad, for a total of 5 dummy variables –> 6 categories Base group is female HS dropouts hsgrad is for female HS grads, colgrad is for female college grads The interactions reflect male HS grads and male college grads

8Prof. Dr. Rainer Stachuletz More on Dummy Interactions Formally, the model is y =  0 +  1 male +  2 hsgrad +  3 colgrad +  4 male*hsgrad +  5 male*colgrad +  1 x + u, then, for example: If male = 0 and hsgrad = 0 and colgrad = 0 y =  0 +  1 x + u If male = 0 and hsgrad = 1 and colgrad = 0 y =  0 +  2 hsgrad +  1 x + u If male = 1 and hsgrad = 0 and colgrad = 1 y =  0 +  1 male +  3 colgrad +  5 male*colgrad +  1 x + u

9Prof. Dr. Rainer Stachuletz Other Interactions with Dummies Can also consider interacting a dummy variable, d, with a continuous variable, x y =  0 +  1 d +  1 x +  2 d*x + u If d = 0, then y =  0 +  1 x + u If d = 1, then y = (  0 +  1 ) + (  1 +  2 ) x + u This is interpreted as a change in the slope

10Prof. Dr. Rainer Stachuletz y x y =  0 +  1 x y = (  0 +  0 ) + (  1 +  1 ) x Example of  0 > 0 and  1 < 0 d = 1 d = 0

11Prof. Dr. Rainer Stachuletz Testing for Differences Across Groups Testing whether a regression function is different for one group versus another can be thought of as simply testing for the joint significance of the dummy and its interactions with all other x variables So, you can estimate the model with all the interactions and without and form an F statistic, but this could be unwieldy

12Prof. Dr. Rainer Stachuletz The Chow Test Turns out you can compute the proper F statistic without running the unrestricted model with interactions with all k continuous variables If run the restricted model for group one and get SSR 1, then for group two and get SSR 2 Run the restricted model for all to get SSR, then

13Prof. Dr. Rainer Stachuletz The Chow Test (continued) The Chow test is really just a simple F test for exclusion restrictions, but we’ve realized that SSR ur = SSR 1 + SSR 2 Note, we have k + 1 restrictions (each of the slope coefficients and the intercept) Note the unrestricted model would estimate 2 different intercepts and 2 different slope coefficients, so the df is n – 2k – 2

14Prof. Dr. Rainer Stachuletz Linear Probability Model P(y = 1|x) = E(y|x), when y is a binary variable, so we can write our model as P(y = 1|x) =  0 +  1 x 1 + … +  k x k So, the interpretation of  j is the change in the probability of success when x j changes The predicted y is the predicted probability of success Potential problem that can be outside [0,1]

15Prof. Dr. Rainer Stachuletz Linear Probability Model (cont) Even without predictions outside of [0,1], we may estimate effects that imply a change in x changes the probability by more than +1 or –1, so best to use changes near mean This model will violate assumption of homoskedasticity, so will affect inference Despite drawbacks, it’s usually a good place to start when y is binary

16Prof. Dr. Rainer Stachuletz Caveats on Program Evaluation A typical use of a dummy variable is when we are looking for a program effect For example, we may have individuals that received job training, or welfare, etc We need to remember that usually individuals choose whether to participate in a program, which may lead to a self- selection problem

17Prof. Dr. Rainer Stachuletz Self-selection Problems If we can control for everything that is correlated with both participation and the outcome of interest then it’s not a problem Often, though, there are unobservables that are correlated with participation In this case, the estimate of the program effect is biased, and we don’t want to set policy based on it!