Econ 140 Lecture 171 Multiple Regression Applications II &III Lecture 17.

Slides:



Advertisements
Similar presentations
Multiple Regression.
Advertisements

Autocorrelation Lecture 20 Lecture 20.
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Nine.
Welcome to Econ 420 Applied Regression Analysis
Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.
Econ 140 Lecture 81 Classical Regression II Lecture 8.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Specification Error II
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Econ 140 Lecture 151 Multiple Regression Applications Lecture 15.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 12: Joint Hypothesis Tests (Chapter 9.1–9.3, 9.5–9.6)
1 Module II Lecture 4:F-Tests Graduate School 2004/2005 Quantitative Research Methods Gwilym Pryce
Classical Regression III
Chapter 13 Multiple Regression
To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-1 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ Chapter 4 RegressionModels.
Econ 140 Lecture 121 Prediction and Fit Lecture 12.
Chapter 12 Multiple Regression
Econ 140 Lecture 131 Multiple Regression Models Lecture 13.
Multiple Regression Models
The Simple Regression Model
Econ 140 Lecture 181 Multiple Regression Applications III Lecture 18.
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
Multiple Regression Applications
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Statistical Analysis SC504/HS927 Spring Term 2008 Session 7: Week 23: 7 th March 2008 Complex independent variables and regression diagnostics.
Chapter 11 Multiple Regression.
7.1 Lecture #7 Studenmund(2006) Chapter 7 Objective: Applications of Dummy Independent Variables.
Multiple Linear Regression
BCOR 1020 Business Statistics
Autocorrelation Lecture 18 Lecture 18.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Multiple Linear Regression Analysis
Lecture 5 Correlation and Regression
8.1 Ch. 8 Multiple Regression (con’t) Topics: F-tests : allow us to test joint hypotheses tests (tests involving one or more  coefficients). Model Specification:
Introduction to Linear Regression and Correlation Analysis
Hypothesis Testing in Linear Regression Analysis
Understanding Multivariate Research Berry & Sanders.
1 Research Method Lecture 6 (Ch7) Multiple regression with qualitative variables ©
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Statistics and Quantitative Analysis U4320 Segment 12: Extension of Multiple Regression Analysis Prof. Sharyn O’Halloran.
Statistics and Econometrics for Business II Fall 2014 Instructor: Maksym Obrizan Lecture notes III # 2. Advanced topics in OLS regression # 3. Working.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 13 Multiple Regression
Environmental Modeling Basic Testing Methods - Statistics III.
Multiple Regression  Similar to simple regression, but with more than one independent variable R 2 has same interpretation R 2 has same interpretation.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
Business Research Methods
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
Chapter 8 Multivariate Regression Analysis 8.3 Multiple Regression with K Independent Variables 8.4 Significance tests of Parameters.
ECONOMETRICS EC331 Prof. Burak Saltoglu
Chapter 14 Introduction to Multiple Regression
Multiple Regression Analysis with Qualitative Information
Multiple Regression Lecture 13 Lecture 12.
Multiple Regression Analysis and Model Building
Multiple Regression Analysis with Qualitative Information
Multiple Regression Analysis with Qualitative Information
Chapter 6: MULTIPLE REGRESSION ANALYSIS
Prepared by Lee Revere and John Large
Chapter 8: DUMMY VARIABLE (D.V.) REGRESSION MODELS
Multiple Regression Analysis with Qualitative Information
Chapter 9 Dummy Variables Undergraduated Econometrics Page 1
Financial Econometrics Fin. 505
Introduction to Regression
Presentation transcript:

Econ 140 Lecture 171 Multiple Regression Applications II &III Lecture 17

Econ 140 Lecture 172 Today’s Plan Two topics and how they relate to multiple regression –Multicollinearity –Dummy variables

Econ 140 Lecture 173 Multicollinearity Suppose we have the following regression equation: Y = a + b 1 X 1 + b 2 X 2 + e Multicollinearity occurs when some or all of the independent X variables are linearly related Different forms of multicollinearity: –Perfect: OLS estimation will not work –Non-perfect: comes out of applied work

Econ 140 Lecture 174 Multicollinearity Example Again we’ll use returns to education where: –the dependent variable Y is (log) wages –the independent variables (X’s) are age, experience, and years of schooling Experience is defined as years in the labor force, or the difference between age and years of schooling –this can be written: Experience = Age - Years of school –What’s the problem with this?

Econ 140 Lecture 175 Multicollinearity Example (2) Note that we’ve expressed experience as the difference of two of our other independent variables –by constructing experience in this manner we create a collinear dependence between age and experience –the relationship between age and experience is a linear relationship such that: as age increases, for given years of schooling, experience also increases We can write our regression equation for this example: Wages = a + b 1 Experience + b 2 Age + e

Econ 140 Lecture 176 Multicollinearity Example (3) Recall that our estimate for b 1 is Where x 1 = experience and x 2 = age The problem is that x 1 and x 2 are linearly related –as we get closer to perfect linearity, the denominator will go to zero. –OLS won’t work!

Econ 140 Lecture 177 Multicollinearity Example (4) Recall that the estimated variance for is: –So as x 1 and x 2 approach perfect collinearity, the denominator will go to zero and the expression for the the estimated variance of will increase Implications: –with multicollinearity, you will get large standard errors on partial coefficients –your t-ratios, given the null hypothesis that the value of the coefficient is zero, will be small

Econ 140 Lecture 178 More Multicollinearity Examples On L16_1.xls we have individual data on age, years of education, weekly earnings, school age, and experience –we can perform a regression to calculate returns given age and experience –we can also estimate bivariate models including only age, only experience, and only years of schooling –we expect that the problem is that experience is related to age (to test this, we can regress age on experience) if the slope coefficient on experience is 1, there is perfect multicollinearity

Econ 140 Lecture 179 More Multicollinearity Examples (2) On L16_2.xls there’s a made-up example of perfect multicollinearity –OLS is unable to calculate the slope coefficients –calculating the products and cross-products, we find that the denominator for the slope coefficients is zero as predicted –If we have is an applied problem with these properties: 1) OLS is still unbiased 2) Large variance, standard errors, and difficult hypothesis testing 3) Few significant coefficients but a high R 2

Econ 140 Lecture 1710 More Multicollinearity Examples (3) What to do with L16_2.xls? –There’s simply not enough variation –We can collect more data or rethink the model –We can test for partial correlations between the X variables (as demonstrated on L16_1.xls).

Econ 140 Lecture 1711 Dummy variables Dummy variables allow you to include qualitative variables (or variables that otherwise cannot be quantified) in your regression –examples include: gender, race, marital status, and religion –also becomes important when looking at “regime shifts” which may be new policy initiatives, economic change, or seasonality We will look at some examples: –using female as a qualitative variable –using marital status as a qualitative variable –using the Phillips curve to demonstrate a regime shift

Econ 140 Lecture 1712 Qualitative example: female We’ll construct a dummy variable: D i = 0 if not femalei = 1, …n D i = 1 if female –We can do this with any qualitative variable –Note: assigning the values for the dummy variable is an arbitrary choice On L17_1.xls there is a sample from the current CPS –to create the dummy variable “female” we assign the value one and zero to the CPS’ value of two and one for sex, respectively –we can include the dummy variable in the regression equation like we would any other variable

Econ 140 Lecture 1713 Qualitative example: female (2) We estimate the following equation: Now we can ask: what are the expected earnings given that a person is male? Similarly, what are the expected earnings given that a person is female? E(Y i | D i = 1) = a + b(1) = a + b = = 5.490

Econ 140 Lecture 1714 Qualitative example: female (4) We can use other variables to extend our analysis for example we can include age to get the equation: Y = a + b 1 D i + b 2 X i + e –where X i can be any or all relevant variables –D i and the related coefficient b 1 will indicate how much, on average, females earn less than males –for males the intercept will be –for females the intercept will be

Econ 140 Lecture 1715 Qualitative example: female (5) The estimated regression found on the spreadsheet is The expected weekly earnings for men are: The expected weekly earnings for women are:

Econ 140 Lecture 1716 Qualitative example: female (6) An important note: We can’t include dummy variables for both male and female in the same regression equation –suppose we have Y = a + b 1 D 1i + b 2 D 2i + e –where: D 1i = 0 if male D 1i = 1 if female D 2i = 0 if female D 2i = 1 if male –OLS won’t be able to estimate the regression coefficients because D 1i and D 2i show perfect multicollinearity with intercept a So if you have m qualitative variables, you should include (m-1) dummy variables in the regression equation

Econ 140 Lecture 1717 Example: marital status The spreadsheet (L17_1.xls) also estimates the following regression equation using two distinct dummy variables: –where: D 1i = 0 if maleD 1i = 1 if female D 2i = 0 if other D 2i = 1 if married Using the regression equation we can create four categories: married males, unmarried males, married females, and unmarried females

Econ 140 Lecture 1718 Example: marital status (2) Expected earnings for unmarried males: Expected earnings for unmarried females: Expected earnings for married males: Expected earnings for unmarried females:

Econ 140 Lecture 1719 Interactive terms So far we’ve only used dummy variables to change the intercept We can also use dummy variables to alter the partial slope coefficients Let’s think about this model: W t = a + b 1 Age i + b 2 Married i + e –we could argue that would be different for males and females –we want to think about two sub-sample groups: males and females –we can test the hypothesis that the partial slope coefficients will be different for these 2 groups

Econ 140 Lecture 1720 Interactive terms (2) To test our hypothesis we’ll estimate the regression equation for the whole sample and then for the two sub-sample groups We test to see if our estimated coefficients are the same between males and females Our null hypothesis is: H 0 : a M, b 1M, b 2M = a F, b 1F, b 2F

Econ 140 Lecture 1721 Interactive terms (3) We have an unrestricted form and a restricted form –unrestricted: used when we estimate for the sub-sample groups separately –restricted: used when we estimate for the whole sample What type of statistic will we use to carry out this test? –F-statistic: q = k, the number of parameters in the model n = n 1 + n 2 where n is complete sample size

Econ 140 Lecture 1722 Interactive terms (4) The sum of squared residuals for the unrestricted form will be: SSR U = SSR M + SSR F L17_2.xls –the data is sorted according to the dummy variable “female” –there is a second dummy variable for marital status –there are 3 estimated regression equations, one each for the total sample, male sub-sample, and female sub- sample

Econ 140 Lecture 1723 Interactive terms (5) The output allows us to gather the necessary sum of squared residuals and sample sizes to construct the estimate: –Since F 0.05,3, 27 = 2.96 > F* we cannot reject the null hypothesis that the partial slope coefficients are the same for males and females

Econ 140 Lecture 1724 Interactive terms (6) What if F* > F 0.05,3, 27 ? How to read the results? –There’s a difference between the two sub-samples and therefore we should estimate the wage equations separately –Or we could interact the dummy variables with the other variables To interact the dummy variables with the age and marital status variables, we multiply the dummy variable by the age and marital status variables to get: W t = a + b 1 Age i + b 2 Married i + b 3 D i + b 4 (D i *Age i ) + b 5 (D i *Married i ) + e i Irene O. Wong:

Econ 140 Lecture 1725 Interactive terms (7) Using L17_2.xls you can construct the interactive terms by multiplying the FEMALE column by the AGE and MARRIED columns –one way to see if the two sub-samples are different, look at the t-ratios on the interactive terms –in this example, neither of the t-ratios are statistically significant so we can’t reject the null hypothesis We now know how to use dummy variables to indicate the importance of sub-sample groups within the data –dummy variables are also useful for testing for structural breaks or regime shifts

Econ 140 Lecture 1726 Interactive terms (8) If we want to estimate the equation for the first sub-sample (males) we take the expectation of the wage equation where the dummy variable for female takes the value of zero: E(W t |D i = 0) = a + b 1 Age i + b 2 Married i We can do the same for the second sub-sample (Females) E(W t |D i = 1) = (a + b 3 ) + (b 1 + b 4 )Age i + (b 2 + b 3 ) Married i We can see that by using only one regression equation, we have allowed the intercept and partial slope coefficients to vary by sub-sample

Econ 140 Lecture 1727 Phillips Curve example Phillips curve as an example of a regime shift. Data points from : There is a downward sloping, reciprocal relationship between wage inflation and unemployment W UNUN

Econ 140 Lecture 1728 Phillips Curve example (2) But if we look at data points from : From the data we can detect an upward sloping relationship W UNUN

Econ 140 Lecture 1729 Phillips Curve example (3) There seems to be a regime shift between the two periods –note: this is an arbitrary choice of regime shift - it was not dictated by a specific change We will use the Chow Test (F-test) to test for this regime shift –the test will use a restricted form: –it will also use an unrestricted form: –D is the dummy variable for the regime shift, equal to 0 for and 1 for

Econ 140 Lecture 1730 Phillips Curve example (4) L17_3.xls estimates the restricted regression equations and calculates the F-statistic for the Chow Test: The null hypothesis will be: H 0 : b 1 = b 3 = 0 –we are testing to see if the dummy variable for the regime shift alters the intercept or the slope coefficient The F-statistic is (* indicates restricted) Where q=2

Econ 140 Lecture 1731 Phillips Curve example (5) The expectation of wage inflation for the first time period: The expectation of wage inflation for the second time period: You can use the spreadsheet data to carry out these calculations

Econ 140 Lecture 1732 What we’ve learned Multicollinearity –linear relationship between independent variables –examples Dummy variables –way to include qualitative variables in regressions –examples