17-1 McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Business Statistics: Communicating with Numbers By Sanjiv.

Slides:



Advertisements
Similar presentations
Dummy Variables Dummy variables refers to the technique of using a dichotomous variable (coded 0 or 1) to represent the separate categories of a nominal.
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Business Statistics: Communicating with Numbers
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Econ 140 Lecture 151 Multiple Regression Applications Lecture 15.
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 7: Demand Estimation and Forecasting.
To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-1 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ Chapter 4 RegressionModels.
Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding.
Functional Form, Scaling and Use of Dummy Variables Copyright © 2006 Pearson Addison-Wesley. All rights reserved
Econ 140 Lecture 171 Multiple Regression Applications II &III Lecture 17.
7.1 Lecture #7 Studenmund(2006) Chapter 7 Objective: Applications of Dummy Independent Variables.
Multiple Regression and Correlation Analysis
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Business Statistics: Communicating with Numbers By Sanjiv Jaggia.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Simple Linear Regression Analysis
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Business Statistics: Communicating with Numbers By Sanjiv Jaggia.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Correlation and Linear Regression
Correlation and Linear Regression
Correlation and Linear Regression Chapter 13 Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Active Learning Lecture Slides
Correlation and Regression
Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Chapter 13: Inference in Regression
Linear Regression and Correlation
Hypothesis Testing in Linear Regression Analysis
Part B Business Statistics: Communicating with Numbers
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Chapter Correlation and Regression 1 of 84 9 © 2012 Pearson Education, Inc. All rights reserved.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 25 Categorical Explanatory Variables.
Multiple Regression. In the previous section, we examined simple regression, which has just one independent variable on the right side of the equation.
Modeling Possibilities
1 Research Method Lecture 6 (Ch7) Multiple regression with qualitative variables ©
Lecture 3-3 Summarizing r relationships among variables © 1.
Business Statistics: Communicating with Numbers
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2003 Thomson/South-Western Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
CHAPTER 14 MULTIPLE REGRESSION
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Statistics and Quantitative Analysis U4320 Segment 12: Extension of Multiple Regression Analysis Prof. Sharyn O’Halloran.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Regression with Inference Notes: Page 231. Height Weight Suppose you took many samples of the same size from this population & calculated the LSRL for.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Dummy Variable Regression Models chapter ten.
Chapter 13 Multiple Regression
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
9.1 Chapter 9: Dummy Variables A Dummy Variable: is a variable that can take on only 2 possible values: yes, no up, down male, female union member, non-union.
Regression Analysis: Part 2 Inference Dummies / Interactions Multicollinearity / Heteroscedasticity Residual Analysis / Outliers.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Hypothesis Tests for 1-Proportion Presentation 9.
Chapter Correlation and Regression 1 of 84 9 © 2012 Pearson Education, Inc. All rights reserved.
Chapter 12 Regression.
Multiple Regression Analysis with Qualitative Information
Chapter 8: DUMMY VARIABLE (D.V.) REGRESSION MODELS
Financial Econometrics Fin. 505
Presentation transcript:

17-1 McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly

17-2 Chapter 17 Learning Objectives (LOs) LO 17.1: Use dummy variables to capture a shift of the intercept. LO 17.2: Test for differences between the categories of a qualitative variable. LO 17.3: Use dummy variables to capture a shift of the intercept and/or slope.

17-3 Is There Evidence of Wage Discrimination? Three Seton Hall professors recently learned in a court decision that they could pursue their lawsuit alleging the University paid higher salaries to younger instructors and male professors. Mary Schweitzer works in human resources at another college and has been asked by the college to test for age and gender discrimination in salaries. She gathers data on 42 professors, including the salary, experience, gender, and age of each.

17-4 Is There Evidence of Wage Discrimination? Using this data set, Mary hopes to: 1.Test whether salary differs by a fixed amount between males and females. 2.Determine whether there is evidence of age discrimination in salaries. 3.Determine if the salary difference between males and females increases with experience.

Dummy Variables In previous chapters, all the variables used in regression applications have been quantitative. In empirical work it is common to have some variables that are qualitative: the values represent categories that may have no implied ordering. We can include these factors in a regression through the use of dummy variables. A dummy variable for a qualitative variable with two categories assigns a value of 1 for one of the categories and a value of 0 for the other. LO 17.1 Use dummy variables to capture a shift of the intercept.

17-6 Variables with Two Categories For example, suppose we are interested in determining the impact of gender on salary. We might first define a dummy variable d (other meaningful names e.g., Dgender, are better) that has the following structure: Let d = 1 if gender = “female” and d = 0 if gender = “male.” This allows us to include a measure for gender in a regression model and quantify the impact of gender on salary. LO 17.1

17-7 Regression with a Dummy Variable LO 17.1

17-8 Regression with a Dummy Variable LO 17.1

17-9 Regression with a Dummy Variable Graphically, we can see how the dummy variable shifts the intercept of the regression line. LO 17.1

17-10 Salaries, Gender, and Age LO 17.1 d 1 = 1 for male and 0 for female d 2 = 0 for young and 1 for old SalaryExperd1d2GenderAge MaleUnder MaleUnder FemaleUnder MaleOver MaleUnder FemaleOver MaleUnder

17-11 Estimation Results LO 17.1 The estimated model is ŷ = x d d 2. b. The predicted salary of a 50-year old male professor (d 1 = 1 and d 2 = 0) with 10 years of experience (x = 10) is ŷ = (10) (1) (0) = 65.83, or $65,830. The corresponding salary of a 50-year-old female (d 1 = 0 and d 2 = 0) is ŷ = (10) (0) (0) = 51.91, or $51,910. The predicted difference in salary between a male and a female professor with 10 years of experience is $13,920 (65,830 − 51,910). This difference can also be inferred from the estimated coefficient of the gender dummy variable d 1. Note that the salary difference does not change with experience. For instance, the predicted salary of a 50-year-old male with 20 years of experience is $77,130. The corresponding salary of a 50-year-old female is $63,210, for the same difference of $13,920.

17-12 Estimation Results LO 17.1 c. For a 65-year-old female professor with 10 years of experience, the predicted salary is ŷ = (10) (0) (1) = 56.25, or 56,250. Prior to any statistical testing, it appears that an older female professor earns, on average, $4,340 (56,250 − 51,910) more than a younger female professor with the same experience.

17-13 Testing the Significance of Dummy Variables The statistical tests discussed in Chapter 15 remain valid for dummy variables as well. We can perform a t-test (using p-value) for individual significance, form a confidence interval using the parameter estimate and its standard error, and conduct a partial F test for joint significance. LO 17.2 Test for differences between the categories of a qualitative variable.

17-14 Example 17.2 LO 17.2

17-15 Multiple Categories LO 17.2

17-16 Multiple Categories LO 17.2 d1d1 d2d2 Public10 Alone01 Carpool00

17-17 Avoiding the Dummy Variable Trap Given the intercept term, we exclude one of the dummy variables from the regression. If we included as many dummy variables as categories, this would create perfect multicollinearity in the data, and such a model cannot be estimated. So, we include one less dummy variable than the number of categories of the qualitative variable. LO 17.2

17-18 Homework Problem 8 on p the data file (SATdummy) is posted on S: drive. The answers are in the appendix.

17-19 Example 17.3 A recent article suggests that Asian-Americans face serious discrimination in the college admissions process (The Boston Globe, February 8, 2010). Specifically, Asian applicants typically need an extra 140 points on the SAT to compete with white students. Another report suggests that colleges are eager to recruit Hispanic students who are generally underrepresented in applicant pools (USA Today, February 8, 2010). In an attempt to corroborate these claims, a sociologist first wants to determine if SAT scores differ by ethnic background. She collects data on 200 individuals from her city with their recent SAT scores and ethnic background.

17-20 Example 17.3 Race DVWhiteB0lackAsian White100 Black010 Asian001 Hispanic000 3, not 4 DV as follows:

17-21 Example 17.3 b. For an Asian individual, we set d 1 = 0, d 2 = 0, d 3 = 1 and calculate ŷ = = Thus, the predicted SAT score for an Asian individual is approximately The predicted SAT score for a Hispanic individual (d 1 = d 2 = d 3 = 0) is ŷ = , or approximately c. Since the p-values corresponding to d 1 and d 3 are approximately zero, we conclude at the 5% level that the SAT scores of White and Asian students are different from those of Hispanic students. However, with a p-value of 0.16, we cannot conclude that the SAT scores of Black and Hispanic students are statistically different.

17-22 Homework Problem 11on p the data file (Retail Sales) is posted on S: drive. Do not do part d.

17-23 Problem 11 on page 524 A government researcher is analyzing the relationship between retail sales and the gross national product (GNP). He also wonders whether there are significant differences in retail sales related to the quarters of the year. He collects ten years of quarterly data. A portion is shown in the accompanying table; the complete data set can be found on the text website, labeled Retail Sales.

17-24 a. Estimate y = β 0 + β 1 x + β 2 d 1 + β 3 d 2 + β 4 d 3 + where y is retail sales, x is GNP, d 1 is a dummy variable that equals 1 if quarter 1 and 0 otherwise, d 2 is a dummy variable that equals 1 if quarter 2 and 0 otherwise, and d 3 is a dummy variable that equals 1 if quarter 3 and 0 otherwise. b. Predict retail sales in quarters 2 and 4 if GNP equals $13,000 billion. c. Which of the quarterly sales are significantly different from those of the 4th quarter at the 5% level? d. Use the partial F test to determine if the three seasonal dummy variables used in the model are jointly significant at the 5% level.