business analytics II ▌applications fuel efficiency 

Slides:



Advertisements
Similar presentations
Chapter 7 Hypothesis Testing
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Objectives (BPS chapter 24)
Lab 4: What is a t-test? Something British mothers use to see if the new girlfriend is significantly better than the old one?
Chapter 10 Simple Regression.
SIMPLE LINEAR REGRESSION
8-2 Basics of Hypothesis Testing
Back to House Prices… Our failure to reject the null hypothesis implies that the housing stock has no effect on prices – Note the phrase “cannot reject”
Active Learning Lecture Slides
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Correlation and Linear Regression
Overview Definition Hypothesis
Hypothesis Testing in Linear Regression Analysis
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
CHAPTER 14 MULTIPLE REGRESSION
Lecture 4 Introduction to Multiple Regression
Economics 173 Business Statistics Lecture 19 Fall, 2001© Professor J. Petry
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
Example x y We wish to check for a non zero correlation.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Managerial Economics & Decision Sciences Department intro to dummy variables  dummy regressions  slope dummies  business analytics II Developed for.
Managerial Economics & Decision Sciences Department introduction  inflated standard deviations  the F  test  business analytics II Developed for ©
Managerial Economics & Decision Sciences Department hypotheses, test and confidence intervals  linear regression: estimation and interpretation  linear.
Managerial Economics & Decision Sciences Department cross-section and panel data  fixed effects  omitted variable bias  business analytics II Developed.
Managerial Economics & Decision Sciences Department tyler realty  old faithful  business analytics II Developed for © 2016 kellogg school of management.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 7 l Hypothesis Tests 7.1 Developing Null and Alternative Hypotheses 7.2 Type I & Type.
Managerial Economics & Decision Sciences Department hypotheses  tests  confidence intervals  business analytics II Developed for © 2016 kellogg school.
Managerial Economics & Decision Sciences Department true and truncated relations  the omitted variable bias effect  spurious regressions  business analytics.
Managerial Economics & Decision Sciences Department intro to linear regression  underlying concepts for the linear regression  interpret linear regression.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Chapter 13 Linear Regression and Correlation. Our Objectives  Draw a scatter diagram.  Understand and interpret the terms dependent and independent.
Module II Lecture 1: Multiple Regression
Multiple Regression Analysis: Inference
business analytics II ▌assignment three - solutions pet food 
Lecture Slides Elementary Statistics Twelfth Edition
business analytics II ▌assignment four - solutions mba for yourself 
Hypothesis Tests l Chapter 7 l 7.1 Developing Null and Alternative
business analytics II ▌applications cigarettes  car dealership 
business analytics II ▌assignment three - solutions pet food 
QM222 Class 9 Section A1 Coefficient statistics
The Multiple Regression Model
business analytics II ▌appendix – regression performance the R2 
Inference for Least Squares Lines
assignment 7 solutions ► office networks ► super staffing
Assumptions For testing a claim about the mean of a single population
business analytics II ▌assignment one - solutions autoparts 
Correlation and Simple Linear Regression
business analytics II ▌panel data models
QM222 Class 16 & 17 Today’s New topic: Estimating nonlinear relationships QM222 Fall 2017 Section A1.
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
The Multiple Regression Model
assignment 8 solutions ► yogurt brands Developed for

Chapter 9 Hypothesis Testing.
Multiple Regression BPS 7e Chapter 29 © 2015 W. H. Freeman and Company.
Interval Estimation and Hypothesis Testing
Hypothesis Testing A hypothesis is a claim or statement about the value of either a single population parameter or about the values of several population.
SIMPLE LINEAR REGRESSION
Chapter 7: The Normality Assumption and Inference with OLS
Product moment correlation
SIMPLE LINEAR REGRESSION
The Multiple Regression Model
Linear Regression and Correlation
Presentation transcript:

business analytics II ▌applications fuel efficiency  Managerial Economics & Decision Sciences Department Developed for business analytics II week 3 week 5 ▌applications week 5 fuel efficiency  ads and profitability  slippery soap sales  mba salaries  week 3 © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II

► confidence and prediction intervals session five applications Developed for business analytics II learning objectives ► linear regression  definition and assumptions of the linear model  estimate the model, interpret coefficients  understand regression table when provided without data  statistical significance, p-value and confidence intervals ► confidence and prediction intervals  klincom and kpredint commands  interpret and use output from klincom and kpredint commands  klincom and kpredint commands: use for change in y and levels of y ► dummy variables  definition and interpretation of dummy and slope dummy variables  use of dummy and slope dummy regressions in hypothesis testing readings ► (MSN)  Chapter 2-5 ► (CS)  Fuel Efficiency  Ads and Profitability  Slippery Soap Sales  MBA Salaries © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II

Est.E[MPG]  60.3810  0.2034·HP  3.0977·Foreign  3.1329·Repair Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ fuel efficiency estimated regression and coefficient interpretation ► Based on the table provided we can write the estimated regression as: Est.E[MPG]  60.3810  0.2034·HP  3.0977·Foreign  3.1329·Repair b0 b1 b2 b3 Figure 1. Regression results MPG | Coef. Std. Err. t P > |t| --------------------------------------------------------------- HP | -0.2034169 0.02846808 -7.1454 0.0000 Foreign | 3.09772802 0.71535252 4.3304 0.0001 Repair | -3.132985 1.79280832 -1.7474 0.0874 _cons | 60.3810185 2.76109585 21.8685 0.0000 ► Coefficient significance: The constant and the coefficients of HP and Foreign are significant at the 5% level since the p value is below the 5% for these coefficients. Recall that this means that the data provides strong evidence, strong enough to satisfy the 5% significance level, that the true value of these coefficients is different from zero. ► Coefficient interpretation: The estimated coefficient on Foreign is 3.098. This estimates that, holding HP and Repair constant, i.e. considering cars with the same engine power and repair history, foreign-made cars, on average, have higher fuel efficiency than domestic-made cars by 3.098 miles per gallon. Note the part of the explanation mentioning that engine power and repair history are held fixed in the comparison: the 3.098 does not mean that average fuel efficiency of foreign-made cars is estimated to be 3.098 mpg higher than for domestic-made cars. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 1

Est.E[MPG]  60.3810  0.2034·HP  3.0977·Foreign  3.1329·Repair Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ fuel efficiency comparison exercise ► The estimated regression is: Est.E[MPG]  60.3810  0.2034·HP  3.0977·Foreign  3.1329·Repair ► You are comparing two cars, the first one is US-made with a 150 hp engine, which has been repaired once before the auction, and the second one is a Japanese-made car with a 200 hp engine, which has never been repaired. Which car do you expect to have a higher MPG? By how much? ► Car 1: made in US (Foreign  0), with a 150 hp engine (HP  150) repaired once (Repair  1) gets MPG1 = 60.3810  0.2034·150  3.0977·0  3.1329·1  26.7381 ► Car 2: made in Japan (Foreign  1), with a 200 hp engine (HP  200) repaired once (Repair  0) gets MPG2 = 60.3810  0.2034·200  3.0977·1  3.1329·0  22.7987 ► The difference is: MPG = MPG1 – MPG2 = 3.9394 We expect the US-made, 150 HP, once-repaired car to have higher MPG, by 3.9394 miles per gallon than the Japan-made, 200 HP, never repaired car. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 2

Est.E[MPG]  60.3810  0.2034·HP  3.0977·Foreign  3.1329·Repair Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ fuel efficiency comparison exercise ► The estimated regression is: Est.E[MPG]  60.3810  0.2034·HP  3.0977·Foreign  3.1329·Repair ► Based on the regression results above, can we infer that an increase in engine’s horse power leads to a decrease in fuel efficiency regardless of whether the car if US or non-US made? ► Since the regression assumes that the dummy variable has only a level effect, i.e. only the dummy variable is included without any slope dummy, the estimated change in mean MPG for a given change in HP will be the same regardless of the Foreign dummy variable being 1 or 0. In other words, the current specification of the regression forces the two regression lines (for US and non-US made cars) to have the same slope (MPG with respect to HP). © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 3

confidence interval fuel efficiency ► The estimated regression is: Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ fuel efficiency confidence interval ► The estimated regression is: Est.E[MPG]  60.3810  0.2034·HP  3.0977·Foreign  3.1329·Repair ► For a US made car that was repaired once before the auction we need a 95% confidence interval for the change in fuel efficiency when the engine’s horse power is increased by 100. What is the interval? Figure 2. Regression results MPG | Coef. Std. Err. t P > |t| --------------------------------------------------------------- HP | -0.2034169 0.02846808 -7.1454 0.0000 Foreign | 3.09772802 0.71535252 4.3304 0.0001 Repair | -3.132985 1.79280832 -1.7474 0.0874 _cons | 60.3810185 2.76109585 21.8685 0.0000 ► The interval refers to a change in MPG given a change in HP. This change is given by 1  the coefficient of HP in the regression. The interval has the form: b1  std.err.(b1)tdf,/2  1  b1  std.err.(b1)tdf,/2 where: b1   0.2034, std.err.(b1)  0.02846808 and tdf,/2  invttail(46,0.025)  2.0128956 ► The interval for 1 is  0.26072  1   0.14611 and thus  26.072  MPG|HP  100   14.611 © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 4

Est.E[MPG]  60.3810  0.2034·HP  3.0977·Foreign  3.1329·Repair Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ fuel efficiency regression specification: underlying assumptions ► The estimated regression is: Est.E[MPG]  60.3810  0.2034·HP  3.0977·Foreign  3.1329·Repair ► Regarding the HP and Repair variables, what is the main assumption that supports the specification of the regression above? Draw two diagrams, one for non-US cars and one for US cars, representing MPG as a function of HP. ► The analyst did not include a slope dummy variable thus the analyst assumes that the change in MPG given a change in HP is the same for cars that were repaired before auction and cars that were not repaired before, i.e. regardless of Repair is 0 or 1. The Repair dummy has only a level effect on MPG. The two lines have the same slope. Figure 3. Regression lines for non-US and US cars as depending on Repair status MPG Foreign  1 MPG Foreign  0 63.48 slope  0.2034 60.35 60.38 Repair  0 slope  0.2034 63.48 Repair  0 slope  0.2034 Repair  1 slope  0.2034 Repair  1 HP HP © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 5

regression specification: underlying assumptions Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ fuel efficiency regression specification: underlying assumptions ► The estimated regression is: Est.E[MPG]  60.3810  0.2034·HP  3.0977·Foreign  3.1329·Repair ► Let’s consider US cars. If you’d suspect that the additional MPG for each extra HP is lower for cars that undergo at least one repair (compared to those with zero repairs) how would you conduct the above analysis? How would you expect the diagram (from iv. above) for US cars to change?. ► In this case the change in MPG given a change in HP differs depending on whether the car had or not previous repairs. This implies different slopes for the MPG  HP lines for different values of Repair. To capture this effect a slope dummy variable defined as: RepairHP  RepairHP and the new regression is written as E[MPG]  0  1·HP  2·Foreign  3·Repair  4·RepairHP ► For US cars (Foreign  0) the regression is E[MPG|Foreign  0]  0  1·HP  3·Repair  4·RepairHP and the slopes (for lines representing MPG depending on HP) are  for Repair  0: E[MPG|Foreign  0]  0  1·HP slope  1  for Repair  1: E[MPG|Foreign  0]  0  (1  4)·HP slope  1  4 © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 6

Est.E[Profit]  b0  b1·Location  2·Marketing Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ ads and profitability regression equation ► Jack, an employee, estimates a regression of profit on marketing spending and a dummy for location. He uses monthly data for the past 36 months and, to the surprise of the Management, Jack’s results indicate that a change in marketing spending generates the same change in profit regardless of the location. In particular it turns out that for each extra $1 change in marketing spending the profit increases by $2. ► Write the regression equation that is estimated and explain whether there are possible problems with the way the regression was designed, i.e. is this specification useful to respond to Management’s request? ► It is fairly straightforward to express the (estimated) regression equation as: Est.E[Profit]  b0  b1·Location  2·Marketing where Location is a dummy variable equals to 1 for location A and 0 for location B and Marketing is a continuous variable that represents marketing spending. Marketing Spending Month Location A Location B 1 $100,000 $50,000 ► Important detail: notice that Management records in each month location-specific information in the top table. Month Marketing Location 1 $100,000 $50,000 ► Jack reorganized the data to codify the location through the dummy variable Location. This is shown in the bottom table. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 7

E[Profit]  0  1·Location  2·Marketing  3·LocationMarketing Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ ads and profitability slope dummy regression ► Jack, an employee, estimates a regression of profit on marketing spending and a dummy for location. He uses monthly data for the past 36 months and, to the surprise of the Management, Jack’s results indicate that a change in marketing spending generates the same change in profit regardless of the location. In particular it turns out that for each extra $1 change in marketing spending the profit increases by $2. ► Suggest a different, and better, regression equation and explain how it differs from the regression estimated by Jack. What is the new equation? What is the main assumption that would lend support to your specification of regression rather than to Jack’s specification? ► Jack’s regression assumes that Location has only a level effect on expected Profit, i.e. the change in mean Profit given a change in Marketing spending is the same regardless of the location. Jack’s regression ignores potential interactions between the dummy variable and the continuous variable, i.e. the slope effect. ► It is fairly straightforward to modify the specification of the regression in order to introduce a slope effect: E[Profit]  0  1·Location  2·Marketing  3·LocationMarketing where Location and Marketing have the meaning as before and LocationMarketing  LocationMarketing is a slope dummy variable that should capture the interaction between Location and Marketing. ► The slope dummy (if 30) will capture/quantify the different effect/change on profit of a change in location-specific marketing spending. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 8

dummy vs. slope dummy interpretation Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ ads and profitability dummy vs. slope dummy interpretation ► Jack, an employee, estimates a regression of profit on marketing spending and a dummy for location. He uses monthly data for the past 36 months and, to the surprise of the Management, Jack’s results indicate that a change in marketing spending generates the same change in profit regardless of the location. In particular it turns out that for each extra $1 change in marketing spending the profit increases by $2. ► After you convinced the management that your regression specification is the correct one you have access to the same data Jack used and obtained a coefficient of slightly above 2 for the variable “marketing spending”. In an angry tone Jack immediately replies: “I already told you that for each $1 of extra marketing spending you get about $2 extra profit, this new more sophisticated regression simply confirms my initial findings…” Is Jack’s statement correct? ► The two regressions are shown below. Jack: E[Profit]  0  1·Location  2·Marketing You: E[Profit]  0  1·Location  2·Marketing  3·LocationMarketing ► Jack find an estimate for 2 of exactly 2 while your reported estimate (being slightly above 2) refers to 2. Even though in both cases the estimates are the coefficient on same variable Marketing the interpretation of each estimate is completely different: Jack’s regression: each additional $1 of Marketing spending seems to generate extra $2 at either location Your regression: each additional $1 of Marketing spending generates slightly more than $2 at location B © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 9

E[Sales] | Price  Benchmark Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ slippery soap sales evaluating/testing claims . regress Sales Price ---------------------------------------------------------------- Sales | Coef. Std.Err. t P>|t| [95% Conf. Interval] Price | -.2929416 .0616406 -4.75 0.000 -.4204540 -.165428 _cons | 5.8291984 .4241016 13.74 0.000 4.9518744 6.7065194 ► Is it fair to say that if we decrease price by $10 we might see our sales increasing by at least $5,000? ► Based on the table provided we can write the estimated regression as: Est.E[Sales]  5.8291984  0.2929416·Price ► The claim has the form: E[Sales] | Price  Benchmark ► Given that E[Sales] | Price  1Price we can re-write the claim as 1Price  Benchmark ► With the specific values: Price  $10 and Benchmark  $5,000/1,000 = $0.5 we get the claim be restated in terms of 1 as 1  0.5 © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 10

evaluating/testing claims Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ slippery soap sales evaluating/testing claims . regress Sales Price ---------------------------------------------------------------- Sales | Coef. Std.Err. t P>|t| [95% Conf. Interval] Price | -.2929416 .0616406 -4.75 0.000 -.4204540 -.165428 _cons | 5.8291984 .4241016 13.74 0.000 4.9518744 6.7065194 ► Is it fair to say that if we decrease price by $10 we might see our sales increasing by at least $5,000? ► The claim: 1  0.5 hypothesis H0: 1  0.5 Ha: 1  0.50 set hypotheses test calculate decision calculate (right tail) pvalue  Pr[ T  ttest ] reject the null hypothesis if pvalue   © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 11

evaluating/testing claims Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ slippery soap sales evaluating/testing claims . regress Sales Price ---------------------------------------------------------------- Sales | Coef. Std.Err. t P>|t| [95% Conf. Interval] Price | -.2929416 .0616406 -4.75 0.000 -.4204540 -.165428 _cons | 5.8291984 .4241016 13.74 0.000 4.9518744 6.7065194 ► Is it fair to say that if we decrease price by $10 we might see our sales increasing by at least $5,000? ► The claim: 1  0.5 ► We calculate the ttest as ► The (right tail) pvalue  Pr[ T  ttest ]  ttail(23,3.359124)  0.00135708 ► We can reject the stated null H0: E[Sales] | Price   10  0.50 for   5% thus it is “not fair” to say that decreasing price by $10 we might see sales increasing by at least $5,000. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 12

b1  std.err(b1)tdf,/2  1  b1  std.err(b1)tdf,/2 Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ slippery soap sales intervals . regress Sales Price ---------------------------------------------------------------- Sales | Coef. Std.Err. t P>|t| [95% Conf. Interval] Price | -.2929416 .0616406 -4.75 0.000 -.4204540 -.165428 _cons | 5.8291984 .4241016 13.74 0.000 4.9518744 6.7065194 ► How can I obtain a 90% confidence interval for the change in Sales for a Price decreases by $1? ► The change in average Sales for a $1 change in Price is given by the coefficient on Price: a 90% confidence interval for a change in Sales for a $1 change in Price can be derived using information on the coefficient on Price: b1  std.err(b1)tdf,/2  1  b1  std.err(b1)tdf,/2 where b1  0.2929416, tdf,/2  invttail(23,0.05)  1.7138715, std.err(b1)  0.0616406. The interval for 1 is thus  0.39858567  1   0.18729753 For a decrease in Price by $1 the change in average Sales is, with probability 90%, in the interval [$1,872.9753, $3,985.8567] © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 13

intervals slippery soap sales Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ slippery soap sales intervals . regress Sales Price ---------------------------------------------------------------- Sales | Coef. Std.Err. t P>|t| [95% Conf. Interval] Price | -.2929416 .0616406 -4.75 0.000 -.4204540 -.165428 _cons | 5.8291984 .4241016 13.74 0.000 4.9518744 6.7065194 ► From the table above, can I obtain a 95% confidence interval for the average Sales when the Price is $10? ► From the information provided in the problem it’s not possible to derive a confidence interval for the level of average Sales except for the particular case when Price  0 in which case Est.E[Sales]  b0 (the constant) and a similar procedure as above can be applied using the information on b0. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 14

E[postMBA]  0  1·school Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ mba salaries simple dummy regression equation ► You have interviewed 100 recent graduates from both programs, and collected data on preMBA - income in year before beginning the program, postMBA - income in year after completing the program and school - dummy equal to 1 for those attending Program K. ► Can you prove that on average students from Program B have higher salaries after graduation than students from Program K? ► Since we are concerned with the difference in postMBA salaries for graduates from Program K vs. Program B we run a simple dummy regression: E[postMBA]  0  1·school Figure 4. Results for regression of postMBA on school postMBA | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------+---------------------------------------------------------------- school | -16.99615 5.318982 -3.20 0.002 -27.55149 -6.440801 _cons | 132.0825 3.761088 35.12 0.000 124.6188 139.5463 ► Since school = 1 for students graduating from Program K, the negative coefficient for school indicates that graduates from Program B earn on average about 16.99615 more than graduates from Program K. We have to test whether the graduates from Program B earn more on average than graduates from Program K © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 15

simple dummy regression equation Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ mba salaries simple dummy regression equation ► You have interviewed 100 recent graduates from both programs, and collected data on preMBA - income in year before beginning the program, postMBA - income in year after completing the program and school - dummy equal to 1 for those attending Program K. ► Can you prove that on average students from Program B have higher salaries after graduation than students from Program K? ► The null/alternative hypotheses and the test are shown below: hypothesis H0: 1  0 Ha: 1  0 set hypotheses test calculate decision calculate (left tail) pvalue  Pr[ T  ttest ] reject the null hypothesis if pvalue   ► We can calculate immediately: ttest  (16.99616  0)/5.318982  3.20 pvalue  1  ttail(98,3.20)  0.00092615  0.05 thus we reject the null. ► The evidence indicates, at 5% significance level, that graduates from Program B earn more on average than graduates from Program K © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 16

confidence interval mba salaries Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ mba salaries confidence interval ► You have interviewed 100 recent graduates from both programs, and collected data on preMBA - income in year before beginning the program, postMBA - income in year after completing the program and school - dummy equal to 1 for those attending Program K. ► Provide an interval that you are 80% confident contains the true average post-graduation salary of graduates from Program B. ► We have to provide an interval that we are 80% confident contains the true average post-graduation salary of graduates from Program B. The simplest way to find it is to use klincom: klincom _b[_cons]  _b[school]*0, level(80) Figure 5. Results for klincom command postMBA | Coef. Std. Err. t P>|t| [80% Conf. Interval] --------+---------------------------------------------------------------- (1) | 132.0825 3.761088 35.12 0.000 127.2298 136.9353 If Ha: < then Pr(T < t) = 1 If Ha: not = then Pr(|T| > |t|) = 0 If Ha: > then Pr(T > t) = 0 ► The 80% confidence interval for postMBA average salaries for graduates from Program B is [127.2298, 136.9353] © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 17

E[postMBA]  0  1·school  2·preMBA  3·schoolpreMBA Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ mba salaries slope dummy ► You have interviewed 100 recent graduates from both programs, and collected data on preMBA - income in year before beginning the program, postMBA - income in year after completing the program and school - dummy equal to 1 for those attending Program K. ► Can you prove that the post-MBA salary depends on the pre-MBA salary differently for graduates of Program K compared to those of Program B? ► We are concerned with the difference in postMBA salaries due to school choice and preMBA salaries. We run a slope dummy regression, with schoolpreMBA  school·preMBA: E[postMBA]  0  1·school  2·preMBA  3·schoolpreMBA Figure 6. Results for regression of postMBA on school, preMBA and slope dummy schoolpreMBA postMBA | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- school | 84.75716 16.00051 5.30 0.000 52.99639 116.5179 preMBA | .9 .1252686 7.18 0.000 .6513438 1.148656 schoolpreMBA | -1.020848 .1646557 -6.20 0.000 -1.347687 -.6940094 _cons | 40 13.17493 3.04 0.003 13.84798 66.15203 ► The estimated regression is thus: Est. E[postMBA]  40.00  84.75·school  0.90·preMBA  1.02·schoolpreMBA © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 18

slope dummy mba salaries Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ mba salaries slope dummy ► You have interviewed 100 recent graduates from both programs, and collected data on preMBA - income in year before beginning the program, postMBA - income in year after completing the program and school - dummy equal to 1 for those attending Program K. ► Can you prove that the post-MBA salary depends on the pre-MBA salary differently for graduates of Program K compared to those of Program B? ► The estimated regression is thus: Est. E[postMBA]  40.00  84.75·school  0.90·preMBA  1.02·schoolpreMBA ► Since we want to prove that the post-MBA salary depends on the pre-MBA salary differently for graduates of Program K compared to those of Program B the question is really whether the changes in postMBA average salary, for a given change in preMBA salary, differs depending on school. ► We have to test whether the slopes of the two lines (one corresponding to school = 1 and the other corresponding to school = 0) are equal. In other words: ► We already have the result for this test in the table above and it’s obvious that pvalue  0.05 thus we reject the null hypothesis at 5%. hypothesis H0: 3  0 Ha: 3  0 set hypotheses © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 19

slope dummy mba salaries Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ mba salaries slope dummy ► You have interviewed 100 recent graduates from both programs, and collected data on preMBA - income in year before beginning the program, postMBA - income in year after completing the program and school - dummy equal to 1 for those attending Program K. ► The estimated regression is thus: Est. E[postMBA]  40.00  84.75·school  0.90·preMBA  1.02·schoolpreMBA Figure 7. Results for regression of postMBA on school, preMBA and slope dummy schoolpreMBA Est.E[  1]  (40.00  84.75)  (0.90  1.02)·preMBA Est.E[postMBA|school = 0]  40.00  0.90·preMBA © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 20

Est. E[postMBA]  0.90·preMBA  1.02·(1·preMBA)   0.12 Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ mba salaries comparative changes ► You have interviewed 100 recent graduates from both programs, and collected data on preMBA - income in year before beginning the program, postMBA - income in year after completing the program and school - dummy equal to 1 for those attending Program K. ► For students from Program K, what is your estimate for the increase in post-MBA salary if pre-MBA salary increases by $1,000? ► The estimated regression is thus: Est. E[postMBA]  40.00  84.75·school  0.90·preMBA  1.02·schoolpreMBA ► For students from Program K the estimate for the increase in postMBA salary when preMBA salary increases by $1000 is simply given by (below preMBA  1): Est. E[postMBA]  0.90·preMBA  1.02·(1·preMBA)   0.12 © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 21

intervals mba salaries Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ mba salaries intervals ► You have interviewed 100 recent graduates from both programs, and collected data on preMBA - income in year before beginning the program, postMBA - income in year after completing the program and school - dummy equal to 1 for those attending Program K. ► Bill is a student from Program K. Provide an interval that for a level 95% confidence will contain his post-graduation salary, if his pre-MBA salary was $60,000. ► This is a straightforward kpredint command: kpredint _b[_cons]  _b[school]*1  _b[preMBA]*60  _b[schoolpreMBA]*60, level(95) Figure 8. Results for kpredint command Estimate: 117.50625 Standard Error of Individual Prediction: 21.895393 Individual Prediction Interval (95%): [74.044242,160.96827] ► The requested interval is thus [$74,044.24, $160,968.27]. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 22

E[postMBA]  0  1·school  2·preMBA  3·schoolpreMBA Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ mba salaries intervals ► You have interviewed 100 recent graduates from both programs, and collected data on preMBA - income in year before beginning the program, postMBA - income in year after completing the program and school - dummy equal to 1 for those attending Program K. ► For those with a pre-MBA salary of $60,000 provide an interval that for a level 95% confidence will contain the difference between average post-MBA salaries of graduates from Program K and average salary had these graduates enrolled in Program B. ► We talk about average salaries for graduates with preMBA salaries of $60,000 from Program K and their salaries had they chosen instead Program B? ► We have to provide an interval that, at 95% confidence level, will contain the difference between post-graduation average salary after Program K and what they would expect had they enrolled in Program B (given preMBA  60). ► Start with the regression equation: E[postMBA]  0  1·school  2·preMBA  3·schoolpreMBA and calculate the difference for E[postMBA] when school  1 then school  0: E[postMBA]  E[postMBA | school  1]  E[postMBA | school  0]  1  3·preMBA © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 23

intervals mba salaries Managerial Economics & Decision Sciences Department session five applications Developed for business analytics II fuel efficiency ◄ ads and profitability ◄ slippery soap sales ◄ mba salaries ◄ mba salaries intervals ► You have interviewed 100 recent graduates from both programs, and collected data on preMBA - income in year before beginning the program, postMBA - income in year after completing the program and school - dummy equal to 1 for those attending Program K. ► For those with a pre-MBA salary of $60,000 provide an interval that for a level 95% confidence will contain the difference between average post-MBA salaries of graduates from Program K and average salary had these graduates enrolled in Program B. ► This is a straightforward klincom command: klincom _b[school]*1  _b[schoolpreMBA]*60, level(95) Figure 9. Results for klincom command postMBA | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------+---------------------------------------------------------------- (1) | 23.50625 7.162113 3.28 0.001 9.289569 37.72293 ► The requested interval is thus [$9,289.57, $37,722.93]. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II session five | page 24