Download presentation
Presentation is loading. Please wait.
Published byCecil Lee Modified over 8 years ago
1
Managerial Economics & Decision Sciences Department intro to dummy variables dummy regressions slope dummies business analytics II Developed for © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II ▌ dummy variables week 4 week 3 week 5 week 3
2
© 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II readings ► statistics & econometrics ► (MSN) define a dummy variable interpret a regression with dummy variables understand and interpret slope dummies learning objectives run dummy regressions ► Chapter 5 ► (CS) Web Ads Pizza Sales session four dummy variables business analytics II Developed for
3
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page1 introduction session four confidence/prediction intervals-based business decisions ► So far we used continuous variables, i.e. variables that can potentially take on any of a range of values, although the actual number of values in your data sample will, of course, be finite ► It’s easy to see that there are plenty of variables that can take only a few (limited) possible values such as gender, location, “calendar” related variables, etc. These are called categorical or discrete variables. ► These predictor variables can be included in regression through the use of dummy variables – basically variables that take on the values of 0 and 1 only ► We add a dummy variable if we believe that the value of y depends on which of the two values the dummy variable takes: Are sales higher on weekends versus weekdays? Do women work harder than men? Do CEOs of publicly traded firms make more than CEOs of privately held companies? ► We will begin by considering variables that have two categories, e.g. gender, private/public, and then explore how to deal with variables that have more than two categories, e.g. ethnicity, industry
4
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page2 packaging matters session four a first view on dummy regressions ► We estimate the following regression model, let’s call it a “dummy” regression, where the variable dummy equals 0 for some observations and equals 1 for other observations E [ y ] 0 1 · dummy ► We can interpret the results as follows: 0 is the true mean of y when dummy 0 0 E [ y | dummy 0] 0 1 is the true mean of y when dummy 1 0 1 E [ y | dummy 1] 1 is the difference in the true mean of y for dummy 1 vs. dummy 0 1 E [ y | dummy 1] E [ y | dummy 0] ► For an example, let’s use the allpack.dta: dummy1 0 if the packaging option 1 is chosen dummy1 1 if the packaging option 2 is chosen The model is ( allpack is represents sales) E [ allpack ] 0 1 · dummy1 ► How do we estimate the coefficients? Is the average sales different depending on the choice of the packaging option? Figure 1. Simple plot of categorical date allpack.dta
5
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page3 packaging matters session four a first view on dummy regressions ► With dummy1 0 if the packaging option 1 is chosen and dummy1 1 if the packaging option 2 is chosen, the model is E[ allpack ] 0 1 · dummy1 ► How do we estimate the coefficients? Source | SS df MS Number of obs = 72 -------------+------------------------------ F( 1, 70) = 5.45 Model | 13908.3405 1 13908.3405 Prob > F = 0.0225 Residual | 178761.353 70 2553.73362 R-squared = 0.0722 -------------+------------------------------ Adj R-squared = 0.0589 Total | 192669.694 71 2713.65766 Root MSE = 50.534 ------------------------------------------------------------------------------ allpack | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- dummy1 | -27.79722 11.91109 -2.33 0.022 -51.55314 -4.041301 _cons | 290.5439 8.422413 34.50 0.000 273.7459 307.3419 Figure 2. Regression results allpack on dummy1 ► The estimated regression is: Est.E [ allpack ] 290.5439 27.79722· dummy1 b 0 b 1
6
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page4 packaging matters session four a first view on dummy regressions ► We can interpret the results as follows: b 0 290.5439 the estimated mean sales when packaging option 1 is chosen b 0 b 1 262.7717 the estimated mean sales when packaging option 2 is chosen b 1 27.79722 the difference in the estimated mean sales when packaging 2 is chosen over packaging 1 ► Can we test these coefficients? Yes! Nothing is changed compared to the regression interpretation so far. We can use the regression results: ► Both coefficients are significantly different from zero at 5%. Further, the confidence intervals can be used in the same way we did previously. allpack | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- dummy1 | -27.79722 11.91109 -2.33 0.022 -51.55314 -4.041301 _cons | 290.5439 8.422413 34.50 0.000 273.7459 307.3419 Figure 3. Regression results allpack on dummy1
7
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page5 packaging matters session four a first view on dummy regressions key questions Based on the estimated regression can we answer the following two questions: Is the mean sales different depending on the choice of the packaging option? Is the mean sales higher when packaging 1 is chosen over packaging 2? Remark. Notice that the question here is about the true mean of y not the estimated mean of y. H 0 : 1 0 H a : 1 0 hypothesis ► We established that 1 the difference in the mean sales when packaging 2 is chosen over packaging 1. Thus the two questions above can be translated into hypotheses about 1 : We already established that 1 is significantly differ from zero at 5% thus the difference in the average sales when packaging 2 is chosen over packaging 1 is significantly different from zero. Is the mean sales different depending on the choice of the packaging option? Is the mean sales higher when packaging 1 is chosen over packaging 2? H 0 : 1 0 H a : 1 0 hypothesis From the regression table we have the calculated ttest (when the benchmark is 0) as 2.33 and here we need to calculate the right tail p value ttail(70,–2.33) 0.988, cannot reject the null.
8
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page6 packaging matters session four a first view on dummy regressions ► What if there are more than just two options (or categories) for the dummy variable? key concept construction of dummy variables ► If there are j categories, you create a dummy variable for each category. In the regression you exclude one category, i.e. you use j – 1 categories in the regression. ■ Example : suppose you are recording sales by month (Jan to Dec) thus you have 12 dummy variables, one for each month. However you’ll include only 11 (any of the 12 in fact) of these in the regression Sales 0 1 · Jan 2 · Feb … 11 · Nov Remark. No matter how many categories there are, the coefficient on a given dummy is the difference in the estimated average y for the associated category versus the estimated average y for the excluded category with all other x-variables in the regression held fixed. The constant gives the estimated average level of y when the missing dummy is set to 1. ■ Example : To continue with the above situation – say you exclude the dummy variable for Dec, then the coefficients for dummy variables Jan,…, Nov are simply the difference between estimated average sales in each of the months Jan, …, Nov and average sales for month Dec: 0 is mean Sales for Dec ( Sales Dec ) 1 is difference between mean Sales for Jan and mean Sales for Dec ( Sales Jan Sales Dec ) …
9
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page7 dummies and slope dummies session four ► Dummy variables indicate whether a certain “condition” is true (dummy variable 1) or false (dummy variable 0) If the unit of observation is the individual, the dummy might indicate if that individual is male or female If the unit of observation is weekly sales data, the dummy might indicate if the week occurred in the summer. Remark. By including dummy variables in a regression, you can determine if the dependent variable is larger or smaller when the dummy is “turned on” versus when it is “turned off” ► So far we looked only at a simple equation relating y and a dummy variable. What if besides the dummy variable there is another continuous variable, say x, that affects the mean of y? Thus the regression equation would be: E [ y ] 0 1 · dummy 2 · x ► For this specification notice that the change in x induces the same change in y (and equal to 2 ) independent of the level of dummy variables. ► What if we think that x may have a different impact on y when dummy 0 versus when dummy 1? How do we model this relation? key concept slope dummy variables ► A slope dummy is an explanatory variable that is formed by multiplying a dummy variable by another variable. If dummy is a dummy variable and x is some other variable, then dummy · x is a slope dummy variable.
10
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page8 dummies and slope dummies session four ► Compare the following two specifications: simple dummy: E [ y ] 0 1 · dummy 2 · x change in y for change in x not affected by dummy slope dummy: E [ y ] 0 1 · dummy 2 · x 3 · dummy · x change in y for change in x is affected by dummy Remark. Notice the simple algebraic implications: dummy 0 then slope dummy 0 dummy 1 then slope dummy x ► Does the slope dummy variable allow us to model the situation in which x may have a different impact on y when dummy 0 versus when dummy 1? In general the change in mean y, E [ y ], for a change in x, x, is given by E [ y ] 2 · x 3 · dummy · x thus setting dummy to zero and the to one: E [ y | dummy 0] 2 · x E [ y | dummy 1] 2 · x 3 · x ( 2 3 ) · x ► Notice that for the same change of one unit in x the change in mean y is 2 when dummy 0 but it is 2 3 when dummy 1. slope dummy
11
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page9 dummies and slope dummies: graphical comparison session four E [ y ] 0 1 · d 2 · x 11 1 3·x1 3·x E[y] 0 1·d 2·x 3·d·x E[y] 0 1·d 2·x 3·d·x 0 10 1 00 0 10 1 00 slope 2 slope 2 3 d 1 d 0 d 1 d 0 ► On the left diagram notice that including only a dummy variable results in a change in level only (and the same for any x.) The slope is the same 2 ► To see this: set d = 0 and then d = 1 ► On the right diagram notice that including both the dummy and slope dummy variables results in a change in level and in slope. The change in slope is exactly 3 ► To see this: set d = 0 and then d = 1 E [ y ] 0 1 2 · x E [ y ] 0 2 · x E[y] 0 1 2·x 3·x E[y] 0 1 2·x 3·x E[y] 0 2·x E[y] 0 2·x Figure 4. Comparison of true relation between mean of y and a dummy variable d, a continuous variable x and a slope dummy
12
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page10 Web Ads session four the use of dummies Figure 5. Scatter diagram for hits and advatW Notice how the if command allows you to plot on the same graph the scatters for y by the values of the dummy variable. twoway (scatter hits advatW if advatW 0) (scatter hits advatW if advatW 1) regress hits advatW Remark. The regression command is the same regardless of having or not a dummy variables among the explanatory variables. Figure 6. Regression results hits on advatW hits | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------+---------------------------------------------------------------- advatW | 108975.8 58124.93 1.87 0.077 -13140.21 231091.7 _cons | 154809.8 45023.38 3.44 0.003 60219.14 249400.4 ► The estimated regression is: Est. E [ hits ] 154809.8 108975.8· advatW
13
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page11 Web Ads session four the use of dummies ► Coefficients interpretation: b 0 154,809.8 the constant b 0 represents the expected number of hits without advertisement Est. E [ hits ] when advatW 0 you should expect around 154,810 hits if no advertisement is performed b 0 b 1 263,785.5 the sum of the coefficients, b 0 b 1, represents the expected number of hits when advertisement is performed Est. E [ hits ] when advatW 1 you should expect a total number of hits of 263,786 when you advertise b 1 108,975.8 the coefficient b 1 represents the difference in number of hits when advertisement is performed compared to no advertisement Est. E [ hits ] when advatW 1 minus Est. E [ hits ] when advatW 0 you should expect the number of hits to increase by 108,976 when you start advertising vs. no advertisement Figure 7. Regression results hits on advatW hits | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------+---------------------------------------------------------------- advatW | 108975.8 58124.93 1.87 0.077 -13140.21 231091.7 _cons | 154809.8 45023.38 3.44 0.003 60219.14 249400.4
14
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page12 Web Ads session four the use of dummies ► There are two indicators of precision: the standard error (as a measure of “spreading” of the estimates from repeating the sampling around the current mean estimate) and the confidence interval (as a measure of “spreading “ of the estimates around the current mean estimate for a given confidence level). The link between the two measures above is t df, /2 invttail( df, /2) sample estimate std.error · t df, /2 true value of estimate sample estimate std.error · t df, /2 ► The associated 95% confidence interval for the true number of extra hits ranges from 13,140 to 231,092 which is a huge range. Our estimate of the number of extra hits does not appear to be very precise. The calculation is given below: lower bound of confidence interval ( advatW ) = 108975.8 2.100·58124.93 13,140.21 upper bound of confidence interval ( advatW ) = 108975.8 2.100·58124.93 231,091.7 where, for 0.05 and df 18, we get t 18,0.025 invttail(18,0.025) 2.100
15
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page13 Web Ads session four the use of dummies ► For a contract that requires a weekly charge of $ 2,000 for advertising space you’ll need at least 200,000 weekly hits to break-even (at $ 0.01 “revenue” per hit). The estimated extra weekly hits is b 1 108,975.8 ► The current estimate is indeed fairly low but since this is just an estimate we might want to investigate further the likelihood that the extra weekly hits are at least 200,000. So we would set up: ► The easiest way to conduct the test is to use the klincom command (as klincom _b[advatW] 200000). Why klincom and not kpredint? Here we are talking about a change in y-variable vs. a change in x-variable thus we are looking at a “ move along the regression line ” - we are looking at a change in the mean (average) of y as we change x. If we are testing the number of hits the we could use the kpredint. H 0 : 1 200,000 H a : 1 200,000 hypothesis hits | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------+---------------------------------------------------------------- (1) | -91024.25 58124.93 -1.57 0.135 -213140.2 31091.71 ------------------------------------------------------------------------ If Ha: < then Pr(T < t) =.067 If Ha: not = then Pr(|T| > |t|) =.135 If Ha: > then Pr(T > t) =.933 Figure 8. Output for klincom command ► We cannot be rejected the null at 5%.
16
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page14 Pizza Sales session four the use of slope dummies ► Let’s start by visualizing the data: pizza sales versus income. Notice that income is the continuous variable here. Figure 9. Graphical representation of data twoway (scatter Sales Income if Competitor 0, mcolor(red)) (scatter Sales Income if Competitor 0, mcolor(navy)) To create scatters for different values of the dummy variable and color them differently we can use the if clause and the option mcolor( color ). Remark. Notice that the sales tend to be higher without competition for the same level of income (red dots); also it seems that the change in sales is greater for the same change in income when there’s no competition. ► These observations lead us to consider, eventually, the inclusion of a slope dummy.
17
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page15 Pizza Sales session four the use of slope dummies ► Regression 1 : NO dummy variable E [ Sales ] 0 2 · Income regress Sales Income Figure 10. Regression results Sales on Income Sales | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------+---------------------------------------------------------------- Income | 2.697244.2777973 9.71 0.000 2.138695 3.255793 _cons | 48.72079 87.57192 0.56 0.581 -127.3544 224.7959 ► The estimated regression equation is: Est. E [ Sales ] 48.72079 2.697244· Income Remark. Since we did not include a dummy for the competition regime, the regression above refers to the estimates across all possible competition regimes, i.e. without controlling for competition.
18
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page16 Pizza Sales session four the use of slope dummies ► Regression 1 : NO dummy variable E [ Sales ] 0 2 · Income ► Since we are considering the usefulness of this regression before choosing new locations we have to consider two aspects: since the current regression does not control for competition regime we will get the same estimate for a new location regardless of the presence or not of a competitor; in this respect the regression is not very useful since we are looking for estimates of Sales at one new location and not average Sales from opening several stores at severallocations, we will have to use the kpredint command for a typical, say the average sample level of Income (which is 293.86): Estimate: 841.33286 Standard Error of Individual Prediction: 226.37558 Individual Prediction Interval (95%): [386.17426,1296.4915] t-ratio: 3.7165355 If Ha: < then Pr(T < t) = 1 If Ha: not = then Pr(|T| > |t|) =.001 If Ha: > then Pr(T > t) = 0 kpredint _b[_cons] _b[Income]*293.86, level(95) Figure 11. Output for kpredint command
19
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page17 Pizza Sales session four the use of slope dummies ► Regression 1 : NO dummy variable E [ Sales ] 0 2 · Income ► The prediction interval (at the sample average of Income ) is fairly wide thus “offering” quite a wide range for Sales and suggests that this regression will not allow us very accurate sales predictions for various locations, and thus will not be that useful for choosing new locations. 386.17 1296.49 prediction interval at the sample average of Income ► The diagram shows the prediction interval for each level of Income and the estimated regression line. ► Notice that as Income increases the “dots” representing Sales diverge away from the estimated regression line – a sign that we should probably refine our analysis. Figure 11. Graphical representation of prediction intervals
20
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page18 Pizza Sales session four the use of slope dummies ► Regression 2 : slope dummy variable E [ Sales ] 0 1 · Competitor 2 · Income 3 · Competitor · Income ► A better model can be constructed by taking into account the effect of competition. We can do this by creating a slope dummy variable, IncomeCompetitor, using the command ► We include the dummy variable Competitor and the slope dummy variable IncomeCompetitor in the regression: Competitor equals one if your main competitor chain also has a store in the neighborhood and zero otherwise IncomeCompetitor is the product of Income and Competitor and so it will equal Income if your main competitor chain also has a store in the neighborhood and zero otherwise. ► It is crucial to introduce the slope dummy variable in this problem because you wanted your model to reflect that what your competitor takes from you is market share (or a share of the disposable income in the area). ► It is sensible that the existence of a competitor may induce a greater reduction of your sales (in absolute terms) in richer neighborhoods than in poor neighborhoods. Including only a 0-1 dummy variable would restrict the effect of a competitor to taking away a fixed amount of sales, no matter what the income in the neighborhood ( shift in level ). We need the slope dummy to capture the idea that “ when the pie is bigger there is more for a competitor to steal ” ( shift in level and change in slope ) generate IncomeCompetitor Competitor*Income
21
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page19 Pizza Sales session four the use of slope dummies ► Regression 2 : slope dummy variable E [ Sales ] 0 1 · Competitor 2 · Income 3 · Competitor · Income regress Sales Competitor Income IncomeCompetitor Figure 12. Regression results Sales on Competitor Income and IncomeCompetitor Sales | Coef. Std. Err. t P>|t| [95% Conf. Interval] -----------------+---------------------------------------------------------------- Competitor | -11.10013 78.6879 -0.14 0.888 -169.4907 147.2904 Income | 3.266832.1944854 16.80 0.000 2.875353 3.658311 IncomeCompetitor | -1.24186.2459005 -5.05 0.000 -1.736832 -.7468882 _cons | 90.70052 63.88796 1.42 0.162 -37.89927 219.3003 ► The estimated regression equation is: Est. E [ Sales ] 90.701 11.100· Competitor 3.267· Income 1.242· IncomeCompetitor Remark. Since we did include a dummy for the competition regime, the regression above refers to the estimates controlling for competition.
22
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page20 Pizza Sales session four the use of slope dummies ► Regression 2 : slope dummy variable E [ Sales ] 0 1 · Competitor 2 · Income 3 · Competitor · Income ► The estimated regression is Est. E [ Sales ] b 0 b 1 · Competitor b 2 · Income b 3 · IncomeCompetitor ► For the interpretation of coefficients let’s consider the table below. In the gray-shaded cells we calculate the Sales corresponding to each of the four possible scenarios (obtained as combinations of Competitor being 0 or 1 and Income being 0 or different from zero). To isolate further the coefficient we will calculate the differences row-wise (corresponding to the effect of a change in Income on Sales in neighborhoods without and with competition) and column-wise (corresponding to the effect of competition on Sales in neighborhoods with zero or different than zero Income ) Sales Income = 0 Income ≠ 0Difference row-wise Competitor = 0 b0b0 b 0 + b 2 · Incomeb 2 · Income Competitor = 1 b 0 + b 1 b 0 + b 1 + b 2 · Income + b 3 · Income ( b 2 + b 3 )· Income difference column-wise b1b1 b 1 + b 3 · Incomeb 3 · Income Figure 13. Interpretation of regression coefficients
23
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page21 Pizza Sales session four the use of slope dummies ► Regression 2 : slope dummy variable E [ Sales ] 0 1 · Competitor 2 · Income 3 · Competitor · Income ► Coefficients interpretation: levels vs. changes b 0 the expected Sales in a neighborhood without competition and Income = 0 b 0 b 1 the expected Sales in a neighborhood with competition and Income = 0 b 1 difference in average Sales between neighborhoods with and without competition, Income = 0 b 2 the expected change in Sales relative to changes in Income when Competitor = 0 b 2 b 3 the expected change in Sales relative to changes in Income when Competitor = 1 b 3 differential effect of existence of competition on change in Sales relative to changes in Income ► Coefficients interpretation: intercepts vs. slopes Some of the coefficients above corresponds to intercept(s) and others to slope(s). How can we figure this out? The intercepts correspond to levels, thus b 0 and b 1 are candidates for intercepts, while slopes correspond to changes, thus b 2 and b 3 are candidates for slopes.
24
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page22 Pizza Sales session four the use of slope dummies ► Regression 2 : slope dummy variable Est.E [ Sales ] 90.70 11.10· Competitor 3.26· Income 1.24· IncomeCompetitor b 0 b 1 79.60039 the expected Sales in a neighborhood with competition and Income = 0 b 0 90.70052 the expected Sales in a neighborhood without competition and Income = 0 Sales | Competitor 0 b 0 b 2 · Income Sales | Competitor 1 ( b 0 b 1 ) ( b 2 b 3 )· Income slope b 2 3.26683 slope b 2 b 3 2.02497 Figure 14. Interpretation of regression coefficients
25
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page23 Pizza Sales session four the use of slope dummies ► We are given the Sales level of 400 and asked the level of Income necessary, in a neighborhood with competition, that provides the required level of Sales. ► With Sales 400 and Competitor 1 the equation in Income becomes: 400 90.70052 11.10013·1 3.266832· Income 1.24186· Income ·1 and the required Income level: Income 158.23 ► Of course, the higher the Income the higher the Sales so a legit question would be: how likely is it to get in Income of at least 158.23 in a neighborhood with competition? ► An immediate answer can be obtained using the command ci Income if Competitor 1, level(99) ► Above we forced the statistics for Income to be calculated using observations only for Competitor 1. Seems very likely that the Income will be higher than the break-even level of 158.23. Variable | Obs Mean Std. Err. [99% Conf. Interval] ---------+--------------------------------------------------------------- Income | 29 281.725 22.21943 220.3269 343.1231
26
Managerial Economics & Decision Sciences Department session four dummy variables business analytics II Developed for intro to dummy variables ◄ dummy regressions ◄ slope dummies ◄ © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II | page24 Conclusions session four the use of dummies / slope dummies ► Don’t add slope dummies just because you can run a regression with many independent variables This adds unwanted complexity to your regression This can unnecessarily decrease the precision of your estimates and make it more difficult to interpret results ► You should include a slope dummy only if you suspect that or are interested in whether the effect of some variable x on y depends on the value of some dummy variable d Even then, it is a good idea to also report the results without the slope dummy, so you can get an idea of the overall effect of x across all values of d ► What if the coefficient on the slope dummy is insignificant? This implies that we do not have strong evidence that the effect of x on y depends on the value of the slope dummy Unless you have strong a priori reasons to keep it or it is central to the analysis at hand, you can drop the slope dummy in such situations and redo the analysis without it ► What if the coefficient on the slope dummy is significant but the coefficient on the dummy variable (“uninteracted” with the slope dummy) is insignificant? You should nonetheless include the dummy variable to facilitate interpretation of the results
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.