WEMBA, Regression Analysis Market Intelligence Julie Edell Britton Session 9 October 9, 2009
Today’s Agenda Announcements WEMBA C Multiple Regression Conjoint & dummy variable regression Multiple R2Y|X1, X2 vs. r2Y|X(i) Uncorrelated predictors Correlated predictors Promotion analysis Course Evaluations
Announcements Submit Nestle Contadina slides by 8 am tomorrow, Sat., 10/10
4 A Model of School Choice Values Perceptions Individual Differences & Constraints Become a Fuqua Student Assumes that behavior is driven by differences in: Values (Importance of key attributes) Perceptions (Duke and Competition on key attributes) Individual Differences & Constraints (travel, cost, etc.)
5 The Funnel Matriculate Admitted Opt Out Apply Selected Out Attend Information Session Opt Out Do not attend Information Session Opt Out
6 The Analysis Approach Sample groups that differ in behavior Compare the groups on relevant dimensions: Perceptions Values Individual Differences & Constraints Infer that any difference found between groups are partly responsible for differences in behavior
7 7 Disproportionate Stratified Random Sample Matriculate Admitted Opt Out Apply Selected Out Attend Information SessionDo not attend Information Session Not Apply n = 26 (of 158, 16.5%) Not Apply n = 24 (of 173, 14%) n = 30 (of 60, 50%) n = 26 (of 60, 52%) n = 42 (of 56, 75%) n = 12 (of 56, 25%) n = 56
8 What Drives Application to Duke? Compare Applied to Did Not Apply Perceptions (Duke – Competitor) Constraints Financial assistance program % Cost Time to travel to Duke Can’t use attendance at Information Session as a factor here without going to population data What did you learn?
9 What Drives Application to Duke? * = chi-square test statistic FACTOR AppliedDid Not Applyt-stat p-value (2-tail) Perceptions (Duke – Competitor) Faculty Reputation Cost Mix of Face to Face and Distance Learning Demographics % Cost You Pay How Long to Travel Duke Company Helps Pay85.7%72.0% 3.038*.081
10 What Drives Acceptance? Compare accepted to did not accept, conditional having applied Perceptions (Duke – Competitor) Constraints Financial assistance program % Cost paid by employer Time to travel to Duke Can consider info session attendance here What did you learn?
11 FACTOR AcceptedDid Not Accept t-stat p-value (2-tail) Perceptions (Duke – Competitior) Faculty Reputation Teaching Quality Core Quality Electives Student Quality Technology Advance Career Cost Student Reputation Demographics % Cost You Pay How Long to Travel Duke Attended an Information Session59.5%39.7% 2.406*.108 What Drives Acceptance? * = chi-square test statistic
12 Acceptance by Sponsorship % paid by company: Student = 53.5 NonStudent = 31.1
13 The Impact of Info Sessions? Compare those who attended to those who did not attend on perceptions of Duke What did you learn?
14 The Impact of Info Sessions? FACTOR Info SessionNo Info Session t-stat p-value (2-tail) Perceptions (Duke) Ability to Shift Work Cost Half the perceptual factors (9 of 17) were in the wrong direction! Information sessions don’t seem to be doing much good at all. *Only positive evidence is that, in overall population, probability of applying was 16.5% for those who attend an info session vs. 9.5% for those who did not, χ 2 = 8.70, p <.005. *No significant effect on matriculation / acceptance: 59.5% of those who attended accepted v. 39.7% of those who did not attend, χ 2 = 2.41, p =.11
15 Who to Target? FACTOR AppliedDid Not Applyt-stat p-value (2-tail) % Cost You Pay How Long to Travel Duke Company Helps Pay 85.7%72.0% 3.038*.081 FACTOR AcceptedDid Not Accept t-stat p-value (2-tail) % Cost You Pay How Long to Travel Duke
16 What Should be Emphasized in Information Sessions? Attributes that are important and where we do well and/or where our competition does not do well. Importance Focus on attributes that predict applying & accepting Rank order attribute importance Look at important attributes where perceptions (Duke – Comp) is positive.
17 Quasi-MAAM for Communication Content: Demonstrated Importance Duke Advantage Duke Disadvantage Important Faculty Rep, Cost Advance Career, Teaching Quality, Electives, Core Quality, Faculty Rep, Student Quality, School Rep BragMisperception (fix Marcom) or Real Problem (fix Product)? UnimportantSave Your Breath Ignore
18 Quasi-MAAM for Communication Content: Using Importance Ratings Duke Advantage Duke Disadvantage Important Continue Career, School Rep, Forwarding Career, Teaching Quality, Faculty Reputation, Other Students BragMisperception (fix Marcom) or Real Problem (fix Product)? UnimportantSave Your Breath Ignore
19 WEMBA Takeaways Be backward in your analysis process too Outline your analysis before you begin Think about tables needed, then do the analysis to make them Real data is imperfect…do the best you can Survey data is correlational, not causal The funnel approach applies to many business problems
20 Multiple Regression Simple linear regression, with more than one predictor. a = intercept: predicted value of y if x 1 = x 2 = …x k = 0 b 1 =Slope of y on x 1 given that x 2 …x k are already in equation R =multiple correlation = correlation of Y-hat with Y (0< R < 1) R 2 = % variance in Y explained by best linear regression equation.
21 Conjoint as Dummy Variable Regression (Y) (X 1 )(X 2 ) (X 3 ) Rating Size16oz Pepsi Caffeine
22 Bivariate Correlation Matrix
23 R 2 Y|X1,X2,X3 = r 2 Y|X1 +r 2 Y|X2 +r 2 Y|X3 Because Dummies are Uncorrelated.976 =[(.327) 2 +(-.873) ) 2 ]
24 Dummy Variable Regression
25 A Framework for Understanding Multicollinearity Major Problem in Multiple Regression: Assessing the (unique) contribution of the individual predictors. ab Variance in x 1 Variance in x 2 Variance in y c e
26 The area in c causes ambiguity in specifying the contributions of x 1 and x 2 to explaining y. Should we attribute all of this variance to x 1 ? All to x 2 ? Somehow split it? Two different ways exist of assessing the contribution of x1 to explaining y. Multicollinearity
27 2 Ways to Deal with Overlap (see slide 26)
28 Understand Expenditures in Milan Food Run two regressions Total in householdWeekly food expenditures Any kids 6- 18? (0=no, 1=yes) Annual income
29 But Check Correlations Between Predictors
30 Zero order coefficients tell us: r bc = +.43 r bi = +.40 r bj = +.37 More people in household, more weekly food expenditures If any kids 6-18, more weekly food expenditures All strongly statistically significant with 498 df Higher income, more spent on food Which two should predict Weekly expenditures best?
31 Multicollinearity, r2, and R2 ModelPredictor 1Predictor 2Sum of r 2 Actual R 2 1HHSizeKid HHSizeIncomeK IncomeKIncome$
32
33 “Partial Effect” Milan Food Problem Run two regressions Weekly food expenditures Any children under 6? (0=no, 1=yes) Total in household Any children under 6? (0=no, 1=yes)
34
35 Doritos XL Models. Effects of own price (& price promotion) & price promotions of other sizes (SM, XXL, 3XL) on sales of XL size
36
37 Only Own XL Price Significant
38 Drop 3 XL from the model
39 Promotion Models (p. 195) If coupon dummy (1 = yes, 0 = no) and promo dummy (1=yes, 0=no) are perfectly correlated and each is correlated, say, r =.5 with weekly sales, R 2 (sales |coupon, promo) =.25 << Coefficients on coupon, promo would be indistinguishable from zero (nonsignificant), with huge standard errs.
40 Regression Analysis Data WeekY = SalesX1 = CouponX2 =Promotion …………
41 Promotion Models Omitted Variable Bias Promo, Coupons each boost 1000 units, but Coupon omitted & r = 1 with promo, coefficient for promo will be (P ) Multicollinearity & Overloaded Models
42 Sales Promotion Analysis, p. 187
43 Takeaways Multiple Regression Dummy Variable Regression for Conjoint (uncorrelated predictors) Correlated predictors make it difficult to assess each predictor’s unique contribution. Common in promotion analysis because it is common to pull multiple promotional levers simultaneously. 2 Solutions: Drop a predictor (omitted variable bias so reinterpret coefficients) Leave both in (inflated Standard Errors, hard to assess impact of each)