business analytics II ▌assignment three - solutions pet food 

Slides:



Advertisements
Similar presentations
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Objectives (BPS chapter 24)
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Chapter 12 Simple Regression
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
SIMPLE LINEAR REGRESSION
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
Linear Regression Example Data
SIMPLE LINEAR REGRESSION
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Simple Linear Regression Analysis
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Hypothesis Testing in Linear Regression Analysis
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
CHAPTER 14 MULTIPLE REGRESSION
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Managerial Economics & Decision Sciences Department intro to dummy variables  dummy regressions  slope dummies  business analytics II Developed for.
Managerial Economics & Decision Sciences Department introduction  inflated standard deviations  the F  test  business analytics II Developed for ©
Managerial Economics & Decision Sciences Department hypotheses, test and confidence intervals  linear regression: estimation and interpretation  linear.
Managerial Economics & Decision Sciences Department cross-section and panel data  fixed effects  omitted variable bias  business analytics II Developed.
Managerial Economics & Decision Sciences Department tyler realty  old faithful  business analytics II Developed for © 2016 kellogg school of management.
Managerial Economics & Decision Sciences Department hypotheses  tests  confidence intervals  business analytics II Developed for © 2016 kellogg school.
Managerial Economics & Decision Sciences Department intro to linear regression  underlying concepts for the linear regression  interpret linear regression.
Managerial Economics & Decision Sciences Department random variables  density functions  cumulative functions  business analytics II Developed for ©
business analytics II ▌assignment four - solutions mba for yourself 
business analytics II ▌assignment three - solutions pet food 
Chapter 14 Introduction to Multiple Regression
QM222 Class 9 Section A1 Coefficient statistics
The Multiple Regression Model
business analytics II ▌appendix – regression performance the R2 
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
assignment 7 solutions ► office networks ► super staffing
26134 Business Statistics Week 5 Tutorial
business analytics II ▌assignment one - solutions autoparts 
business analytics II ▌panel data models
business analytics II ▌applications fuel efficiency 
QM222 Class 8 Section A1 Using categorical data in regression
Chapter 11 Simple Regression
assignment 8 solutions ► yogurt brands Developed for
Elementary Statistics
The Least-Squares Regression Line
...Relax... 9/21/2018 ST3131, Lecture 3 ST5213 Semester II, 2000/2001
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Chapter 9 Hypothesis Testing.
The Simple Linear Regression Model: Specification and Estimation
Interval Estimation and Hypothesis Testing
SIMPLE LINEAR REGRESSION
Product moment correlation
SIMPLE LINEAR REGRESSION
The Multiple Regression Model
Correlation and Simple Linear Regression
Presentation transcript:

business analytics II ▌assignment three - solutions pet food  Managerial Economics & Decision Sciences Department Developed for business analytics II week 3 ▌assignment three - solutions week 4 pet food  soap sales  dvd resales  week 3 © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II

► statistics & econometrics session three inference, confidence and prediction intervals Developed for business analytics II learning objectives ► statistics & econometrics  definition of confidence and prediction intervals  differences between confidence and prediction intervals ►  generate confidence and prediction intervals  klincom and kpredint commands readings ► (MSN)  Chapter 3 ► (CS)  Pet Food  Soap Sales  DVD Resales © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II

Pet Food: Visualize Data and Regression Managerial Economics & Decision Sciences Department assignment three - solutions inference, confidence and prediction intervals Developed for business analytics II pet food ◄ soap sales ◄ dvd resales ◄ Pet Food: Visualize Data and Regression Figure 1. Graphical relation: price and sqfoot ► First step in analyzing (a two-dimensional data problem) is to visualize the relation between the variables. To find the parameters of the linear fit (intercept and slope) we should run the regression of WeeklySales_hundredsofdollars on ShelfSpace_sqft. Remark. The linear regression provides estimates b0 and b1 of true parameters 0 and 1 assumed to reflect the relation between mean of WeeklySales_hundredsofdollars and ShelfSpace_sqft at population level. rename WeeklySales_hundredsofdollars wsales rename ShelfSpace_sqft sspace regress wsales sspace Figure 2. Results for linear regression of wsales on sspace wsales | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------+---------------------------------------------------------------- sspace | .074 .0159081 4.65 0.001 .0385546 .1094454 _cons | 1.45 .2178302 6.66 0.000 .9646441 1.935356 ► The estimated parameters are b0  1.45 and b1  0.074. These values are stored by STATA as _b[_cons] and _b[sspace] respectively; these can be referred as such in subsequent calculations. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II assignment three | page 1

Est.E[wsales]  1.45  0.074  sspace Managerial Economics & Decision Sciences Department assignment three - solutions inference, confidence and prediction intervals Developed for business analytics II pet food ◄ soap sales ◄ dvd resales ◄ Pet Food: Interpret the Linear Regression ► The estimated regression line is: Est.E[wsales]  1.45  0.074  sspace regress wsales sspace generate wsaleshat  _b[_cons]  _b[sspace]*sspace twoway (scatter wsales sspace) (connected wsaleshat sspace, sort msymbol(i)) Figure 3. Linear regression graphical representation  In the first step we perform the regression which provides the estimates for the linear coefficients. We generate then the fitted values of wsales for each available observation of sspace. We usually call this fitted value as varnamehat and its calculation is fairly intuitive: use the estimated coefficients and “plug” the values for sspace.  Add msymbol(i) as an option to the connected graph in order to remove the markers along the line. Est. E[wsales]  1.45  0.074·sspace © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II assignment three | page 2

Est.E[ wsales | sspace  8 ]  1.45  0.074·8  2.042 (in hundreds) Managerial Economics & Decision Sciences Department assignment three - solutions inference, confidence and prediction intervals Developed for business analytics II pet food ◄ soap sales ◄ dvd resales ◄ Pet Food: Prediction ► For sspace = 8 the average wsales according to the estimated regression is: Est.E[ wsales | sspace  8 ]  1.45  0.074·8  2.042 (in hundreds) Figure 4. Prediction: graphical representation  Use display _b[_cons]  _b[sspace]*8 to get the result. predicted average wsales for sspace = 8 Remark. Graphically, the predicted price lies on the fitted line corresponding to the estimated regression. Why?  Whenever you “plug” values for independent variables into the regression equation you basically pick points on the regression line. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II assignment three | page 3

► For the confidence interval we use the klincom command: Managerial Economics & Decision Sciences Department assignment three - solutions inference, confidence and prediction intervals Developed for business analytics II pet food ◄ soap sales ◄ dvd resales ◄ Pet Food: Intervals ► For the confidence interval we use the klincom command: klincom _b[_cons]  _b[sspace]*8, level(90) ► The confidence interval says that with 90% confidence the average weekly sales of pet food for stores with 8 square feet of shelf space for pet food will be between $1.8351 and $2.2489 (in hundreds) ► The prediction interval says that with 90% confidence the weekly sales of pet food in any individual store with 8 square feet of self space for pet food will be between $1.4465 and $2.6375 (in hundreds) Figure 5. Results for klincom command wsales | Coef. Std. Err. t P>|t| [90% Conf. Interval] -------+----------------------------------------------------- (1) | 2.042 .1141619 17.89 0.000 1.835086 2.248914 ► For the prediction interval we use the kpredint command: kpredint _b[_cons]  _b[sspace]*8, level(90) Figure 6. Results for kpredint command Estimate: 2.042 Standard Error of Individual Prediction: .32853149 Individual Prediction Interval (90%): [ 1.4465494, 2.6374506 ] t-ratio: 6.2155381 Remark. The key difference is that the confidence interval is about the average over stores of weekly sales of pet food for stores having 8 square feet of shelf space while the prediction interval is about the weekly sales of pet food for an individual store having 8 square feet of shelf space. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II assignment three | page 4

Pet Food: Intervals generate estwsales  _b[_cons]  _b[sspace]*sspace Managerial Economics & Decision Sciences Department assignment three - solutions inference, confidence and prediction intervals Developed for business analytics II pet food ◄ soap sales ◄ dvd resales ◄ Pet Food: Intervals generate estwsales  _b[_cons]  _b[sspace]*sspace  This command will generate a new variable estwsales by calculating for each value of sspace in the sample the corresponding value from the expression _b[_cons]  _b[sspace]*sspace generate tval  invttail(10,0.025)  This command will generate a new variable tval by calculating for each value of sspace in the sample the corresponding value from the expression invttail(10,0.025). predict CIstderror, stdp (for confidence interval) predict PIstderror, stdf (for prediction interval)  This command will generate new variables CIstderror and PIstderror by calculating for each value of sspace in the sample the standard error of mean for _b[_cons]  _b[sspace]*sspace. As such, the command generates n observations (sample size) and after this command you should have two extra columns in the data sets for CIstderror and PIstderror. Since you have different values for sspace in the sample set, and therefore different estimates for the dependent variable, you will have different standard error for each value of sspace. generate lbCI  estwsales – tval*CIstderror generate ubCI  estwsales  tval*Cistderror generate lbPI  estwsales – tval*PIstderror generate ubPI  estwsales  tval*PIstderror  These commands will generate new variables lbCI, ubCI and lbPI, ubPI by calculating for each value of sspace in the sample the corresponding value from the expressions in the above commands. These bounds will be different for different values of sspace. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II assignment three | page 5

Pet Food: Intervals Figure 7. Confidence and prediction intervals Managerial Economics & Decision Sciences Department assignment three - solutions inference, confidence and prediction intervals Developed for business analytics II pet food ◄ soap sales ◄ dvd resales ◄ Pet Food: Intervals Figure 7. Confidence and prediction intervals This is where you find the intervals (confidence and prediction) for Shelf Space = 8 sqft ubPI = 2.63 ubCI = 2.24 lbCI = 1.83 lbPI = 1.44 8 © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II assignment three | page 6

Soap Sales: Estimated Regression Managerial Economics & Decision Sciences Department assignment three - solutions inference, confidence and prediction intervals Developed for business analytics II pet food ◄ soap sales ◄ dvd resales ◄ Soap Sales: Estimated Regression ► The regression table provides the following information: regress Sales Price Figure 8. Results for linear regression of Sales on Price Sales | Coef. Std.Err. t P>|t| [95% Conf. Interval] ------+-------------------------------------------------------- Price | -.2929416 .0616406 -4.75 0.000 -.4204540 -.165428 _cons | 5.8291984 .4241016 13.74 0.000 4.9518744 6.7065194 ► The estimated regression is thus: Est. E[Sales]  5.8291984  0.2929416·Price ► We can use this equation to determine how the change in Price affects the change in estimated mean Sales: Change in Est. E[Sales]   0.2929416·Change in Price ► A decrease in Price by $0.50 means a change in Price  0.50 implying a change in estimated mean Sales: Change in Est. E[Sales]   0.2929416·( 0.50)  0.1465 That is an increase in estimated mean Sales by $0.14651,000 $146.5. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II assignment three | page 7

Soap Sales: Confidence Interval Managerial Economics & Decision Sciences Department assignment three - solutions inference, confidence and prediction intervals Developed for business analytics II pet food ◄ soap sales ◄ dvd resales ◄ Soap Sales: Confidence Interval ► Notice that we are asked to provide a confidence interval for the change in estimated mean Sales when the Price decreases by $0.50. We saw that: Change in Est. E[Sales]  b1·Change in Price ► Since the change in price is fixed (as 0.50) the uncertainty about the true change in mean Sales is related to the uncertainty about the true value of the parameter 1. In other words, if we are given that the true parameter 1 is in the interval lower bound for 1  1  upper bound for 1 then, if Change in Price is positive: (lower bound for 1)Change in Price  Change in E[Sales]  (upper bound for 1)Change in Price and if Change in Price is negative: (upper bound for 1)Change in Price  Change in E[Sales]  (lower bound for 1)Change in Price ► Since the change in price is 0.50 and the lower ad upper bound are given in the regression table (0.4204540 and 0.165428 respectively) we get the 95% interval for the change in true mean Sales as (in thousands): (0.165428)(0.50)  Change in E[Sales]  (0.4204540)(0.50) that is $82.714  Change in E[Sales]  $210.227 © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II assignment three | page 8

Soap Sales: Hypothesis Testing Managerial Economics & Decision Sciences Department assignment three - solutions inference, confidence and prediction intervals Developed for business analytics II pet food ◄ soap sales ◄ dvd resales ◄ Soap Sales: Hypothesis Testing ► The claim is that a decrease of $0.50 in Price will result in an increase true mean Sales per store by at least $160. Thus we set up first the null/alternative hypotheses (notice the symbol delta  for Change) hypothesis H0: E[Sales] | Price   0.50  0.160 Ha: E[Sales] | Price   0.50  0.160 set hypotheses ► To get the test remember that (notice how the equality holds for change in true mean of Sales and the true regression parameter 1): E[Sales]  1Price ► We can recast the above hypotheses in terms of true parameter 1 and provide the hypothesis H0: 1Price  0.160 Ha: 1Price  0.160 H0: 1   0.320 Ha: 1   0.320 set hypotheses for Price  0.50: test calculate decision calculate (right tail) pvalue  Pr[ T  ttest ] reject the null hypothesis if pvalue   © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II assignment three | page 9

H0: E[Sales] | Price   0.50  0.160 Managerial Economics & Decision Sciences Department assignment three - solutions inference, confidence and prediction intervals Developed for business analytics II pet food ◄ soap sales ◄ dvd resales ◄ Soap Sales: Hypothesis Testing and Prediction Figure 9. Results for linear regression of Sales on Price Sales | Coef. Std.Err. t P>|t| [95% Conf. Interval] ------+-------------------------------------------------------- Price | -.2929416 .0616406 -4.75 0.000 -.4204540 -.165428 _cons | 5.8291984 .4241016 13.74 0.000 4.9518744 6.7065194 ► We calculate the ttest as ► The (right tail) pvalue  Pr[ T  ttest ]  ttail(923,0.438)  0.333 ► We cannot reject the stated null H0: E[Sales] | Price   0.50  0.160 for   5% (in fact we cannot reject the null for any choice of  up to about 34%.) Remark. Given that the calculated ttest is fairly close to zero, “flipping” the null and alternative will lead you to calculate the left tail pvalue  0.667, again concluding that you cannot reject the null which in this flipped case would be H0: E[Sales] | Price   0.50  0.160 ► For Price  $9.99 the estimated mean Sales per store is found based on the estimated regression as Est. E[Sales | Price  9.99]  5.8291984  0.2929416·9.99  2.902712 (thousands) ► For 2000 stores the estimated mean sales in dollars is simply 2000·2.902712·1,000  $5,805,423.63. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II assignment three | page 10

Est. E[ DVDs|Gross  36 ]  26.535  8.0831·36  $317.530 (thousands) Managerial Economics & Decision Sciences Department assignment three - solutions inference, confidence and prediction intervals Developed for business analytics II pet food ◄ soap sales ◄ dvd resales ◄ DVD Resales: Hypothesis Testing and Prediction ► The regression table provides the following information: Figure 10. Results for linear regression of DVDs on Gross DVDs | Coef. Std.Err. t P>|t| [95% Conf. Interval] ------+-------------------------------------------------------- Gross | 8.0831 .5008435 16.1 0.000 7.057181 9.109037 _cons | 26.535 11.83184 2.24 0.033 2.298798 50.77149 ► The estimated regression equation (in thousands): Est. E[DVDs]  26.535  8.0831·Gross with estimated mean resales of DVDs for Gross  36: Est. E[ DVDs|Gross  36 ]  26.535  8.0831·36  $317.530 (thousands) ► The prediction interval has the form: Est. E[ DVDs|Gross  36 ]  std.errDVDs·tdf,/2  DVDs|Gross  36  Est. E[ DVDs|Gross  36 ]  std.errDVDs·tdf,/2 ► With tdf,/2  invttail(28,0.025)  2.0484 and std.errDVDs  49.841 thousands we get: 317.530  2.0484·49.841  DVDs|Gross  36  317.530  2.0484·49.841 that is $215.435  DVDs|Gross  36  $419.624 © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II assignment three | page 11

DVD Resales: Confidence Interval Managerial Economics & Decision Sciences Department assignment three - solutions inference, confidence and prediction intervals Developed for business analytics II pet food ◄ soap sales ◄ dvd resales ◄ DVD Resales: Confidence Interval ► The regression table provides the following information: Figure 11. Results for linear regression of DVDs on Gross DVDs | Coef. Std.Err. t P>|t| [95% Conf. Interval] ------+-------------------------------------------------------- Gross | 8.0831 .5008435 16.1 0.000 7.057181 9.109037 _cons | 26.535 11.83184 2.24 0.033 2.298798 50.77149 ► The key here is to recognize that when Gross  0, the only remaining part of the regression equation is b0 since the term b1·Gross “drops out” for Gross  0. Thus, estimated mean DVD sales, given that Gross  0 become: Est. E[ DVDs|Gross  0 ]  b0 ► This implies that, whenever Gross  0, the uncertainty about the true mean DVDs comes from the uncertainty about the true parameter 0. The confidence interval for true mean DVDs, when Gross  0, coincides with the confidence interval for the constant (in thousands): [2.298798, 50.77149]. © 2016 kellogg school of management | managerial economics and decision sciences department | business analytics II assignment three | page 12