1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 12 Multiple.

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Correlation and regression
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 5 Analyzing.
Chapter 12 Simple Regression
Lecture 22 Multiple Regression (Sections )
Chapter 12b Testing for significance—the t-test Developing confidence intervals for estimates of β 1. Testing for significance—the f-test Using Excel’s.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Irwin/McGraw-Hill © The McGraw-Hill Companies, Inc., 2000 LIND MASON MARCHAL 1-1 Chapter Twelve Multiple Regression and Correlation Analysis GOALS When.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 7 Sampling.
Multiple Linear Regression
Introduction to Probability and Statistics Linear Regression and Correlation.
Multiple Regression and Correlation Analysis
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Simple Linear Regression Analysis
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 15 The.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Correlation & Regression
Example of Simple and Multiple Regression
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Introduction to Linear Regression and Correlation Analysis
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Chapter 13: Inference in Regression
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
CHAPTER 14 MULTIPLE REGRESSION
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
Go to Table of Content Single Variable Regression Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
14- 1 Chapter Fourteen McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Chapter 13 Multiple Regression
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 14 Experimental.
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 2: Review of Multiple Regression (Ch. 4-5)
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
ANOVA, Regression and Multiple Regression March
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 2 The.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Economics 173 Business Statistics Lecture 18 Fall, 2001 Professor J. Petry
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 13 Time.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Chapter 13 Simple Linear Regression
Essentials of Modern Business Statistics (7e)
Chapter 13 Created by Bethany Stubbe and Stephan Kogitz.
Chapter 11 Simple Regression
Prepared by Lee Revere and John Large
Multiple Regression Chapter 14.
Analyzing Bivariate Data
Chapter Fourteen McGraw-Hill/Irwin
Presentation transcript:

1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 12 Multiple Regression Models

2 Doing Statistics for Business Chapter 12 Objectives Ü Find the the regression equation for a dependent variable Y as a function of a set of independent variables, X 1, X 2,…X k. Ü Determine whether the relationship is significant. Ü Determine which variables contribute to the model and which do not.

3 Doing Statistics for Business Chapter 12 Objectives (con’t) Ü Analyze the results of a regression analysis to determine whether the model is appropriate.

4 Doing Statistics for Business The dependent variable, Y, is often referred to as the output variable, while the set of independent variables, X 1, X 2, …, X k, are referred to as the input variables

5 Doing Statistics for Business The true relationship between the independent variable Y and the set of independent variables, X 1, X 2,…X k, the multiple regression model, can be described by the equation: y =     x    x   k x k + 

6 Doing Statistics for Business Figure 12.1 Computer Output from Excel

7 Doing Statistics for Business Figure 12.1 (con’t) Computer Output from Minitab

8 Doing Statistics for Business TRY IT NOW! Order Filling Finding the Multiple Regression Model A mail-order catalog company is looking at the time it takes to prepare an order for for shipping. In particular, the company is looking for the amount of time that is spent collecting the items ordered and packing them. In this operation, an employee (a checker) is given an order to fill. Items are located in bins in one of six different sections of the warehouse. The checkers move around the warehouse retrieving the items and

9 Doing Statistics for Business TRY IT NOW! Order Filling Finding the Multiple Regression Model (con’t) packing them into the shipping cartons. The company has looked at the operation in some detail and believes that three major variables are involved in the process: the number of items ordered, the number of different locations (sections of the warehouse) in which the items are located, and the experience level (in months) of the checker. Data are collected on 45 orders. A portion of the data is shown next:

10 Doing Statistics for Business TRY IT NOW! Order Filling Finding the Multiple Regression Model (con’t)

11 Doing Statistics for Business TRY IT NOW! Order Filling Finding the Multiple Regression Model (con’t) The company wants to know how the time it takes to fill an order is related to the other three variables, so it decides to use a multiple regression model. Write down the equation of the regression model. Interpret the value of each of the coefficients of the model.

12 Doing Statistics for Business TRY IT NOW! Order Filling Finding Predicted Values The mail-order catalog company that is looking at the time it takes to prepare an order for shipping wants to see how well the model it has found predicts the time to fill an order. The data for six of the observations are shown next:

13 Doing Statistics for Business TRY IT NOW! Order Filling Finding Predicted Values (con’t)

14 Doing Statistics for Business TRY IT NOW! Order Filling Finding Predicted Values (con’t) Use the model to find the predicted time to fill an order for each of the six sets of input data you have. Compare the predicted results to the actual data. Do you think that this model does a good job of predicting the time to fill an order? Why or why not?

15 Doing Statistics for Business Figure 12.2 ANOVA Table for Multiple Regression Model

16 Doing Statistics for Business TRY IT NOW! Order Filling Testing the Significance of the Model The mail-order company looking at factors related to the time to fill an order wants to know if its model is significant. Write down the hypotheses that it needs to test. Using the Excel output from your textbook, locate the values for MSR and MSE and their degrees of freedom.

17 Doing Statistics for Business TRY IT NOW! Order Filling Testing the Significance of the Model Use the values of MSR and MSE to calculate the value of the F statistic and compare it to the value in the table. At the 0.05 level of significance, what is the critical value for the test? What can the mail-order company conclude as a result of its test? Verify your answer by using the p value from the printout.

18 Doing Statistics for Business The Coefficient of Multiple Determination, R 2, is a measure of the percentage of the variation in the dependent variable, Y, that can be accounted for by the complete set of independent variables, X 1, X 2,…X k, in the model.

19 Doing Statistics for Business The Adjusted R 2, is the value of the coefficient of multiple determination adjusted to reflect the number of variables in the model.

20 Doing Statistics for Business TRY IT NOW! Order Filling The Coefficient of Multiple Determination The mail-order company looking at factors related to the time to fill an order wants to know if the set of independent variables that it selected does a good job of accounting for the variation in the time to fill an order.

21 Doing Statistics for Business TRY IT NOW! Order Filling The Coefficient of Multiple Determination (con’t) Using the output, find the value of the coefficient of multiple determination. If you were the manager of the company would you be satisfied with this model? Why or why not? Look at the value of adjusted R 2. What do you think this value might be telling the company?

22 Doing Statistics for Business TRY IT NOW! Order Filling Testing Individual Regression Coefficients The mail-order company looking at factors related to the time to fill an order wants to know how the individual variables contribute to the model. It decides to look at the computer analysis again. Look at the coefficients of each of the three variables in the model and perform the appropriate hypothesis tests.

23 Doing Statistics for Business TRY IT NOW! Order Filling Testing Individual Regression Coefficients (con’t) Which variable(s) have nonzero coefficients? Which variable(s) have coefficients that are equal to zero? As a result of these test, what recommendation would you make?

24 Doing Statistics for Business Discovery Exercise 12.1 Find the Best Model The office of Institutional Planning at a university in the Far West is interested in understanding what factors influence the graduation rate, that is, the percentage of entering freshmen who actually graduate from the university. The university planners have collected data on 46 universities with traits similar to theirs and have selected three variables that they think are related to the graduation rate: 25th percentile combined SATs of accepted students, the acceptance rate (percentage of students who apply that are accepted by the university), and educational expenditure per student ($) by the university.

25 Doing Statistics for Business Discovery Exercise 12.1 Find the Best Model (con’t) A sample of the data is shown below:

26 Doing Statistics for Business Discovery Exercise 12.1 Find the Best Model (con’t) A. Which variable do you think will have the greatest effect on graduation rate? Explain why you chose this variable. B. Find a simple linear model that predicts graduation rate using the variable that you think is most important. What is the value of R 2 for your model? Is the model significant? C. Do this again using the other two independent variables.

27 Doing Statistics for Business Discovery Exercise 12.1 Find the Best Model (con’t) D. Which one-variable model do you think is the best? Why? E. How many different models with two independent variables could you find? List them all. F. Find the multiple regression models for all of the two-variable models. G. Which two-variable model do you think is best? Why? Does the model you think is best contain the variable form the best one-variable model? Would you expect it to?

28 Doing Statistics for Business Discovery Exercise 12.1 Find the Best Model (con’t) H. Find the multiple regression model that predicts graduation rate using all three independent variables. What is R 2 for this model? Is it significant? I. Now, fill in the table in your textbook for each of the “best” models that you have found. J. If you were going to use a model to predict graduation rate, which of these three models would you choose? Why?

29 Doing Statistics for Business Model Building Techniques are methods used for identifying the best multiple regression model from a set of independent variables. These methods include: m Forward Selection m Backward Elimination m Stepwise Regression m All Possible Regressions.

30 Doing Statistics for Business TRY IT NOW! GPA Choosing the Independent Variables Many studies have been done on what factors are related to a college student’s grade point average (GPA). These studies often focus on pre-college factors, such as high school performance, SAT or ACT scores and general socioeconomic factors such as family income and race. Students know that once they are in college, many other factors influence their GPA.

31 Doing Statistics for Business TRY IT NOW! GPA Choosing the Independent Variables (con’t) You are going to try to find a model relating current factors to GPA. Make a list of all variables that you think are related to a college student’s GPA. From this list, identify what you think are the five most important variables. How would you go about gathering the data you need to do your study?

32 Doing Statistics for Business TRY IT NOW! Order Filling Forward Selection The mail-order company wants to use a standard method to find the best model from its set of three variables. It decides on forward selection because this variable is the easiest to understand. The relevant portions of the computer output are shown in your textbook. What variable is added on the first step of the procedure? What is its coefficient? What is the value of R 2 for the first model considered?

33 Doing Statistics for Business TRY IT NOW! Order Filling Forward Selection (con’t) What variable is added on the second step? What is its coefficient? What does R 2 change to after this step? Does the coefficient of the first variable change when the second variable is added? If so, what is the new value? Write down the final model.

34 Doing Statistics for Business TRY IT NOW! Order Filling Testing Individual Regression Coefficients The mail-order company looking at the model for order filling decides to use stepwise regression to see if it finds a model that is different from the one found using forward selection. From the Minitab output in your textbook, how many iterations did the procedure take? Which variable was added to the model first? What is its coefficient?

35 Doing Statistics for Business TRY IT NOW! Order Filling Testing Individual Regression Coefficients (con’t) Which variable was added second? What is its coefficient? Did the coefficient for the first variable change? If so, what is its coefficient after the second variable is added? What is the is the final model from the stepwise procedure?

36 Doing Statistics for Business TRY IT NOW! Order Filling All Possible Regressions The mail-order company is pretty sure that it has identified the best model, but it decides to use all the possible regressions method to make sure that it is not missing something. Based on the output in your textbook, is there a one-variable model that is best on all criteria? If so, what is it? What is the best two-variable model?

37 Doing Statistics for Business TRY IT NOW! Order Filling All Possible Regressions (con’t) Is there any benefit from moving to a three-variable model? Why or why not? What model do you recommend the company use? Why?

38 Doing Statistics for Business TRY IT NOW! Order Filling All Possible Regressions Now that has a model it likes, the mail-order company would like to make sure that the model does not violate any assumptions of the multiple regression model. The company creates a set of residual plots, which are shown below:

39 Doing Statistics for Business TRY IT NOW! Order Filling All Possible Regressions (con’t) Look at the plot of residuals versus the locations. Does it appear that there is a problem with the assumption of equal variances? Look at the plot of residuals versus the number of items. Does it appear that there is a problem with the assumption of equal variances?

40 Doing Statistics for Business TRY IT NOW! Order Filling All Possible Regressions (con’t) The company also created a normal probability plot of the residuals: From this plot, what can you say about the assumption of normality?

41 Doing Statistics for Business The Cook’s Distance method compares the values of the regression coefficients with all observations to the values when the ith observation is removed from the model.

42 Doing Statistics for Business A multiple-regression model has Multicollinearity when variables in the set of independent variables are correlated with each other.

43 Doing Statistics for Business TRY IT NOW! Order Filling Checking For Multicollinearity The mail-order company wants to check to see if the model it is thinking about using has any problems with multicollinearity. It calculates the correlation between the two variables that are in the final model and finds that the correlation is – Based on this information do you think that multicollinearity is a problem in their model? Why or why not?

44 Doing Statistics for Business Multiple Regression Models in Excel The tools used in Excel for multiple regression models are the same one that are used for the simple linear model. The only difference is that the data range for the X variables will cover more than one column. It is very important, however, that the X variable columns be adjacent to each other. If they are not, you will have to rearrange the worksheet before you do the analysis.

45 Doing Statistics for Business Stepwise Regression with KaddStat 4 From the Kadd menu select Regression and Correlation > Forward Stepwise and the dialog box shown in Figure 12.3 opens. The dialog box is identical to the one for Single/Multiple regression with two additional inputs.

46 Doing Statistics for Business

47 Doing Statistics for Business 4 Just above the where you indicate which graphical output you want is a section labeled Select one:. 4 This allows you to indicate whether you want to have only the final model output or whether you want each step output. 4 In addition, you need to specify the p value to be used for adding and dropping variables from the model. 4 Kaddstat uses the same p value for both.

48 Doing Statistics for Business Chapter 12 Summary In this chapter you have learned: 4 Modeling is an iterative process for which there is no single correct answer. 4 There are several steps to the modeling process: Identifying potential independent variables Collecting data Finding a potential model

49 Doing Statistics for Business Chapter 12 Summary (con’t) 4 Once a potential model is found, the process of Model Building takes place to find a “best” model. 4 The objective is finding a model that does an acceptable job of explaining or predicting the dependent variable with as few independent variables as possible.

50 Doing Statistics for Business Chapter 12 Summary (con’t) 4 Some model-building techniques used are: Forward Selection Backward Elimination Stepwise Regression All Possible Regressions

51 Doing Statistics for Business Chapter 12 Summary (con’t) 4 Once the “best” model is identified, and before it can be used for decision-making purposes, it must be checked for problems such as: Violation of Assumptions Influential Observations Multicollinearity