1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.

Slides:



Advertisements
Similar presentations
All Possible Regressions and Statistics for Comparing Models
Advertisements

Chapter 5 Multiple Linear Regression
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 12.2.
Guide to Using Excel 2007 For Basic Statistical Applications To Accompany Business Statistics: A Decision Making Approach, 8th Ed. Chapter 15: Multiple.
Chapter 12 Simple Linear Regression
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide 統計學 Spring 2004 授課教師:統計系余清祥 日期: 2004 年 5 月 11 日 第十二週:建立迴歸模型.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Statistics for Managers Using Microsoft® Excel 5th Edition
Statistics for Managers Using Microsoft® Excel 5th Edition
Part I – MULTIVARIATE ANALYSIS C3 Multiple Linear Regression II © Angel A. Juan & Carles Serrat - UPC 2007/2008.
1 Chapter 9 Variable Selection and Model building Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Lecture 24 Multiple Regression (Sections )
Chapter 11 Multiple Regression.
Multiple Linear Regression
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Chapter 15: Model Building
Stat 112: Lecture 9 Notes Homework 3: Due next Thursday
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
CHAPTER 15 Simple Linear Regression and Correlation
1 1 Slide © 2016 Cengage Learning. All Rights Reserved. The equation that describes how the dependent variable y is related to the independent variables.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Model Selection1. 1. Regress Y on each k potential X variables. 2. Determine the best single variable model. 3. Regress Y on the best variable and each.
Quantile Regression By: Ashley Nissenbaum. About the Author Leo H. Kahane Associate Professor at Providence College Research Sport economics, international.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
CHAPTER 17 Model Building
Regression Model Building LPGA Golf Performance
Regression Analysis A statistical procedure used to find relations among a set of variables.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
©2006 Thomson/South-Western 1 Chapter 14 – Multiple Linear Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western Concise.
Chapter 13 Multiple Regression
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Model Selection and Validation. Model-Building Process 1. Data collection and preparation 2. Reduction of explanatory or predictor variables (for exploratory.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Model Building and Model Diagnostics Chapter 15.
1 1 Slide © 2005 Thomson/South-Western AK/ECON 3480 M & N WINTER 2006 n Power Point Presentation n Professor Ying Kong School of Analytic Studies and Information.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
1 Experimental Statistics - week 12 Chapter 12: Multiple Regression Chapter 13: Variable Selection Model Checking.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Variable Selection 1 Chapter 8 Variable Selection Terry Dielman Applied Regression Analysis:
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 12 Multiple.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Chapter 13 Simple Linear Regression
Chapter 15 Multiple Regression Model Building
Chapter 9 Multiple Linear Regression
John Loucks St. Edward’s University . SLIDES . BY.
Business Statistics, 4e by Ken Black
Business Statistics, 4e by Ken Black
Business Statistics, 4e by Ken Black
Presentation transcript:

1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Chapter 16 Regression Analysis: Model Building n Multiple Regression Approach to Experimental Design Experimental Design n General Linear Model n Determining When to Add or Delete Variables n Variable Selection Procedures n Autocorrelation and the Durbin-Watson Test

2 2 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Models in which the parameters (  0,  1,...,  p ) all Models in which the parameters (  0,  1,...,  p ) all have exponents of one are called linear models. General Linear Model n A general linear model involving p independent variables is n Each of the independent variables z is a function of x 1, x 2,..., x k (the variables for which data have been collected).

3 3 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. General Linear Model n The simplest case is when we have collected data for just one variable x 1 and want to estimate y by using a straight-line relationship. In this case z 1 = x 1. n This model is called a simple first-order model with one predictor variable.

4 4 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Modeling Curvilinear Relationships n This model is called a second-order model with one predictor variable. n To account for a curvilinear relationship, we might set z 1 = x 1 and z 2 =.

5 5 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Interaction n This type of effect is called interaction. n In this model, the variable z 5 = x 1 x 2 is added to account for the potential effects of the two variables acting together. n If the original data set consists of observations for y and two independent variables x 1 and x 2 we might develop a second-order model with two predictor variables.

6 6 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Transformations Involving the Dependent Variable n Another approach, called a reciprocal transformation, is to use 1/ y as the dependent variable instead of y. n Often the problem of nonconstant variance can be corrected by transforming the dependent variable to a different scale. n Most statistical packages provide the ability to apply logarithmic transformations using either the base-10 (common log) or the base e = (natural log).

7 7 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. n We can transform this nonlinear model to a linear model by taking the logarithm of both sides. Nonlinear Models That Are Intrinsically Linear Models in which the parameters (  0,  1,...,  p ) have exponents other than one are called nonlinear models. Models in which the parameters (  0,  1,...,  p ) have exponents other than one are called nonlinear models. n In some cases we can perform a transformation of variables that will enable us to use regression analysis with the general linear model. n The exponential model involves the regression equation:

8 8 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection Procedures n Stepwise Regression n Forward Selection n Backward Elimination Iterative; one independent variable at a time is added or deleted based on the F statistic Different subsets of the independent variables are evaluated n Best-Subsets Regression The first 3 procedures are heuristics and therefore offer no guarantee that the best model will be found. The first 3 procedures are heuristics and therefore offer no guarantee that the best model will be found.

9 9 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection: Stepwise Regression n If no variable can be removed and no variable can be added, the procedure stops. n At each iteration, the first consideration is to see whether the least significant variable currently in the model can be removed because its F value is less than the user-specified or default Alpha to remove. n If no variable can be removed, the procedure checks to see whether the most significant variable not in the model can be added because its F value is greater than the user-specified or default Alpha to enter.

10 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection: Stepwise Regression Compute F stat. and p -value for each indep. variable not in model Compute F stat. and p -value for each indep. variable not in model Start with no indep. variables in model variables in model Start with no indep. variables in model variables in model Any p -value > alpha to remove ?Any p -value > alpha to remove ? StopStop Indep. variable with largest p -value is removed from model Indep. variable with largest p -value is removed from model Compute F stat. and p -value for each indep. variable in model Compute F stat. and p -value for each indep. variable in model Any p -value < alpha to enter ?Any p -value < alpha to enter ? Indep. variable with smallest p -value is entered into model Indep. variable with smallest p -value is entered into model No No Yes Yesnextiteration

11 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection: Forward Selection n This procedure is similar to stepwise regression, but does not permit a variable to be deleted. n This forward-selection procedure starts with no independent variables. n It adds variables one at a time as long as a significant reduction in the error sum of squares (SSE) can be achieved.

12 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Start with no indep. variables in model Start with no indep. variables in model StopStop Compute F stat. and p -value for each indep. variable not in model Compute F stat. and p -value for each indep. variable not in model Any p -value < alpha to enter ?Any p -value < alpha to enter ? Indep. variable with smallest p -value is entered into model Indep. variable with smallest p -value is entered into model No Yes Variable Selection: Forward Selection

13 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection: Backward Elimination n This procedure begins with a model that includes all the independent variables the modeler wants considered. n It then attempts to delete one variable at a time by determining whether the least significant variable currently in the model can be removed because its p -value is less than the user-specified or default value. n Once a variable has been removed from the model it cannot reenter at a subsequent step.

14 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection: Backward Elimination StopStop Compute F stat. and p -value for each indep. variable in model Compute F stat. and p -value for each indep. variable in model Any p -value > alpha to remove ?Any p -value > alpha to remove ? Indep. variable with largest p -value is removed from model Indep. variable with largest p -value is removed from model No Yes Start with all indep. variables in model Start with all indep. variables in model

15 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Tony Zamora, a real estate investor, has just Tony Zamora, a real estate investor, has just moved to Clarksville and wants to learn about the city’s residential real estate market. Tony has randomly selected 25 house-for-sale listings from the Sunday newspaper and collected the data partially listed on the next slide. Variable Selection: Backward Elimination n Example: Clarksville Homes Develop, using the backward elimination Develop, using the backward elimination procedure, a multiple regression model to predict the selling price of a house in Clarksville.

16 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection: Backward Elimination n n Partial Data Segment of City Selling Price ($000) House Size (00 sq. ft.) Number of Bedrms. Number of Bathrms. Garage Size (cars) Northwest South Northeast Northwest West South West West

17 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection: Backward Elimination n n Regression Output CoefSE CoefT p Intercept House Size Bedrooms Bathrooms Cars Predictor Greatest p -value >.05 Greatest p -value >.05 Variable to be removedVariable removed

18 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection: Backward Elimination n Cars (garage size) is the independent variable with the highest p -value (.697) >.05. n Cars variable is removed from the model. n Multiple regression is performed again on the remaining independent variables.

19 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection: Backward Elimination n n Regression Output CoefSE CoefT p Intercept House Size Bedrooms Bathrooms Predictor Greatest p -value >.05 Greatest p -value >.05 Variable to be removedVariable removed

20 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection: Backward Elimination n Bedrooms is the independent variable with the highest p -value (.281) >.05. n Bedrooms variable is removed from the model. n Multiple regression is performed again on the remaining independent variables.

21 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. CoefSE CoefTp Intercept House Size Bathrooms Predictor Variable Selection: Backward Elimination n n Regression Output Greatest p -value >.05 Greatest p -value >.05 Variable to be removedVariable removed

22 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection: Backward Elimination n Bathrooms is the independent variable with the highest p -value (.110) >.05. n Bathrooms variable is removed from the model. n Multiple regression is performed again on the remaining independent variable.

23 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection: Backward Elimination n n Regression Output CoefSE CoefT p Intercept House Size E-09 Predictor Greatest p -value is <.05 Greatest p -value is <.05

24 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Variable Selection: Backward Elimination n House size is the only independent variable remaining in the model. n The estimated regression equation is:

25 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. n Minitab output identifies the two best one-variable estimated regression equations, the two best two- variable equation, and so on. Variable Selection: Best-Subsets Regression n The three preceding procedures are one-variable-at- a-time methods offering no guarantee that the best model for a given number of variables will be found. n Some software packages include best-subsets regression that enables the user to find, given a specified number of independent variables, the best regression model.

26 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. The Professional Golfers Association keeps a The Professional Golfers Association keeps a variety of statistics regarding performance measures. Data include the average driving distance, percentage of drives that land in the fairway, percentage of greens hit in regulation, average number of putts, percentage of sand saves, and average score. n Example: PGA Tour Data Variable Selection: Best-Subsets Regression

27 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. n Variable Names and Definitions Variable-Selection Procedures Score : average score for an 18-hole round Sand : percentage of sand saves (landing in a sand trap and still scoring par or better) Putt : average number of putts for greens that have been hit in regulation been hit in regulation Green : percentage of greens hit in regulation (a par-3 green is “hit in regulation” if the player’s first shot lands on the green) shot lands on the green) Fair : percentage of drives that land in the fairway Drive : average length of a drive in yards

28 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part Variable-Selection Procedures Drive Fair Green Putt Sand Score Drive Fair Green Putt Sand Score n Sample Data (Part 1)

29 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part Variable-Selection Procedures Drive Fair Green Putt Sand Score Drive Fair Green Putt Sand Score n Sample Data (Part 2)

30 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part Variable-Selection Procedures Drive Fair Green Putt Sand Score Drive Fair Green Putt Sand Score n Sample Data (Part 3)

31 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. n Sample Correlation Coefficients Variable-Selection Procedures Sand Putt Green Fair Drive Score Drive Fair Green Putt