McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. A PowerPoint Presentation Package to Accompany Applied Statistics.

Slides:



Advertisements
Similar presentations
13 Multiple Regression Chapter Multiple Regression
Advertisements

Week 13 November Three Mini-Lectures QMM 510 Fall 2014.
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Week 12 November Four Mini-Lectures QMM 510 Fall 2014.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Objectives (BPS chapter 24)
Chapter 13 Multiple Regression
Chapter 13 Additional Topics in Regression Analysis
Chapter 12 Simple Regression
Chapter 12 Multiple Regression
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Irwin/McGraw-Hill © The McGraw-Hill Companies, Inc., 2000 LIND MASON MARCHAL 1-1 Chapter Twelve Multiple Regression and Correlation Analysis GOALS When.
Chapter 11 Multiple Regression.
Simple Linear Regression Analysis
Multiple Linear Regression
Multiple Regression and Correlation Analysis
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. A PowerPoint Presentation Package to Accompany Applied Statistics.
Simple Linear Regression Analysis
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Correlation & Regression
Objectives of Multiple Regression
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Chapter 13: Inference in Regression
Chapter 11 Simple Regression
Regression Method.
Multiple Regression Analysis
Bivariate Regression (Part 1) Chapter1212 Visual Displays and Correlation Analysis Bivariate Regression Regression Terminology Ordinary Least Squares Formulas.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
14- 1 Chapter Fourteen McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 13 Multiple Regression
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Lecture 10: Correlation and Regression Model.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Model Building and Model Diagnostics Chapter 15.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Correlation & Regression Analysis
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Assumptions & Requirements.  Three Important Assumptions 1.The errors are normally distributed. 2.The errors have constant variance (i.e., they are homoscedastic)
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Chapter 15 Multiple Regression Model Building
Multiple Regression.
Chapter 11 Simple Regression
CHAPTER 29: Multiple Regression*
Prepared by Lee Revere and John Large
Multiple Regression Chapter 14.
Chapter Fourteen McGraw-Hill/Irwin
Chapter 13 Additional Topics in Regression Analysis
Presentation transcript:

McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. A PowerPoint Presentation Package to Accompany Applied Statistics in Business & Economics, 4 th edition David P. Doane and Lori E. Seward Prepared by Lloyd R. Jaisingh

13-2 Multiple Regression Chapter Contents 13.1 Multiple Regression 13.2 Assessing Overall Fit 13.3 Predictor Significance 13.4 Confidence Intervals for Y 13.5 Categorical Predictors 13.6 Tests for Nonlinearity and Interaction 13.7 Multicollinearity 13.8 Violations of Assumptions 13.9 Other Regression Topics Chapter 13

13-3 Chapter Learning Objectives LO13-1: Use a fitted multiple regression equation to make predictions. LO13-2: Interpret the R 2 and perform an F test for overall significance. LO13-3: Test individual predictors for significance. LO13-4: Interpret confidence intervals for regression coefficients. LO13-5: Incorporate a categorical variable into a multiple regression model. LO13-6: Detect multicollinearity and assess its effects. LO13-7: Analyze residuals to check for violations of residual assumptions. LO13-8: Identify unusual residuals and high leverage observations. LO13-9: Explain the role of data conditioning and data transformations. Chapter 13 Multiple Regression

Multiple Regression Multiple regression is an extension of simple regression to include more than one independent variable.Multiple regression is an extension of simple regression to include more than one independent variable. Limitations of simple regression: - often simplistic - biased estimates if relevant predictors are omitted - lack of fit does not show that X is unrelated to YLimitations of simple regression: - often simplistic - biased estimates if relevant predictors are omitted - lack of fit does not show that X is unrelated to Y Simple or Multivariate? Simple or Multivariate? Chapter 13LO1 Regression Terminology Regression Terminology Y is the response variable and is assumedY is the response variable and is assumed to be related to the k predictors (X 1, X 2, … X k ) to be related to the k predictors (X 1, X 2, … X k ) by a linear equation called the population by a linear equation called the population regression model: regression model: The fitted regression equation is:The fitted regression equation is: LO13-1 LO13-1: Use a fitted multiple regression equation to make predictions.

13-5 Fitted Regression: Simple versus Multivariate Fitted Regression: Simple versus Multivariate Chapter Multiple Regression LO13-1

13-6 Four Criteria for Regression Assessment Four Criteria for Regression Assessment Logic: Is there an a priori reason to expect a causal relationship between the predictors and the response variable? the predictors and the response variable? Fit : Does the overall regression show a significant relationship between the predictors and the response variable? between the predictors and the response variable? Parsimony: Does each predictor contribute significantly to the explanation? Are some predictors not worth the trouble? explanation? Are some predictors not worth the trouble? Stability: Are the predictors related to one another so strongly that regression estimates become erratic? estimates become erratic? Regression Modeling Regression Modeling Chapter Multiple Regression LO13-1

Assessing Overall Fit For a regression with k predictors, the hypotheses to be tested are H 0 : All the true coefficients are zero H 1 : At least one of the coefficients is nonzeroFor a regression with k predictors, the hypotheses to be tested are H 0 : All the true coefficients are zero H 1 : At least one of the coefficients is nonzero In other words, H 0 :  1 =  2 = … =  k = 0 H 1 : At least one of the coefficients is nonzeroIn other words, H 0 :  1 =  2 = … =  k = 0 H 1 : At least one of the coefficients is nonzero F Test for Significance F Test for Significance Chapter 13 Coefficient of Determination (R 2 ) Coefficient of Determination (R 2 ) R 2, the coefficient of determination, is a common measure of overall fit.R 2, the coefficient of determination, is a common measure of overall fit. It can be calculated one of two ways.It can be calculated one of two ways. For example, for the home price data,For example, for the home price data, LO13-2 LO13-2: Interpret the R 2 and perform an F test for overall significance.

13-8 F Test for Significance F Test for Significance Chapter 13 The ANOVA calculations for a k-predictor model can be summarized asThe ANOVA calculations for a k-predictor model can be summarized as Adjusted R 2 Adjusted R 2 It is generally possible to raise the coefficient ofIt is generally possible to raise the coefficient of determination R 2 by including addition predictors. determination R 2 by including addition predictors. The adjusted coefficient of determination is doneThe adjusted coefficient of determination is done to penalize the inclusion of useless predictors. to penalize the inclusion of useless predictors. For n observations and k predictors,For n observations and k predictors, 13.2 Assessing Overall Fit LO13-2

13-9 Test each fitted coefficient to see whether it is significantly different from zero.Test each fitted coefficient to see whether it is significantly different from zero. The hypothesis tests for predictor X j areThe hypothesis tests for predictor X j are If we cannot reject the hypothesis that a coefficient is zero, then the corresponding predictor does not contribute to the prediction of Y.If we cannot reject the hypothesis that a coefficient is zero, then the corresponding predictor does not contribute to the prediction of Y. Chapter Predictor Significance Test Statistic Test Statistic The test statistic for coefficient of predictor X j isThe test statistic for coefficient of predictor X j is To reject H 0 we can compare t calc to t  for the different hypotheses or if p-value ≤ . LO13-3 LO13-3: Test individual predictors for significance.

Confidence Intervals for Y The standard error of the regression (SE) is another important measure of fit.The standard error of the regression (SE) is another important measure of fit. For n observations and k predictors.For n observations and k predictors. Standard Error Standard Error If all predictions were perfect, the SE = 0.If all predictions were perfect, the SE = 0. Chapter 13 The 95% confidence interval for coefficient  j isThe 95% confidence interval for coefficient  j is LO13-4 LO13-4: Interpret confidence intervals for regression coefficients.

13-11 Approximate Confidence and Prediction Intervals for Y Approximate Confidence and Prediction Intervals for Y Chapter 13 A very quick prediction and confidence interval for Y interval without using a t table.A very quick prediction and confidence interval for Y interval without using a t table Confidence Intervals for Y LO13-4

13-12 A binary predictor has two values (usually 0 and 1) to denote the presence or absence of a condition.A binary predictor has two values (usually 0 and 1) to denote the presence or absence of a condition. For example, for n graduates from an MBA program:For example, for n graduates from an MBA program: Employed = 1 Unemployed = 0 These variables are also called dummy or indicator variables.These variables are also called dummy or indicator variables. For easy understandability, name the binary variable the characteristic that is equivalent to the value of 1.For easy understandability, name the binary variable the characteristic that is equivalent to the value of 1. Its contribution to the regression is either b 1 or nothing, resulting in an intercept of either b 0 (when X 1 = 0) or b 0 + b 1 (when X 1 = 1).Its contribution to the regression is either b 1 or nothing, resulting in an intercept of either b 0 (when X 1 = 0) or b 0 + b 1 (when X 1 = 1). The slope does not change, only the intercept is shifted.The slope does not change, only the intercept is shifted. In multiple regression, binary predictors require no special treatment. They are tested as any other predictor using a t test.In multiple regression, binary predictors require no special treatment. They are tested as any other predictor using a t test. Be cautious, including all binaries for all categories would introduce a serious problem for the regression estimation. What Is a Binary Predictor? What Is a Binary Predictor? Chapter Categorical Predictors LO13-5 LO13-5: Incorporate a categorical variable into a multiple regression model.

13-13 Sometimes the effect of a predictor is nonlinear. A simple example would be estimating the volume of lumber to be obtained from a tree. To test for suspected nonlinearity of any predictor, we can include its square in the regression. Tests for Nonlinearity Tests for Nonlinearity Chapter Tests for Nonlinearity and Interaction Tests for Interaction Tests for Interaction We can test for interaction between two predictors by including their product in the regression.

Multicollinearity Multicollinearity occurs when the independent variables X 1, X 2, …, X m are intercorrelated instead of being independent. One can inspect the correlation matrix.Multicollinearity occurs when the independent variables X 1, X 2, …, X m are intercorrelated instead of being independent. One can inspect the correlation matrix. Collinearity occurs if only two predictors are correlated.Collinearity occurs if only two predictors are correlated. The degree of multicollinearity is the real concern.The degree of multicollinearity is the real concern. What is Multicollinearity? What is Multicollinearity? Chapter 13 Variance Inflation Variance Inflation Multicollinearity induces variance inflation when predictors are strongly intercorrelated.Multicollinearity induces variance inflation when predictors are strongly intercorrelated. This results in wider confidence intervals for the true coefficients  1,  2, …,  m and makes the t statistic less reliable.This results in wider confidence intervals for the true coefficients  1,  2, …,  m and makes the t statistic less reliable. The separate contribution of each predictor in “explaining” the response variable is difficult to identify.The separate contribution of each predictor in “explaining” the response variable is difficult to identify. LO13-6 LO13-6: Detect multicollinearity and assess its effects.

13-15 Variance Inflation Factor (VIF) Variance Inflation Factor (VIF) The matrix scatter plots and correlation matrix only show correlations between any two predictors.The matrix scatter plots and correlation matrix only show correlations between any two predictors. The variance inflation factor (VIF) is a more comprehensive test for multicollinearity.The variance inflation factor (VIF) is a more comprehensive test for multicollinearity. For a given predictor j, the VIF is defined asFor a given predictor j, the VIF is defined as where R j 2 is the coefficient of determination when predictor j is regressed against all other predictors. Chapter Multicollinearity LO13-6 Some possible situations are:

Violations of Assumptions The least squares method makes several assumptions about the (unobservable) random errors  i. Clues about these errors may be found in the residuals e i. Assumption 1: The errors are normally distributed. Assumption 2: The errors have constant variance (i.e., they are homoscedastic). Assumption 3: The errors are independent (i.e., they are non-autocorrelated). Note: Technology can be used to test for violations of these assumptions. Chapter 13 LO13-8: Identify unusual residuals and high leverage observations. Unusual Observations Unusual Observations An observation may be unusual 1. because the fitted model’s prediction is poor (unusual residuals), or 2. because one or more predictors may be having a large influence on the regression estimates (unusual leverage). LO13-7, 8 LO13-7: Analyze residuals to check for violations of residual assumptions.

13-17 Unusual Observations Unusual Observations Chapter 13 To check for unusual residuals, simply inspect the residuals to find instances where the model does not predict well.To check for unusual residuals, simply inspect the residuals to find instances where the model does not predict well. To check for unusual leverage, look at the leverage statistic (how far each observation is from the mean(s) of the predictors) for each observation.To check for unusual leverage, look at the leverage statistic (how far each observation is from the mean(s) of the predictors) for each observation. For n observations and k predictors, look for observations whose leverage exceeds 2(k + 1)/n.For n observations and k predictors, look for observations whose leverage exceeds 2(k + 1)/n. LO13-9: Explain the role of data conditioning and data transformations. OutliersOutliers Missing PredictorsMissing Predictors Ill-Conditioned DataIll-Conditioned Data Significance in Large SamplesSignificance in Large Samples Model Specification ErrorsModel Specification Errors Missing DataMissing Data Logistic (binary) RegressionLogistic (binary) Regression Stepwise and Best Subsets RegressionStepwise and Best Subsets Regression 13.8 Violations of Assumptions LO13-7, 8, 9