Chapter 12 Multiple Regression and Model Building.

Slides:



Advertisements
Similar presentations
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Advertisements

Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Simple Linear Regression
Chapter 13 Multiple Regression
Chapter 10 Simple Regression.
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Simple Regression
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Multiple Regression
© 2003 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
Chapter 11 Multiple Regression.
Statistics for Business and Economics Chapter 11 Multiple Regression and Model Building.
Simple Linear Regression Analysis
Multiple Linear Regression
Chapter 12: Multiple Regression and Model Building
Multiple Regression and Correlation Analysis
© 2004 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
Linear Regression Example Data
Ch. 14: The Multiple Regression Model building
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Correlation & Regression
Chapter 8 Forecasting with Multiple Regression
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
© 2011 Pearson Education, Inc. Statistics for Business and Economics Chapter 11 Multiple Regression and Model Building.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2003 Thomson/South-Western Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Chapter 14 Introduction to Multiple Regression
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Chapter 13 Multiple Regression
Lecture 10: Correlation and Regression Model.
Copyright © 2012 Pearson Education, Inc. All rights reserved Chapter 12 Multiple Regression and Model Building.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 14-1 Chapter 14 Introduction to Multiple Regression Statistics for Managers using Microsoft.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Chap 13-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 13 Multiple Regression and.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Exam 2 Review. Data referenced throughout review An Educational Testing Service (ETS) research scientist used multiple regression analysis to model y,
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Chapter 15 Multiple Regression Model Building
Chapter 14 Introduction to Multiple Regression
Chapter 15 Multiple Regression and Model Building
Multiple Regression Analysis and Model Building
Essentials of Modern Business Statistics (7e)
Multiple Regression and Model Building
CHAPTER 29: Multiple Regression*
Prepared by Lee Revere and John Large
Multiple Regression Chapter 14.
Simple Linear Regression
Presentation transcript:

Chapter 12 Multiple Regression and Model Building

2 Multiple Regression Models The General Multiple Regression Model is the dependent variable are the independent variables is the deterministic portion of the model determines the contribution of the independent variable

3 Multiple Regression Models Analyzing a Multiple Regression Model 1.Hypothesize the deterministic component of the model 2.Use sample data to estimate β 0,β 1,β 2,… β k 3.Specify probability distribution of ε and estimate σ 4.Check that assumptions on ε are satisfied 5.Statistically evaluate model usefulness 6.Useful model used for prediction, estimation, other purposes

4 Multiple Regression Models Assumptions about Random Error ε 1.For any given set of values of x 1, x 2,…..x k, the random error has a normal probability distribution with mean 0 and variance σ 2 2.The random errors are independent

5 The First-Order Model: Estimating and Interpreting the  -Parameters For the chosen fitted model minimizes

6 The First-Order Model: Estimating and Interpreting the  -Parameters y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + ε where Y = Sales price (dollars) X 1 = Appraised land value (dollars) X 2 = Appraised improvements (dollars) X 3 = Area (square feet )

7 The First-Order Model: Estimating and Interpreting the  -Parameters Plot of data for sample size n=20

8 The First-Order Model: Estimating and Interpreting the  -Parameters Fit model to data

9 The First-Order Model: Estimating and Interpreting the  -Parameters Interpret β estimates E(y), the mean sale price of the property is estimated to increase dollars for additional square foot of living area, holding other variables constant E(y), the mean sale price of the property is estimated to increase.8204 dollars for every $1 increase in appraised improvements, holding other variables constant E(y), the mean sale price of the property is estimated to increase.8145 dollars for every $1 increase in appraised land value, holding other variables constant

10 The First-Order Model: Estimating and Interpreting the  -Parameters Given a model E(y) = 1 +2x 1 +x 2, the effect of x 2 on E(y), holding x 1 constant is

11 The First-Order Model: Estimating and Interpreting the  -Parameters The plane E(y) = 1 +2x 1 +x 2

12 Inferences about the  -Parameters and the Overall Model Utility 2 types of inferences can be made, using either confidence intervals or hypothesis testing For any inferences to be made, the assumptions made about the random error term ε (normal distribution with mean 0 and variance σ 2, independence of errors) must be met

13 Inferences about the  -Parameters and the Overall Model Utility A Test of an Individual Parameter Coefficient One-Tailed Test Two-Tailed Test H 0 : β i =0 H a : β i 0) H 0 : β i =0 H a : β i ≠0 Rejection region: t< -t α (or t 0) Rejection region: |t|> t α/2 Where t α and t α/2 are based on n-(k+1) degrees of freedom

14 Inferences about the  -Parameters and the Overall Model Utility A 100(1-α)% Confidence Interval for a  -Parameter where t α/2 is based on n-(k+1) degrees of freedom and n = Number of observations k+1 = Number of  parameters in the model

15 Inferences about the  -Parameters and the Overall Model Utility A Minitab Analysis Use for confidence Intervals Use for hypotheses about parameter coefficients

16 Inferences about the  -Parameters and the Overall Model Utility 3 tests of overall model utility: 1.Multiple coefficient of determination R 2 2.Adjusted multiple coefficient of determination 3.Global F-test

17 Inferences about the  -Parameters and the Overall Model Utility Testing Global Usefulness of the Model: The Analysis of Variance F-test H 0 : β 1 = β 2=.... β k =0 H a : At least one β i ≠ 0 where n is the sample size and k is number of terms in the model Rejection region: F>F α, with k numerator degrees of freedom and [n- (k+1)] denominator degrees of freedom

18 Inferences about the  -Parameters and the Overall Model Utility Checking the Utility of a Multiple Regression Model 1.Conduct a test of overall model adequacy using the F-test. If H 0 is rejected, proceed to step 2 2.Conduct t-tests on β parameters of particular interest 3.Examine values of and 2s to evaluate how well the model fits the data

19 Using the Model for Estimation and Prediction As in Simple Linear Regression, intervals around a predicted value will be wider than intervals around an estimated value Most statistics packages will print out both confidence and prediction intervals

20 Model Building: Interaction Models An Interaction Model relating E(y) to Two Quantitative Independent Variables where represents the change in E(y) for every 1-unit increase in x 1, holding x 2 fixed represents the change in E(y) for every 1-unit increase in x 2, holding x 1 fixed

21 Model Building: Interaction Models When the relationship between two y and x i is not impacted by a second x (no interaction) When the linear relationship between y and x i depends on another x

22 Model Building: Interaction Models

23 Model Building: Quadratic and other Higher-Order Models A Quadratic (Second-Order) Model where is the y-intercept of the curve is a shift parameter is the rate of curvature

24 Model Building: Quadratic and other Higher-Order Models Home Size-Electrical Usage Data Size of Home, x (sq. ft.) Monthly Usage, y (kilowatt-hours) 1,2901,182 1,3501,172 1,4701,264 1,6001,493 1,7101,571 1,8401,711 1,9801,804 2,2301,840 2,4001,95 2,9301,954

25 Model Building: Quadratic and other Higher-Order Models

26 Model Building: Quadratic and other Higher-Order Models A Complete Second-Order Model with Two Quantitative Independent Variables where is the y-intercept, value of E(y) when x 1 = x 2 =0 changes cause the surface to shift along the x 1 and x 2 axes controls the rotation of the surface control the type of surface, rates of curvature

27 Model Building: Quadratic and other Higher-Order Models

28 Model Building: Qualitative (Dummy) Variable Models Dummy variables – coded, qualitative variables Codes are in the form of (1, 0), 1 being the presence of a condition, 0 the absence Create Dummy variables so that there is one less dummy variable than categories of the qualitative variable of interest Gender dummy variable coded as x = 1 if male, x=0 if female If model is E(y)=β 0 +β 1 x, β 1 captures the effect of being male on the dependent variable

29 Model Building: Models with both Quantitative and Qualitative Variables Start with a first order model with one quantitative variable, E(y)=β 0 +β 1 x 1 Adding a qualitative variable with no interaction, E(y)=β 0 +β 1 x 1 + β 2 x 2 + β 3 x 3

30 Model Building: Models with both Quantitative and Qualitative Variables Adding an interaction term, E(y)=β 0 +β 1 x 1 + β 2 x 2 + β 3 x 3 + β 4 x 1 x 2 + β 5 x 1 x 3 Main effect, Main effect Interaction x 1 x 2 and x 3

31 Model Building: Comparing Nested Models Models are nested if one model contains all the terms of the other model and at least one additional term. Complete (full) model – the more complex model Reduced model – the simpler model

32 Model Building: Comparing Nested Models Models are nested if one model contains all the terms of the other model and at least one additional term. Complete (full) model – the more complex model Reduced model – the simpler model

33 Model Building: Comparing Nested Models F-Test for comparing nested models: F-Test for Comparing Nested Models Reduced model Complete Model H 0 : β g+1 = β g+2=.... β k =0 H a : At least one β under test is nonzero. Rejection region: F>F α, with k-g numerator degrees of freedom and [n-(k+1)] denominator degrees of freedom

34 Model Building: Stepwise Regression Used when a large set of independent variables Software packages will add in variables in order of explanatory value. Decisions based on largest t-values at each step Procedure is best used as a screening procedure only

35 Residual Analysis: Checking the Regression Assumptions Regression Residual – the difference between an observed y value and its corresponding predicted value Properties of Regression Residuals The mean of the residuals equals zero The standard deviation of the residuals is equal to the standard deviation of the fitted regression model

36 Residual Analysis: Checking the Regression Assumptions Analyzing Residuals Top plot of residuals reveals non-random pattern, curved shape Second plot, based on second-order term being added to model, results in random pattern, better model

37 Residual Analysis: Checking the Regression Assumptions Identifying Outliers Residual plots can reveal outliers Outliers need to be checked to try to determine if error is involved If error is involved, or observation is not representative, analysis can be rerun after deleting data point to assess the effect. Outlier

38 Residual Analysis: Checking the Regression Assumptions With OutlierWithout Outlier Checking for Normal Errors

39 Residual Analysis: Checking the Regression Assumptions Checking for Equal Variances Pattern in residuals indicate violation of equal variance assumption Can point to use of transformation on the dependent variable to stabilize variance

40 Residual Analysis: Checking the Regression Assumptions Steps in Residual Analysis 1.Check for misspecified model by plotting residuals against quantitative independent variables 2.Examine residual plots for outliers 3.Check for non-normal error using frequency distribution of residuals 4.Check for unequal error variances using plots of residuals against predicted values

41 Some Pitfalls: Estimability, Multicollinearity, and Extrapolation Estimability – the number of levels of observed x-values must be one more than the order of the polynomial in x that you want to fit Multicollinearity – when two or more independent variables are correlated

42 Some Pitfalls: Estimability, Multicollinearity, and Extrapolation Multicollinearity – when two or more independent variables are correlated Leads to confusing, misleading results, incorrect parameter estimate signs. Can be identified by –checking correlations among x’s –non-significant for most/all x’s –signs opposite from expected in the estimated β parameters Can be addressed by –Dropping one or more of the correlated variables in the model –Restricting inferences to range of sample data, not making inferences about individual β parameters based on t-tests.

43 Some Pitfalls: Estimability, Multicollinearity, and Extrapolation Extrapolation – use of model to predict outside of range of sample data is dangerous Correlated Errors – most common when working with time series data, values of y and x’s observed over a period of time. Solution is to develop a time series model.