© 2003 Prentice-Hall, Inc.Chap 11-1 Business Statistics: A First Course (3 rd Edition) Chapter 11 Multiple Regression.

Slides:



Advertisements
Similar presentations
Korelasi Diri (Auto Correlation) Pertemuan 15 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Advertisements

Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Qualitative Variables and
Multiple Regression [ Cross-Sectional Data ]
Chapter 13 Multiple Regression
Regresi dan Analisis Varians Pertemuan 21 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter 14 Introduction to Multiple Regression
Korelasi Ganda Dan Penambahan Peubah Pertemuan 13 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Statistics for Managers Using Microsoft® Excel 5th Edition
Statistics for Managers Using Microsoft® Excel 5th Edition
Interaksi Dalam Regresi (Lanjutan) Pertemuan 25 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Regresi dan Rancangan Faktorial Pertemuan 23 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter 12 Multiple Regression
© 2000 Prentice-Hall, Inc. Chap Multiple Regression Models.
Multiple Regression Models. The Multiple Regression Model The relationship between one dependent & two or more independent variables is a linear function.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
© 2003 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
1 Pertemuan 13 Uji Koefisien Korelasi dan Regresi Matakuliah: A0392 – Statistik Ekonomi Tahun: 2006.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Statistics for Business and Economics Chapter 11 Multiple Regression and Model Building.
© 2004 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
Linear Regression Example Data
Ch. 14: The Multiple Regression Model building
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Simple Linear Regression. Chapter Topics Types of Regression Models Determining the Simple Linear Regression Equation Measures of Variation Assumptions.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Chapter 15: Model Building
Chapter 7 Forecasting with Simple Regression
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Statistics for Managers Using Microsoft Excel 3rd Edition
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
© 2001 Prentice-Hall, Inc.Chap 14-1 BA 201 Lecture 23 Correlation Analysis And Introduction to Multiple Regression (Data)Data.
Chapter 8 Forecasting with Multiple Regression
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Purpose of Regression Analysis Regression analysis is used primarily to model causality and provide prediction –Predicts the value of a dependent (response)
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
Lecture 14 Multiple Regression Model
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Chapter 14 Introduction to Multiple Regression
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Chap 14-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics.
Lecture 4 Introduction to Multiple Regression
Lecture 3 Introduction to Multiple Regression Business and Economic Forecasting.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 14-1 Chapter 14 Introduction to Multiple Regression Statistics for Managers using Microsoft.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Chap 13-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 13 Multiple Regression and.
Statistics for Managers Using Microsoft® Excel 5th Edition
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
© 2000 Prentice-Hall, Inc. Chap Chapter 10 Multiple Regression Models Business Statistics A First Course (2nd Edition)
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Chapter 13 Simple Linear Regression
Chapter 15 Multiple Regression Model Building
Chapter 14 Introduction to Multiple Regression
Lecture 24 Multiple Regression Model And Residual Analysis
Chapter 15 Multiple Regression and Model Building
Statistics for Managers using Microsoft Excel 3rd Edition
Multiple Regression Analysis and Model Building
Pemeriksaan Sisa dan Data Berpengaruh Pertemuan 17
Korelasi Parsial dan Pengontrolan Parsial Pertemuan 14
Presentation transcript:

© 2003 Prentice-Hall, Inc.Chap 11-1 Business Statistics: A First Course (3 rd Edition) Chapter 11 Multiple Regression

© 2003 Prentice-Hall, Inc. Chap 11-2 Chapter Topics The Multiple Regression Model Residual Analysis Testing for the Significance of the Regression Model Inferences on the Population Regression Coefficients Testing Portions of the Multiple Regression Model

© 2003 Prentice-Hall, Inc. Chap 11-3 Chapter Topics The Quadratic Regression Model Dummy Variables Using Transformations in Regression Models Collinearity Model Building Pitfalls in Multiple Regression and Ethical Issues (continued)

© 2003 Prentice-Hall, Inc. Chap 11-4 Population Y-intercept Population slopesRandom Error The Multiple Regression Model Relationship between 1 dependent & 2 or more independent variables is a linear function Dependent (Response) variable Independent (Explanatory) variables

© 2003 Prentice-Hall, Inc. Chap 11-5 Multiple Regression Model Bivariate model

© 2003 Prentice-Hall, Inc. Chap 11-6 Multiple Regression Equation Bivariate model Multiple Regression Equation

© 2003 Prentice-Hall, Inc. Chap 11-7 Multiple Regression Equation Too complicated by hand! Ouch!

© 2003 Prentice-Hall, Inc. Chap 11-8 Interpretation of Estimated Coefficients Slope (b i ) Estimated that the average value of Y changes by b i for each 1 unit increase in X i holding all other variables constant (ceterus paribus) Example: If b 1 = -2, then fuel oil usage (Y) is expected to decrease by an estimated 2 gallons for each 1 degree increase in temperature (X 1 ) given the inches of insulation (X 2 ) Y-Intercept (b 0 ) The estimated average value of Y when all X i = 0

© 2003 Prentice-Hall, Inc. Chap 11-9 Multiple Regression Model: Example ( 0 F) Develop a model for estimating heating oil used for a single family home in the month of January based on average temperature and amount of insulation in inches.

© 2003 Prentice-Hall, Inc. Chap Multiple Regression Equation: Example Excel Output For each degree increase in temperature, the estimated average amount of heating oil used is decreased by gallons, holding insulation constant. For each increase in one inch of insulation, the estimated average use of heating oil is decreased by gallons, holding temperature constant.

© 2003 Prentice-Hall, Inc. Chap Multiple Regression in PHStat PHStat | Regression | Multiple Regression … EXCEL spreadsheet for the heating oil example.

© 2003 Prentice-Hall, Inc. Chap Simple and Multiple Regression Compared simple Coefficients in a simple regression pick up the impact of that variable (plus the impacts of other variables that are correlated with it) and the dependent variable. multiple Coefficients in a multiple regression account for the impacts of the other variables in the equation.

© 2003 Prentice-Hall, Inc. Chap Simple and Multiple Regression Compared:Example Two simple regressions: Multiple Regression:

© 2003 Prentice-Hall, Inc. Chap Venn Diagrams and Explanatory Power of Regression Oil Temp Variations in Oil explained by Temp or variations in Temp used in explaining variation in Oil Variations in Oil explained by the error term Variations in Temp not used in explaining variation in Oil

© 2003 Prentice-Hall, Inc. Chap Venn Diagrams and Explanatory Power of Regression Oil Temp (continued)

© 2003 Prentice-Hall, Inc. Chap Venn Diagrams and Explanatory Power of Regression Oil Temp Insulation Overlapping variation NOT estimation Overlapping variation in both Temp and Insulation are used in explaining the variation in Oil but NOT in the estimation of nor NOT Variation NOT explained by Temp nor Insulation

© 2003 Prentice-Hall, Inc. Chap Venn Diagrams and Explanatory Power of Regression Oil Temp Insulation

© 2003 Prentice-Hall, Inc. Chap Coefficient of Multiple Determination Proportion of Total Variation in Y Explained by All X Variables Taken Together Never Decreases When a New X Variable is Added to Model Disadvantage When Comparing Models

© 2003 Prentice-Hall, Inc. Chap Adjusted Coefficient of Multiple Determination Proportion of Variation in Y Explained by All X Variables Adjusted for the Number of X Variables Used and Sample Size Penalizes Excessive Use of Independent Variables Smaller than Useful in Comparing among Models

© 2003 Prentice-Hall, Inc. Chap Coefficient of Multiple Determination Excel Output Adjusted r 2  reflects the number of explanatory variables and sample size  is smaller than r 2

© 2003 Prentice-Hall, Inc. Chap Interpretation of Coefficient of Multiple Determination 96.56% of the total variation in heating oil can be explained by temperature and amount of insulation 95.99% of the total fluctuation in heating oil can be explained by temperature and amount of insulation after adjusting for the number of explanatory variables and sample size

© 2003 Prentice-Hall, Inc. Chap Using The Regression Equation to Make Predictions Predict the amount of heating oil used for a home if the average temperature is 30 0 and the insulation is 6 inches. The predicted heating oil used is gallons

© 2003 Prentice-Hall, Inc. Chap Predictions in PHStat PHStat | Regression | Multiple Regression … Check the “Confidence and Prediction Interval Estimate” box EXCEL spreadsheet for the heating oil example.

© 2003 Prentice-Hall, Inc. Chap Residual Plots Residuals Vs May need to transform Y variable Residuals Vs May need to transform variable Residuals Vs May need to transform variable Residuals Vs Time May have autocorrelation

© 2003 Prentice-Hall, Inc. Chap Residual Plots: Example No Discernible Pattern Maybe some non- linear relationship

© 2003 Prentice-Hall, Inc. Chap Testing for Overall Significance Shows if there is a Linear Relationship between all of the X Variables Together and Y Use F test Statistic Hypotheses: H 0 :      …  k = 0 (No linear relationship) H 1 : At least one  i  ( At least one independent variable affects Y ) The Null Hypothesis is a Very Strong Statement The Null Hypothesis is Almost Always Rejected

© 2003 Prentice-Hall, Inc. Chap Testing for Overall Significance Test Statistic: where F has k numerator and (n-k-1) denominator degrees of freedom (continued)

© 2003 Prentice-Hall, Inc. Chap Test for Overall Significance Excel Output: Example k = 2, the number of explanatory variables n - 1 p value

© 2003 Prentice-Hall, Inc. Chap Test for Overall Significance Example Solution F H 0 :  1 =  2 = … =  k = 0 H 1 : At least one  i  0  =.05 df = 2 and 12 Critical Value : Test Statistic: Decision: Conclusion: Reject at  = 0.05 There is evidence that at least one independent variable affects Y  = 0.05 F  (Excel Output)

© 2003 Prentice-Hall, Inc. Chap Test for Significance: Individual Variables Shows if There is a Linear Relationship Between the Variable X i and Y Use t Test Statistic Hypotheses: H 0 :  i  0 (No linear relationship) H 1 :  i  0 (Linear relationship between X i and Y)

© 2003 Prentice-Hall, Inc. Chap t Test Statistic Excel Output: Example t Test Statistic for X 1 (Temperature) t Test Statistic for X 2 (Insulation)

© 2003 Prentice-Hall, Inc. Chap t Test : Example Solution H 0 :  1 = 0 H 1 :  1  0 df = 12 Critical Values: Test Statistic: Decision: Conclusion: Reject H 0 at  = 0.05 There is evidence of a significant effect of temperature on oil consumption. t Reject H Does temperature have a significant effect on monthly consumption of heating oil? Test at  = t Test Statistic = 

© 2003 Prentice-Hall, Inc. Chap Venn Diagrams and Estimation of Regression Model Oil Temp Insulation Only this information is used in the estimation of This information is NOT used in the estimation of nor

© 2003 Prentice-Hall, Inc. Chap Confidence Interval Estimate for the Slope Provide the 95% confidence interval for the population slope  1 (the effect of temperature on oil consumption)   1  The estimated average consumption of oil is reduced by between 4.7 gallons to 6.17 gallons per each increase of 1 0 F.

© 2003 Prentice-Hall, Inc. Chap Contribution of a Single Independent Variable Let X k be the Independent Variable of Interest Measures the contribution of X k in explaining the total variation in Y

© 2003 Prentice-Hall, Inc. Chap Contribution of a Single Independent Variable Measures the contribution of in explaining Y From ANOVA section of regression for

© 2003 Prentice-Hall, Inc. Chap Coefficient of Partial Determination of Measures the Proportion of Variation in the Dependent Variable that is Explained by X k while Controlling for (Holding Constant) the Other Independent Variables

© 2003 Prentice-Hall, Inc. Chap Coefficient of Partial Determination for (continued) Example: Model with two independent variables

© 2003 Prentice-Hall, Inc. Chap Venn Diagrams and Coefficient of Partial Determination for Oil Temp Insulation =

© 2003 Prentice-Hall, Inc. Chap Coefficient of Partial Determination in PHStat PHStat | Regression | Multiple Regression … Check the “Coefficient of Partial Determination” box EXCEL spreadsheet for the heating oil example

© 2003 Prentice-Hall, Inc. Chap Contribution of a Subset of Independent Variables Let X s Be the Subset of Independent Variables of Interest Measures the contribution of the subset X s in explaining SST

© 2003 Prentice-Hall, Inc. Chap Contribution of a Subset of Independent Variables: Example Let X s be X 1 and X 3 From ANOVA section of regression for

© 2003 Prentice-Hall, Inc. Chap Testing Portions of Model Examines the Contribution of a Subset X s of Explanatory Variables to the Relationship with Y Null Hypothesis: Variables in the subset do not improve significantly the model when all other variables are included Alternative Hypothesis: At least one variable is significant

© 2003 Prentice-Hall, Inc. Chap Testing Portions of Model One-tailed Rejection Region Requires Comparison of Two Regressions One regression includes everything Another regression includes everything except the portion to be tested (continued)

© 2003 Prentice-Hall, Inc. Chap Partial F Test for the Contribution of a Subset of X variables Hypotheses: H 0 : Variables X s do not significantly improve the model given all others variables included H 1 : Variables X s significantly improve the model given all others included Test Statistic: with df = m and (n-k-1) m = # of variables in the subset X s

© 2003 Prentice-Hall, Inc. Chap Partial F Test for the Contribution of a Single Hypotheses: H 0 : Variable X j does not significantly improve the model given all others included H 1 : Variable X j significantly improves the model given all others included Test Statistic: With df = 1 and (n-k-1) m = 1 here

© 2003 Prentice-Hall, Inc. Chap Testing Portions of Model: Example Test at the  =.05 level to determine if the variable of average temperature significantly improves the model given that insulation is included.

© 2003 Prentice-Hall, Inc. Chap Testing Portions of Model: Example H 0 : X 1 (temperature) does not improve model with X 2 (insulation) included H 1 : X 1 does improve model  =.05, df = 1 and 12 Critical Value = 4.75 (For X 1 and X 2 )(For X 2 ) Conclusion: Reject H 0 ; X 1 does improve model

© 2003 Prentice-Hall, Inc. Chap Testing Portions of Model in PHStat PHStat | Regression | Multiple Regression … Check the “Coefficient of Partial Determination” box EXCEL spreadsheet for the heating oil example.

© 2003 Prentice-Hall, Inc. Chap Do We Need to Do This for One Variable? The F Test for the Inclusion of a Single Variable after all Other Variables are Included in the Model is IDENTICAL to the t Test of the Slope for that Variable The Only Reason to Do an F Test is to Test Several Variables Together

© 2003 Prentice-Hall, Inc. Chap The Quadratic Regression Model Relationship Between the Response Variable and the Explanatory Variable is a Quadratic Polynomial Function Useful When Scatter Diagram Indicates Non- linear Relationship Quadratic Model : The Second Explanatory Variable is the Square of the First Variable

© 2003 Prentice-Hall, Inc. Chap Quadratic Regression Model (continued) Quadratic model may be considered when a scatter diagram takes on the following shapes: X1X1 Y X1X1 X1X1 YYY  2 > 0  2 < 0  2 = the coefficient of the quadratic term X1X1

© 2003 Prentice-Hall, Inc. Chap Testing for Significance: Quadratic Model Testing for Overall Relationship Similar to test for linear model F test statistic = Testing the Quadratic Effect Compare quadratic model with the linear model Hypotheses (No quadratic term) (Quadratic term is needed)

© 2003 Prentice-Hall, Inc. Chap Heating Oil Example ( 0 F) Determine if a quadratic model is needed for estimating heating oil used for a single family home in the month of January based on average temperature and amount of insulation in inches.

© 2003 Prentice-Hall, Inc. Chap Heating Oil Example: Residual Analysis No Discernable Pattern Possible non-linear relationship (continued)

© 2003 Prentice-Hall, Inc. Chap Heating Oil Example: t Test for Quadratic Model Testing the Quadratic Effect Model with quadratic insulation term Model without quadratic insulation term Hypotheses (No quadratic term in insulation) (Quadratic term is needed in insulation) (continued)

© 2003 Prentice-Hall, Inc. Chap Example Solution H 0 :  3 = 0 H 1 :  3  0 df = 11 Critical Values: Test Statistic: Decision: Conclusion: Do not reject H 0 at  = 0.05 There is not sufficient evidence for the need to include quadratic effect of insulation on oil consumption. Z Reject H Is quadratic term in insulation needed on monthly consumption of heating oil? Test at  =

© 2003 Prentice-Hall, Inc. Chap Example Solution in PHStat PHStat | Regression | Multiple Regression … EXCEL spreadsheet for the heating oil example.

© 2003 Prentice-Hall, Inc. Chap Dummy Variable Models Categorical Explanatory Variable with 2 or More Levels: Yes or No, On or Off, Male or Female, Use Dummy Variables (Coded As 0 or 1) Only Intercepts are Different Assumes Equal Slopes Across Categories The Number of Dummy Variables Needed is (# of Levels - 1) Regression Model Has Same Form:

© 2003 Prentice-Hall, Inc. Chap Dummy-Variable Models (with 2 Levels) Given: Y = Assessed Value of House X 1 = Square footage of House X 2 = Desirability of Neighborhood = Desirable (X 2 = 1) Undesirable (X 2 = 0) 0 if undesirable 1 if desirable Same slopes

© 2003 Prentice-Hall, Inc. Chap Undesirable Desirable Location Dummy-Variable Models (with 2 Levels) (continued) X 1 (Square footage) Y (Assessed Value) b 0 + b 2 b0b0 Same slopes Intercepts different

© 2003 Prentice-Hall, Inc. Chap Interpretation of the Dummy Variable Coefficient (with 2 Levels) Example: : GPA 0 non-business degree 1 business degree : Annual salary of college graduate in thousand $ With the same GPA, college graduates with a business degree are making an estimated 6 thousand dollars more than graduates with a non-business degree on average. :

© 2003 Prentice-Hall, Inc. Chap Regression Model Containing an Interaction Term Hypothesizes Interaction Between a Pair of X Variables Response to one X variable varies at different levels of another X variable Contains a Cross Product Term Can Be Combined With Other Models E.g., Dummy Variable Model

© 2003 Prentice-Hall, Inc. Chap Effect of Interaction Given: Without Interaction Term, Effect of X 1 on Y is Measured by  1 With Interaction Term, Effect of X 1 on Y is Measured by  1 +  3 X 2 Effect Changes as X 2 Changes

© 2003 Prentice-Hall, Inc. Chap Y = 1 + 2X 1 + 3(1) + 4X 1 (1) = 4 + 6X 1 Y = 1 + 2X 1 + 3(0) + 4X 1 (0) = 1 + 2X 1 Interaction Example Effect (slope) of X 1 on Y depends on X 2 value X1X Y Y = 1 + 2X 1 + 3X 2 + 4X 1 X 2

© 2003 Prentice-Hall, Inc. Chap Interaction Regression Model Worksheet Multiply X 1 by X 2 to get X 1 X 2 Run regression with Y, X 1, X 2, X 1 X 2 Case, iYiYi X 1i X 2i X 1i X 2i :::::

© 2003 Prentice-Hall, Inc. Chap Interpretation when there are 3+ Levels MALE = 0 if female and 1 if male MARRIED = 1 if married; 0 if not DIVORCED = 1 if divorced; 0 if not MALEMARRIED = 1 if male married; 0 otherwise = (MALE times MARRIED) MALEDIVORCED = 1 if male divorced; 0 otherwise = (MALE times DIVORCED)

© 2003 Prentice-Hall, Inc. Chap Interpretation when there are 3+ Levels (continued)

© 2003 Prentice-Hall, Inc. Chap Interpreting Results FEMALE Single: Married: Divorced: MALE Single: Married: Divorced: Main Effects : MALE, MARRIED and DIVORCED Interaction Effects : MALEMARRIED and MALEDIVORCED Difference

© 2003 Prentice-Hall, Inc. Chap Hypothesize Interaction Between a Pair of Independent Variables Contains a Cross-Product Term Hypotheses: H 0 :  3 = 0 (No Interaction between X 1 and X 2 ) H 1 :  3  0 (X 1 Interacts with X 2 ) Evaluating Presence of Interaction

© 2003 Prentice-Hall, Inc. Chap Using Transformations Requires Data Transformation Either or Both Independent and Dependent Variables may be Transformed Can be Based on Theory, Logic or Scatter Diagrams

© 2003 Prentice-Hall, Inc. Chap Inherently Non-Linear Models Non-linear Models that can be Expressed in Linear Form Can be estimated by least squares in linear form Require Data Transformation

© 2003 Prentice-Hall, Inc. Chap Transformed Multiplicative Model (Log-Log) Similarly for X 2

© 2003 Prentice-Hall, Inc. Chap Square Root Transformation  1 > 0  1 < 0 Similarly for X 2 Transforms non-linear model to one that appears linear. Often used to overcome heteroscedasticity.

© 2003 Prentice-Hall, Inc. Chap Exponential Transformation (Log-Linear) Original Model  1 > 0  1 < 0 Transformed Into:

© 2003 Prentice-Hall, Inc. Chap Interpretation of Coefficients Transformed Exponential Model (The Dependent Variable is Logged) The coefficient of the independent variable can be approximately interpreted as: a 1 unit change in leads to an estimated average rate of change of percentage in Y

© 2003 Prentice-Hall, Inc. Chap Interpretation of Coefficients Transformed Multiplicative Model (Both Dependent and Independent Variables are Logged) The coefficient of the independent variable can be approximately interpreted as : a 1 percent rate of change in leads to an estimated average rate of change of percentage in Y. Therefore is the elasticity of Y with respect to a change in (continued)

© 2003 Prentice-Hall, Inc. Chap Collinearity (Multicollinearity) High Correlation between Explanatory Variables Coefficient of Multiple Determination Measures Combined Effect of the Correlated Explanatory Variables Little or No New Information Provided Leads to Unstable Coefficients (Large Standard Error)

© 2003 Prentice-Hall, Inc. Chap Venn Diagrams and Collinearity Oil Temp Insulation Overlap NOT Large Overlap in variation of Temp and Insulation is used in explaining the variation in Oil but NOT in estimating and Overlap Large Overlap reflects collinearity between Temp and Insulation

© 2003 Prentice-Hall, Inc. Chap Detect Collinearity (Variance Inflationary Factor)  Used to Measure Collinearity  If is Highly Correlated with the Other Explanatory Variables

© 2003 Prentice-Hall, Inc. Chap Detect Collinearity in PHStat PHStat | Regression | Multiple Regression … Check the “Variance Inflationary Factor (VIF)” box EXCEL spreadsheet for the heating oil example Since there are only two explanatory variables, only one VIF is reported in the Excel spreadsheet No VIF is > 5 There is no evidence of collinearity

© 2003 Prentice-Hall, Inc. Chap Model Building Goal is to Develop a Good Model with the Fewest Explanatory Variables Easier to interpret Lower probability of collinearity Stepwise Regression Procedure Provide limited evaluation of alternative models Best-Subset Approach Uses the or C p Statistic Selects the model with the largest or small C p near k+1

© 2003 Prentice-Hall, Inc. Chap Model Building Flowchart Choose X 1,X 2,…X p Run Regression to find VIFs Remove Variable with Highest VIF Any VIF>5? Run Subsets Regression to Obtain “best” models in terms of C p Do Complete Analysis Add Curvilinear Term and/or Transform Variables as Indicated Perform Predictions No More than One? Remove this X Yes No Yes

© 2003 Prentice-Hall, Inc. Chap Additional Pitfalls and Ethical Issues Fail to Understand that the Interpretation of the Estimated Regression Coefficients are Performed Holding All Other Independent Variables Constant Fail to Evaluate Residual Plots for Each Independent Variable Fail to Evaluate Interaction Terms

© 2003 Prentice-Hall, Inc. Chap Additional Pitfalls and Ethical Issues Fail to Obtain VIF for Each Independent Variable and Remove Variables that Exhibit a High Collinearity with Other Independent Variables before Performing Significance Test on Each Independent Variable Fail to Examine Several Alternative Models Fail to Use Other Methods when the Assumptions Necessary for Least-squares Regression have been Seriously Violated (continued)

© 2003 Prentice-Hall, Inc. Chap Chapter Summary Developed the Multiple Regression Model Discussed Residual Plots Addressed Testing the Significance of the Multiple Regression Model Discussed Inferences on Population Regression Coefficients Addressed Testing Portions of the Multiple Regression Model

© 2003 Prentice-Hall, Inc. Chap Chapter Summary Described the Quadratic Regression Model Addressed Dummy Variables Discussed Using Transformations in Regression Models Described Collinearity Discussed Model Building Addressed Pitfalls in Multiple Regression and Ethical Issues (continued)