Multiple Regression W&W, Chapter 13, 15(3-4). Introduction Multiple regression is an extension of bivariate regression to take into account more than.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Managerial Economics in a Global Economy
Welcome to Econ 420 Applied Regression Analysis
The Multiple Regression Model.
Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Simple Linear Regression and Correlation
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
Chapter 12 Simple Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 10 Simple Regression.
Chapter 4 Multiple Regression.
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Chapter 11 Multiple Regression.
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Simple Linear Regression Analysis
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Multiple Linear Regression Analysis
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Objectives of Multiple Regression
Chapter 13: Inference in Regression
Understanding Multivariate Research Berry & Sanders.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Econ 3790: Business and Economics Statistics
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2003 Thomson/South-Western Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
CHAPTER 14 MULTIPLE REGRESSION
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Chapter 16 Data Analysis: Testing for Associations.
Chapter 13 Multiple Regression
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Environmental Modeling Basic Testing Methods - Statistics III.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Correlation & Regression Analysis
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Chapter 16 Multiple Regression and Correlation
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Business Research Methods
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
1 1 Slide © 2011 Cengage Learning Assumptions About the Error Term  1. The error  is a random variable with mean of zero. 2. The variance of , denoted.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
Chapter 20 Linear and Multiple Regression
Inference for Least Squares Lines
Essentials of Modern Business Statistics (7e)
John Loucks St. Edward’s University . SLIDES . BY.
Quantitative Methods Simple Regression.
Goodness of Fit The sum of squared deviations from the mean of a variable can be decomposed as follows: TSS = ESS + RSS This decomposition can be used.
Some issues in multivariate regression
Seminar in Economics Econ. 470
Product moment correlation
Tutorial 6 SEG rd Oct..
Multiple Regression Berlin Chen
Presentation transcript:

Multiple Regression W&W, Chapter 13, 15(3-4)

Introduction Multiple regression is an extension of bivariate regression to take into account more than one independent variable. The most simple multivariate model can be written as: Y =  0 +  1 X 1 +  2 X 2 +  We make the same assumptions about our error (  ) that we did in the bivariate case.

Example Suppose we examine the impact of fertilizer on crop yields, but this time we want to control for another factor that we think affects yield levels, rainfall. We collect the following data.

Data Y (yield)X 1 (fertilizer)X 2 (rainfall)

Multiple Regression Partial Slope Coefficients  1 is geometrically interpreted as the marginal effect of fertilizer (X 1 ) on yield (Y) holding rainfall (X 2 ) constant. The OLS model is estimated as: Y p = b 0 + b 1 X 1 + b 2 X 2 + e Solving for b 0, b 1, and b 2 becomes more complicated than it was in the bivariate model because we have to consider the relationships between X 1 and Y, X 2 and Y, and X 1 and X 2.

Finding the Slopes We would solve the following equations simultaneously for this problem:  (X 1 -M x1 )(Y-M y ) = b 1  (X 1 -M x1 ) 2 + b 2  (X 1 -M x1 )(X 2 -M x2 )  (X 2 -M x2 )(Y-M y ) = b 1  (X 1 -M x1 )(X 2 -M x2 ) + b 2  (X 2 -M x2 ) 2 b 0 = M y – b 1 X 1 – b 2 X 2 These are called the normal or estimating equations.

Solution b 1 = [  (X 1 -M x1 )(Y-M y )][  (X 2 -M x2 ) 2 ] – [  (X 2 -M x2 )(Y-M y )][  (X 1 -M x1 )(X 2 - M x2 )]  (X 1 -M x1 ) 2  (X 2 -M x2 ) 2 -  (X 1 -M x1 )(X 2 -M x2 ) 2 b 2 = [  (X 2 -M x2 )(Y-M y )][  (X 1 -M x1 ) 2 ] – [  (X 1 -M x1 )(Y-M y )][  (X 1 -M x1 )(X 2 - M x2 )]  (X 1 -M x1 ) 2  (X 2 -M x2 ) 2 -  (X 1 -M x1 )(X 2 -M x2 ) 2 Good thing we have computers to calculate this for us!

Hypothesis Testing for  We can calculate a confidence interval:  = b +/- t  /2 (se b ) Df = N – k – 1, where k=# of regressors We can also use a t-test for each independent variable to test the following hypotheses (as one or two tailed tests): H o :  1 = 0 H o :  2 = 0where t = b i /(se bi ) H A :  1  0 H A :  2  0

Dropping Regressors We may be tempted to throw out variables that are insignificant, but we might bias the remaining coefficients in the model. Such an omission of important variables is called omitted variable bias. If you have a strong theoretical reason to include a variable, then you should keep it in the model. One way to minimize such bias is to use randomized assignment of the treatment variables.

Interpreting the Coefficients In the bivariate regression model, the slope (b) represents a change in Y that accompanies a one unit change in X. In the multivariate regression model, each slope coefficient (b i ) represents the change in Y that accompanies a one unit change in the regressor (X i ) if all other regressors remain constant. This is like taking a partial derivative in calculus, which is why we refer to these as partial slope coefficients.

Partial Correlation Partial correlation calculates the correlation between Y and X i with the other regressors held constant: Partial r = b/(se b ) = t  [b/(se b ) 2 + (n-k-1)]  [t 2 + (n-k-1)]

Calculating Adjusted R 2 R 2 = SSR/SS Problem: R 2 increases as k increases, so some people advocate the use of the adjusted R 2 : R 2 A : (n-1)R 2 – k (n-k-1) We subtract k in the numerator as a “penalty” for increasing the size of k (# of regressors).

Stepwise Regression W&W talk about stepwise regression (pages ). This is an atheoretical procedure that selects variables on the basis of how they increase R 2. Don’t use this technique because it is not theoretically driven and R 2 is a very problematic statistic (as you will learn later).

Standard error of the estimate A better measure of model fit is the standard error of the estimate: s =  [  (Y-Y p ) 2 ]/[n-k-1] This is just the square root of the SSE, controlling for degrees of freedom. A model with a smaller standard error of the estimate is better. See Chris Achen’s Sage monograph on regression for a good discussion of this measure.

Multicollinearity An additional assumption we must make in multiple regression is that none of the independent variables are perfectly correlated with each other. In the simple multivariate model, for example: Y p = b 0 + b 1 X 1 + b 2 X 2 + e r 12  1

Multicollinearity With perfect multicollinearity, you cannot estimate the partial slope coefficients. To see why this is so, rewrite the estimate for b 1 in the model with two independent variables as: b 1 = [r y1 – r 12 r y2 ] s y [1 – r 12 2 ] s 1 r y1 = correlation between Y and X 1, r 12 = correlation between X 1 and X 2, r y2 = correlation between Y and X 2, s y = standard deviation of Y, s 1 = standard deviation of X 1

Multicollinearity We can see that if r 12 =1 or –1 you are dividing by zero, which is impossible. Often times, if r 12 is high, but not equal to one, you will get a good overall model fit (high R 2, significant F-statistic), but insignificant t-ratios. You should always examine the correlations between you independent variables to determine if this might be an issue.

Multicollinearity Multicollinearity does not bias our estimates, but if inflates the variance and thus the standard error of the parameters (or increases inefficiency). This is why we get insignificant t ratios, because we t = b/(se b ), and as we inflate se b, we depress the t-ratio making it less likely that we will reject the null hypothesis.

Standard error for b i We can calculate the standard error for b 1, for example, as: se 1 = s [  (X 1 -M x1 ) 2 ][1-R 1 2 ] Where R 1 = the multiple correlation of X 1 with all the other regressors. As R 1 increases, our standard error increases. Note that for bivariate regression, the term [1-R 1 2 ] drops out.