Chapter 7: Multiple Regression II Ayona Chatterjee Spring 2008 Math 4813/5813.

Slides:



Advertisements
Similar presentations
TESTING THE STRENGTH OF THE MULTIPLE REGRESSION MODEL.
Advertisements

3.3 Hypothesis Testing in Multiple Linear Regression
Topic 12: Multiple Linear Regression
Multiple Regression W&W, Chapter 13, 15(3-4). Introduction Multiple regression is an extension of bivariate regression to take into account more than.
Lecture 10 F-tests in MLR (continued) Coefficients of Determination BMTRY 701 Biostatistical Methods II.
Chapter 12 Simple Linear Regression
Statistics and Quantitative Analysis U4320
3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Topic 15: General Linear Tests and Extra Sum of Squares.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Analysis of Economic Data
Psychology 202b Advanced Psychological Statistics, II February 1, 2011.
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Multiple Linear Regression
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
SIMPLE LINEAR REGRESSION
Chapter 14 Introduction to Linear Regression and Correlation Analysis
1 732G21/732A35/732G28. Formal statement  Y i is i th response value  β 0 β 1 model parameters, regression parameters (intercept, slope)  X i is i.
Simple Linear Regression and Correlation
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Objectives of Multiple Regression
SIMPLE LINEAR REGRESSION
Correlation and Regression
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
CHAPTER 14 MULTIPLE REGRESSION
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Model Selection1. 1. Regress Y on each k potential X variables. 2. Determine the best single variable model. 3. Regress Y on the best variable and each.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
An alternative approach to testing for a linear association The Analysis of Variance (ANOVA) Table.
Lecture 11 Multicollinearity BMTRY 701 Biostatistical Methods II.
Chapter 13 Multiple Regression
1 Psych 5510/6510 Chapter Eight--Multiple Regression: Models with Multiple Continuous Predictors Part 1: Testing the Overall Model Spring, 2009.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Week 101 ANOVA F Test in Multiple Regression In multiple regression, the ANOVA F test is designed to test the following hypothesis: This test aims to assess.
Correlation & Regression Analysis
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
EXTRA SUMS OF SQUARES  Extra sum squares  Decomposition of SSR  Usage of Extra sum of Squares  Coefficient of partial determination.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Multiple Regression Example A hospital administrator wished to study the relation between patient satisfaction (Y) and the patient’s age (X 1 ), severity.
Statistics 350 Lecture 19. Today Last Day: R 2 and Start Chapter 7 Today: Partial sums of squares and related tests.
1 1 Slide © 2011 Cengage Learning Assumptions About the Error Term  1. The error  is a random variable with mean of zero. 2. The variance of , denoted.
Analysis of variance approach to regression analysis … an (alternative) approach to testing for a linear association.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
Stats Methods at IC Lecture 3: Regression.
Multiple Regression.
Decomposition of Sum of Squares
Analysis of Variance in Matrix form
Regression Diagnostics
Multiple Regression.
Quantitative Methods Simple Regression.
Multiple Regression II
Multiple Regression.
Multiple Regression II
Simple Linear Regression
CHAPTER 14 MULTIPLE REGRESSION
Simple Linear Regression
Statistics 350 Lecture 18.
Decomposition of Sum of Squares
Pearson Correlation and R2
Presentation transcript:

Chapter 7: Multiple Regression II Ayona Chatterjee Spring 2008 Math 4813/5813

Extra Sums of Squares: Example We have Y: amount of body fat X1: triceps skin fold thickness X2: thigh circumference X3: midarm circumference The study was conducted for 20 healthy females. Note predictor variables are easy to measure. Response variable has to be measured using complicated procedure.

Example Continued Various regression analysis was carried out using a subset of predictors at a time. PredictorsSSRSSE X X X1 & X X1, X2 & X

Extra Sums of Squares The difference in the error sums of squares when both X1 and X2 are used in the model as opposed to only X1 is called an extra sum of squares and will be denoted by SSR(X2|X1): –SSR(X2|X1) = SSE(X1) – SSE(X1, X2) The extra sums of squares SSR(X2|X1) measures the marginal effect of adding X2 to the regression model when X1 is already in the model.

Equivalently we can define –SSR(X3|X1, X2) = SSE(X1,X2) – SSE(X1, X2, X3) We can also consider the marginal effect of adding several variables, such as say X2 and X3 to the regression model already containing X1. –SSR(X2, X3 | X1) = SSE(X1) –SSE(X1, X2, X3) Extra Sums of Squares

The Basic Idea An extra sums of squares measures the marginal reduction in the error sum of squares when one or several predictors are added to the regression model, given that other predictors are already in the model. The extra sum of squares measures the marginal increase in the regression sum of squares when one or several predictors are added to the model.

Decomposition of SSR into Extra Sums of Squares Various decompositions are possible in multiple regression analysis. –Suppose we have two predictor variables. SSTO = SSR(X1) + SSE(X1) –Replacing SSE(X1) SSTO = SSR(X1) + SSR(X2|X1) + SSE(X1, X2) –Or we can simply write SSTO = SSR(X1, X2) + SSE(X1, X2)

ANOVA Table Containing Decomposition of SSR ANOVA table with decomposition of SSR for three predictor variables. Source of Variation SSdfMS RegressionSSR(X1, X2, X3)3MSR(X1, X2, X3) X1SSR(X1)1MSSR(X1) X2|X1SSR(X2|X1)1MSR(X2|X1) X3|X1,X2SSR(X3|X1, X2)1MSR(X3|X1, X2) ErrorSSE(X1,X2, X3)n –4MSE(X1,X2, X3) TotalSSTOn –1

Tests for Regression Coefficients Using the extra sums of squares we will test if a single  k =0. This is to test if the predictor X k can be dropped from the regression model. We already know this.

Using Extra Sums of Squares Consider a first order regression model with three predictors. We fit the full model and obtain the SSE. We write SSE(X1, X2, X3) = SSE(F), here the df are n – 4.

We next fit a reduced model without X3 as a predictor and obtain SSE(R)=SSE(X1, X2) with n – 3 degrees of freedom. We define the test statistic Using Extra Sums of Squares

Test Whether Several  k =0 In multiple regression analysis we may be interested in whether several terms in the regression model can be dropped.

Example: Body Fat We wish to test for the body fat example with all three predictors if we can drop thigh circumference (X2) and midarm circumference (X3) from the full regression model. Use alpha = 0.05.

Coefficients of Partial Determination A coefficient of partial determination measures the marginal contribution of one X variable when all others are already included in the model. For a two predictor model we define the coefficient of partial determination between Y and X 1 given X 2 by which measures the proportionate reduction in the variation in Y remaining after X 2 is included in the model that is gained by also including X 1 in the model.

Formulae R 2 can take any value between 0 and 1. The square root of coefficient of partial determination is called a coefficient of partial correlation denoted by r.

Round off Errors in Normal Equation Calculations When a large number of predictor variables are present in the model, serious round off effects can arise despite the use of many digits in intermediate calculations. One of the main sources of these errors is when calculating. The main problem is if the determinant of is close to zero or the values in the matrix have large difference in magnitude. Solution: Transform all the entries so that they lie between –1 and 1.

Correlation Transformation Correlation transformation involves standardizing the variables. We define:

Standardized Regression Model The regression model with the transformed variables as defined by the correlation transformation is called a standardized regression model and is as follows:

Example for Calculations Suppose your Y values are: 174.4, 164.4, …… The average for Y = & s Y = Then the standardized values will be Suppose the values for the first predictor are 68.5, 45.2, …. The average is and s 1 = Then the first standardized value will be

Multicollinearity and Its Effects Important questions in multiple regression: –What is the relative importance of the effects of the different predictor variables? –What is the magnitude of the effect of given predictor on the response? –Can we drop a predictor from the model? –Should we consider other predictors to be included in the mode? When predictor variables are correlated among themselves, intercorrelation or multicollinearity among them is said to exist.

Uncorrelated Predictor Variables If two predictor variables are uncorrelated, that is the effects ascribed to them by the first-order regression model are the same no matter which other of these predictor variables are included in the model. Also the marginal contribution of one predictor variable reducing the error sum of squares when the other predictor variables are in the model is exactly the same when this predictor variable is on the model alone.

Problem when Predictor Variables Are Perfectly Correlated. Let us consider a simple example to study the problems associated with multicollinearity. iX i1 X i2 YiYi

Effects of Multicollinearity In real life, we rarely have perfect correlation. We can still obtain a regression equation and make inferences in presence of multicollinearity. However interpreting the regression coefficients as before may be incorrect. When predictor variables are correlated, the regression coefficient of any one variable depends on which other predictor variables are included in the model and which ones are left out.

Effects on Extra Sums of Squares Suppose X 1 and X 2 are highly correlated, then when X 2 is already in the model, the marginal contribution of X 1 in reducing the error sums of squares is comparatively small as X 2 already contains most of the information present in X 1. Multicollinearity also effects the coefficients of partial determination.

Effects of Multicollinearity If predictors are correlated, there may be large variations in the values of the regression coefficients depending on the variables included in the model. More powerful tools are needed for identifying the existence of serious multicollinearity.