1 Psych 5510/6510 Chapter Eight--Multiple Regression: Models with Multiple Continuous Predictors Part 2: Testing the Addition of One Parameter at a Time.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Inference for Regression
1 Psych 5510/6510 Chapter Eight--Multiple Regression: Models with Multiple Continuous Predictors Part 3: Testing the Addition of Several Parameters at.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Objectives (BPS chapter 24)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. *Chapter 29 Multiple Regression.
Statistics for the Social Sciences
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Simple Regression
Chapter 11 One-way ANOVA: Models with a Single Categorical Predictor
Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Linear Regression and Correlation Analysis
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Multiple Regression and Correlation Analysis
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
BCOR 1020 Business Statistics
Chapter 15: Model Building
Simple Linear Regression Analysis
Relationships Among Variables
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Chapter 8: Bivariate Regression and Correlation
Example of Simple and Multiple Regression
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Regression with 2 IVs Generalization of Regression from 1 to 2 Independent Variables.
Correlation and Linear Regression
Hypothesis Testing in Linear Regression Analysis
CORRELATION & REGRESSION
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
1 Chapter 3 Multiple Linear Regression Multiple Regression Models Suppose that the yield in pounds of conversion in a chemical process depends.
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
1 Psych 5510/6510 Chapter 10. Interactions and Polynomial Regression: Models with Products of Continuous Predictors Spring, 2009.
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
1 Psych 5510/6510 Chapter 14 Repeated Measures ANOVA: Models with Nonindependent ERRORs Part 3: Factorial Designs Spring, 2009.
1 Psych 5510/6510 Chapter Eight--Multiple Regression: Models with Multiple Continuous Predictors Part 1: Testing the Overall Model Spring, 2009.
1 Psych 5510/6510 Chapter 13 ANCOVA: Models with Continuous and Categorical Predictors Part 2: Controlling for Confounding Variables Spring, 2009.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Correlation & Regression Analysis
ANOVA, Regression and Multiple Regression March
1 Psych 5510/6510 Chapter 13: ANCOVA: Models with Continuous and Categorical Predictors Part 3: Within a Correlational Design Spring, 2009.
Regression. Outline of Today’s Discussion 1.Coefficient of Determination 2.Regression Analysis: Introduction 3.Regression Analysis: SPSS 4.Regression.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
1 Psych 5510/6510 Chapter 14 Repeated Measures ANOVA: Models with Nonindependent ERRORs Part 2 (Crossed Designs) Spring, 2009.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
Week of March 23 Partial correlations Semipartial correlations
Chapter 12 Inference for Linear Regression. Reminder of Linear Regression First thing you should do is examine your data… First thing you should do is.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Correlation, Bivariate Regression, and Multiple Regression
Review. Review Statistics Needed Need to find the best place to draw the regression line on a scatter plot Need to quantify the cluster.
CHAPTER 29: Multiple Regression*
Simple Linear Regression
CHAPTER 12 More About Regression
Introduction to Regression
Presentation transcript:

1 Psych 5510/6510 Chapter Eight--Multiple Regression: Models with Multiple Continuous Predictors Part 2: Testing the Addition of One Parameter at a Time Spring, 2009

2 Overall Test In part one we looked at the overall test of the parameters in Model A: Model C: Ŷ i =β 0 (where β 0 = μ Y ) PC=1 Model A: Ŷ i =β 0 + β 1 X i1 + β 2 X i β p-1 X ip-1 PA=p

3 Disadvantages The disadvantages of this overall test are: 1.If some of the parameters in A are worthwhile and some are not, the PRE per parameter added may not be very impressive, with the weaker parameters washing out the effects of the stronger. 2.As with the overall F test in ANOVA, our alternative hypothesis is very vague, that at least one β 1 through β p-1 doesn’t equal 0. If Model A is worthwhile overall, we don’t know which of its individual parameters contributed to that worthwhileness.

4 One parameter test It is usually more interesting to test adding one parameter at a time (PA-PC=1) to our model. Model C Ŷ i =β 0 + β 1 X i1 + β 2 X i β p-1 X ip-1 Model A Ŷ i =β 0 + β 1 X i1 + β 2 X i β p-1 X ip-1 + β p X ip HO: β p = 0 HA: β p  0

5 Model C: Ŷ i =β 0 + β 1 X i1 + β 2 X i β p-1 X ip-1 Model A: Ŷ i =β 0 + β 1 X i1 + β 2 X i β p-1 X ip-1 + β p X ip The values of β 1 through β p-1 will probably change when β p X ip is added to the model (as we will see, this is due to redundancy among the predictor variables). Remember our subscripting (useful when the situation is not clear from the context): β (i.e. the value of β 4 when 1,2, and 3 are included in the model)

6 Redundancy When we use more than one predictor variable in our model then an important issue arises; specifically, to what degree are the predictor variables redundant (i.e. share information). For example, using both a child’s age and a child’s height to predict their weight is somewhat redundant, as there is a relationship between their height and age. Please review the Venn diagrams on redundancy from Part 1.

7 Redundancy Thus two or more predictor variables are redundant to the degree to which they are correlated. Let’s say we are going to add another predictor variable X p to the model below: Model C: Ŷ i = β 0 + β 1 X i1 +β 2 X i2 + β 3 X i3 and we want to know how redundant X p may be with the X variables that are already in the model. Well, we know how to determine that...

8 Measuring the Redundancy of X p with X 1, X 2, and X 3 Yes indeed, we know how to measure the relationship between X p and X 1, X 2, and X 3. The R² (i.e. PRE) of moving from Model C to Model A is the measure of the redundancy between X p and X 1, X 2, and X 3..

9 Redundancy We can measure the redundancy between the variable we are going to add (X p ) and those variables already in the model (X 1 through X p-1 ) by seeing how well those already added variables can predict the value of X p. To do this we will regress X p on variables X 1 through X p-1, and look at the resulting PRE. The PRE for regressing X p on variables X 1 through X p-1 is symbolized as R² p, which is shorter version of the full symbol which would be R² p.123…p-1

10 Tolerance Conversely, tolerance is a measure of how unique a variable is compared to the other predictor variables already in the model. If tolerance is low then the variable is redundant and can add little to the model, if tolerance is high then the variable is not very redundant, and thus has the ability to add significantly to the model (if it is correlated to Y of course). (For a pictorial representation of these ideas see the handout on ‘Tolerance’)

11 Confidence Intervals

12 Low Tolerance The formula for the confidence interval of β includes tolerance in its denominator (look back at that formula), if tolerance is low then the confidence interval of β is large (and thus rejecting β=0 becomes unlikely). If tolerance is very low then the confidence interval for β becomes huge, meaning that we become increasingly unable to determine the true value of β, and the accuracy of some computations begins to drop. Because of this, when tolerance is below.01 (or.001) some statistical programs issue a warning message.

13 Variable Inflation Factor Because a low tolerance makes the confidence interval wider, some programs report the variance inflation factor (VIF) which is the inverse of the tolerance.

14 Back to the One Parameter Test We are looking at the PRE of adding one new predictor variable to our model: Model C: Ŷ i =β 0 + β 1 X i1 + β 2 X i β p-1 X ip-1 Model A: Ŷ i =β 0 + β 1 X i1 + β 2 X i β p-1 X ip-1 + β p X ip H0: η² = 0 HA: η² > 0, or equivalently, H0: β p = 0 HA: β p  0

15 Statistical Significance SPSS makes this easy, simply regress Y on the variables of Model A. For each β in the model SPSS provides its confidence interval, and the values of ‘t’ and ‘p’ for the test of whether that β = 0. Not only do we get the information needed to decide whether it is worthwhile to add Xp to a model that contains the other variables, we get the same information about adding each variable last to a model that contains the other variables…

16 Significance (cont.) …for each variable SPSS gives us the PRE for adding that variable to a model containing all of the other variables, and tells us whether or not the β that goes with the variable differs from zero. So in addition to testing β p similar information is provided for β 1 (see below) and all the other β’s: Model C: Ŷ i =β 0 + β 2 X i β p-1 X ip-1 + β p X ip Model A: Ŷ i =β 0 + β 2 X i β p-1 X ip-1 + β p X ip + β 1 X i1 H0: β 1 = 0 HA: β 1  0 And so on for each β.

17 Coefficient of Partial Determination The PRE from adding a new predictor variable to a model that already contains predictor variables is called the ‘coefficient of partial determination’. It is symbolized as r² Yp.123…p-1 (the PRE of adding variable X p to the model of Y when variables X 1 - X p-1 are already included). See the handout on ‘Partial Correlations’.

18 Partial Correlation Coefficient The square root of the coefficient of partial determination is called the ‘partial correlation coefficient’. It is symbolized as r Yp.123…p-1 It represents the correlation between Y and X p when the influences of the other predictor variables have been removed from both Y and X p.

19 More Descriptions of ‘Partial Correlation Coefficient’ It is the correlation between Y and X p when the other predictor variables are ‘held constant’. It is the correlation between Y and X p for people who have identical scores on the other predictor variables.

20 Part Correlations Another correlation sometimes examined (but not in our approach) is called the ‘part’ or ‘semipartial’ correlation. In this correlation the influence of the other predictor variables (X 1 -X p-1 ) is only removed from X p, rather than from both X p and Y (see the handout on Partial Correlations).

21 Partial This and Partial That We have three ‘partial’ terms: Partial regression coefficient: the value of β (or equivalently est. β = b) that goes with a particular predictor variable. Partial correlation coefficient: the correlation between Y and a particular predictor variable after the influence of the other predictor variables has been removed from both Y and that variable. Partial coefficient of determination: the PRE of adding a particular predictor variable to a model that already contains the other predictor variables. It is the (partial correlation coefficient)² Now let’s see how the terms connect.

22 Back to Our Example Dependent Variable: GPA Predictor Variables: 1.HS_Rank 2.SAT_V 3.SAT_M Let’s look at the various ‘partial’ values that go with the predictor variable SAT_M.

23 The ‘Partial’ Plot 1.Use the other predictor variables (HS rank and SAT_V) to predict Y i. 2.Compute the error of those predictions (I.e. create a variable consisting of Y i –Ŷ i ). This is a variable of residuals (showing how much the actual Y scores vary from what HS rank and SAT_V can predict). Name this variable Y residuals.

24 The ‘Partial’ Plot (cont.) 3.Use the other predictor variables (HS rank and SAT_V) to predict SAT_M. 4.Compute the error of those predictions (I.e. create a variable consisting of SAT_M i actual – SAT_M i predicted scores). This is a variable of residuals (showing how much the actual SAT_M scores vary from what HS rank and SAT_V can predict). Name this variable SAT_M residuals.

25 The ‘Partial’ Plot (cont.) 5.Now graph the scatter plot of Y residuals and SAT_M residuals. This is the relationship between Y and SAT_M after the influence of the other predictor variables have been removed (from both of them). This is equivalent to saying the relationship between Y and SAT_M when the values of the other variables are held constant.

26 The ‘Partial’ Plot (cont.)

27 The ‘Partial’ Plot (cont.) The partial regression coefficient is the slope of that regression line. The partial correlation coefficient is the correlation shown in the plot (the correlation between the Y residuals and SAT_M residuals ). The partial coefficient of determination is the r² of that correlation (how much we gain by using the regression line rather than the mean of the Y residuals scores to predict the Y residuals scores). Note that the mean of the Y residuals scores =0.

28 Back to Our Example (again) Dependent Variable: GPA Predictor Variables: 1.HS_Rank 2.SAT_V 3.SAT_M See SPSS printout.

29 Test of Worthwhileness of Overall Model Y=GPA Model C: Ŷ i = β 0 (where β 0 is μ Y ) Model A: Ŷ i = β 0 + β 1 ( HSRank i ) + β 2 (SAT_V i ) + β 3 (SAT_M i ) PRE=.220 F*= p<.001 est. η²=.214

30 A Look at Each Predictor Variable We will now examine each predictor variable individually, looking at the analysis of adding each variable last to a model that already contains the other predictor variables.

31 HSRank: Analysis Model C: Ŷ i = β 0 + β 2 (SAT_V i ) + β 3 (SAT_M i ) Model A: Ŷ i = β 0 + β 2 (SAT_V i ) + β 3 (SAT_M i ) + β 1 ( HSRank i ) From SPSS: Ŷ i = ( HSRank i ) +.011(SAT_V i ) +.022(SAT_M i ) 1.b 1 =.027, test to determine whether β 1  0: t=8.3, p< Partial correlation between HS rank and GPA (i.e. the correlation between those two variables when the other predictor variables are held constant (i.e. the influences of the other predictor variables have been removed from HS rank and GPA): PRE of adding HS rank to the model (i.e. moving from Model C to Model A): 0.38²=0.14 (p<.001, same as from part ‘1’ above). Extra parameter of Model A worthwhile.

32 HSRank: Residual Plot The relationship between HSRank and GPA with the other variables held constant. The slope of the regression line is.027 (i.e. b 1 ), the correlation between HSRank and GPA in this plot is.38 (i.e. the partial correlation), the PRE of using HSRank to predict GPA is.38²=0.14

33 SAT_V Model C: Ŷ i = β 0 + β 1 ( HSRank i ) + β 3 (SAT_M i ) Model A: Ŷ i = β 0 + β 1 ( HSRank i ) + β 2 (SAT_V i ) + β 3 (SAT_M i ) As before: Ŷ i = ( HSRank i ) +.011(SAT_V i ) +.022(SAT_M i ) 1.b 2 =.011, test to determine whether β 2  0: t=2.5, p= Partial correlation between SAT_V and GPA (i.e. the correlation between those two variables when the other predictor variables are held constant (i.e. the influences of the other predictor variables have been removed from SAT_V and GPA): PRE of adding SAT_V to the model (i.e. moving from Model C to Model A):.126²=0.016 (p=.011, same as from part ‘1’ above). Extra parameter of Model A worthwhile.

34 SAT_V: Residual Plot The relationship between SAT_V and GPA with the other variables held constant. The slope of the regression line is.011 (i.e. b 2 ), the correlation between SAT_V and GPA in this plot is.126 (i.e. the partial correlation), the PRE of using SAT_V to predict GPA is.126²=0.016

35 SAT_M Model C: Ŷ i = β 0 + β 1 ( HSRank i ) + β 2 (SAT_V i ) Model A: Ŷ i = β 0 + β 1 ( HSRank i ) + β 2 (SAT_V i ) + β 3 (SAT_M i ) As before: Ŷ i = ( HSRank i ) +.011(SAT_V i ) +.022(SAT_M i ) 1.b 3 =.022, test to determine whether β 3  0: t=4.5, p< Partial correlation between SAT_M and GPA (i.e. the correlation between those two variables when the other predictor variables are held constant (i.e. the influences of the other predictor variables have been removed from SAT_M and GPA): PRE of adding SAT_M to the model (i.e. moving from Model C to Model A):.216²=0.047 (p<.000, same as from part ‘1’ above). Extra parameter of Model A worthwhile.

36 SAT_M: Residual Plot The relationship between SAT_M and GPA with the other variables held constant. The slope of the regression line is.022 (i.e. b 3 ), the correlation between SAT_M and GPA in this plot is.216 (i.e. the partial correlation), the PRE of using SAT_M to predict GPA is.216²=0.047

37 Tolerances HSRank=.995 SAT_V=.893 SAT_M=.890 The values of the tolerances show that the predictor variables were not very redundant, leaving each with the opportunity to significantly add to the model if their correlation with Y is high.

38 Another Example We are interested in the relationship between unemployment (UN) and industrial production (IP). We expect there to be a negative correlation between the two (the higher the industrial production that year the lower the unemployment, and vice versa).

39 Data YearUN (millions) IPYear Code

40 MODELS Y=UN X=IP MODEL C: Ŷ i =β 0 =2.82 MODEL A: Ŷ i =β 0 + β 1 X i = (X i ) PRE=.098, p=.379 Not only do we not reject H0, but the slope was unexpectedly a positive value!

41 Scatter Plot UN and IP

42 Bringing ‘Year’ into the Model Let’s take a look at the relationship between unemployment and year (using the ‘year codes’ of 1 through 10). Y=UN X=Year MODEL C: Ŷ i =β 0 =2.82 MODEL A: Ŷ i =β 0 + β 1 X i = (X i ) PRE=.428, p=.04 We reject H0, it is worthwhile to add year to the model (compare to using just the mean)

43 Scatter Plot UN and Year

44 Year and IP Let’s see year’s ability to predict industrial production (IP). Y=IP X=Year MODEL C: Ŷ i =β 0 =138 MODEL A: Ŷ i =β 0 + β 1 X i = (X i ) PRE=.821, p<.001 We reject H0, year is also good for predicting industrial production.

45 Year as a Suppressor Variable Perhaps the variable ‘year’ is having a large effect on both unemployment (UN) and industrial production (IP), and is thus masking the relationship between UN and IP. If this is true then year would be called a suppressor variable.

46 Residuals Let’s take a look at the relationship between unemployment and industrial production when the effects of year are removed from both variables.

47 Residuals from Using Year to Predict UN and IP Year CodeUN residualsIP residuals

48 Scatterplot of Residuals

49 UN and IP Residuals Partial correlation coefficient: PRE=.77 p=.002 So there is a negative correlation between UN and IP once the effect of year has been taken out of both UN and IP.

50 SPSS Output We don’t need to actually compute the residuals of using year to predict unemployment, and then using year to predict industrial production, to find the relationship between unemployment and industrial production after the effect of year on both variables has been incorporated into the model. SPSS gives us all that in its computation of the partial regression coefficient and the partial correlation coefficient. See the handout from the course web site.

51 What We Are Doing MODEL C: Ŷ i =β 0 + β 1 Year i MODEL A: Ŷ i =β 0 + β 1 Year i + β 2 IP i There are a couple of ways of thinking about what we are doing in this example: 1.We are testing to see if adding IP is worthwhile to a model that already contains year. 2.We are examining the relationship between IP and UN when the effect of Year is held constant.

52 Review of Terms MODEL C: Ŷ i =β 0 + β 1 Year i MODEL A: Ŷ i =β 0 + β 1 Year i + β 2 IP i The β’s are partial regression coefficients, their values will be influenced both by the relationship between the predictor variable and Y as well as by the other predictor variables in the model. The relationship between a predictor variable and Y when the effects of the other predictor variables on both are controlled (held constant) is called the partial correlation coefficient. Squaring the partial correlation coefficient gives you the coefficient of partial determination, which is the PRE of adding that predictor variable to a model that already contains the other variables.