Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept. 2010.

Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept. 2010

Why Multiple Regression?  In real world, using only one predictor (IV) to interpret or predict a outcome variable (DV) is rare. Mostly, we need several IV’s.  Multiple regression (Pearson, 1908) is to investigate the relationship between several independent or predictor variables and a dependent or criterion variable.

The prediction equation in multiple regression Y’ = predicted Y score a = intercept b 1 … b k = regression coefficients X 1 … X k = scores of IVs With two IV’s:

Calculation of basic statistics 1  Calculation with two IV’s is similar to one IV. However, it is not hard but tedious.  We need knowledge of matrix operations to perform calculations with 3 or more IV’s.  Good news is that we can have the computer do the calculations!

Calculation of basic statistics 2

Calculation of basic statistics 3

Why calculations, as always?  Intercept (a) & regression coefficients (b’s) !

Brain exercise  Now, we have the regression line!  What’s next?  The predicted Y or Y’!  Then what?  Deviation due to regression ( ) and the regression sum of squares ( ).  Deviation due to residuals ( ) and the residual sum of squares ( ).

Sum of squares  Recall that we have plenty ways to calculate the sum of squares. Some methods allow us to calculate sum of squares without using Y’:  Remember, we need Y’ to calculate residuals, which are essential for regression diagnostics (chapter 3).

Squared multiple correlation coefficient  R-square indicates the proportion of variance of the DV (Y) accounted for by the IV’s (X’s).  Note that R 2 is equivalent to for two IV’s.

Test of significance of R 2 F test: if R 2 is significantly different from 0. Rule of thumb: We reject H 0 when the calculated F is greater than the table (critical) value or the calculated probability is less than α.  significance level fail to reject H 0 reject H 0 F critical Probability, p

Test of significance of individual b’s  T-test (mostly two-tailed, except that we can rule out one direction): if b is significantly different from 0. Rule of thumb: We reject H 0 when the absolute value of calculated T is greater than the table (critical) value or the calculated probability is less than α.  fail to reject H 0 reject H 0 

Test of R 2 vs. test of b  Test of R 2 is equivalent to testing all the b’s simultaneously.  Test of a given b for significance is to determine whether it differs from 0 while controlling for the effects of the other IV’s.  For simple linear regression, they are equivalent ( ).

Confidence interval  Definition: If an experiment was repeated many times, 100(1-α)% of these intervals would contain µ.  If the CI does not include 0, we reject H 0 and conclude that the given regression coefficient significantly differs from 0.

Test of increments in proportion of variance accounted for (R 2 change)  In multiple linear regression, we could test amount of R 2 increases or decreases when a given IV or a set of variables are added to or deleted from the regression equation.

Test of increments in proportion of variance accounted for (R 2 change)  The test is equivalent to testing significance of individual b if one IV is added to or deleted from the regression equation.  Note that the R 2 change caused by a given IV or a set of IV’s depends on the order of addition or deletion.

Commonly used methods of adding or deleting variables  Enter: enter all IV’s at once in a single model  Stepwise: enter IV’s one by one in several models commonly based on R 2  Forward: enter IV’s one by one based on strength of correlation with DV.  Backward: enter all IV’s and delete weakest one unless it significantly affects the model.  Hierarchical: enter IV’s (one or more at a time) according to certain theoretical framework.

Standardized regression coefficient (β, beta)  In SPSS (now PASW) output, we have something like this:  Is it a population parameter?

Standardized regression coefficient (β, beta)  Sample unstandardized regression coefficient (b) is the expected change in Y associated with one measurement unit change of in X.  Sample standardized regression coefficient (β) is the expected change in standard deviation of Y associated with a change of one standard deviation in X.

Standardized regression coefficient (β, beta)  The regression equation now is:  Note that the α disappears because standardized score for a constant is always 0.  β could be used to determine the relative contribution of individual IV to account for variance in DV.

What about the correlation coefficients (r’s)?  Later, we will discuss the correlation coefficients in details, mostly in chapter 7 (Statistical Control: Partial and Semipartial Correlation).

Remarks  Multiple regression is an upgraded version of simple linear regression and its interpretation is similar to simple linear regression.  We need emphasize on contributions of each individual IV’s.  To some extent, multiple IV’s have better explanation and prediction on the DV – it is not always true.

Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept. 2010.

Similar presentations

Presentation on theme: "Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept. 2010."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept. 2010.

Similar presentations

Presentation on theme: "Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept. 2010."— Presentation transcript:

Similar presentations

About project

Feedback