Regression with 2 IVs Generalization of Regression from 1 to 2 Independent Variables.

Slides:



Advertisements
Similar presentations
Multiple Regression Analysis
Advertisements

Partial and Semipartial Correlation
Regression Basics Predicting a DV with a Single IV.
Overview Correlation Regression -Definition
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Statistics for the Social Sciences
Chapter 12 Simple Regression
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
7/2/ Lecture 51 STATS 330: Lecture 5. 7/2/ Lecture 52 Tutorials  These will cover computing details  Held in basement floor tutorial lab,
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Ch. 14: The Multiple Regression Model building
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Multiple Regression Research Methods and Statistics.
Correlation and Regression Analysis
Simple Linear Regression Analysis
Relationships Among Variables
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Smith/Davis (c) 2005 Prentice Hall Chapter Eight Correlation and Prediction PowerPoint Presentation created by Dr. Susan R. Burns Morningside College.
Review Guess the correlation. A.-2.0 B.-0.9 C.-0.1 D.0.1 E.0.9.
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Example of Simple and Multiple Regression
Objectives of Multiple Regression
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
ASSOCIATION BETWEEN INTERVAL-RATIO VARIABLES
Lecture 14 Multiple Regression Model
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Introduction to Regression Analysis. Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test.
Chapter 15 Correlation and Regression
Regression. Correlation and regression are closely related in use and in math. Correlation summarizes the relations b/t 2 variables. Regression is used.
1 1 Slide © 2016 Cengage Learning. All Rights Reserved. The equation that describes how the dependent variable y is related to the independent variables.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
Curvilinear 2 Modeling Departures from the Straight Line (Curves and Interactions)
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Warsaw Summer School 2015, OSU Study Abroad Program Regression.
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
Chapter 13 Multiple Regression
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Regression. Outline of Today’s Discussion 1.Coefficient of Determination 2.Regression Analysis: Introduction 3.Regression Analysis: SPSS 4.Regression.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 18 Multivariate Statistics.
Multiple Regression.
Chapter 14 Introduction to Multiple Regression
Regression and Correlation
Review Guess the correlation
Regression 11/6.
Regression 10/29.
Essentials of Modern Business Statistics (7e)
Multiple Regression.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Chapter 15 Linear Regression
Quantitative Methods Simple Regression.
Multiple Regression.
Statistics for the Social Sciences
Curvilinear Regression
Product moment correlation
Presentation transcript:

Regression with 2 IVs Generalization of Regression from 1 to 2 Independent Variables

Questions Write a raw score regression equation with 2 IVs in it. What is the difference in interpretation of b weights in simple regression vs. multiple regression? What happens to b weights if we add new variables to the regression equation that are highly correlated with ones already in the equation? Why do we report beta weights (standardized b weights)?

More Questions Write a regression equation with beta weights in it. How is it possible to have a significant r- square and non- significant b weights? What are the three factors that influence the standard error of the b weight? Describe R-square in two different ways, that is, using two distinct formulas. Explain the formulas.

Equations 1 IV. Define terms. Multiple IVs. One score, 1 intercept; 1 error; many slopes. Predicted value. Recall slope and intercept for 1 IV: Sum of cross-products.

Equations (2) Note: b weights use SSx1, SSx2, and all 3 cross products. Unlike slopes, the intercept is a simple extension of the 1 IV case.

Numerical Example Chevy mechanics; mechanical aptitude & conscientiousness. Find sums. Job PerfMech Apt Consc YX1X2X1*YX2*YX1*X YX1X2X1*YX2*YX1*X Sum 20 N M SD USS Note. Only some of the data are shown. Size problem in Powerpoint.

SSCP Matrix SSCP means sums of squares and cross-products. YX1X2 Y29.75 X X YX1X2 Y X1 X2

Find Estimates YX1 (MA)X2 (Consc) Y (Perf)29.75 X X SSCP Predicted job performance as a function of test scores.

Scatterplots

Scatterplot 2

Scatterplot 3 Predicted Y is a plane.

R2R2 Y is linear function of Xs plus error. YX1X2Y'Resid M = V = USS= Use capital R for multiple regression. R 2 is the proportion of variance in Y due to regression. Note: N=19; lost 1.

Correlations Among Data YX1X2PredResid Y1 X1.731 X Pred Resid

Excel Example Grab from the web under Lecture, Excel Example.

Review Write a raw score regression equation with 2 IVs in it. Describe terms. Describe a concrete example where you would use multiple regression to analyze the data. What does R 2 mean in multiple regression? For your concrete example, what would an R 2 of.15 mean? With 1 IV, the IV and the predicted values correlate 1.0. Not so with 2 or more IVs. Why?

Significance Test for R 2 When the null is true, result is distributed as F with k and (N-k-1) df. In our example, R 2 =.61, k = 2 and N = 20. F(α=.05,2,17)=3.59.

The Problem of Variable Importance With 1 IV, the correlation provides a simple index of the ‘importance’ of that variable. Both r and r 2 are good indices of importance with 1 IV. With multiple IVs, total R-square will be the sum of the individual IV r 2 values, if and only if the IVs are mutually uncorrelated, that is, they correlate to some degree with Y, but not with each other. When multiple IVs are correlated, there are many different statistical indices of the ‘importance’ of the IVs, and they do not agree with one another. There is no simple answer to questions about the importance of correlated IVs. Rather there are many reasonable answers depending on what you mean by importance.

Venn Diagrams {easy but not always right} Fig 1. IVs uncorrelated. Fig 2. IVs correlated. r 2 for X1, Y. RYX1X2 Y1 X1.401 X R 2 = =.61 RYX1X2 Y1 X1.501 X R 2 =.32 R =.41 What to do with shared Y?

More Venn Diagrams Desired state Typical state In a regression problem, we want to predict Y from X as well as possible (maximize R 2 ). To do so, want X variables correlated with Y but not X. Hard to find, e.g., cognitive ability tests.

Raw & Standardized Regression Weights Each X has a raw score slope, b. Slope tells expected change in Y if X changes 1 unit*. Large b weights should indicate important variables, but b depends on variance of X. A b for height in inches would be 12 times larger than b for height in feet. If we standardize X and Y, all units of X are the same. Relative size of b now meaningful. *strictly speaking, holding other X variables constant.

Computing Standardized Regression Weights Standardized regression weight aka beta weight. Poor choice of names & symbols. With 1 IV,. If you have a correlation matrix, you can calculate beta weights (standardized regression weights). Yx1x2 Y1 x x What is r 12 ? What impact?

Calculating R 2 Sum of squared simple (zero order) r. Product of standardized regression weight and r. This is really interesting because the sum of products will add up to R 2 and because r,, and the product of the two are all reasonable indices of the importance of the IV.

Calculating R 2 (2) Yx1x2 Y1 x x

Review What is the problem with correlated independent variables if we want to maximize variance accounted for in the criterion? Why do we report beta weights (standardized b weights)? Describe R-square in two different ways, that is, using two distinct formulas. Explain the formulas.

Tests of Regression Coefficients (b Weights) Each slope tells the expected change in Y when X changes 1 unit, but X is controlled for all other X variables. Consider Venn diagrams. Standard errors of b weights with 2 IVs: Where S 2 y.12 is the variance of estimate (variance of residuals), the first term in the denominator is the sum of squares for X 1 or X 2, and r 2 12 is the squared correlation between predictors.

Tests of b Weights (2) SSres=9.42 For significance of the b weight, compute a t: Degrees of freedom for each t are N-k-1.

Tests of R 2 vs Tests of b Slopes (b) tell about the relation between Y and the unique part of X. R 2 tells about proportion of variance in Y accounted for by set of predictors all together. Correlations among X variables increase the standard errors of b weights but not R 2. Possible to get significant R 2, but no or few significant b weights (see Venn diagrams). Possible but unlikely to have significant b but not significant R 2. Look to R 2 first. If it is n.s., avoid interpreting b weights.

Review How is it possible to have a significant R-square and non-significant b weights? Write a regression equation with beta weights in it. Describe terms.

Testing Incremental R 2 You can start regression with a set of one or more variables and then add predictors 1 or more at a time. When you add predictors, R 2 will never go down. It usually goes up, and you can test whether the increment in R 2 is significant or else if likely due to chance. =R-square for the larger model =R-square for the smaller model = number of predictors in the larger model =number of predictors in the smaller model

Examples of Testing Increments Suppose we start with 1 variable and R-square is.52. We add a second variable and R-square increases to.67. We have 20 people. Then p<.05 Suppose we start with 3 IVs and R-square is.25. We add 2 more IVs in a block and R-square climbs to.35. We have 100 people. Then: p <.05

Another Look at Importance In regression problems, the most commonly used indices of importance are the correlation, r, and the increment to R-square when the variable of interest is considered last. The second is sometimes called a last-in R-square change. The last-in increment corresponds to the Type III sums of squares and is closely related to the b weight. The correlation tells about the importance of the variable ignoring all other predictors. The last-in increment tells about the importance of the variable as a unique contributor to the prediction of Y, above and beyond all other predictors in the model. You can assign shared variance in Y to specific X by adding variable to equations in order, but then the importance is sort of arbitrary and under your influence. “Importance” is not well defined statistically when IVs are correlated. Doesn’t include mediated models (path analysis).

Review Find data on website – Labs, then 2IV example Find r, beta, r*beta Describe importance