Multiple Regression David A. Kenny January 12, 2014.

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Multiple Regression W&W, Chapter 13, 15(3-4). Introduction Multiple regression is an extension of bivariate regression to take into account more than.
Kin 304 Regression Linear Regression Least Sum of Squares
Overview Correlation Regression -Definition
Variance and covariance M contains the mean Sums of squares General additive models.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
MULTIPLE REGRESSION. OVERVIEW What Makes it Multiple? What Makes it Multiple? Additional Assumptions Additional Assumptions Methods of Entering Variables.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Statistics for the Social Sciences
9. SIMPLE LINEAR REGESSION AND CORRELATION
Bivariate Regression CJ 526 Statistical Analysis in Criminal Justice.
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
Lecture 6: Multiple Regression
Statistical Analysis SC504/HS927 Spring Term 2008 Session 7: Week 23: 7 th March 2008 Complex independent variables and regression diagnostics.
(Correlation and) (Multiple) Regression Friday 5 th March (and Logistic Regression too!)
Multiple Regression and Correlation Analysis
Elaboration Elaboration extends our knowledge about an association to see if it continues or changes under different situations, that is, when you introduce.
Multiple Regression Research Methods and Statistics.
Correlation and Regression Analysis
Simple Linear Regression Analysis
Variance and covariance Sums of squares General linear models.
Ordinary Least Squares
Objectives of Multiple Regression
Introduction to Linear Regression and Correlation Analysis
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Understanding Multivariate Research Berry & Sanders.
Introduction to Regression Analysis. Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test.
Chapter 6 & 7 Linear Regression & Correlation
Multiple Regression: Advanced Topics David A. Kenny January 23, 2014.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Chapter 9 Analyzing Data Multiple Variables. Basic Directions Review page 180 for basic directions on which way to proceed with your analysis Provides.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
Regression Analyses. Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where.
Examining Relationships in Quantitative Research
Chapter 16 Data Analysis: Testing for Associations.
Chapter 13 Multiple Regression
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
University of Warwick, Department of Sociology, 2012/13 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 5 Multiple Regression.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
General Linear Model.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
DTC Quantitative Research Methods Regression I: (Correlation and) Linear Regression Thursday 27 th November 2014.
There is a hypothesis about dependent and independent variables The relation is supposed to be linear We have a hypothesis about the distribution of errors.
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Venn diagram shows (R 2 ) the amount of variance in Y that is explained by X. Unexplained Variance in Y. (1-R 2 ) =.36, 36% R 2 =.64 (64%)
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Multiple Regression Scott Hudson January 24, 2011.
Chapter 11 REGRESSION Multiple Regression  Uses  Explanation  Prediction.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard)   Week 5 Multiple Regression  
The simple linear regression model and parameter estimation
The Correlation Coefficient (r)
Multiple Regression.
Multiple Regression Example
Multiple Regression.
Quantitative Methods Simple Regression.
BIVARIATE REGRESSION AND CORRELATION
Correlation and Simple Linear Regression
Multiple Regression – Part II
Correlation and Simple Linear Regression
Simple Linear Regression and Correlation
The Correlation Coefficient (r)
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Multiple Regression David A. Kenny January 12, 2014

2 The Equation Y = a + bX + cZ + E Y criterion variable X predictor variable a intercept: the predicted value of Y when all the predictors are zero b regression coefficient: how much of a difference in Y results from a one unit difference in X E residual variable Y, X, Z, and E are variables and a, b, and c are coefficients.

3 Y Hat and the Multiple Correlation The variable Y is the predicted Y given X and Z or equivalently a + bX + cZ, often called "Y hat.“ Note that E = Y - Y R is the multiple correlation: the correlation between Y and Y. Note also that R 2 can be defined as the variance of Y divided by the variance of Y.

4 Least Squares The coefficients (a, b, and c) are chosen so that the sum of squared errors is minimized. The estimation technique is then called least squares or ordinary least squares (OLS). Given the criterion of least squares, the mean of the errors is zero and the errors correlate zero with each predictor.

5 Standardized Variables If the predictor and criterion variables are all standardized, the regression coefficients are called beta weights. A beta weight equals the correlation when there is a single predictor. If there are two or predictors, a beta weights can be larger than +1 or smaller than -1.

6 Order of Entry and Stepwise Regression The predictors in a regression equation have no order and one cannot be said to enter before the other. Generally in interpreting a regression equation, it makes no scientific sense to speak of the variance due to a given predictor. Measures of variance depend on the order of entry in step-wise regression and on the correlation between the predictors. Also the semi-partial correlation or unique variance has little interpretative utility.

7 Assumptions For significance testing the following assumptions are made about the errors or Es: 1) They have a normal distribution. 2) The variance of the errors is constant and does not depend on the level of any predictor. 3) Errors are independent of each other, i.e., no clustering.

8 Significance Testing The standard test of a regression coefficient is to determine if the multiple correlation significantly declines when the predictor variable is removed from the equation and the other predictor variables remain. In most computer programs this is test is given by the t or F next to the coefficient.

9 Multicollinearity If two predictors are highly correlated or if one predictor has a large multiple correlation with the other predictors, there is said to be multicollinearity. With perfect multicollinearity (correlations of plus or minus one), estimation of regression coefficients is impossible. Multicollinearity results in large standard errors for coefficient, and so a statistically significant regression coefficient is difficult (power is low).

10 Multicollinearity Multicollinearity for a given predictor is typically measured by what is called tolerance which is defined as 1 - R 2 where R 2 is the multiple correlation where the predictor now becomes the criterion and the other predictors are the predictors. Generally tolerance values below.20 are considered potentially problematic. Another measure is the variance inflation factor which is defined as 1/(1 - R 2 ). Values above 5 are considered to be potentially problematic.

11 Suppression It can occur that a predictor may have little or correlation with the criterion, but have a moderate to large regression coefficient. For this to happen, two conditions must co-occur: 1) the predictor must be co-linear with one or more other predictor and 2) these predictors have non-trivial coefficients. With suppression, because the suppressor is correlated with a predictor that has large effect on the criterion, the suppressor should correlate with the criterion. To explain this, the suppressor is assumed to have an effect that compensates for the lack of correlation.

12 Advanced Topics Rescaling No intercept Adjusted R 2 Bilinear Effects

13 No Intercept It is possible to run a multiple regression equation but fix the intercept to zero. This is done for different reasons. –There may be a reason to think the intercept is zero: criterion a change score. –May want two intercepts, one for each level of a dichotomous predictor: two- intercept model.

14 Rescaling Imagine the following equation: Y = a + bX + E If X ʹ = cX + d, the new regression equation would be: Y = a + dM X + (bc)X ʹ + E where M X Is the mean of X.

15 Adjusted R 2 The multiple correlation is biased, i.e. too large. We can adjust R 2 for bias by [R 2 – k/(N – 1)][(N – 1)/(N – k -1)] where N is the number of cases and k the number of predictors. If the result is negative, the adjusted R 2 is set to zero. The adjustment is bigger if k is large relative to N. Normally, the adjustment is not made and the regular R 2 is reported.

16 Bilinear or Piecewise Regression Imagine you want the effect of X to change at a given value of X 0. Create two variables X 1 = X when X ≤ X 0, zero otherwise X 2 = X when X > X 0, zero otherwise Regress Y on X 1 and X 2.

17 Example Consider the hypothetical regression equation in which Age (in years) and Gender (1 = Male and –1 = Female) predict weight (in pounds): Weight = (Gender) + 3(Age) + Error

18 Interpretation We interpret the unstandardized coefficients as follows: intercept: the predicted weight for people who are zero years of age and half way between male and female is 12 pounds gender: a difference between men and women on the gender variable equals 2 and so there is a 44 (2 times 22) pound difference between the two groups age: a difference of one year in age results in a difference of 3 pounds It is advisable to center the Age variable. To center Age, we would subtract the mean age from Age. Doing so, would change the intercept to the predicted score for persons of average age in the study.

19 Rescaling Note that if we recoded gender to be 1 = Male and 0 = Female, the new equation would be: Weight = (Gender) + 3(Age) + Error intercept: the predicted weight for women who are zero years of age and is ‑ 10 pounds gender: men weigh on average 44 more pounds than women, controlling for age age: a difference of one year in age results in a difference of 3 pounds