Multiple Regression Models: Some Details & Surprises Review of raw & standardized models Differences between r, b & β Bivariate & Multivariate patterns.

Slides:



Advertisements
Similar presentations
Bivariate &/vs. Multivariate
Advertisements

Transformations & Data Cleaning
Automated Regression Modeling Descriptive vs. Predictive Regression Models Four common automated modeling procedures Forward Modeling Backward Modeling.
Research Hypotheses and Multiple Regression: 2 Comparing model performance across populations Comparing model performance across criteria.
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, Semi-Partial correlations, statistical control Using model plotting to think about ANCOVA & Statistical control.
Chapter 6 (p153) Predicting Future Performance Criterion-Related Validation – Kind of relation between X and Y (regression) – Degree of relation (validity.
MULTIPLE REGRESSION. OVERVIEW What Makes it Multiple? What Makes it Multiple? Additional Assumptions Additional Assumptions Methods of Entering Variables.
Research Hypotheses and Multiple Regression Kinds of multiple regression questions Ways of forming reduced models Comparing “nested” models Comparing “non-nested”
Linear Discriminant Function LDF & MANOVA LDF & Multiple Regression Geometric example of LDF & multivariate power Evaluating & reporting LDF results 3.
Statistical Control Regression & research designs Statistical Control is about –underlying causation –under-specification –measurement validity –“Correcting”
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, Semi-Partial correlations, statistical control Using model plotting to think about ANCOVA & Statistical control.
Introduction to Path Analysis Reminders about some important regression stuff Boxes & Arrows – multivariate models beyond MR Ways to “think about” path.
Multiple Regression Models Advantages of multiple regression Important preliminary analyses Parts of a multiple regression model & interpretation Differences.
Power Analysis for Correlation & Multiple Regression Sample Size & multiple regression Subject-to-variable ratios Stability of correlation values Useful.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Multiple Regression Models
Simple Regression correlation vs. prediction research prediction and relationship strength interpreting regression formulas –quantitative vs. binary predictor.
Lecture 6: Multiple Regression
Introduction to Path Analysis & Mediation Analyses Review of Multivariate research & an Additional model Structure of Regression Models & Path Models Direct.
Multiple-group linear discriminant function maximum & contributing ldf dimensions concentrated & diffuse ldf structures follow-up analyses evaluating &
Simple Correlation Scatterplots & r Interpreting r Outcomes vs. RH:
Introduction to Path Analysis Ways to “think about” path analysis Path coefficients A bit about direct and indirect effects What path analysis can and.
Multivariate Analyses & Programmatic Research Re-introduction to Multivariate research Re-introduction to Programmatic research Factorial designs  “It.
Bivariate & Multivariate Regression correlation vs. prediction research prediction and relationship strength interpreting regression formulas process of.
2x2 BG Factorial Designs Definition and advantage of factorial research designs 5 terms necessary to understand factorial designs 5 patterns of factorial.
An Introduction to Classification Classification vs. Prediction Classification & ANOVA Classification Cutoffs, Errors, etc. Multivariate Classification.
Multiple Regression Research Methods and Statistics.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Multiple Regression Dr. Andy Field.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Example of Simple and Multiple Regression
Principal Components An Introduction
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, November 19 Chi-Squared Test of Independence.
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
Regression with 2 IVs Generalization of Regression from 1 to 2 Independent Variables.
CHAPTER 6, INDEXES, SCALES, AND TYPOLOGIES
MULTIPLE REGRESSION Using more than one variable to predict another.
Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Extension to Multiple Regression. Simple regression With simple regression, we have a single predictor and outcome, and in general things are straightforward.
Statistical Control Statistical Control is about –underlying causation –under-specification –measurement validity –“Correcting” the bivariate relationship.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
BIOL 582 Lecture Set 11 Bivariate Data Correlation Regression.
Coding Multiple Category Variables for Inclusion in Multiple Regression More kinds of predictors for our multiple regression models Some review of interpreting.
Intro to Factorial Designs The importance of “conditional” & non-additive effects The structure, variables and effects of a factorial design 5 terms necessary.
ANOVA and Linear Regression ScWk 242 – Week 13 Slides.
Correlation & Regression Chapter 5 Correlation: Do you have a relationship? Between two Quantitative Variables (measured on Same Person) (1) If you have.
SW388R6 Data Analysis and Computers I Slide 1 Multiple Regression Key Points about Multiple Regression Sample Homework Problem Solving the Problem with.
Regression Models w/ 2 Quant Variables Sources of data for this model Variations of this model Main effects version of the model –Interpreting the regression.
Chapter 16 Data Analysis: Testing for Associations.
Discussion of time series and panel models
Plotting Linear Main Effects Models Interpreting 1 st order terms w/ & w/o interactions Coding & centering… gotta? oughta? Plotting single-predictor models.
Education 795 Class Notes P-Values, Partial Correlation, Multi-Collinearity Note set 4.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Simple Correlation Scatterplots & r –scatterplots for continuous - binary relationships –H0: & RH: –Non-linearity Interpreting r Outcomes vs. RH: –Supporting.
D/RS 1013 Data Screening/Cleaning/ Preparation for Analyses.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, partial correlations & multiple regression Using model plotting to think about ANCOVA & Statistical control Homogeneity.
General Linear Model What is the General Linear Model?? A bit of history & our goals Kinds of variables & effects It’s all about the model (not the data)!!
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Choosing and using your statistic. Steps of hypothesis testing 1. Establish the null hypothesis, H 0. 2.Establish the alternate hypothesis: H 1. 3.Decide.
Regression. Why Regression? Everything we’ve done in this class has been regression: When you have categorical IVs and continuous DVs, the ANOVA framework.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Multiple Regression Prof. Andy Field.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Regression.
Chapter 6 Predicting Future Performance
Multiple Regression – Part II
Chapter 6 Predicting Future Performance
Presentation transcript:

Multiple Regression Models: Some Details & Surprises Review of raw & standardized models Differences between r, b & β Bivariate & Multivariate patterns Suppressor Variables Colinearity MR Surprises: –Multivariate power –Null Washout –Extreme colinearity Missing Data

raw score regression y’ = b 1 x 1 + b 2 x 2 + b 3 x 3 + a each b represents the unique and independent contribution of that predictor to the model for a quantitative predictor tells the expected direction and amount of change in the criterion for a 1-unit change in that predictor, while holding the value of all the other predictors constant for a binary predictor (with unit coding -- 0,1 or 1,2, etc.), tells direction and amount of group mean difference on the criterion variable, while holding the value of all the other predictors constant a the expected value of the criterion if all predictors have a value of 0

standard score regression Z y ’ =  Z x1 +  Z x2 +  Z x3 each  for a quantitative predictor the expected Z-score change in the criterion for a 1-Z-unit change in that predictor, holding the values of all the other predictors constant for a binary predictor, tells size/direction of group mean difference on criterion variable in Z-units, holding all other variable values constant As for the standardized bivariate regression model there is no “a” or “constant” because the mean of Zy’ always = Zy = 0 The most common reason to refer to standardized weights is when you (or the reader) is unfamiliar with the scale of the criterion. A second reason is to promote comparability of the relative contribution of the various predictors (but see the important caveat to this discussed below!!!).

Different kinds of correlations & regression weights r -- simple correlation tells the direction and strength of the linear relationship between two variables (r =  for bivariate models) b -- raw regression weight from a bivariate model tells the expected change (direction and amount) in the criterion for a 1-unit increase in the predictor  -- standardized regression weight from a bivariate model tells the expected change (direction and amount) in the criterion in Z-score units for a 1-Z-score unit increase in that predictor b i -- raw regression weight from a multivariate model tells the expected change (direction and amount) in the criterion for a 1-unit increase in that predictor, holding the value of all the other predictors constant  i -- standardized regression weight from a multivariate model tells the expected change (direction and amount) in the criterion in Z-score units for a 1-Z-score unit change in that predictor, holding the value of all the other predictors constant

What influences the size of bivariate r, b &  ????? r -- bivariate correlation range = to strength of linear relationship with the criterion -- sampling “problems” (e.g., range restriction) b -- raw-score regression weights range = -∞ to ∞ -- strength of linear relationship with the criterion -- scale differences between & criterion -- sampling “problems” (e.g., range restriction)  -- standardized regression weights range = to strength of linear relationship with the criterion -- sampling “problems” (e.g., range restriction)

What influences the size of multivariate b i &  i b (raw-score regression weights range = -∞ to ∞ -- strength of linear relationship with the criterion -- collinearity with the other predictors -- scale differences between predictor and criterion -- sampling “problems” (e.g., range restriction)  -- standardized regression weights range = to strength of relationship with the criterion -- collinearity with the other predictors -- sampling “problems” (e.g., range restriction) Difficulties of determining “more important contributors” -- b is not very helpful - scale differences produce b differences --  works better, but influenced by sampling variability and measurement influences (range restriction) Only interpret “very large”  differences as evidence that one predictor is “more important” than another

y x1x1 x2x2 x3x3 Venn diagrams representing r, b and R 2 r y,x1 r y,x2 r y,x3

y x1x1 x2x2 x3x3 Remember that the b of each predictor represents the part of that predictor shared with the criterion that is not shared with any other predictor -- the unique contribution of that predictor to the model b x1 &  x1 b x2 &  x2 b x3 &  x2

y x1x1 x2x2 x3x3 Remember R 2 is the total variance shared between the model (all of the predictors) and the criterion (not just the accumulation of the parts uniquely attributable to each predictor). R 2 = + + +

Bivariate vs. Multivariate Analyses & Interpretations We usually perform both bivariate and multivariate analyses with the same set of predictors. Why? Because they address different questions correlations ask whether variables each have a relationship with the criterion bivariate regressions add information about the details of that relationship (how much change in Y for how much change in that X) multivariate regressions tell whether variables have a unique contribution to a particular model (and if so, how much change in Y for how much change in that X after holding all the other Xs constant) So, it is important to understand the different outcomes possible when performing both bivariate and multivariate analyses with the same set of predictors.

Simple correlation with the criterion Multiple regression weight Non-contributing – probably because colinearity with one or more other predictors Non-contributing – probably because of weak relationship with the criterion Bivariate relationship and multivariate contribution (to this model) have same sign “Suppressor variable” – no bivariate relationship but contributes (to this model) “Suppressor variable” – bivariate relationship & multivariate contribution (to this model) have different signs There are 5 patterns of bivariate/multivariate relationship Bivariate relationship and multivariate contribution (to this model) have same sign

Here’s a depiction of the two different reasons that a predictor might not be contributing to a multiple regression model the variable isn’t correlated with the criterion 2. the variable is correlated with the criterion, but is collinear with one or more other predictors (we can’t tell which), and so, has no independent contribution to the multiple regression model y x1 x2 x3 X1 has a substantial r with the criterion and has a substantial b x2 has a substantial r with the criterion but has a small b because it is collinear with x1 x3 has neither a substantial r nor substantial b

predictor  age UGPA GRE work hrs #credits r(p).11(.32).45(.01).38(.03) -.15(.29).28(.04) b(p).06(.67) 1.01(.02).002(.22).023(.01) -.15(.03) Bivariate & Multivariate contributions – DV = Grad GPA Bivariate relationship and multivariate contribution (to this model) have same sign Non-contributing – probably because colinearity with one or more other predictors Non-contributing – probably because of weak relationship with the criterion “Suppressor variable” – no bivariate relationship but contributes (to this model) “Suppressor variable” – bivariate relationship & multivariate contribution (to this model) have different signs age UGPA GRE #credits work hrs

predictor  #fish #reptiles ft 2 #employees #owners r(p) -.10(.31).48(.01) -.28(.04).37(.03) -.08(.54) b(p) -.96(.03) 1.61(.42) 1.02(.02) 1.823(.01) -.65(.83) Bivariate & Multivariate contributions – DV = Pet Quality Non-contributing – probably because of weak relationship with the criterion Bivariate relationship and multivariate contribution (to this model) have same sign “Suppressor variable” – no bivariate relationship but contributes (to this model) “Suppressor variable” – bivariate relationship & multivariate contribution (to this model) have different signs Non-contributing – probably because colinearity with one or more other predictors #fish #reptiles ft 2 #employees #owners

How to think about suppressor effects ? To be a suppressor, the variable must contribute to the multivariate model AND not be correlated with the criterion OR be correlated with the criterion with the opposite sign of b i A suppressor effect means that ”the part of the predictor that is not related to the other predictors,” is better/differently related with the criterion than is “the whole predictor”. ft 2 from last example… -r – fish quality is negatively correlated with store size +b in mreg – fish quality is positively correlated with the part of store size that is not related to #fish, #reptiles, #employees & #owners (the hard part is to figure out what to call “the part of store size that is not related to #fish, #reptiles, #employees & #owners”)

What to do with a suppressor variable ?? One common response is to “simplify the model” by dumping any suppressor variables from the model... Another is to label the suppressor variable and then ignore it... A much better approach is to determine which other variables in the equation are involved Look at the collinearities among the predictors (predictors that are positively correlated with some predictors and negative correlated with others are the most likely to be involved in suppressor effects) Check each 2-predictor, 3-predictor, etc. model (ways including the target variable), to reproduce the suppressor effect (this is less complicated with variables you know well) Then you can (sometimes) figure out an interesting & informative interpretation of thesuppression suppression often indicates “mediational models” & sometimes interaction/moderation effects

While we’re on this collinearity thing… It is often helpful to differentiate between three “levels” of collinearity… 1. Collinearity -- correlations among predictors -- the stuff of life -- behaviors, attributes and opinions are related to each other -- consequences -- forces us to carefully differentiate between the question asked of simple correlation (whether or not a given predictor correlates with that criterion) vs. the question asked by multiple correlation (whether or not a given predictor contributes to a particular model of that criterion) Collinearity can be assessed using the “tolerance” statistic, which, for each predictor, is 1 - R²predicting that predictor using all the other predictors (larger values are “better”)

2. Extreme collinearity one useful definition is when the collinearities are as large or larger than the validities (correlations between the predictors and the criterion) -- need to consider whether the collinearity is really between the predictor constructs, or the predictor measures (do predictors have overlapping elements?) -- may need to “select” or “aggregate” to form smaller set of predictors 3. Singularity -- when one or more predictors is perfectly correlated with one or more other predictors -- be sure not to include as predictors a set of variables and another that is their total (or mean) -- will need to “select” or “aggregate” to form smaller set of predictors

Another concern we have is “range restriction” … when the variability of a predictor or criterion variable in the sample is less than the variability of the represented construct in the population -- the consequence is that the potential correlation between that variable and others will be less than 1.00 Two major sources of range restriction … 1. Sample doesn’t represent population of interest examples -- selection research, analog research 2. Poor fit between sample and measure used-- also called “floor” or “ceiling” effects examples -- MMPI with “normals”, BDI with inpatients Range restriction will yield a sample correlation that under-estimates the population correlation !!

Range restriction issues in multiple regression if the criterion is range restricted -- the strength of the model will be underestimated -- “good predictors” will be missed (Type II errors) if all the predictors are range restricted -- same as above the real problem is.. (huge and almost impossible to avoid) DIFFERENTIAL range restriction among the predictors -- relative importance of predictors as single predictors and contributors to multiple regression models will be misrepresented in the sample (if is concern over this which will be why we don’t just inspect  weights to determine which predictors are “more important” in a multiple regression model)

As we talked about, collinearity among the multiple predictors can produce several patterns of bivariate-multivariate results. There are three specific combinations you should be aware of (none of which is really common, but each can be perplexing if they aren’t expected)… 1.Multivariate Power -- sometimes a set of predictors none of which are significantly correlated with the criterion can produce a significant multivariate model (with one or more contributing predictors) How’s that happen? The error term for the multiple regression model and the test of each predictor’s b is related to 1-R 2 of the model Adding predictors will increase the R 2 and so lower the error term – sometimes leading to the model and one or more predictors being “significant” This happens most often when one or more predictors have “substantial” correlations, but the sample power is low

2.Null Washout -- sometimes a set of predictors with only one or two significant correlations to the criterion will produce a model that is not significant. Even worse, those significantly correlated predictors may or may not be significant contributors to the non-significant model How’s that happen? The F-test of the model R 2 really (mathematically) tests the average contribution of all the predictors in the model So, a model dominated by predictors that are not substantially correlated with the criterion might not have a large enough “average” contribution to be statistically significant This happens most often when the sample power is low and there are many predictors R² / k F = (1 - R²) / (N - k - 1)

3.Extreme collinearity -- sometimes a set of predictors all of which are significantly correlated with the criterion can produce a significant multivariate model with one or more contributing predictors How’s that happen? Remember that in a multiple regression model each predictor’s b weight reflects the unique contribution of that predictor to that model If the predictors are more highly correlated with each other than with the criterion then the “overlap” each has with the criterion is shared with 1 or more other predictors, and so, no predictor has much unique contribution to that very successful (high R 2 ) model y x1 x2 x3 x4

Missing Data Missing data happen for many different reasons and how you treat the missing values is likely to change the results you get Casewise or Listwise Deletion Only cases that have “complete data” are used in any of the analyses Which cases those are can change as the variables used in the analysis change Pairwise Analyses Use whatever cases have “complete data” for that analysis Which cases those are can change as the variables used in the analysis change In particular  watch for results of different analyses reported with different sample sizes or no sample sizes