Getting More out of Multiple Regression Darren Campbell, PhD.

Slides:



Advertisements
Similar presentations
Questions From Yesterday
Advertisements

Managerial Economics in a Global Economy
Chapter 9: Regression Analysis
Lesson 10: Linear Regression and Correlation
Logistic Regression Psy 524 Ainsworth.
Correlation and Linear Regression.
Statistical Analysis Regression & Correlation Psyc 250 Winter, 2013.
Overview Correlation Regression -Definition
Collinearity. Symptoms of collinearity Collinearity between independent variables – High r 2 High vif of variables in model Variables significant in simple.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
MULTIPLE REGRESSION. OVERVIEW What Makes it Multiple? What Makes it Multiple? Additional Assumptions Additional Assumptions Methods of Entering Variables.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Regression Models w/ k-group & Quant Variables Sources of data for this model Variations of this model Main effects version of the model –Interpreting.
Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding.
Lecture 11 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Ch 11: Correlations (pt. 2) and Ch 12: Regression (pt.1) Nov. 13, 2014.
Elaboration Elaboration extends our knowledge about an association to see if it continues or changes under different situations, that is, when you introduce.
Measures of Association Deepak Khazanchi Chapter 18.
Correlational Designs
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Multiple Regression Research Methods and Statistics.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Simple Linear Regression Analysis
SPSS Session 4: Association and Prediction Using Correlation and Regression.
Example of Simple and Multiple Regression
Objectives of Multiple Regression
Introduction to Multilevel Modeling Using SPSS
Chapter 12 Correlation and Regression Part III: Additional Hypothesis Tests Renee R. Ha, Ph.D. James C. Ha, Ph.D Integrative Statistics for the Social.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Chapter 11 Simple Regression
Regression with 2 IVs Generalization of Regression from 1 to 2 Independent Variables.
ASSOCIATION BETWEEN INTERVAL-RATIO VARIABLES
Moderation & Mediation
Introduction Multilevel Analysis
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
Regression Analyses II Mediation & Moderation. Review of Regression Multiple IVs but single DV Y’ = a+b1X1 + b2X2 + b3X3...bkXk Where k is the number.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
1 Psych 5510/6510 Chapter 10. Interactions and Polynomial Regression: Models with Products of Continuous Predictors Spring, 2009.
Chapter 17 Partial Correlation and Multiple Regression and Correlation.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Multiple regression models Experimental design and data analysis for biologists (Quinn & Keough, 2002) Environmental sampling and analysis.
Examining Relationships in Quantitative Research
1 G Lect 7M Statistical power for regression Statistical interaction G Multiple Regression Week 7 (Monday)
Chapter 7 Relationships Among Variables What Correlational Research Investigates Understanding the Nature of Correlation Positive Correlation Negative.
Regression Lesson 11. The General Linear Model n Relationship b/n predictor & outcome variables form straight line l Correlation, regression, t-tests,
Regression Models w/ 2 Quant Variables Sources of data for this model Variations of this model Main effects version of the model –Interpreting the regression.
Department of Cognitive Science Michael J. Kalsher Adv. Experimental Methods & Statistics PSYC 4310 / COGS 6310 Regression 1 PSYC 4310/6310 Advanced Experimental.
Regression analysis and multiple regression: Here’s the beef* *Graphic kindly provided by Microsoft.
Chapter 16 Data Analysis: Testing for Associations.
Regression Models w/ 2-group & Quant Variables Sources of data for this model Variations of this model Main effects version of the model –Interpreting.
Correlation and Regression: The Need to Knows Correlation is a statistical technique: tells you if scores on variable X are related to scores on variable.
ANOVA, Regression and Multiple Regression March
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Linear Regression Chapter 7. Slide 2 What is Regression? A way of predicting the value of one variable from another. – It is a hypothetical model of the.
Multiple Regression David A. Kenny January 12, 2014.
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
Moderated Multiple Regression II Class 25. Regression Models Basic Linear Model Features: Intercept, one predictor Y = b 0 + b 1 + Error (residual) Do.
Anareg Week 10 Multicollinearity Interesting special cases Polynomial regression.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Michael J. Kalsher PSYCHOMETRICS MGMT 6971 Regression 1 PSYC 4310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher.
Regression. Why Regression? Everything we’ve done in this class has been regression: When you have categorical IVs and continuous DVs, the ANOVA framework.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Correlation & Simple Linear Regression Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1.
Correlation and Regression
Regression Diagnostics
Multiple Regression.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Presentation transcript:

Getting More out of Multiple Regression Darren Campbell, PhD

Overview View on Teaching Statistics When to Apply How to Use & How to Interpret

Multiple Regression Techniques 1. Centring removing /group difference confounds 2. Centring interpret continuous interactions 3. Spline functions – Piecemeal Polynomials  Estimate separate slopes each angle of the regression polynomial

Perks of Multiple Regression  1. Realistic many influences  Behaviour  2. Control over confounds  3. Test for relative importance  4. Identify interactions

Why Not Use ANOVAs? Not realistic:  Many behaviours / constructs are continuous e.g., intelligence, personality Loss of statistical power - categories  scores assumed to be the same + error  mixing systematic patterns into the error term

What is Centring? Simple re-scaling of raw scores  Raw Score minus Some Constant value  x1 – – 5.1 = – 5.1 = -1.1  x2 – – 29.4 = = 5.6

A Simple Case for Centring Babies:  Cry & Fuss – parent report diary measures  Fail about - limb movement Are these 2 infant behaviours related?  Emotional Responses & Emotion Regulation

A Simple Case for Centring AgeMoves / HrCrying Hrs/Day 6 week olds month olds Full Sample Are these 2 infant behaviours related?

6 Week-Olds r = +.47 some infants cry more & move more others cry less & move less

6 Month-Olds r = +.38 some infants cry more & move more others cry less & move less What if we combine the two groups?

Full sample r = Do we get a significant corr? If so, what kind?

What happened with the Correlations? 6 Week-olds: r = Month-Olds: r = Week & 6 Month-olds: r = -0.22

Correlations = Grand Mean Centring 1) Mean Deviations for each variable: X & Y 2) Rank Order Mean Deviations 3) Correlate 2 rank orders of X & Y

The Disappearing Correlation Explained Grand Mean Centring lead to  all the older infants being classified as high movers  young infants low movers  Young high criers & high movers -> high criers & low movers  Large Group differences in movement altered the detection of within-group r’s What should we do?

Solution: Create Group Mean Deviations Re-scale raw scores Raw – Group Mean 6 week-olds: xs – month-olds: xs – 29.4

Solution: Create Group Mean Deviations CryingRaw AL Group Means Group Centred AL

Raw Scores

Group Centred Scores Group mean data r =.41 - full sample Mulitple Regression could also work on uncentred variables Crying = Group + Uncentred AL Not a Group x AL interaction – the relation is the same for both groups

Centring so far 1. Centring is Magic 2. Different types of centring  Depending on the number used to re-scale the data  Grand mean – Pearson Correlations  Group Means – Infant Limb Movements

Regression Interactions Centring Great for Interpreting Interactions  trickier than for ANOVAs  do not have pre-defined levels or groups  based on 2+ continuous vars

Multiple Regression - the Basics The Basic Equation: Y = a + b1*X 1 + b2*X 2 + b3*X 3 + e Outcome = Intercept + Beta1 * predictor1 + B2 * pred2 + B3 * pred3 + Error a = expected mean response of y betas: every 1 unit change in X you get a beta sized change in Y

Regression Interactions Centring Reducing multicollinearity interaction predictor = x1 * x2 x1 & x2 numbers near 0 stay near 0 and high x1 & x2 numbers get really high interaction term is highly correlated with original x1 & x2 variables Centring makes each predictor: x1 & x2 have more moderate numbers above and below zero positive and negative numbers Reduces the multiplicative exaggeration between x1 & x2 and the interaction product x1*x2

Centring to reduce Multicollinearity

Regression Y = a + b1*X 1 + b2*X 2 + b3*X 1 *X 3 + e How does X2 relate to Y at different levels of X1? How does predictor 2 (shyness) relate to the outcome (social interactions) at different stress levels (X1)?

Uncentred DataCentred Data X1 = 26.2 (14.5)X1 = 0.0 (14.5) X2 = 24.8 (27.6)X2 = 0.0 (27.6) x1x2x12yx1cx2cx12cy x **0.65**0.14** x1c ** * x **0.28** x2c **0.28** x ** x12c ** Correlation Matrix: ** p =.01 * p =.05

Regression Equation Results No Interaction:  Y = b0 + b1 * X1 + b2 * X2 Uncentred:  Y = – 4 X X2 ** Centred:  Y = – 4 X X2 **

Regression Equation Results Interaction Term Included:  Y = b0 + b1 * X1 + b2 * X2 + b3 * X1*X2 Uncentred:  Y = 1733 – 19.1 X1 – 31.7 X2 ** X1*X2 Centred:  Y = X X X1*X2

But what does it mean… How does X2 relate to Y at different levels of X1? How does predictor 2 (shyness) relate to the outcome (social interactions) at different stress levels (X1)?

Post Hocs Y = b0 + b1 * X1 + b2 * X2 + b3 * X1*X2 Y = ( b1 * X1 + b0 ) + ( b2 + b3 * X1 ) * X2 -1 SD below X1 Mean& + 1SD above X1 Mean X - ( )X X

Scatterplots: Moving the Y Axis

-1 SD Below X1 Mean  Y = X X X1*X2  t (1,196) = -1.40, p =.16 Centred:  Y = X X X1*X2  t (1,196) = 0.12, p = SD Above X1 Mean  Y = X X2 ** X1*X2  t (1,196) = 3.66, p =.001

Regression Interaction Example Predicting inhibitory ability with motor activity & age  simon says like games  4 to 6 yr-olds & physical movement  Move by Age interaction F (1, 81) = 5.9, p <.02  Young (-1.5SD): move beta sig + Inhibition  Middle (Mean) : move beta p =.10 ~ Inhibition  Older (+1.5SD): move beta n.s. inhibition

Polynomials, Centring, & Spline Functions Polynomial relations: quadratic, cubic, etc Y = a + b1*X 1 - b2*X 1 *X 1 + e

Curvilinear Pattern Assume a symmetric pattern – X 2 But, it may not be... Perceived Control (Y) slowly increases & then declines rapidly in old age

This Brings us to Spline Functions Split up predictor X  2+ variables X Low & X High X Low = X – (-5) & set values at the next change point to zero  Ditto for X High Re-run Y = a + b1*X Low - b2*X High + e

Perks of Spline Functions Estimate slope anywhere along the range Can be sig on one part - n.s. on another Steeper or shallower

Multiple Regression Techniques 1. Centring removing /group difference confounds 2. Centring interpret continuous interactions 3. Spline functions More precise understanding of polynomial patterns

Questions Alpha control procedures for spline functions – Could be argue that you are describing the pattern already identified? – Conservatively, you could apply an alpha control procedure. I like the False Discovery Rate procedures. – Replication is preferred, but not always possible.

Alpha Control Aside The source of Type 1 errors is typically poorly described. Typical: If enough probability tests are run, the probability will increase to the point where something becomes significant just by chance. – But, probability is linked to the representativeness of your data and type 1 error is a proxy for the likelihood of the representativeness of your data. My View: The real source of Type 1 errors is that if you – divide up the data into enough subgroupings – eventually one of those subgroupings will differ because it is misrepresentative of reality.

Standardized vs Centred Centred is x – x M Standardized (x – x M )/ SDx – Makes variability for each predictor = 1 – Standardized Beta = raw b * SDx / SDy – Similar to centring but different metric needs to be adjusted for interaction terms To get comparable results with interaction term – Standardization should be applied to X1 and X2 prior to the X1*X2 estimate then use “raw” coefficients

Centring and Spline Functions Relatively simple procedures Old dogs in the Statistic World  but new tricks for many That’s All Folks!