Multiple Regression.

Slides:



Advertisements
Similar presentations
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
Advertisements

Automated Regression Modeling Descriptive vs. Predictive Regression Models Four common automated modeling procedures Forward Modeling Backward Modeling.
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, Semi-Partial correlations, statistical control Using model plotting to think about ANCOVA & Statistical control.
Review ? ? ? I am examining differences in the mean between groups
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Regression single and multiple. Overview Defined: A model for predicting one variable from other variable(s). Variables:IV(s) is continuous, DV is continuous.
Regression With Categorical Variables. Overview Regression with Categorical Predictors Logistic Regression.
Chapter 17 Making Sense of Advanced Statistical Procedures in Research Articles.
Simple Linear Regression 1. Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable.
MULTIPLE REGRESSION. OVERVIEW What Makes it Multiple? What Makes it Multiple? Additional Assumptions Additional Assumptions Methods of Entering Variables.
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, Semi-Partial correlations, statistical control Using model plotting to think about ANCOVA & Statistical control.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Interactions in Regression.
Multiple Regression Models: Some Details & Surprises Review of raw & standardized models Differences between r, b & β Bivariate & Multivariate patterns.
ANCOVA Psy 420 Andrew Ainsworth. What is ANCOVA?
Intro to Statistics for the Behavioral Sciences PSYC 1900
Lecture 6: Multiple Regression
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
LECTURE 13 PATH MODELING EPSY 640 Texas A&M University.
Multiple Regression – Basic Relationships
Multiple Regression Dr. Andy Field.
Discriminant Analysis Testing latent variables as predictors of groups.
Relationships Among Variables
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, November 19 Chi-Squared Test of Independence.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
Wednesday PM  Presentation of AM results  Multiple linear regression Simultaneous Simultaneous Stepwise Stepwise Hierarchical Hierarchical  Logistic.
Discriminant Function Analysis Basics Psy524 Andrew Ainsworth.
Understanding Regression Analysis Basics. Copyright © 2014 Pearson Education, Inc Learning Objectives To understand the basic concept of prediction.
Soc 3306a Multiple Regression Testing a Model and Interpreting Coefficients.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
Chapter 9 Analyzing Data Multiple Variables. Basic Directions Review page 180 for basic directions on which way to proceed with your analysis Provides.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
By: Amani Albraikan.  Pearson r  Spearman rho  Linearity  Range restrictions  Outliers  Beware of spurious correlations….take care in interpretation.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Psychology 820 Correlation Regression & Prediction.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Descriptions. Description Correlation – simply finding the relationship between two scores ○ Both the magnitude (how strong or how big) ○ And direction.
Linear Regression Chapter 7. Slide 2 What is Regression? A way of predicting the value of one variable from another. – It is a hypothetical model of the.
Multiple Regression David A. Kenny January 12, 2014.
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
Applied Quantitative Analysis and Practices LECTURE#28 By Dr. Osman Sadiq Paracha.
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, partial correlations & multiple regression Using model plotting to think about ANCOVA & Statistical control Homogeneity.
Week of March 23 Partial correlations Semipartial correlations
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
Regression. Why Regression? Everything we’ve done in this class has been regression: When you have categorical IVs and continuous DVs, the ANOVA framework.
Stats Methods at IC Lecture 3: Regression.
Theme 6. Linear regression
Chapter 15 Multiple Regression Model Building
Multiple Regression: II
Logistic Regression APKC – STATS AFAC (2016).
Correlation, Bivariate Regression, and Multiple Regression
Multiple Regression Prof. Andy Field.
Understanding Regression Analysis Basics
Moderation, Mediation, and Other Issues in Regression
Regression.
Understanding Research Results: Description and Correlation
Regression Analysis.
Review I am examining differences in the mean between groups How many independent variables? OneMore than one How many groups? Two More than two ?? ?
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Multiple Regression

Multiple Regression Regression Attempts to predict one criterion variable using one predictor variable Addresses the question: Does the predictor significantly predict the criterion?

Multiple Regression Multiple Regression Attempts to predict one criterion variable using 2+ predictor variables Addresses the questions: Do the predictors significantly predict the criterion? If so, which predictor is best? Allows for variance to be removed from one predictor prior to evaluating the rest (like ANCOVA)

Multiple Regression How to compare the predictive value of 2+ predictors When comparing multiple predictors within an experiment Use standardized b (β) β = bxs/sintercept z-score = lets you compare performance between 2 variables with different metrics, by addressing performance relative to a sample mean & SD

Multiple Regression How to compare the predictive value of 2+ predictors When comparing multiple predictors between experiments Use b SE highly variable between experiments  the SE from Exp. 1 ≠ the SE from Exp. 2  β’s from both experiments not comparable Can’t compare z-score of your Stats grade from this semester with your Stats grade if you take the class again next semester If next semester’s class is especially dumb, you appear to have gotten much smarter

Multiple Regression Magnitude of the relationship between one predictor and a criterion (b/β) in a model dependent upon the other predictors in that model Relationship between IQ and SES (with College GPA and Parents’ SES in the model) will be different if more, less, or different predictors included in the model

Multiple Regression Why not? When comparing the results of 2 experiments using regression, coefficients (b/β) will not be the same Will be similar to the extent that the regression models are similar Why not?

Multiple Regression Coefficients (b/β) represent partial and semipartial (part) correlations, not traditional Pearson’s r Partial Correlation – the correlation between 2 variables with the variance from one or more variables removed I.e. correlation between the residuals of both variables, once variance from one or more covariates has been removed

Multiple Regression Partial Correlation = the amount of the variance in a criterion that is associated with a predictor that could not be explained by the other covariate(s)

Multiple Regression Semipartial/Part Correlation -the correlation between 2 variables with the variance from one or more variables removed from the predictor only (i.e. not the criterion) I.e. correlation between the residuals of the predictor, once variance from one or more covariates has been removed, and the criterion

Multiple Regression Part Correlation = the amount of variance that a predictor explains in a criterion once variance from the covariates has been removed I.e. the percentage of the total variance left unexplained by the covariate that the predictor accounts for Since the variance that is removed from the criterion depends on the other predictors in the model, different models yield different regression coefficients

Partial Correlation = B Part Correlation = B/A + B

Multiple Regression How to compare the predictive value of 2+ predictors Remember: Regression coefficients are very unstable from sample to sample, so interpret large differences in coefficients only (> ~.2)

Multiple Regression Like regression, tests: Also can test: Ability of each predictor to predict the criterion variable (tests b’s/β’s) Overall ability of the model (all predictors combined) to predict the criterion variable (Model R2) Model R2 = total % variance in criterion accounted for by predictors Model R = correlation between predictors and criterion Also can test: If one or more predictors can predict the criterion if variance from one or more other predictors is removed If each predictor significantly increases the Model R2

Multiple Regression Predictors are evaluated with variance from other predictors removed More than one way to remove this variance Examine all predictors en masse with variance from all other predictors removed Remove variance from one or more predictors first, then look at second set Like in factorial ANCOVA

Multiple Regression This is done by specifying different selection methods Selection method = method of inputting predictors into a regression equation Four most commonly used methods Commonly-used = Only 4 methods offered by SPSS

Multiple Regression Selection Methods Simultaneous – Adds all predictors at once & is therefore the lack of a selection method Good if there is no theory to guide which predictors should be entered first But when does this ever happen?

Multiple Regression Selection Methods All Subsets – Computer finds method of entering predictors that maximizes overall Model R2 But SPSS doesn’t do it and it finds best subset in your particular dataset – since data, not theory, guiding selection method not guarantee that the model will generalize to other datasets, particularly in smaller samples

Multiple Regression Selection Methods Backward Elimination – Starts will all predictors in the model and eliminates the predictor with least unique variance related to criterion iteratively until all predictors are significant Iterative = process involving several steps It begins with all predictors, so predictors with least variance not overlapping with other predictors (i.e. that would be partialled out) are removed But, also atheoretical/based on data only

Multiple Regression Selection Methods Forward Selection – the opposite of backward elimination - starts will the predictor in the model most strongly related to the criterion and adds the predictor next most strongly-related to criterion iteratively until a nonsignificant predictor is found Step 1: predictor most correlated with the criterion (P1)  Step 2: add strongest predictor when P1 partialled out But also atheoretical

Multiple Regression Selection Methods Stepwise Technically, any selection method that procedes iteratively (in steps) is stepwise (i.e. both backward elimination and forward selection) However, usually refers to method where order of predictors is determined in advance by the researcher based upon theory

Multiple Regression Selection Method Stepwise Why would you use it? Same reason as covariates in ANCOVA Want to know if Measure A of treatment adherence is better than Measure B? Run stepwise regression and enter Measure B first, then Measure A with treatment outcome as the criterion. Addresses the question: Does Measure A predict treatment outcome even when variance from Measure B has already been removed (i.e. above and beyond Measure B)?

Multiple Regression Selection Method Stepwise Why would you use it? Running a repeated-measures design and want to make sure your groups are equal on pre-test scores? Enter the pre-test into the first step of your regression.

Multiple Regression Assumptions Linearity of Regression Variables linearly related to one another Normality in Arrays Actual values of DV normally distributed around predicted values (i.e. regression line) – AKA regression line is good approximation of population parameter Homogeneity of Variance in Arrays Assumes that variance of criterion is equal for all levels of predictor(s)

Multiple Regression Issues to be aware of: Range Restriction Heterogenous Subsamples Outliers With multiple predictors, must be aware of both univariate outliers (unusual values on one variable) as well as multivariate outliers (unusual values on two or more variables)

Multiple Regression Outliers Univariate outlier – a man weighing 500 lbs. Multivariate outlier – a man who is 6’ tall and weights 120 lbs. – Note neither value is a univariate outlier, but both together are quite odd Three variables define the presence of an outlier in multiple regression: Distance – distance from the regression line Leverage – distance from predictor mean Influence – average of distance and leverage

Distance – distance from the regression line See A Leverage – distance from predictor mean See B Influence – average of distance and leverage

Multiple Regression Degree of Overlap in Predictors Adding predictors is like adding covariates in ANCOVA: In adding one that correlates too highly with others, model R2 remains unchanged but df decreases, making the regression less powerful Tolerance = multiple R2 between all predictors – want to be low Examine bivariate correlations between predictors, if correlation exceeds internal consistency (α), get rid of one of them

Multiple Regression Multiple regression can also test for more complex relationships, such as mediation and moderation Mediation – when one variable (predictor) operates on another variable (criterion) via a third variable (mediator)

Math self-efficacy mediates math ability and interest in a math major Must establish paths A & B, and that path C is smaller when paths A & B are included in the model (i.e. math self-efficacy accounts for variance in interest in a math major above and beyond math ability)

Find significant correlations between the predictor and mediator (path A) and mediator and criterion (path B) Run a stepwise regression with the predictor entered first, then the predictor and mediator entered together in step 2

Multiple Regression The mediator should be a significant predictor of the criterion in step 2 The predictor-criterion relationship (b/β) should decrease from step 1 to step 2 Full mediation: If this relationship is significant in step 1, but nonsignificant in step 2 Partial mediation: This relationship is significant in step 1, and smaller, but still significant, in step 2

Multiple Regression Partial mediation Sobel’s test (1982): tests the statistical significance of this mediation relationship Regress predictor on mediator (path A) and mediator on criterion (path B) in 2 separate regressions Calculate sβ for path A & B, where sβ = β/t Calculate a t-statistic, where df = n – 3 and

Multiple Regression Multiple regression can also test for more complex relationships, such as mediation and moderation Moderation (in regression) – when the strength of a predictor-criterion changes as a result of a third variable (moderator) Interaction (in ANOVA) – when the strength of the relationship between an IV and DV changes as a function of levels of the IV

Multiple Regression Moderation Unlike in ANOVA, you have to create a moderator term for yourself by multiplying the predictor and moderator In SPSS, go to Transform  Compute Typical to enter the predictor and mediator in the first step of a regression and the interaction term in the second step to determine the contribution of the mediator above and beyond the main effect terms Just like how variance is partitioned in a factorial ANOVA

Logistic Regression Logistic Regression = used to predict a dichotomous criterion (only 2 levels) variable with 1+ continuous or discrete predictors Can’t use linear regression with a dichotomous criterion because: Dichtotomous = assuming the criterion isn’t normally distributed (i.e. assumption of normality in arrays is violated)

Can’t use linear regression with a dichotomous criterion because: Regression line fits data more poorly when predictor = 0 (i.e. assumption of homogeneity of variance arrays is violated)

Logistic Regression Logistic Regression Interpreting coefficients In logistic regression, b represents change in log odds in criterion with one point increase in predictor Raise “ex” where x = b, to find the odds – b = -.0812  e-.0812 = .9220

Logistic Regression Logistic Regression Interpreting coefficients Continuous predictor: One pt. increase in predictor corresponds to decreasing (because b is neg) odds of criterion by factor of .922 (almost 100% or twice as likely) Dichotomous predictor: Odds of change in one group vs. other group (sign indicates increase or decrease)