Linear Regression 1 Oct 2010 CPSY501 Dr. Sean Ho Trinity Western University Please download from “Example Datasets”: Record2.sav ExamAnxiety.sav.

Slides:

Advertisements

Similar presentations

Correlation and Linear Regression.

Advertisements

6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.

Simple Linear Regression 1. Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable.

Understanding the General Linear Model

LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.

LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.

Statistics for the Social Sciences Psychology 340 Spring 2005 Prediction cont.

Statistics for the Social Sciences

Chapter 10 Simple Regression.

Bivariate Regression CJ 526 Statistical Analysis in Criminal Justice.

Multivariate Data Analysis Chapter 4 – Multiple Regression.

Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,

Correlational Designs

Correlation and Regression Analysis

Summary of Quantitative Analysis Neuman and Robson Ch. 11

Multiple Regression Dr. Andy Field.

CHAPTER 5 REGRESSION Discovering Statistics Using SPSS.

Leedy and Ormrod Ch. 11 Gray Ch. 14

ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?

Simple Covariation Focus is still on ‘Understanding the Variability” With Group Difference approaches, issue has been: Can group membership (based on ‘levels.

CPSY 501: Lecture 5 Please download and open in SPSS: 04-Record2.sav (from Lec04) and 05-Domene.sav Steps for Regression Analysis (continued) Hierarchical.

Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.

Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.

ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.

L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.

Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.

Examining Relationships in Quantitative Research

Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.

Ch14: Linear Correlation and Regression 13 Mar 2012 Dr. Sean Ho busi275.seanho.com Please download: 09-Regression.xls 09-Regression.xls HW6 this week Projects.

Department of Cognitive Science Michael J. Kalsher Adv. Experimental Methods & Statistics PSYC 4310 / COGS 6310 Regression 1 PSYC 4310/6310 Advanced Experimental.

Mixed-Design ANOVA 5 Nov 2010 CPSY501 Dr. Sean Ho Trinity Western University Please download: treatment5.sav.

Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.

Education 793 Class Notes Multiple Regression 19 November 2003.

Chapter 16 Data Analysis: Testing for Associations.

Discussion of time series and panel models

Scatterplots & Regression Week 3 Lecture MG461 Dr. Meredith Rolfe.

Psychology 820 Correlation Regression & Prediction.

Examining Relationships in Quantitative Research

Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.

ANCOVA. What is Analysis of Covariance? When you think of Ancova, you should think of sequential regression, because really that’s all it is Covariate(s)

Advanced Statistical Methods: Continuous Variables REVIEW Dr. Irina Tomescu-Dubrow.

: Pairwise t-Test and t-Test on Proportions 25 Oct 2011 BUSI275 Dr. Sean Ho HW6 due Thu Please download: Mileage.xls Mileage.xls.

Ch15: Multiple Regression 3 Nov 2011 BUSI275 Dr. Sean Ho HW7 due Tues Please download: 17-Hawlins.xls 17-Hawlins.xls.

Handout Twelve: Design & Analysis of Covariance

Ch14: Correlation and Linear Regression 27 Oct 2011 BUSI275 Dr. Sean Ho HW6 due today Please download: 15-Trucks.xls 15-Trucks.xls.

Linear Regression Chapter 7. Slide 2 What is Regression? A way of predicting the value of one variable from another. – It is a hypothetical model of the.

4 basic analytical tasks in statistics: 1)Comparing scores across groups  look for differences in means 2)Cross-tabulating categoric variables  look.

Multiple Regression 8 Oct 2010 CPSY501 Dr. Sean Ho Trinity Western University Please download from “Example Datasets”: Record2.sav Domene.sav.

Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.

Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.

SOCW 671 #11 Correlation and Regression. Uses of Correlation To study the strength of a relationship To study the direction of a relationship Scattergrams.

Applied Quantitative Analysis and Practices LECTURE#28 By Dr. Osman Sadiq Paracha.

Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.

Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 18 Multivariate Statistics.

Multiple Linear Regression An introduction, some assumptions, and then model reduction 1.

Mixed-Design ANOVA 13 Nov 2009 CPSY501 Dr. Sean Ho Trinity Western University Please download: treatment5.sav.

STATISTICAL TESTS USING SPSS Dimitrios Tselios/ Example tests “Discovering statistics using SPSS”, Andy Field.

Michael J. Kalsher PSYCHOMETRICS MGMT 6971 Regression 1 PSYC 4310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher.

Regression. Why Regression? Everything we’ve done in this class has been regression: When you have categorical IVs and continuous DVs, the ANOVA framework.

Chapter 14 Introduction to Multiple Regression

Chapter 4 Basic Estimation Techniques

REGRESSION G&W p

Bivariate & Multivariate Regression Analysis

Linear Regression Prof. Andy Field.

Regression Analysis Week 4.

Ass. Prof. Dr. Mogeeb Mosleh

Introduction to Regression

Regression Part II.

Presentation transcript:

Linear Regression 1 Oct 2010 CPSY501 Dr. Sean Ho Trinity Western University Please download from “Example Datasets”: Record2.sav ExamAnxiety.sav

1 Oct 2010CPSY501: multiple regression2 Outline for today Correlation and Partial Correlation Linear Regression (Ordinary Least Squares) Using Regression in Data Analysis Requirements: Variables Requirements: Sample Size Assignments & Projects

1 Oct 2010CPSY501: multiple regression3 Causation from correlation? Ancient proverb: “correlation ≠ causation” But: sometimes correlation can suggest causality, in the right situations! We need >2 variables, and it's Not the “ultimate” (only, direct) cause, but Degree of influence one var has on another  (“causal relationship”, “causality”) Is a significant fraction of the variability in one variable explained by the other variable(s)?

1 Oct 2010CPSY501: multiple regression4 How to get causality? Use theory and prior empirical evidence e.g., temporal sequencing: can't be T2 → T1 Order of causation? (gender & career aspir.) Best is a controlled experiment: Randomly assign participants to groups Change the level of the IV for each group Observe impact on DV But this is not always feasible/ethical! Still need to “rule out” other potential factors

1 Oct 2010CPSY501: multiple regression5 Potential hidden factors What other vars might explain these correlatns? Height and hair length  e.g., both are related to gender Increased time in bed and suicidal thoughts  e.g., depression can cause both # of pastors and # of prostitutes (!)  e.g., city size To find causality, must rule out hidden factors  e.g., cigarettes and lung cancer

1 Oct 2010CPSY501: multiple regression6 Example: immigration study Level of Acculturation Psychological Well-being ImmigrationFollow-up 1 Year

1 Oct 2010CPSY501: multiple regression7 Partial Correlation Purpose: to measure the unique relationship between two variables – after the effects of other variables are “controlled for”. Two variables may have a large correlation, but It may be largely due to the effects of other moderating variables So we must factor out their influence, and See what correlation remains (partial) SPSS algorithm assumes parametricity There exist non-parametric methods, though

1 Oct 2010CPSY501: multiple regression8 Visualizing Partial Correlation Moderating Variable (e.g., city size) Variable 2 (e.g., prostitutes) Variable 1 (e.g., pastors) Partial Correlation Direction of Causation? ?

1 Oct 2010CPSY501: multiple regression9 Practise: Partial Correlation Dataset: Record2.savRecord2.sav Analyze → Correlate → Partial, Variables: adverts, sales Controlling for: airplay, attract Or: Analyze → Regression → Linear: Dependent: sales; Independent: all other var Statistics → “Part and partial correlations”  Uncheck everything else All partial r, plus regular (0-order) r

1 Oct 2010CPSY501: multiple regression10 Linear Regression When we use correlation to try to infer direction of causation, we call it regression: Combining the influence of a number of variables (predictors) to determine their total effect on another variable (outcome).

1 Oct 2010CPSY501: multiple regression11 Simple Regression: 1 predictor Regression Simple regression is predicting scores on an outcome variable from a single predictor variable Mathematically equiv. to bivariate correlation

1 Oct 2010CPSY501: multiple regression12 Outcome Intercept Gradient Predictor Error OLS Regression Model In Ordinary Least-Squares (OLS) regression, the “best” model is defined as the line that minimizes the error between model and data (sum of squared differences). Conceptual description of regression line (General Linear Model): Y = b 0 + b 1 X 1 + (B 2 X 2 … + B n X n ) + ε

1 Oct 2010CPSY501: multiple regression13 Assessing Fit of a Model R 2 : proportion of variance in outcome accounted for by predictors: SS model / SS tot Generalization of r 2 from correlation F ratio: variance attributable to the model divided by variance unexplained by model F ratio converts to a p-value, which shows whether the “fit” is good If fit is poor, predictors may not be linearly related to the outcome variable

1 Oct 2010CPSY501: multiple regression14 Example: Record Sales Dataset: Record2.savRecord2.sav Outcome variable: Record sales Predictor: Advertising Budget Analyze → Regression → Linear Dependent: sales; Independent: adverts Statistics → Estimates, Model fit, Part and partial correlations

1 Oct 2010CPSY501: multiple regression15 Record2: Interpreting Results Model Summary: R Square =.335 adverts explains 33.5% of variability in sales ANOVA: F = and Sig. =.000 There is a significant effect Report: R 2 =.335, F(1, 198) = , p <.001 Coefficients: (Constant) B = (Y-intcpt) Unstandardized advert B =.096 (slope) Standardized advert Beta =.578  (uses variables converted to z-scores) Linear model: Ŷ =.578 * AB z + 134

1 Oct 2010CPSY501: multiple regression16

1 Oct 2010CPSY501: multiple regression17 General Linear Model Actually, nearly all parametric methods can be expressed as generalizations of regression! Multiple Regression: several scale predictors Logistic Regression: categorical outcome ANOVA: categorical IV t-test: dichotomous IV ANCOVA: mix of categorical + scale IVs Factorial ANOVA: several categorical IVs Log-linear: categorical IVs and DV We'll talk about each of these in detail later

1 Oct 2010CPSY501: multiple regression18 Regression Modelling Process 1.Develop RQ: IVs/DVs, metrics Calc required sample size & collect data 2.Clean: data entry errors, missing data, outliers 3.Explore: assess requirements, xform if needed 4.Build model: try several models, see what fits 5.Test model: “diagnostic” issues: Multivariate outliers, overly influential cases 6.Test model: “generalizability” issues: Regression assumptions: rebuild model

1 Oct 2010CPSY501: multiple regression19 Selecting Variables According to your model or theory, what variables might relate to your outcomes? Does the literature suggest important vars? Do the variables meet all the requirements for an OLS multiple regression? (more on this next week) Record sales example: DV: what is a possible outcome variable? IV: what are possible predictors, and why?

1 Oct 2010CPSY501: multiple regression20 Choosing Good Predictors It's tempting to just throw in hundreds of predictors and see which ones contribute most Don't do this! There are requirements on how the predictors interact with each other! Also, more predictors → less power Have a theoretical/prior justification for them Example: what's a DV you are interested in? Come up with as many possible good IVs as you can – have a justification!  Background, internal personal, current external environment

1 Oct 2010CPSY501: multiple regression21 Using Derived Variables You may want to use derived variables in regression, for example: Transformed variables (to satisfy assumptions) Interaction terms: (“moderating” variables) e.g., Airplay * Advertising Budget Dummy variables: e.g., coding for categorical predictors Curvilinear variables (non-linear regression) e.g., looking at X 2 instead of X

1 Oct 2010CPSY501: multiple regression22 Required Sample Size Depends on effect size and # of predictors Use G*Power to find exact sample size Rough estimates on pp of Field Consequences of insufficient sample size: Regression model may be overly influenced by individual participants (not generalizable) Can't detect “real” effects of moderate size Solutions: Collect more data from more participants! Reduce number of predictors in the model

1 Oct 2010CPSY501: multiple regression23 Requirements on DV Must be interval/continuous: If not: mathematics simply will not work Solutions:  Categorical DV: use Logistic Regression  Ordinal DV: use Ordinal Regression, or possibly convert into categorical form Independence of scores (research design) If not: invalid conclusions Solutions: redesign data set, or  Multi-level modelling instead of regression

1 Oct 2010CPSY501: multiple regression24 Requirements on DV, cont. Normal (use normality tests): If not: significance tests may be misleading Solutions: Check for outliers, transform data, use caution in interpreting significance Unbounded distribution (obtained range of responses versus possible range of responses) If not: artificially deflated R 2 Solutions:  Get more data in missing range  Use more sensitive instrument

1 Oct 2010CPSY501: multiple regression25 Requirements on IV Scale-level Can be categorical, too (see next page) If ordinal, either threshold into categorical or treat as if scale (only if have sufficient number of ranks) Full-range of values If an IV only covers 1-3 on a scale of 1-10, then the regression model will predict poorly for values beyond 3 More requirements on the behaviour of IVs with each other and the DV (covered next week)

1 Oct 2010CPSY501: multiple regression26 Categorical Predictors Regression can work for categorical predictors: If dichotomous: code as 0 and 1 e.g., 1 dichotomous predictor, 1 scale DV: Regression is equivalent to t-test! And the slope B 1 is the difference of means If n categories: use “dummy” variables Choose a base category Create n-1 dichotomous variables e.g., {BC, AB, SK}: dummys are isAB, isSK