Quote of the Day "Not everything that counts can be counted and not everything that can be counted counts". Albert Einstein.

Slides:



Advertisements
Similar presentations
Cause (Part II) - Causal Systems I. The Logic of Multiple Relationships II. Multiple Correlation Topics: III. Multiple Regression IV. Path Analysis.
Advertisements

Canonical Correlation
Kin 304 Regression Linear Regression Least Sum of Squares
Structural Equation Modeling
Bivariate and Partial Correlations. X (HT) Y (WT) The Graphical View– A Scatter Diagram X’ Y’
Soc 3306a: Path Analysis Using Multiple Regression and Path Analysis to Model Causality.
Sociology 690 – Data Analysis Simple Quantitative Data Analysis.
Understanding the General Linear Model
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Regression and Correlation
Intro to Statistics for the Behavioral Sciences PSYC 1900
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Structural Equation Modeling Continued: Lecture 2 Psy 524 Ainsworth.
Structural Equation Models – Path Analysis
Example of Simple and Multiple Regression
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, November 19 Chi-Squared Test of Independence.
Introduction to Linear Regression and Correlation Analysis
Soc 3306a Lecture 8: Multivariate 1 Using Multiple Regression and Path Analysis to Model Causality.
Section #6 November 13 th 2009 Regression. First, Review Scatter Plots A scatter plot (x, y) x y A scatter plot is a graph of the ordered pairs (x, y)
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Soc 3306a Multiple Regression Testing a Model and Interpreting Coefficients.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
BIOL 582 Lecture Set 11 Bivariate Data Correlation Regression.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Sociology 680 Multivariate Analysis: Analysis of Variance.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Chapter 13 Multiple Regression
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
1 B IVARIATE AND MULTIPLE REGRESSION Estratto dal Cap. 8 di: “Statistics for Marketing and Consumer Research”, M. Mazzocchi, ed. SAGE, LEZIONI IN.
Multivariate Analysis: Analysis of Variance
Correlation & Regression Analysis
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
DTC Quantitative Research Methods Regression I: (Correlation and) Linear Regression Thursday 27 th November 2014.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Multiple Regression Scott Hudson January 24, 2011.
Stats Methods at IC Lecture 3: Regression.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard)   Week 5 Multiple Regression  
Statistical analysis.
Inference for Least Squares Lines
Cause (Part I) - Elaboration
REGRESSION G&W p
Bivariate & Multivariate Regression Analysis
Statistical analysis.
The Correlation Coefficient (r)
Kin 304 Regression Linear Regression Least Sum of Squares
Multiple Regression.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
BPK 304W Regression Linear Regression Least Sum of Squares
Simple Linear Regression
Correlation and Regression
CHAPTER- 17 CORRELATION AND REGRESSION
Using AMOS With SPSS Files.
Cause (Part II) - Causal Systems
Undergraduated Econometrics
The greatest blessing in life is
The Pearson Correlation
Sociology 690 – Data Analysis
Product moment correlation
Multivariate Analysis: Analysis of Variance
Regression Analysis.
Sociology 690 – Data Analysis
Introduction to Regression
3.2. SIMPLE LINEAR REGRESSION
3 basic analytical tasks in bivariate (or multivariate) analyses:
Multivariate Analysis: Analysis of Variance
The Correlation Coefficient (r)
Presentation transcript:

Quote of the Day "Not everything that counts can be counted and not everything that can be counted counts". Albert Einstein

Using Statistics to Evaluate Cause -The Case for Path Analysis- Professor J. Schutte Psychology 524 April 11, 2017

Two Paradigms for Investigating Cause Internal Validation – Experimental Control Specifying the Conditions (Experimental & Control Groups) Testing Differences (Simple and Complex ANOVA) External Validation – Statistical Control Specifying the People/Variables (Sampling Frames) Testing Relationships (Partial and Multiple Correlation)

What is Cause in Non-Experimental Settings? Cause is a Philosophical not a Statistical Concept In Non-Experimental Settings, it’s Based on: Covariation Over a valid Time frame Of a non-spurious nature Related through theory or logic

Topics in Using Correlation for Causal Analysis: I. Statistical Covariation – Pearson’s r II. Third Variable Effects – Partial Correlation III. The Logic of Multivariate Relationships IV. Multiple Correlation and Regression V. Path Analysis – The Essentials VI. Using AMOS to Automatic Path Analysis

I. Statistical Covariation Pearson’s r - The Bivariate Correlation Coefficient

The Graphical View– A Scatter Diagram Y (WT)   X’ A HIGH POSITIVE CORRELATION . . . . . . . . . .. . . . . . .. . . . . . .. . . . .. . . .. . . . . Y’   X (HT)

The Graphical View– A Scatter Diagram Y (Prejudice)   X’ . .. . . . . . . . . . . .. .. . . . . . . .. . . . . . .. .. . . . . . .. . .. Y’ A HIGH NEGATIVE CORRLEATION   X (Education)

The Graphical View– A Scatter Diagram Y (Births in Brazil)   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .. .. . .. . .. . . . . . . . . . . . . . . . . . . . X’ NO CORRELATION Y’   X (Rainfall in NYC)

1) 2) 3) 4) The Algebraic View – Shared Variance Take the Variance in X = S2x and the Variance in Y = S2y   2)   3)   4)   The correlation is simply 3) divided by 2):  

1) Find the raw scores, means, squared deviations and cross-products: Manually Calculating a Correlation 1) Find the raw scores, means, squared deviations and cross-products:     3) Square r to determine variation explained r2 = .75

An Example of Calculating a Correlation from SPSS INPUT

An Example of Calculating a Correlation from SPSS OUTPUT

II. Partial Relationships Looking for the effects of a third variable

The Partial Correlation Coefficient Step 1 – Determine the zero order correlations (r). Assume our previous correlation (rxy=.86) between Mother’s Education (x) and Daughter’s Education (y). Now assume you now want to control for the effects of a third variable, Father’s Education (Z), as it relates to the original Mother-Daughter correlation (rxy). You would first calculate the zero order correlation Mother’s and Father’s education (rxz), finding it to be .88; then calculate the same for Daughter and Father (ryz), finding it to be .87. How much does the Father’s Education account for the original Mother-Daughter correlation. .86 – (.88) (.88) .091 Step 2 – Calculate the partial correlation (rxy.z) ___ .227 = = = .40 Step 3 – Draw conclusions Before z (rxy)2 = .75 Therefore, Z accounts for (.59/.75) or 79% of the covaration between X&Y After z (rxy.z)2 = .16

Using SPSS for finding Partial Correlation Coefficients INPUT

Using SPSS for finding Partial Correlation Coefficients OUTPUT

III. The Logic of Multivariate Partial Relationships Multiple Correlation and Multiple Regression

Causal Systems I. The Logic of Multiple Relationships X1 Y X2 NR R NR One Dependent Variable, Multiple Independent Variables NR X1 Y R NR X2 In this diagram the overlap of any two circles can be thought of as the r2 between the two variables. When we add a third variable, however, we must ‘partial out’ the redundant overlap of the additional independent variables.

Causal Systems R2y.x1x2 = r2yx1 + r2yx2 R2y.x1x2 = r2yx1 + r2yx2.x1 II. Multiple Correlation and Coefficient of Determination r2yx1 r2yx2 X1 Y X1 NR Y NR X2 X2 R2y.x1x2 = r2yx1 + r2yx2 R2y.x1x2 = r2yx1 + r2yx2.x1 Notice that when the Independent Variables are independent of each other, the multiple correlation coefficient, here squared and called the coefficient of determination (R2), is simply the sum of the individual r2, but if the independent variables are related, R2 is the sum of one zero order r2 of one plus the partial r2 of the other(s). This is required to compensate for the fact that multiple independent variables being related to each other would be otherwise double counted in explaining the same portion of the dependent variable. Partially out this redundancy solves this problem.

Causal Systems Y’ = a + byx1X1 + byx2X2 or Standardized II. Multiple Regression Y’ = a + byx1X1 + byx2X2 X1 Y X2 or Standardized X1 Y’ = Byx1X1 + Byx2X2 Y X2 If we were to translate this into the language of regression, multiple independent variables, that are themselves independent of each other would have their own regression slopes and would simply appear as an another term added in the regression equation.

Causal Systems Y’ = a + byx1X1 + byx2.x1X2 or Standardized Multiple Regression X1 Y Y’ = a + byx1X1 + byx2.x1X2 or Standardized X2 Y’ = Byx1X1 + Byx2.x1X2 X1 Y X2 Once we assume the Independent Variables are themselves related with respect to the variance explained in the Dependent Variable, then we must distinguish between direct and indirect predictive effects. We do this using partial regression coefficients to find these direct effects. When standardized these B-values are called “Path coefficients” or “Beta Weights”

IV. Path Analysis The Essentials

Cause (Part II) - Causal Systems III. Path Analysis – The Steps and an Example 1. Input the data 2. Calculate the Correlation Matrix 3. Specify the Path Diagram 4. Enumerate the Equations 5. Solve for the Path Coefficients (Betas) 6. Interpret the Findings

Path Analysis – Steps and Example Step1 – Input the data Assume you have information from ten respondents as to their income, education, parent’s education and parent’s income. We would input these ten cases and four variables into SPSS in the usual way, as here on the right. In this analysis we will be trying to explain respondent’s income (Y), using the three other independent variables (X1, X2, X3) Y = DV - income X3 = IV - educ X2 = IV - pedu X1 = IV - pinc

Path Analysis – Steps and Example Step 2 – Calculate the Correlation Matrix These correlations are calculated in the usual manner through the “analyze”, “correlate”, bivariate menu clicks. X1 X2 X3 Y Notice the zero order correlations of each IV with the DV. Clearly these IV’s must interrelate as the values of the r2 would sum to an R2 indicating more than 100% of the variance in the DV which, of course, is impossible.

Path Analysis – Steps and Example Step 3 – Specify the Path Diagram Therefore, we must specify a model that explains the relationship among the variables across time We start with the dependent variable on the right most side of the diagram and form the independent variable relationship to the left, indicating their effect on subsequent variables. X1 a e Y f X3 b d Y = Offspring’s income c X1 = Parent’s income X2 X2 = Parent’s education X3 = Offspring’s education Time

Path Analysis – Steps and Example Step 4 – Enumerate the Path Equations With the diagram specified, we need to articulate the formulae necessary to find the path coefficients (arbitrarily indicated here by letters on each path). Overall correlations between an independent and the dependent variable can be separated into its direct effect plus the sum of its indirect effects. X1 a e X3 Y f b d 1. ryx1 = a + brx3x1 + crx2x1 c 2. ryx2 = c + brx3x2 + arx1x2 X2 3. ryx3 = b + arx1x3 + crx2x3 4. rx3x2 = d + erx1x2 5. rx3x1 = e + drx1x2 6. rx1x2 = f

Path Analysis – Steps and Example Step 5 – Solve for the Path Coefficients The easiest way to calculate B is to use the Regression module in SPSS. By indicating income as the dependent variable and pinc, pedu and educ as the independent variables, we can solve for the Beta Weights or Path Coefficients for each of the Independent Variables. These circled numbers correspond to Beta for paths a, c and b, respectively, in the previous path diagram.

Path Analysis – Steps and Example Step 5a – Solving for R2 The SPSS Regression module also calculate R2. According to this statistic, for our data, 50% of the variation in the respondent’s income (Y) is accounted for by the respondent’s education (X3), parent’s education (X2) and parent’s income (X1) R2 is calculated by multiplying the Path Coefficient (Beta) by its respective zero order correlation and summed across all of the independent variables (see spreadsheet at right).

Path Analysis – Steps and Example Checking the Findings ryx1 = a + brx3x1 + crx2x1 .69 = .63 + .31(.68) -.21(.75) e = .50 ryx2 = c + brx3x2 + arx1x2 X1 r = .69 B = .63 .57 = .31 + .63(.68) - .21(.82) r = .75 B = .36 ryx3 = b + arx1x3 + crx2x3 .52 = -.21 + .63(.75) + .31(.82) r = .52 B = -.21 Y X3 r = B =.68 The values of r and B tells us three things: 1) the value of Beta is the direct effect; 2) dividing Beta by r gives the proportion of direct effect; and 3) the product of Beta and r summed across each of the variables with direct arrows into the dependent variable is R2 . The value of 1-R2 is e. r = .82 B = .57 r = .57 B =.31 X2 Time

Path Analysis – Steps and Example Step 6 – Interpret the Findings Y = Offspring’s income X3 = Offspring’s education X2 = Parent’s education X1 e = .50 X1 = Parent’s income .63 Specifying the Path Coefficients (Betas), several facts are apparent, among which are that Parent’s income has the highest percentage of direct effect (i.e., .63/.69 = 92% of its correlation is a direct effect, 8% is an indirect effect). Moreover, although the overall correlation of educ with income is positive, the direct effect of offspring’s education, in these data, is actually negative! .36 Y .68 X3 -.21 .57 .31 X2 Time

Automating Path Analysis V. Using AMOS Automating Path Analysis

First we input the data into SPSS in the usual way

And save it in the usual way (an spss.sav file)

We then open the AMOS Program having saved the Data Set

Input and Label each Variable in the Model

Place all of the variables in time sequence order

Next specify the causal and non-causal connections

Then indicate the error terms of the endogenous variables

Identify the dataset from File Datafile menu

If a new diagram, save as an .amw file first before calculating

Specify the output parameters from the View Analysis menu

Finally, click on the Output Button on the right of the upper box Numbers on the arrow lines are path coefficients

Assumptions Linearity Homoscedasticity Uncorrelated error terms Residuals normally distributed

Example #1 – Age at Marriage

Example #2 – College GPA