Presentation is loading. Please wait.

Presentation is loading. Please wait.

G89.2247 Lect 21 G89.2247 Lecture 2 Regression as paths and covariance structure Alternative “saturated” path models Using matrix notation to write linear.

Similar presentations


Presentation on theme: "G89.2247 Lect 21 G89.2247 Lecture 2 Regression as paths and covariance structure Alternative “saturated” path models Using matrix notation to write linear."— Presentation transcript:

1 G89.2247 Lect 21 G89.2247 Lecture 2 Regression as paths and covariance structure Alternative “saturated” path models Using matrix notation to write linear models Multivariate Expectations Mediation

2 G89.2247 Lect 22 Question: Does exposure to childhood foster care (X) lead to adverse outcomes (Y) ? Example of purported "causal model" X Y Y = B 0 + B 1 X + e Regression approach  B 0 and B 1 can be estimated using OLS  Estimates depend on sample standard deviations of Y and X, sample means, and covariance between Y and X B 1 = S XY /S 2 X B 0 = M Y -B 1 M X  Correlation, r XY = S XY /S X S Y, can be used to estimate the variance of the residual, e, V(e). S 2 e = S 2 Y (1-r 2 XY ) = S 2 Y - S 2 XY /S 2 X e B1B1

3 G89.2247 Lect 23 A Covariance Structure Approach If we have data on Y and X we can compute a covariance matrix This estimates the population covariance structure,    Y can itself be expressed as B 2 1  2 X +  2 e  Three statistics in the sample covariance matrix are available to estimate three population parameters

4 G89.2247 Lect 24 Covariance Structure Approach, Continued A structural model that has the same number of parameters as unique elements in the covariance matrix is "saturated". Saturated models always fit the sample covariance matrix.

5 G89.2247 Lect 25 Another saturated model: Two explanatory variables The first model is likely not to yield an unbiased estimate of foster care because of selection factors (Isolation failure). Suppose we have a measure of family disorganization (Z) that is known to have an independent effect on Y and also to be related to who is assigned to foster care (X) Y X Z e    XZ

6 G89.2247 Lect 26 Covariance Structure Expression The model: Y=b 0 +b 1 X+b 2 Z+e  If we assume E(X)=E(Z)=E(Y)=0  and V(X) = V(Z) = V(Y) = 1  then b 0 =0 and  's are standardized The parameters can be expressed  When sample correlations are substituted, these expressions give the OLS estimates of the regression coefficients.

7 G89.2247 Lect 27 Covariance Structure: 2 Explanatory Variables In the standardized case the covariance structure is: Each correlation is accounted by two components, one direct and one indirect There are three regression parameters and three covariances.

8 G89.2247 Lect 28 The more general covariance matrix for two IV multiple regression If we do not assume variances of unity the regression model implies

9 G89.2247 Lect 29 More Math Review for SEM Matrix notation is useful

10 G89.2247 Lect 210 A Matrix Derivation of OLS Regression OLS regression estimates make the sum of squared residuals as small as possible.  If Model is Then we choose B so that e'e is minimized. The minimum will occur when the residual vector is orthogonal to the regression plane  In that case, X'e = 0

11 G89.2247 Lect 211 When will X'e = 0? When e is the residual from an OLS fit.

12 G89.2247 Lect 212 Multivariate Expectations There are simple multivariate generalizations of the expectation facts:  E(X+k) = E(X)+k =  x +k  E(k*X) = k*E(X) = k*  x  V(X+k) = V(X) =  x 2  V(k*X) = k 2 *V(X) = k 2 *  x 2 Let X T =[X 1 X 2 X 3 X 4 ],  T =[         ] and let k be scalar value  E(k*X) = k*E(X) = k*   E(X+k* 1 ) = { E(X) + k* 1} =  + k* 1

13 G89.2247 Lect 213 Multivariate Expectations In the multivariate case Var(X) is a matrix  V(X)=E[(X-  ) (X-  ) T ]

14 G89.2247 Lect 214 Multivariate Expectations The multivariate generalizations of  V(X+k) = V(X) =  x 2  V(k*X) = k 2 *V(X) = k 2 *  x 2 Are:  Var( X + k* 1 ) =   Var(k* X ) = k 2  Let c T = [c 1 c 2 c 3 c 4 ]; c T X is a linear combination of the X's.  Var( c T X) = c T  c This is a scalar value If this positive for all values of c then  is positive definite

15 G89.2247 Lect 215 Semi Partial Regression Adjustment The multiple regression coefficients are estimated taking all variables into account  The model assumes that for fixed X, Z has an effect of magnitude  Z.  Sometimes people say "controlling for X" The model explicitly notes that Z has two kinds of association with Y  A direct association through  Z (X fixed)  An indirect association through X (magnitude  X  XZ )

16 G89.2247 Lect 216 Pondering Model 1: Simple Multiple Regression The semi-partial regression coefficients are often different from the bivariate correlations  Adjustment effects  Suppression effects Randomization makes  XZ = 0 in probability. Y X Z e    XZ

17 G89.2247 Lect 217 Mathematically Equivalent Saturated Models Two variations of the first model suggest that the correlation between X and Z can itself be represented structurally. Y X Z eYeY   eZeZ  Y X Z eYeY   eXeX 

18 G89.2247 Lect 218 Representation of Covariance Matrix Both models imply the same correlation structure The interpretation, however, is very different.

19 G89.2247 Lect 219 Model 2: X leads to Z and Y X is assumed to be causally prior to Z.  The association between X and Z is due to X effects. Z partially mediates the overall effect of X on Y  X has a direct effect  1 on Y  X has an indirect effect      on Y through Z  Part of the bivariate association between Z and Y is spurious (due to common cause X) Y X Z eYeY   eZeZ 

20 G89.2247 Lect 220 Model 3: Z leads to X and Y Z is assumed to be causally prior to X.  The association between X and Z is due to Z effects. X partially mediates the overall effect of Z on Y  Z has a direct effect  2 on Y  Z has an indirect effect      on Y through X  Part of the bivariate association between X and Y is spurious (due to common cause Z) Y X Z eYeY   eXeX 

21 G89.2247 Lect 221 Choosing between models Often authors claim a model is good because it fits to data (sample covariance matrix)  All of these models fit the same (perfectly!) Logic and theory must establish causal order There are other possibilities besides 2 and 3  In some instances, X and Z are dynamic variables that are simultaneously affecting each other  In other instances both X and Z are outcomes of an additional variable, not shown.

22 G89.2247 Lect 222 Mediation: A theory approach Sometimes it is possible to argue on theoretical grounds that  Z is prior to X and Y  X is prior to Y  The effect of Z on Y is completely accounted for by the indirect path through X. This is an example of total mediation If   is fixed to zero, then Model 3 is no longer saturated.  Question of fit becomes informative  Total mediation requires strong theory

23 G89.2247 Lect 223 A Flawed Example Someone might try to argue for total mediation of family disorganization on low self-esteem through placement in foster care Baron and Kenny(1986) criteria might be met  Z is significantly related to Y  Z is significantly related to X  When Y is regressed on Z and X,   is significant but   is not significant. Statistical significance is a function of sample size. Logic suggests that children not assigned to foster care who live in a disorganized family may suffer directly.

24 G89.2247 Lect 224 A More Compelling Example of Complete Mediation If Z is an experimentally manipulated variable such as a prime X is a measured process variable Y is an outcome logically subsequent to X  It should make sense that X affects Y for all levels of Z  E.g. Chen and Bargh (1997) Are participants who have been subliminally primed with negative stereotype words more likely to have partners who interact with them in a hostile manner?


Download ppt "G89.2247 Lect 21 G89.2247 Lecture 2 Regression as paths and covariance structure Alternative “saturated” path models Using matrix notation to write linear."

Similar presentations


Ads by Google