1 General Structural Equation (LISREL) Models Week 4 #1 Non-normal data: summary of approaches Missing data approaches: summary, review and computer examples.

1 General Structural Equation (LISREL) Models Week 4 #1 Non-normal data: summary of approaches Missing data approaches: summary, review and computer examples Longitudinal data analysis: lagged dependent variables in LISREL models

2 Major approaches: 1. Transform data to normality before using in SEM software Can be done with any stats packages Common transformations: log, sqrt, square 2. ADF (also called WLS [in LISREL] AGLS [EQS]) estimation Requires construction of asymptotic covariance matrix Requires large Ns

3 Major approaches to non-normal data 1. Transform data to normality before using in SEM software 2. ADF (also called WLS [in LISREL] AGLS [EQS]) estimation 3. Scaled test statistics (Bentler-Satorra) also referred to as “robust test statistics” 4. Bootstrapping 5. New approaches (Muthen) 6. Polychoric correlations (PM matrix) Require asympt. Cov. Matrix Not suitable for small Ns

4 Scaled test statistics Generate an asymptotic covariance matrix in PRELIS as well as the usual covariance matrix

5 Scaled Test Statistics Added statistics provided when asymptotic covariance matrix specified in LISREL program Part 2A: ML estimation but scaled chi-square statistic DA NI=14 NO=1456 CM FI=e:\classes\icpsr2004\Week3Examples\nonnormaldata\relmor1.cov AC FI=e:\classes\icpsr2004\Week3Examples\nonnormaldata\relmor1.acc …PROGRAM MATRIX SPECIFICATION LINES ou me=mll sc nd=3 mi Degrees of Freedom = 67 Minimum Fit Function Chi-Square = 407.134 (P = 0.0) Normal Theory Weighted Least Squares Chi-Square = 409.627 (P = 0.0) Satorra-Bentler Scaled Chi-Square = 319.088 (P = 0.0) Chi-Square Corrected for Non-Normality = 342.559 (P = 0.0)

6 Scaled Test Statistics Added statistics provided when asymptotic covariance matrix specified in LISREL program Caution: LISREL manual suggests standard errors are “robust” se’s but in version 8.54, identical to regular ML. Use nested chi-square LR tests if needed Degrees of Freedom = 67 Minimum Fit Function Chi-Square = 407.134 (P = 0.0) Normal Theory Weighted Least Squares Chi-Square = 409.627 (P = 0.0) Satorra-Bentler Scaled Chi-Square = 319.088 (P = 0.0) Chi-Square Corrected for Non-Normality = 342.559 (P = 0.0)

7 Categorical Variable Model Joreskog: with ordinal variables, “no units of measurement.. Variances and covariances have no meaning.. the only information we have is counts of cases in each cell of a multiway contingency table.

8 Categorical Variable Model

13 Categorical Variable Model Bivariate normality: not testable 2x2 Issue: zero cells (skipped) Too many zero cells: imprecise estimates Only one non-zero cell in a row or column: estimation breaks down (in tetrachoric, PRELIS replaces 0 with 0.5; will affect estiamtes)

14 Categorical Variable Model Polychoric correlation very robust to violations of underlying bivariate normality - doctoral dissert. Ana Quiroga, 1992, Upsala) LR chi-square very sensitive RMSEA measure: - no serious effects unless RMSEA >1 (PRELIS will issue warning)

15 Categorical Variable Model What if underlying bivariate normality does not hold approximately? - reduce # of categories - eliminate offending variables - assess if conditional on covariates

16 Bivariate data patterns not fitting the model Agr StAgreeNeutrDisDis St Agr St2010 50 Agree1020 105 Neut4020 10 Dis05204010 Dis St0551020

17 Insert if time permits: brief overview of LISREL CVM approach Subdirectory Week4Examples\OrdinalData

18 Bootstrapping Hasn’t caught on as much as one might have thought Sample with replacement, repeat B times, get set of values for parameters and observe the distribution across “draws” Typically, bootstrap N = sample N (some literature suggestinng m<n might be preferred, but n is standard)

19 Bootstrapping Notes on technique: Yung and Bentler in Marcoulides and Schumaker, Advanced SEM (text supp.) + article in Br. J. Math & Stat Psych. 47: 63-84 1994 Important development: see Bollen and Stine in Long, Testing Structural Equation Models.

20 Bootstrapping in AMOS Under analysis options, Bootstrapping tab IterationsMethod 0Method 1Method 2 1000 2000 3000 4000 5000 6070 70990 802060 901140 100460 110180 12070 13020 14000 15010 16000 17000 18000 19000 Total05000 0 bootstrap samples were unused because of a singular covariance matrix. 0 bootstrap samples were unused because a solution was not found. 500 usable bootstrap samples were obtained.

21 Bootstrapping in AMOS |-------------------- 207.235|* 224.559| 241.883|* 259.207|*** 276.531|******* 293.855|********** 311.179|******************* N = 500328.503|**************** Mean = 325.643345.827|************* S. e. = 1.627363.151|******** 380.476|***** 397.800|*** 415.124|* 432.448| 449.772|* |--------------------

22 Bootstrapping in AMOS ParameterSESE-SEMeanBiasSE-Bias Relig<---V368.029.001.060-.001.001 Relig<---V363.032.001.042.001 Relig<---V356.032.001.082-.002.001 Relig<---V355.028.001-.094-.003.001 Relig<---V353.030.001.126.000.001 Env2<---Relig.044.001.131-.003.002 Env1<---Relig.043.001-.084.002 Env1<---V368.035.001-.010-.002.002 Env2<---V368.036.001.070.000.002 Env2<---V363.039.001.063.000.002 Env1<---V363.038.001-.111.000.002 Env1<---V356.038.001-.145.002 Env2<---V356.040.001.227-.002.002 Env1<---V355.034.001.005.001.002

23 Missing Data The major approaches we discussed last class: EM algorithm to “replace” case values and estimate Σ, z Nearest neighbor imputation FIML

24 The “mechanics” of working with missing data in PRELIS/LISREL Nearest Neighbor : In PRELIS syntax: IM (V356 SEX ) (V147 V176 V355) VR=.5 XN or XL

25 The “mechanics” of working with missing data in PRELIS/LISREL The “matching variables” should have relatively few missing cases (for a given case, imputation will fail if any of the matching variables is missing). Matching variables may include variables in the “imputed variables” list (though if any of these variables has a large number of missing cases, this would not be a good idea).

26 PRELIS imputation Can save results of imputation in raw data file

27 Imputation It is even possible to then re-run PRELIS and do other imputations. (Although not advised, a variable that has been imputed can now be used as a “matching variable”. It is also possible to make another attempt at imputation for the same variable using different “matching variables”). (would need to read in raw data file back into PRELIS)

28 Sample listing (IM) SAMPLE listing: Case 13 imputed with value 7 (Variance Ratio = 0.000), NM= 1 Case 14 not imputed because of Variance Ratio = 0.939 (NM= 2) Case 21 not imputed because of missing values for matching variables Number of Missing Values per Variable After Imputation V9 V147 V151 V175 V176 V304 V305 V307 -------- -------- -------- -------- -------- -------- -------- -------- 16 13 54 38 9 21 35 56 V308 V309 V310 V355 V356 SEX OCC1 OCC2 -------- -------- -------- -------- -------- -------- -------- -------- 32 37 36 29 62 13 0 0 OCC3 OCC4 OCC5 -------- -------- -------- 0 0 0 Distribution of Missing Values Total Sample Size = 1839 Number of Missing Values 0 1 2 3 4 5 6 7 8 9 Number of Cases 1584 162 50 17 10 5 7 2 1 1

29 EM algorithm: PRELIS

30 EM algorithm: PRELIS syntax: !PRELIS SYNTAX: Can be edited SY='G:\Missing\USA5.PSF' SE 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 EM CC = 0.00001 IT = 200 OU MA=CM SM=emcovar1.cov RA=usa6.psf AC=emcovar1.acm XT XM ------------------------------- EM Algoritm for missing Data: ------------------------------- Number of different missing-value patterns= 80 Convergence of EM-algorithm in 4 iterations -2 Ln(L) = 98714.48572

31 Multiple Group Approach Allison Soc. Methods&Res. 1987 Bollen, p. 374 (uses old LISREL matrix notation)

32 Multiple Group Approach Note: 13 elements of matrix have “pseudo” values - 13 df

33 Multiple group approach Disadvantage: - Works only with a relatively small number of missing patterns

34 Other missing data option: FIML estimation LISREL PROGRAM FOR SEXUAL MORALITY AND RELIGIOSITY EXAMPLE DA NI=19 NO=1839 MA=CM RA FI='G:\MISSING\USA1.PSF' -------------------------------- EM Algorithm for missing Data: -------------------------------- Number of different missing-value patterns= 80 Convergence of EM-algorithm in 5 iterations -2 Ln(L) = 98714.48567 Percentage missing values= 1.81 Note: The Covariances and/or Means to be analyzed are estimated by the EM procedure and are only used to obtain starting values for the FIML procedure SE V9 V151 V175 V176 V147 V304 V305 V307 V308 V309 V310 V355 V356 SEX/ MO NY=11 NE=2 LY=FU,FI PS=SY TE=SY BE=FU,FI NX=3 NK=3 LX=ID C PH=SY,FR TD=ZE GA=FU,FR VA 1.0 LY 5 1 LY 8 2 FR LY 1 1 LY 2 1 LY 3 1 LY 4 1 FR LY 11 2 LY 7 2 LY 6 2 LY 9 2 LY 10 2 FR BE 2 1 OU ME=ML MI SC ND=4 LISREL IMPLEMENTATION

35 FIML GAMMA V355 V356 SEX -------- -------- -------- ETA 1 -0.0137 0.0604 0.4172 (0.0024) (0.0202) (0.0828) -5.7192 2.9812 5.0358 ETA 2 -0.0066 0.1583 -0.3198 (0.0025) (0.0215) (0.0871) -2.6128 7.3654 -3.6705 GAMMA -- regular ML, listwise AGE EDUC SEX -------- -------- -------- ETA 1 -0.0130 0.0732 0.4257 (0.0025) (0.0205) (0.0904) -5.2198 3.5626 4.7098 ETA 2 -0.0076 0.1562 -0.3112 (0.0028) (0.0227) (0.0970) -2.7180 6.8715 -3.2087

36 FIML (also referred to as “direct ML”) Available in AMOS and in LISREL AMOS implementation fairly easy to use (check off means and intercepts, input data with missing cases and … voila!) LISREL implementation a bit more difficult: must input raw data from PRELIS into LISREL

37 FIML

38 FIML

39 FIML

40 (INSERT PRELIS/LISREL DEMO HERE) EM covariance matrix Nearest neighbour imputation FIML

41 EM algorithm: in SAS PROC MI Example: religiosity/morality problem. /Week4Examples/MissingData/SAS SASMIProc1.sas

42 SAS MI procedure libname in1 'e:\classes\icpsr2005\Week4Examples\MissingData 2\SAS'; data one; set in1.wvssub3a; proc mi; em outem=in1.cov; var V9 V151 V175 V176 V147 V304 V305 V307 V308 V309 V310 v355 v356 SEX; run; proc calis data=in1.cov cov mod; [calis procedure specifications]

43 SAS MI procedure Data Set WORK.ONE Method MCMC Multiple Imputation Chain Single Chain Initial Estimates for MCMC EM Posterior Mode Start Starting Value Prior Jeffreys Number of Imputations 5 Number of Burn-in Iterations 200 Number of Iterations 100 Seed for random number generator 1254 Missing Data Patterns Group V9 V151 V175 V176 V147 V304 V305 V307 V308 V309 V310 V355 V356 SEX Freq 1 X X X X X X X X X X X X X X 1456 2 X X X X X X X X X X X X. X 173 3 X X X X X X X X X X X. X X 10

44 SAS MI procedure Missing Data Patterns Group V9 V151 V175 V176 V147 V304 V305 V307 V308 V309 V310 V355 V356 SEX Freq 4 X X X X X X X X X X X.. X 10 5 X X X X X X X X X X. X X X 5 6 X X X X X X X X X. X X X X 9 7 X X X X X X X X X. X X. X 1 8 X X X X X X X X X.. X X X 2 9 X X X X X X X X. X X X X X 3 10 X X X X X X X X. X. X X X 1 11 X X X X X X X. X X X X X X 13 12 X X X X X X X. X X X X. X 2 13 X X X X X X X. X X X.. X 1 14 X X X X X X X. X X. X X X 3 15 X X X X X X X. X X.. X X 1 16 X X X X X X X. X. X X X X 1 17 X X X X X X X. X. X X. X 1

45 SAS MI procedure Initial Parameter Estimates for EM _TYPE_ _NAME_ V9 V151 V175 V176 V147 MEAN 1.720790 1.174790 1.414770 8.058470 3.958927 Initial Parameter Estimates for EM V304 V305 V307 V308 V309 V310 V355 1.876238 2.151885 3.049916 2.395683 4.001110 4.896284 46.792265 Initial Parameter Estimates for EM V356 SEX 7.775246 0.489396

46 SAS MI procedure Initial Parameter Estimates for EM _TYPE_ _NAME_ V9 V151 V175 V176 V147 COV V9 0.808388 0 0 0 0 COV V151 0 0.168983 0 0 0 COV V175 0 0 0.483982 0 0 COV V176 0 0 0 6.783348 0 COV V147 0 0 0 0 6.575298

47 SAS MI procedure EM (MLE) Parameter Estimates _TYPE_ _NAME_ V9 V151 V175 V176 V147 MEAN 1.721840 1.180968 1.420315 8.046136 3.959583 COV V9 0.807215 0.184412 0.307067 -1.599731 1.301326 COV V151 0.184412 0.170271 0.137480 -0.626684 0.454568 COV V175 0.307067 0.137480 0.485803 -1.073616 0.753307 COV V176 -1.599731 -0.626684 -1.073616 6.805023 -3.428576 COV V147 1.301326 0.454568 0.753307 -3.428576 6.567477 COV V304 0.390792 0.165856 0.263160 -1.368173 1.069671 COV V305 0.455902 0.114129 0.249936 -1.353161 0.993579

48 SAS PROC mi Multiple Imputation Variance Information Relative Fraction -----------------Variance----------------- Increase Missing Variable Between Within Total DF in Variance Information V9 0.000000239 0.000439 0.000439 1834.4 0.000653 0.000653 V151 0.000002904 0.000092789 0.000096275 1120.1 0.037561 0.036832 V175 0.000002180 0.000264 0.000266 1741.7 0.009913 0.009863 V176 0.000002364 0.003710 0.003713 1834.1 0.000765 0.000764 V147 0.000025982 0.003571 0.003602 1760.1 0.008731 0.008692 V304 0.000002260 0.001621 0.001623 1830.6 0.001674 0.001672 V305 0.000034129 0.001946 0.001987 1509.7 0.021050 0.020824 V307 0.000027451 0.003995 0.004028 1767.2 0.008245 0.008211

49 Sas PROC mi SAS log: 115 proc mi; em outem=in1.cov; var NOTE: This is an experimental version of the MI procedure. 116 V9 V151 V175 V176 V147 V304 V305 V307 V308 V309 V310 v355 v356 SEX; run; NOTE: The data set IN1.COV has 15 observations and 16 variables. NOTE: PROCEDURE MI used: real time 2.77 seconds cpu time 2.65 seconds

50 CALIS (SAS) proc calis data=in1.cov cov nobs=1836 mod;  nobs= not needed if working with raw data lineqs v9 = 1.0 F1 + e1, V175 = b1 F1 + e2, V176 = b2 F1 + e3, V147 = b3 F1 + e4, V304 = 1.0 F2 + e5, V305 = b4 F2 + e6, V307 = b5 F2 + e7, V308 = b6 F2 + e8, V309 = b7 F2 + e9, V310 = b8 F2 + e10, F1 = b9 V355 + b10 V356 + b11 SEX + d1, F2 = b12 V355 + b13 V356 + b14 SEX + d2; std e1-e10 = errvar:,  - special convention for more than 1 at a time (generates warning msg.) v355=vv355, v356 = vv356, sex = vsex, d1 = vd1, d2= vd2; cov d1 d2 = covD1D2; run;

51 SAS - CALIS The CALIS Procedure Covariance Structure Analysis: Maximum Likelihood Estimation Manifest Variable Equations with Estimates V9 = 1.0000 F1 + 1.0000 e1 V175 = 0.6232*F1 + 1.0000 e2 Std Err 0.0223 b1 t Value 27.9048 V176 = -3.0284*F1 + 1.0000 e3 Std Err 0.0835 b2 t Value -36.2766 V147 = 2.2839*F1 + 1.0000 e4 Std Err 0.0822 b3 t Value 27.7987 V304 = 1.0000 F2 + 1.0000 e5 V305 = 1.0732*F2 + 1.0000 e6 Std Err 0.0671 b4 t Value 15.9949 V307 = 2.1959*F2 + 1.0000 e7 Std Err 0.1118 b5 t Value 19.6468 V308 = 1.6376*F2 + 1.0000 e8 Std Err 0.0863 b6 t Value 18.9819 V309 = 2.3768*F2 + 1.0000 e9 Std Err 0.1184 b7 t Value 20.0708 V310 = 1.9628*F2 + 1.0000 e10 Std Err 0.1037 b8 t Value 18.9346

52 SAS - CALIS Variances of Exogenous Variables Standard Variable Parameter Estimate Error t Value V355 vv355 314.89289 9.85535 31.95 V356 vv356 4.80150 0.14968 32.08 SEX vsex 0.24989 0.00821 30.44 e1 errvar1 0.27695 0.01365 20.29 e2 errvar2 0.27984 0.01049 26.67 e3 errvar3 1.94179 0.11086 17.52 e4 errvar4 3.80162 0.14232 26.71 e5 errvar5 2.20316 0.07811 28.21 e6 errvar6 2.68880 0.09493 28.32 e7 errvar7 3.58210 0.14913 24.02 e8 errvar8 2.60696 0.10216 25.52 e9 errvar9 3.40464 0.15093 22.56 e10 errvar10 3.81099 0.14885 25.60 d1 vd1 0.50262 0.02572 19.54 d2 vd2 0.70781 0.06601 10.72

53 SAS - CALIS Lagrange Multiplier and Wald Test Indices _PHI_ [15:15] Symmetric Matrix Univariate Tests for Constant Constraints Lagrange Multiplier or Wald Index / Probability / Approx Change of Value V355 V356 SEX e1 e2 V355 1020.8953 0.0000 0.0000 0.1671 9.0161. 1.0000 1.0000 0.6827 0.0027. 0.0000 -0.0000 -0.1071 0.6931 [vv355] V356 0.0000 1029.0776 0.0000 3.6310 0.4885 1.0000. 1.0000 0.0567 0.4846 0.0000. 0.0000 0.0610 -0.0197 [vv356] SEX 0.0000 0.0000 926.8575 1.3898 0.1477 1.0000 1.0000. 0.2384 0.7007 -0.0000 0.0000. 0.0089 0.0026 [vsex] e1 0.1671 3.6310 1.3898 411.7985 29.2403 0.6827 0.0567 0.2384. 0.0000 -0.1071 0.0610 0.0089. -0.0528 [errvar1] e2 9.0161 0.4885 0.1477 29.2403 711.2491 0.0027 0.4846 0.7007 0.0000. 0.6931 -0.0197 0.0026 -0.0528. [errvar2]

54 A general “what to do when” outline (see handout)

55 Longitudinal data I.Modeling of latent variable mean differences over time II.More complicated tests (linear growth, quadratic growth, etc.)

56 Applications to longitudinal data Basic model for assessing latent variable mean change: Can run this model on X or Y side (LISREL) Equations: X1 = a1 + 1.0L1 + e1 X2 = a2 + b1 L1 + e2 X3 = a3 + b2 L1 + e3 X4 = a4 + 1.0 L2 + e4 X5 = a5 + b3 L2 + e5 X6 = a6 + b4 L2 + 36 Constraints: b1=b3 b2=b4 LX=IN a1=a4 a2=a5 a3=a6 TX=IN Ka1 = 0ka2 = (to be estimated)

57 Applications to longitudinal data Basic model for assessing latent variable mean change: Can run this model on X or Y side (LISREL) Equations: X1 = a1 + 1.0L1 + e1 X2 = a2 + b1 L1 + e2 X3 = a3 + b2 L1 + e3 X4 = a4 + 1.0 L2 + e4 X5 = a5 + b3 L2 + e5 X6 = a6 + b4 L2 + 36 Constraints: b1=b3 b2=b4 LX=IN a1=a4 a2=a5 a3=a6 TX=IN Ka1 = 0ka2 = (to be estimated) Correlated errors

58 Applications to longitudinal data Model for assessing latent variable mean change Usual parameter constraints: TX(1)=TX(4)=TX(7) LISREL: EQ TX 1 TX 4 TX 7 AMOS: same parameter name

59 Applications to longitudinal data Model for assessing latent variable mean change Usual parameter constraints: TX(1)=TX(4)=TX(7) LISREL: EQ TX 1 TX 4 TX 7 AMOS: same parameter name KA(1) = 0 KA(2) = mean difference parameter #1 KA(3) = mean difference parameter #2 LISREL: KA=FI group 1 KA=FR groups 2,3 IN AMOS:

60 Applications to longitudinal data Model for assessing latent variable mean change Usual parameter constraints: TX(1)=TX(4)=TX(7) LISREL: EQ TX 1 TX 4 TX 7 AMOS: same parameter name KA(1) = 0 KA(2) = mean difference parameter #1 KA(3) = mean difference parameter #2 LISREL: KA=FI group 1 KA=FR groups 2,3 Some tests: Test for change: H0: ka1=ka2=0 Linear change model:ka2 = 2*ka1 Quadratic change model:ka2 = 4*ka1

61 As a causal model: Beta 1 “stability coefficient” Stability coefficient is high if relative rankings preserved, even if there has been massive change with respect to means In model with AL1=0 and AL2=free, can have high Beta2,1 with a) AL(1)=AL(2) or AL(1) massively different from AL(2)

62 Causal models: Ksi-2 as lagged (time 1) version of eta-1 (could re-specify as an eta variable) Temporal order in Ksi-1  Eta-1 relationship

63 Causal models: Cross-lagged panel coefficients [Reduced form of model on next slide]

64 Causal models: Reciprocal effects, using lagged values to achieve model identification

65 Causal models: A variant Issue: what does ga(1,1) mean given concern over causal direction?

66 Lagged and contemporaneous effects This model is underidentified

67 Lagged and contemporaneous effects Three wave model with constraints:

68 Lagged effects model Ksi-1 could be an “event” 1/0 dummy variable

69 Lagged effects model

70 Re-expressing parameters: GROWTH CURVE MODELS Intercept & linear (& sometimes quadratic) terms Exogenous variables Alternative: HLM, subjects as level-2 observations within subjects as level-1 (mixed models: discussed elsewhere)

1 General Structural Equation (LISREL) Models Week 4 #1 Non-normal data: summary of approaches Missing data approaches: summary, review and computer examples.

Similar presentations

Presentation on theme: "1 General Structural Equation (LISREL) Models Week 4 #1 Non-normal data: summary of approaches Missing data approaches: summary, review and computer examples."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 General Structural Equation (LISREL) Models Week 4 #1 Non-normal data: summary of approaches Missing data approaches: summary, review and computer examples.

Similar presentations

Presentation on theme: "1 General Structural Equation (LISREL) Models Week 4 #1 Non-normal data: summary of approaches Missing data approaches: summary, review and computer examples."— Presentation transcript:

Similar presentations

About project

Feedback