Session 1 1 Check installations 2 Open Mplus 3 Type basic commands 4 Get data read in, spat out &read in again 5 Run an analysis 6 What has it done?
Session 2 1 Mplus input file command structures 2 Mplus conventions 3 Mplus punctuation CaSeS ; : (.) & green! 4 Common typos in input 5 Mplus output file structure 6 Output options inc. saves and plots
Session 1 1 Check installations 2 Open Mplus 3 Type basic commands 4 Get data read in, spat out &read in again 5 Run an analysis 6 What has it done?
Odd1 Even1 Sum1 Case1 Odd2 Even2 Sum2 Case XXXX XXXX1234 DATA: Dunn Statistics in Psychiatry GHQ12 T1 T2 Clinical Psychology students General Health Questionnaire Short self-report questionnaire Used to screen for common mental Disorder (anxiety and depression)
Warning, detour ahead …
Getting (past) the basics … The importance of total control over data The value of learning simple FORTRAN formatting statements –LISCOMP, the predecessor to Mplus, –And Mplus itself, is written in FORTRAN Reading data in uses simple conventions F / I / G / E / X / T – could be read as F6 –92 read as F2.1 would be 9.1 »Width in columns for the real number »then number of numerals that appear after the decimal point
Overcoming limitations Of the Mplus demo –It has a limit on the number of variables Depending on what analysis you are doing these limits are –4 variables –6 variables We shall (largely) work within them –But we can also beat them to make life easier for you
FORTRAN FORMAT STATEMENTS I X T and / –I1 Integer single digit –4X skip four columns without reading anything –Can jump over data in the middle of a file this way »i.e. columns in text files can be ignored –T10 Jump from first column to and start reading at column ten, then process rest of format instructions –Can jump over data at the start of a file this way –/ is to jump to next line (when data has more “records” lines of data per individidual) rather than
Dunn SiP Book GHQ12 from clinical psychology students Odd1 Even1 Sum1 Case1 Odd2 Even2 Sum2Case | | | | | | | | | | | | Numbered columns for field widths (guide for eye and FORTRAN SYNTAX) F5 F5 F5 F6 F5 F6 F5 F5 i.e (3F5,F6,F5,F6,2F5) Specifications for formatted input some F5 some F6
End of detour …
Analysis time Time to begin! –A first orientation to Mplus in action Data Input file syntax Output Plot Actually just doing data transformation at this stage –Not doing any analysis
Odd1 Even1 Sum1 Case1 Odd2 Even2 Sum2 Case XXXX XXXX1234 DATA: Dunn Statistics in Psychiatry GHQ12 T1 T2 Clinical Psychology students General Health Questionnaire Short self-report questionnaire Used to screen for common mental Disorder (anxiety and depression)
DATA: FILE IS "c:\dunn_ghqoddeven12.dat"; FORMAT IS I4 I4 4x I4 I4 I4 4x I4; Odd1 Even1 Sum1 Case1 Odd2 Even2 Sum2 Case
DATA: FILE IS "c:\dunn_ghqoddeven12.dat"; FORMAT IS I4 I4 4x I4 I4 I4 4x I4; DEFINE: sum1 = odd1 + even1; diff1= odd1 - even1; !sum2 = odd2 + even2; !diff2= odd2 - even2; VARIABLE: NAMES ARE odd1 even1 case1 odd2 even2 case2; **File actually contains SUM1 & SUM2 variables** USEVARIABLES ARE sum1 diff1; Odd1 Even1 Sum1 Case1 Odd2 Even2 Sum2 Case This file is dunn_GHQ12_T1T2_ClinPsych_SiP.inp WARNING - more syntax below in the file estimates a correlation and produces a scatter plot
ANALYSIS: ESTIMATOR=ML; MODEL: sum1 with diff1; !sum1 with diff1 sum2 diff2; ! diff1 with sum2 diff2; ! sum2 with diff2; OUTPUT: STDY SAMPSTAT; PLOT: TYPE IS PLOT1; Odd1 Even1 Sum1 Case1 Odd2 Even2 Sum2 Case This file is dunn_GHQ12_T1T2_ClinPsych_SiP.inp
ANALYSIS: ESTIMATOR=ML; MODEL: sum1 with diff1; !sum1 with diff1 sum2 diff2; ! diff1 with sum2 diff2; ! sum2 with diff2; OUTPUT: STDY SAMPSTAT; PLOT: TYPE IS PLOT1; This file is dunn_GHQ12_T1T2_ClinPsych_SiP.inp
SUMMARY OF ANALYSIS [ dunn_GHQ12_T1T2_ClinPsych_SiP.out] Number of groups 1 Number of observations 12 Number of dependent variables 2 Number of independent variables 0 Number of continuous latent variables 0 Observed dependent variables Continuous SUM1 DIFF1
[ dunn_GHQ12_T1T2_ClinPsych_SiP.out] Estimator ML Information matrix OBSERVED Maximum number of iterations 1000 Convergence criterion 0.500D-04 Maximum number of steepest descent iterations 20 Input data file(s) c:\dunn_ghqoddeven12.dat Input data format (I4 I4 4X I4 I4 I4 4X I4) [dunn_GHQ12_T1T2_ClinPsych_SiP.out]
SAMPLE STATISTICS [ dunn_GHQ12_T1T2_ClinPsych_SiP.out] Means SUM1 DIFF1 ________ ________ Covariances SUM1 DIFF1 ________ ________ SUM DIFF Correlations SUM1 DIFF1 ________ ________ SUM DIFF [dunn_GHQ12_T1T2_ClinPsych_SiP.out]
THE MODEL ESTIMATION TERMINATED NORMALLY TESTS OF MODEL FIT Chi-Square Test of Model Fit Value Degrees of Freedom 0 P-Value Chi-Square Test of Model Fit for the Baseline Model Value Degrees of Freedom 1 P-Value CFI/TLI CFI TLI Loglikelihood H0 Value H1 Value [dunn_GHQ12_T1T2_ClinPsych_SiP.out]
MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value SUM1 WITH DIFF Means SUM DIFF Variances SUM DIFF STANDARDIZED MODEL RESULTS STDY Standardization Two-Tailed Estimate S.E. Est./S.E. P-Value SUM1 WITH DIFF Means SUM DIFF Variances SUM DIFF [dunn_GHQ12_T1T2_ClinPsych_SiP.out]
[dunn_GHQ12_T1T2_ClinPsych_SiP.out/gph]
Session 2 1 Mplus input file COMMAND STRUCTURES: 2 Mplus conventions 3 Mplus punctuation ; : (.) & green! comment 4 Common typos in input 5 Mplus output file structure 6 Output options inc. saves and plots
TITLE: DATA: VARIABLE: DEFINE: ANALYSIS: MODEL: OUTPUT: SAVEDATA: PLOT: Command Structures Simple command structures can be built from the GUI
Main command structures appear first on lines as BLOCK CAPS: OTHER COMMANDS THEN FOLLOW either IN CAPS or lowercase; All lines end with a ; but lines can run over more than one line and end with a colon; Conventions
We’ve seen this already … TITLE: DATA: VARIABLE: DEFINE: ANALYSIS: MODEL: OUTPUT: SAVEDATA: PLOT: Mplus does not mind which Order commands come in …. You do not need them all! Actually you can do a lot With a little!
Mplus is not CaSe SeNsItIvE ! Exclamations - are like comment statements, the editor turns them green
All lines end with a semi-colon The most common typo is probably omitting one of these or typing two;;
{ ( [ Mplus parameters } ) } Variancesor Residual Variances –Variable name without brackets [Means]or Thresholds [catvar$1] –Variable name in square brackets (round brackets) –Variable name in round brackets {Scale factors} –Variable name in curly brackets
Mplus output file structure- has to be seen to be believed! OUTPUT: options here govern what you will see in the text output file; SAVEDATA: options here will determine what else is saved in new text files; PLOT: options here will enable you to view graphs of certain things;
OUTPUT: and PLOT: OUTPUT: !many more! SAMP STAND RES ! short for residuals) MOD (number) CINT !(three types) TECHn !(14 types-no’s 1 to 14) FSCOEFF FSDETERMINACY + a few more … don’t forget that final colon; PLOT: TYPE IS PLOT1 PLOT2 or PLOT3; ! That’s about it
SAVEDATA: You can save –You data (the subset of variables that you modelled only, or these variables plus some more that you want to keep even though you did not use them e.g. IDVARIABLE a subid or AUXILIARY other variables such as sex etc. You can also save –Factor scores (appended to your data) –Latent class memberships –Cook’s distances or “influence” statistics There are also other things you can save …. –These depend on what analysis you have constructed
Watch out for the GUI spot here ….. (if you want it)
Watch out for the GUI-2
GUI doesn’t build this part … V1 F1 V2V3V4 F2 E1 E2 E3 E4
GHQ T1 T2 Psychological Distress Odd 1 GHQ T1 Even 1 Odd 2 Even 2 GHQ T2 E1 E2 E3 E4 Correlation Among GHQ scores at T1 and T2 (could be regression)
Acronyms / Abbreviations / Fit Chi-Square [Pearson and Likelihood Ratio] CFI/TLI Loglikelihood H0 Value H1 Value Information Criteria Akaike (AIC) Bayesian (BIC) Sample-Size Adjusted BIC (n* = (n + 2) / 24) RMSEA Root Mean Square Error Of Approximation SRMR Standardized Root Mean Square Residual
TITLE: Dunn SiP Book GHQ12 from clinical psychology students Odd1 Even1 Sum1 Case1 Odd2 Even2 Sum2 Case DATA: FILE IS "c:\dunn_ghqoddeven12.dat"; FORMAT IS I4 I4 4x I4 I4 I4 4x I4; VARIABLE: NAMES ARE odd1 even1 case1 odd2 even2 case2; USEVARIABLES ARE odd1 even1 odd2 even2;
ANALYSIS: ESTIMATOR=ML; MODEL: GHQ_T1 by Odd1*1 (1) Even1 (1); GHQ_T2 by Odd2*1 (1) Even2 (1); GHQ_T1 with GHQ_T2; Odd1 Even1 Odd2 Even2 (2); OUTPUT: SAMPSTAT RESIDUAL; PLOT: TYPE IS PLOT3; INPUT READING TERMINATED NORMALLY
The MODEL statement: Basic commands for modelling BY As in “latent construct measured BY variable” An example would be Depression BY BDI1-BDI3; ! first loading constrained to 1 by default – else can free it and fix var=1 …. which can be read as “measure the latent factor Depression by three variables from the Beck Depression Inventory” [BDI1-BDI3] NB. “through” = hyphen & numbering goes 1-3 not ON As in “regression of y ON x” An example would be Depression ON Sex Age; ….. which can be read as “regress the latent construct Depression (y) on the 1st covariate Sex (x1) and 2nd covariate Age (x2)” WITH As in “x1 correlated with y” or “y2 correlated with y3” An example would be Age WITH Depression or Mental WITH Physical
MODEL: WITH ON BY WITH –correlated with ON –regressed on BY –measured by
SUMMARY OF ANALYSIS Number of groups 1 Number of observations 12 Number of dependent variables 4 Number of independent variables 0 Number of continuous latent variables 2 Observed dependent variables Continuous ODD1 EVEN1 ODD2 EVEN2 Continuous latent variables GHQ_T1 GHQ_T2
SAMPLE STATISTICS SAMPLE STATISTICS Means ODD1 EVEN1 ODD2 EVEN2 ________ ________ ________ ________ Covariances ODD1 EVEN1 ODD2 EVEN2 ________ ________ ________ ________ ODD EVEN ODD EVEN Correlations ODD1 EVEN1 ODD2 EVEN2 ________ ________ ________ ________ ODD EVEN ODD EVEN
TESTS OF MODEL FIT Chi-Square Test of Model Fit Value Degrees of Freedom 7 P-Value Chi-Square Test of Model Fit for the Baseline Model Value Degrees of Freedom 6 P-Value CFI/TLI CFI TLI Loglikelihood H0 Value H1 Value Information Criteria Number of Free Parameters 7 Akaike (AIC) Bayesian (BIC) Sample-Size Adjusted BIC (n* = (n + 2) / 24) RMSEA (Root Mean Square Error Of Approximation) Estimate Percent C.I Probability RMSEA <= SRMR (Standardized Root Mean Square Residual) Value 0.097
MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value GHQ_T1 BY ODD EVEN GHQ_T2 BY ODD EVEN GHQ_T1 WITH GHQ_T Intercepts ODD EVEN ODD EVEN Variances GHQ_T GHQ_T Residual Variances ODD EVEN ODD EVEN Ratio of estimate to std error is an approximate z test
RESIDUAL OUTPUT ESTIMATED MODEL AND RESIDUALS (OBSERVED - ESTIMATED) Model Estimated Means/Intercepts/Thresholds ODD1 EVEN1 ODD2 EVEN2 ________ ________ ________ ________ Standardized Residuals (z-scores) for Means/Intercepts/Thresholds ODD1 EVEN1 ODD2 EVEN2 ________ ________ ________ ________ Model Estimated Covariances/Correlations/Residual Correlations ODD1 EVEN1 ODD2 EVEN2 ________ ________ ________ ________ ODD EVEN ODD EVEN Residuals for Covariances/Correlations/Residual Correlations ODD1 EVEN1 ODD2 EVEN2 ________ ________ ________ ________ ODD EVEN ODD EVEN
A structural model Measurement and structural relations ABBREVIATIONS & FIT STATISTICS
We will see more like that tomorrow, and later today Now for some simple syntax steps INCLUDING REGRESSION CLARIFICATIONS each TYPE
Using Mplus to estimate different types of regression model Output gives Odds ratios
Mplus regressions (1) Y ON X Linear Regression ANALYSIS: TYPE IS GENERAL; ESTIMATOR=ML; MODEL: Y ON X; Probit regression VARIABLE: CATEGORICAL ARE Y1234; ANALYSIS: TYPE IS GENERAL; ESTIMATOR=WLS; MODEL: Y1234 ON X;
Mplus regressions (2) Y ON X Binary Logistic Regression VARIABLE: CATEGORICAL ARE U01; ANALYSIS: TYPE IS GENERAL; ESTIMATOR=ML; MODEL: U01 ON X; Ordinal Logistic Regression VARIABLE: CATEGORICAL ARE U123; ANALYSIS: TYPE IS GENERAL; ESTIMATOR=ML; MODEL: U123 ON X;
Out and back: Bob the (re)-builder walk through Reading data with a fixed format –Same kind of data now, bigger dataset –Do an analysis and save data –Then read back in formatted –hals_ighq.inp Then –read_savehalstxt.inp
“Passing out” variables [ quick2_rebuild2.inp ] DATA: FILE IS c:\halsghq3.dat; VARIABLE: NAMES ARE GHQ22 GHQ24 GHQ28 AGEYRS IDNUM SEXM1F2; USEVARIABLES ARE AGEYRS GHQ22 GHQ24 GHQ28; CATEGORICAL ARE GHQ22 GHQ24 GHQ28; IDVARIABLE = IDNUM; AUXILIARY = SEXM1F2; MODEL: IGHQ BY GHQ22 GHQ24 GHQ28;!define IGHQ measured BY 3vars IGHQ ON AGEYRS; !Here regressing latent factor ON age SAVEDATA: FILE IS savedata.txt; Little words IS/ARE are optional
HALSGHQ3.DAT "halsGHQ3.dat“ GHQ22 GHQ24 GHQ28 AGEYRS IDNUM SEXM1F2
Output. SUMMARY OF ANALYSIS Number of groups 1 Number of observations 6553 Number of dependent variables 3 Number of independent variables 1 Number of continuous latent variables 1 Observed dependent variables Binary and ordered categorical (ordinal) GHQ22 GHQ24 GHQ28 Observed independent variables AGEYRS Observed auxiliary variables SEXM1F2 Continuous latent variables IGHQ Variables with special functions ID variable IDNUM
SAVEDATA INFORMATION Order and format of variables GHQ22 F10.3 GHQ24 F10.3 GHQ28 F10.3 AGEYRS F10.3 IDNUM I6 SEXM1F2 F10.3 Save file savedata.txt Save file format 4F10.3 I6 F10.3 Save file record length 5000 Output
Quick2rebuildout.out INPUT READING TERMINATED NORMALLY SUMMARY OF ANALYSIS Number of groups 1 Number of observations 6553 Number of dependent variables 3 Number of independent variables 1 Number of continuous latent variables 1 Observed dependent variables Binary and ordered categorical (ordinal) GHQ22 GHQ24 GHQ28 Observed independent variables AGEYRS Observed auxiliary variables SEXM1F2 Continuous latent variables IGHQ Variables with special functions ID variable IDNUM
Quick2rebuildout.out (cont) Estimator WLSMV Maximum number of iterations 1000 Convergence criterion 0.500D-04 Maximum number of steepest descent iterations 20 Parameterization DELTA Input data file(s) c:\halsghq3.dat Input data format FREE
Quick2rebuildout.out SUMMARY OF CATEGORICAL DATA PROPORTIONS GHQ22 Category Category Category Category GHQ24 Category Category Category Category GHQ28 Category Category Category Category
Quick2rebuildout.out MODEL ESTIMATION TERMINATED NORMALLY :TESTS OF MODELFIT Chi-Square Test of Model Fit Value * Degrees of Freedom 2** P-Value Chi-Square Test of Model Fit for the Baseline Model Value Degrees of Freedom 4 P-Value CFI/TLI CFI TLI Number of Free Parameters 13 RMSEA (Root Mean Square Error Of Approximation) Estimate WRMR (Weighted Root Mean Square Residual) Value 0.812
Quick2rebuildout.out MODEL RESULTS Two-Tailed Estimate S.E. Est./S.E. P-Value IGHQ BY GHQ GHQ GHQ IGHQ ON AGEYRS Thresholds GHQ22$ GHQ22$ GHQ22$ GHQ24$ GHQ24$ GHQ24$ GHQ28$ GHQ28$ GHQ28$ Residual Variances IGHQ
Quick2rebuildout.out R-SQUARE Observed Residual Variable Estimate Variance GHQ GHQ GHQ Latent Variable Estimate IGHQ QUALITY OF NUMERICAL RESULTS Condition Number for the Information Matrix 0.967E-04 (ratio of smallest to largest eigenvalue) Beginning Time: 05:51:15 Ending Time: 05:51:17 Elapsed Time: 00:00:02
Day 1 Tuesday afternoon Session 3 1 Amos 1 and 2 2 Amos 3 and 4 3 Amos 5 and 6 4 Amos 7 and 8 5 Amos 9 and 10 6 Amos 11 and 12
Time for some constraints & F by y1*1 (1) y2 (1); F1 by i1 i2 (10) i3 i4 (11); … can be words not numbers
Some fun to finish … after Jacque Tacq
EFA! DATA: FILE IS datafile.dat; NOBSERVATIONS = 145; TYPE=CORRELATION; … ANALYSIS: TYPE IS EFA 1 3;
Easy – eh!
Easier!
The Formal Structure of Techniques of Multivariate Analysis
X1 Y Convergent Causal Structure: Multiple regression analysis X3 X2
X Y Bivariate Causal Structure: Test of association between two variables
Partial Correlation Analysis Spurious Causality XY ? Z Indirect Causality ? XY Z
X1 Y X2 Interactive Structure: Analysis of variance and covariance
Latent Structure: Factor analysis... X1X2Xn
Canonical Structure: Canonical correlation analysis X1 Xp X2 Y1 Yq Y2...
X1X2XnX Latent Structure of Similarities: Multi-dimensional scale analysis
Multiple regression analysis (convergent causal structure)
Analysis of variance (ANOVA) (the interactive structure)
Analysis of covariance (ANCOVA) (interactive structure)
Partial correlation analysis (spurious or indirect causality) ?
Discriminant analysis (discriminant structure)
Canonical correlation analysis (the canonical structure)
Multivariate multiple regression (convergent causal structure two or several times)
Multivariate analysis of variance (interactive structure two or several times)
Multivariate discriminate analysis (discrimination structure with more than two population groups)