ALISON BOWLING CONFIRMATORY FACTOR ANALYSIS
REVIEW OF EFA Exploratory Factor Analysis (EFA) Explores the data All measured variables are related to every factor by a factor loading estimate. If each measured variable loads highly (>.4) on only one factor, we have simple structure. Factors are derived from statistical results, not from theory. Factors are named only after the analysis We do not know initially how many factors there are or which variables belong to which constructs.
CFA AND CONSTRUCT VALIDITY Construct validity is the extend to which a set of measured items actually reflect the theoretical latent construct they are designed to measure CFA enables us to assess the construct validity of a proposed measurement theory.
INTRODUCTION TO CFA With CFA, the researcher must specify both the number of factors that exist within a set of variables and which factor each variable will load highly on before the results can be computed. Each variable loads on only one factor Factors are correlated
TERMINOLGY Latent variable Unobserved variable = factor Displayed as a circle in model diagram Observed variables Variables in the data set Displayed as rectangles Exogenous variables Synonymous with IVs Do not have arrows pointing to them Endogenous variables Synonymous with DVs Have arrows pointing to them. Have error variances
PATH DIAGRAM Mediation model er1 and er2 are latent (not measured) Emotcope, coghard and ghq are observed variables Emotcope and ghq are endogenous Have arrows pointing to them Have error variances The lines represent predicted relationships
INFORMATION FOR CFA
TESTS OF GOODNESS OF FIT Reproduced covariance matrix is constructed after estimation of the parameters. This may be compared with the input matrix. Tests of Goodness of Fit compare these two matrices. Likelihood ratio 2 Very sensitive: not terribly useful Goodness-of-fit indices (GFI); higher the better Root mean square error of approximation (RMSEA) OK Incremental fit indices (TLI, etc) >.9 AIC (can be used to compare models)
SIGNIFICANCE OF PARAMETERS Each of the regression weights and other parameters estimated has a CR test of significance. This is distributed as z. Therefore CR > ±1.96 is significant
MODEL BUILDING Error terms All endogenous variables have error variances associated with them These represent errors of measurement (observed variables) OR errors of prediction (latent variables) Fixing parameters To avoid having more parameters than data points (under- identified) some of the parameters need to be fixed Regression weights of the error terms are fixed to 1. Factors are unobserved -> have no scale One of the indicator variables for each factor is usually fixed to 1.
EXAMPLE (TABACHNICK AND FIDELL) CFA of the WISC. 11 subtests, with two factors Verbal : (information, comprehension, arithmetic, similarities, vocabulary, digit span) Performance : (picture completion, picture arrangement, block design, object assembly, coding). Does a 2 factor model with simple-structure fit the data? Is there a significant covariance between Verbal and Performance factors? Datafile: wiscsem.sav
MODEL SPECIFICATION Data points = (11 x12) /2 = 66 Parameters to estimate = 1 covariance 11 regression weights 11 variances = 23 df = 66 – 23 = 43
ANALYSIS PROPERTIES
Model shows correlation between Verbal and Performance IQ of.59 Each of the standardised regression coefficients (loadings) of the variables on the two factors. OUTPUT: STANDARDISED ESTIMATES
NOTES FOR MODEL Computation of degrees of freedom (Default model) Number of distinct sample moments: 66 Number of distinct parameters to be estimated: 23 Degrees of freedom ( ): 43 Result (Default model) Minimum was achieved Chi-square = Degrees of freedom = 43 Probability level =.005
MODEL FIT Model NFI Delta 1 RFI rho1 IFI Delta2 TLI rho2CFI Default model Saturated model1.000 Independence model.000 ModelRMSEALO 90HI 90PCLOSE Default model Independence model In general the model tits well. Can it be improved?
IMPROVING MODEL FIT 1.Could there be an additional path in the model? 2.Could a more parsimonious model be obtained by removing coding from the model?
ADDING A PATH Check Modification indices Add a path in which performance predicts comprehension (er2) M.I.Par Change er11 er er11 er er2 Performance er2 er er3 er er5 er
UPDATED MODEL
MODEL FIT Number of distinct sample moments: 55 Number of distinct parameters to be estimated: 22 Degrees of freedom ( ):33 Chi-square = , df = 33 Probability level =.079 Chi-square is now non-significant. Other indices have also improved: RMSEA.06 ->.046
MODEL COMPARISON We can compare nested models (initial model with model including Performance -> Comprehension by testing the difference between Chi-square values. Initial chi-square = , df = 43 Chi-square with extra path included = , df = 42 Chi-squared difference = 9.94, df = 1, p <.01 Adding the extra path significantly improves model fit.
MODEL COMPARISON : AIC We can use AIC to compare non-nested models. When we delete Coding, the new model is not nested, as we have changed the data from the initial model. AIC for initial model = AIC for final model = The lower the value of AIC, the better the model fit.
CAVEAT The model modifications are post hoc May be due to chance Ideally should be cross-validated with a new sample.