CFA: Basics Beaujean Chapter 3
Other readings Kline 9 – a good reference, but lumps this entire section into one chapter.
How things are related: SEM Traditional Multivariate Canonical Correlation CFAIRTEFA Path Analysis Multiple Regression ANOVA t- tests Correlation
CFA Models EFA models – You have a bunch of questions – You have an idea (or sometimes not!) of how many factors to expect – You let the questions go where they want – You remove the bad questions until you get a good fit
CFA Models CFA models – You set up the model with specific questions onto specific factors Forcing the cross loadings be zero (draw) – You test to see if that model fits – (so the C = Confirming the EFA).
CFA Models Reflective – the latent variable causes the manifest variables scores – Purpose is to understand the relationships between the measured variables – Same theoretical concept as EFA.
CFA Models Formative – latent variables are the result of manifest variables – Similar to PCA theoretical concept – Demographics?
CFA Models The manifest variables in a CFA are sometimes called indicator variables – Because they indicate what the latent variable should be since we don’t directly measure it.
CFA Models General rules: – The latents will be correlated Similar to an oblique rotation – Each factor section has to be identified – Arrows go from latent to measured (reflexive) We think that latent caused the measured answers – Error terms on the measured variables.
CFA Models Generally, you leave the error terms uncorrelated – BUT! – These questions all measure the same factor right? So their answers on some will be tied to answers on another. So the errors may also be correlated. – You can get away with adding those here, if you have strong modification indices or theoretical reasons.
CFA Models Factor loadings – same idea as EFA, you want the relationship between the latent variable and manifest variable to be strong – Otherwise why you are using that item/scale as an indicator of the latent?
CFA Models Pattern coefficients versus structure coefficients Pattern – regression coefficient, how much does the manifest variable increase for each one unit of the latent variable Structure – correlation between latent and manifest variable
CFA Models Pattern = structure when there is only one latent variable. – Not equal when there are multiple latent variables.
CFA Models Identification rules of thumb: – Latent variables should have four indicators – Latent variables have three indicators AND Error variances do not covary – Latent variables have two indicators AND Error variances do not covary Loadings are set to equal each other.
CFA Models Scaling – setting the scale for latent variable – This scaling also helps with identification issues. – Now, we’ll explain why you’ve been using the 1 to set a particular path and explore other options.
CFA Models Scaling options: – Standardized latent variable – constrain the latent variable variance to 1. – What does that do? Sets the scale to z-score. Makes double headed arrow between latents correlation. Use the unstandardized solution or you’ve double standardized.
CFA Models Scaling options: – Marker variable – sets one of the factor loadings to one (what we’ve been doing) – Gives the items a scale.
CFA Models Scaling options: – Effects coding – estimates all loadings but constrains them to averaging 1.0 across a latent variable.
CFA Models These options will give you different loadings but not different model fit.
CFA Models Empirical underidentification – When the factor loading for a question is very close to 0 or factor correlations are very close to 1.
Let’s try it out! New stuff: – Translate a correlation matrix to covariance matrix. – Something not super clear from before: If you use a correlation matrix as the input: – The Unstandardized output is technically standardized. If you use a covariance table or raw data: – The unstandardized output is unstandardized.
Example 1 CFA with one latent variable New function: – cor2cov(correlations, SDs) – Converts correlation tables to covariance tables – You need a correlation matrix AND a vector of SDs
Example 1 =~ symbol in the model description Before we did – Y ~ X – Because all the variables were manifest Now we use =~ to tell lavaan that the Y is a latent variable.
Example 1 Also new code: parameterEstimates(wisc4.fit, standardized=TRUE) You get the same output as the summary function (parameter wise). This version will give you CIs though!
Example 1 Std.lv = FALSE – False is the default – Makes the first variable the indicator (i.e. sets it to 1). TRUE – Sets the latent as the indicator, estimates all loadings and constrains the variance to 1.
Example 1 What are the new Standardized Columns? – Std.lv = standardizes the latent variable but leaves manifest in the scale – Std.all = standardizes everything.
Example 1 fitted() function – Gives you the recreated covariance table residuals() function – Gives you the difference between actual and reproduced correlation table
Example 1 fitMeasures() – How you can get ALL the fit indices! modificationIndices() – Get the modification indices separately from all the output.
Example 2 Switching to estimating all variables, constraining the variance instead of an indicator. – Use std.lv=TRUE
Example 3 Let’s specify a two-factor model!
Example 4 A fully latent model! Use ~ for latent to latent
Things to check out Heywood cases/logical solution – Are our variances positive + SMCs ok? – Are there any crazy SEs? Estimates – Did our questions load? Model fit – Are the fit indices any good?
Parameters Remember – you want parameters that make sense – You can check out the standardized parameters to determine if questions are still loading like they would in an EFA. – Z = parameter / SE Sometimes called a critical ratio.
Parameters Standard errors are tricky – They are based on the scale of the variable – You do not want them to be zero Estimating no variance is bad … some variance is always good! – You do not want them to be large That means you are not estimating very well
Model Fit See previous notes but here’s a quick reminder: – X 2 nonsignificant (ha!) – RMSEA, SRMR = small numbers – CFI/TLI = large numbers
Model Fit We talked about modification indices in the last section. – In this section – think about what they mean before adding paths. – Usually CFA is meant to test that specific question/latent combination, so it may not help to add the paths or correlate errors on two different latents.
Model Fit Overfitted model – when you add parameters that help model fit, but do not help with theory (and probably won’t replicate)
Compare the models! Make a chi-square difference table – Chi square difference, df difference, critical chi square, reject? Look at differences in CFI – Is the change greater than.01? Look at the AIC/ECVI – Which one is lower?