Working with Under-identified Structural Equation Models David A. Kenny University of Connecticut Website: davidakenny.net/kenny.htm Paper download: davidakenny.net/doc/kandm.doc Powerpoint download: davidakenny.net/doc/under.ppt
Introductory Comment Talk is about Structural Equation Models (SEM). Nonetheless, the points apply to many other types of modeling as issues about identification apply to a broad range of models.
Identification in SEM Specify a model. See if it is identified. If identified, estimate it. If under-identified, respecify the model until it is identified. Models that are under-identified are not estimated and are thought to be useless models.
Quote If a model is not identified, it must be made identified by increasing the number of manifest variables or by reducing the number of parameters to be estimated (Blunch, 2008, p. 78).
What To Do with Under-identified Models Make them identified: Add variables Make parameter constraints Estimate the range of possible values. Sensitivity analysis: Fix the “under-identifying parameter” to a range of reasonable values, and examine the solutions.
Some Under-identified Models Contain Useful Information A: Some model parameters can be estimated even if the model as a whole is under-identified. These might be theoretically or practically important parameters. B: Fit can be evaluated sometimes even if the model as a whole is under-identified. Can be a way for ruling out models.
A: Under-identified Models with Identified Parameters A model is under-identified if not all the parameters of the model are indentified. However, some of the parameters of the model might be identified. Those parameters may be of interest. Three Examples Outcome with a single indicator Stability of personality Growth curve model with just two waves
How to Estimate Under-identified Models? Can set one or more of the under-identified parameter estimate to "allowable" values. A fix: Turning an under-identified model into an identified model by pretending something is true which is not true. Some programs do estimate parameter estimates, even if the model is not identified. With Amos: “Try to fit under-identified models.” Use MIIV.
Outcome with a Single Indicator: Fishbein & Ajzen Despite the model being under-identified, paths a, b, and c (the key parts of the model) are identified.
Usual Fix W = V + E7
What to Do? Use the fix. Model is identified! But the model is wrong! Use the under-identified model. Obvious drawback: The model is under-identified. But it does give information about key parameters. It does not pretend to know something to that it does not know.
Stability of Depression in Boys 10 knowns 11 unknowns model under-identified Standardized a is identified = .561
Fixes Add a third indicator. Fix one of the free loadings to one (it does not matter which one). Fix both free loadings to one. Model now over-identified and fit may be poor. The under-identified model might be better.
Growth-curve Model with Just Two Waves 20 knowns 22 unknowns model under-identified Red paths are identified!
Fix W = U + E1 X = V + E2
Information Lost by Not Having Three or More Waves Slope and intercept variances are not identified. Thus, measures of variance explained are not available. Linearity must be assumed and is not tested.
Identified Parameters in Under-identified Models Best to estimate the under-identified model as it makes clear what is known and what is unknown. One can find a “fix,” but the fix gives the illusion that the model is identified, when in fact it is not. The “fix” might make an unreasonable assumption.
B: Under-identified Models for Which Fit Can Be Evaluated For all models that meet or exceed the minimum condition of identifiability but are under-identified, the fit of the model can be evaluated because all of these models place some sort of constraint on the data. Two examples Longitudinal Models with No Cross-causal Effects
Models that Meet or Exceed the Minimum Condition of Identifiability Minimum condition of identifiability or the t rule: The number of knowns (variances, covariances, and means) must be greater than or equal to the number of unknowns (e.g., paths). Some models that meet this condition are not identified.
Non-recursive Model 10 knowns 10 unknowns model under-identified 2df: r23 − r12r13 = 0 and r24 − r12r14 = 0
Some Paths Can Be Estimated and Fit Can Be Evaluated 10 knowns 10 unknowns model under-identified Paths a and b not identified. Paths c, d, and e are identified. Model has 2df,
Fix
Longitudinal Model of Spuriousness Common Model for Two-wave Data Is to Estimate Cross-causal Effects Whismam: Depression Causes Marital Dissatisfaction vs. Marital Dissatisfaction Causes Depression Alternative Model: Depression and Marital Dissatisfaction Do Not Cause Each Other Zero paths model makes strong and implausible assumption about spuriousness (Dwyer). Better might be an explicit model (under-identified, but testable) of spuriousness
Model of Spuriousness Four or more measures at two or more times Assumptions Spuriousness The manifest variables are caused by latent variables which explain all the covariation in the variables. No lagged causal effects. Stationarity After a linear transformation, factor structure and variances invariant over time.
Dumenci and Windle Example Four measures of depression (CESD) for 16 and 17 years olds, 372 males and 433 females. Chosen because the measures should not have causal effects between them. Model Fit (p values) Stationarity (df = 2) Spuriousness (df = 6) Males .514 .079 Females .273 .990 As expected, data are consistent with spuriousness, i.e., no causal effects.
Conclusions Not all under-identified models are hopeless. Sometimes key parameters can be estimated in an under-identified model. Sometimes model fit can be estimated for under-identified model which can be useful in testing the model.
Suggestions SEM programs need to be able to estimate under-identified models. Try to avoid “fixes”; estimate a more realistic model even if part of it is under-identified. Sometimes cheaper (e.g., fewer measures or time points) designs can yield key information with an under-identified model. e.g., two-wave growth models
Final Suggestion Kenny 1979, p. 40: "Making new specifications to just to be able to identify the parameters of a causal model is perhaps the worst sin of causal modelers."
The End Download powerpoint: davidakenny.net/doc/under.ppt Download Kenny & Milan Identification Chapter for the forthcoming Handbook of Structural Equation Modeling (Richard Hoyle, David Kaplan, George Marcoulides, and Steve West, Eds.), New York: Guilford Press: davidakenny.net/doc/kandm.doc