Presentation is loading. Please wait.

Presentation is loading. Please wait.

Joseph H. Hammer, PhD Michael D

Similar presentations


Presentation on theme: "Joseph H. Hammer, PhD Michael D"— Presentation transcript:

1 Applied Psychometric Strategies Lab Applied Quantitative and Psychometric Series
Joseph H. Hammer, PhD Michael D. Toland, PhD Bifactor Analysis in Mplus November 1, 2016

2 How do I cite and reference this talk?
If you wish to cite the video of this AQPS Talk, please use this reference and citation: Reference: Hammer, J. H., & Toland, M. D. (2016, November). Bifactor analysis in Mplus. [Video file]. Retrieved from In-text citation: Hammer and Toland (2016) or (Hammer & Toland, 2016) This PowerPoint Handout can be found at the APS Lab website: You can download the de-identified raw data and Mplus input and output syntax used in this talk from the APS Lab website. You are encouraged to adapt our syntax for yourown research—please use this reference and citation when doing so: Hammer, J. H. & Toland, M. D. (2016). Name of specific syntax file you adapted from us goes here  [Data file]. Retrieved from

3 Overview What are some ways of assessing internal structure?
What is a bifactor model? What questions can we answer using a bifactor model and ancillary bifactor measures? Walk through a real example (ISMI-29)

4 Problems with Testing Internal Structure (Dimensionality)
Numerous instruments are created to measure a single latent variable (e.g., self-efficacy, depression) However, factor analysis studies tend to show conflicting evidence for unidimensionality vs. multidimensionality Are we measuring a single latent variable or a variety of different-but-related latent variables? Bifactor analysis provides helpful information when confronted with these dimensionality debates

5 How can the dimensionality of instruments be modeled?
Unidimensional model Correlated factors model Second-order factor model Bifactor model Reise et al. (2007) recommended researchers complement the analyses of unidimensional and correlated factors models with a bifactor model

6 Unidimensional Model

7 Correlated Factors Model
Accounts for multiple related factors, which are allowed to correlate with each other (i.e., oblique factors)

8 Second-Order Factor Model
Dimensions (first-order factors) are correlated with a general factor (the second-order factor). Correlations among first-order factors are explained by the second-order factor.

9 Bifactor Model General Factor
A bifactor model is a model with a broader general factor and narrower specific factors (sometimes called group, nuisance, or method factors) that compete against each other to account for variance in the items. All factors are set orthogonal to each other, meaning they are not allowed to correlate. Unlike the second-order model, all factors are first-order factors. To specify a bifactor model, I tell Mplus that each item should load on the general factor as well as their assigned specific factor. Thus, each item loads onto two different factors simultaneously. Some items mostly tap the General Factor, some items mostly tap a given Specific Factor, and some items tap each Factor to a similar degree.

10 Bifactor Model – Example 1
Intelligence Here’s a first example. Many researchers believe that overall intelligence is a real construct that can be measured, but that intelligence is also defined by several subdimensions such as verbal comprehension, perceptual reasoning, working memory, and processing speed. The bifactor model can account for both the overall intelligence construct using the general factor, but also account for these narrower subdimensions using the specific factors. Verbal Comprehension Perceptual Reasoning Working Memory Processing Speed

11 Bifactor Model Affective General Depression Factor Cognitive Somatic
Here’s a second example. Depression is thought to be an overall construct that also has several subdimensions that define the construct’s content domain. To successfully model this construct using a bifactor model, the general factor might measure a person’s overall level of depression, whereas the three specific factors might measure specific subdimensions of depression such as affective, cognitive, and somatic symptom clusters. Somatic (Brouwer et al., 2013)

12 Bifactor Model – Example 3
Here’s a third example. The 10-item Rosenberg Self-Esteem Scale has 6 positively-worded items and also 4 negatively-worded items that need to be reverse-coded prior to creating a total score for the entire instrument. Even though self-esteem is conceptualized as unidimensional, the positively-worded items and the negatively-worded items try to form their own method factors (or “nuisance factors”), which forces researchers to use a correlated factors model instead. The Bifactor model allows researchers to capture the overall self-esteem construct with the General Factor, while also accounting for these method effects by specifying two Specific Factors, one for the positive items and one for the negative items. (Marsh et al., 2010)

13 What questions can the bifactor model answer?
How unidimensional vs. multidimensional is the instrument? Is it permissible to model an instrument as unidimensional despite the presence of some multidimensionality? What is the impact of method effects (e.g., negatively-worded items) on the internal structure of the instrument? Is the raw total score a reliable-enough measure of the general factor? Are the raw subscale scores reliable-enough measures of their specific factors? Or do I have to use SEM to obtain latent factor scores instead?

14 When might an instrument best conform to a bifactor structure?
Hints from Our Anecdotal Experience Subscales generally inter-correlate > .3 First-order factors generally loading on second-order factor > .5 Ratio of 1st eigenvalue to 2nd eigenvalue is > 3, in the context of a traditional EFA (Cho et al., 2015; p. 554) Again, this is completely anecdotal

15 Time for a Real Example Internalized Stigma of Mental Illness Scale (ISMI-29; Ritsher et al., 2003) 29 items, 5 subdomains Self-report measure of how much stigma people feel about their own mental illness 4-point Likert-type format 1 (Strongly disagree) to 4 (Strongly agree)

16 What are the 5 subdomains of the ISMI-29?
Alienation “Having a mental illness has spoiled my life.” Stereotype Endorsement “Mentally ill people tend to be violent” Discrimination Experience “People discriminate against me because I have a mental illness” Social Withdrawal “I don’t talk about myself as much because I don’t want to burden others with my mental illness” *Stigma Resistance (*reverse-coded) “I can have a good, fulfilling life, despite my mental illness”

17 What is the problem with how we use ISMI-29?
Conflicting findings regarding dimensionality Some support for a correlated factors or second-order model But these subscales intercorrelate very strongly (r ’s > .65) Are they really capturing separate subdomains? Researchers sometimes use the raw total score Requires evidence of a strong, reliable general factor running through most of the 29 items Researchers often use the raw subscale scores Requires evidence of specific factors that account for substantial reliable variance in their items over and above the general factor Bifactor analysis to the rescue!

18 Input: Bifactor Model

19 CFA Results using WLSMV
Greener background indicates better fit Bifactor provides adequate fit to data Time to look at ancillary bifactor measures!

20 Ancillary Bifactor Measures
Ancillary bifactor measures address either: Dimensionality of instrument, or Model-Based Reliability of total and/or subscale scores

21

22 Abridged 1 of 2

23 Abridged 2 of 2

24 Ancillary Bifactor Measures with WLSMV
All ancillary bifactor measures based on Model Results were similar or identical to those using standardized model results To get ancillary bifactor measures using Standardized Estimates you need to feed back into Mplus the standardized values as start values Step 1: save standardized values using svalues Step 2: rerun goodies with standardized values as start values

25 Abridged Input Using Standardized Values as Start Values for Ancillary Bifactor Measures
All results presented herein are based on standardized WLSMV model results.

26 Regarding Dimensionality
ECVGeneral, ECVSpecific, and IECV PUC General vs. Specific loadings General vs. Unidimensional loadings  Average Relative Parameter Bias See Rodriguez et al., (2016) for review

27 Explained Common Variance (ECVGen)
(Reise, Moore, & Haviland, 2010) An index of unidimensionality Proportion of common variance across items explained by the general dimension RQ: Is the ISMI-29 “unidimensional enough” that it is permissible to model the ISMI-29 as a unidimensional instrument in a CFA or SEM?

28 𝐸𝐶𝑉𝐺𝑒𝑛= 𝜆 𝑗 ∗𝐺 2 ( 𝑗=1 𝐽 𝜆 𝑗 ∗𝐺 2 )+( 𝑗=1 𝐽 𝜆 𝑗 ∗𝑆 𝑠 2 )
𝐸𝐶𝑉𝐺𝑒𝑛= 𝜆 𝑗 ∗𝐺 2 ( 𝑗=1 𝐽 𝜆 𝑗 ∗𝐺 2 )+( 𝑗=1 𝐽 𝜆 𝑗 ∗𝑆 𝑠 2 ) 𝐸𝐶𝑉𝐺𝑒𝑛 = 0.763

29 ECVGen Cutoffs ECV ≥ .85 suggests instrument sufficiently unidimensional to warrant a one-factor model (Stucky et al., 2014; 2015) For binary data, Quinn (2014) says ECV > .90 suggest one-factor model ECV values between .90 to .70 are gray area ECV < .70 are Multidimensional and subscores may have value ECV near 0 suggest data is completely multidimensional

30 𝐸𝐶𝑉 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐 = 𝑗=1 𝐽 𝜆 𝑗 ∗𝑆 𝑠 2 ( 𝑗=1 𝐽 𝜆 𝑗 ∗𝐺 2 )+( 𝑗=1 𝐽 𝜆 𝑗 ∗𝑆 𝑠 2 )

31 Excel-based PUC Calculator available from DrJosephHammer.com

32 Percent of Uncontaminated Correlations (PUC)
Instruments with more specific factors and fewer items per specific factor have larger PUC Higher PUC = less bias in structural coefficients = more permissible to treat instrument as unidimensional When PUC > .80, ECVGen values are less important in predicting bias When PUC < .80, ECVGen values >.60 and ωH > .70 suggest that the presence of some multidimensionality is not severe enough to disqualify the interpretation of the instrument as primarily unidimensional p. 22, Reise, Scheines, Widaman, & Haviland (2013)

33 PUC and ECVGen for ISMI-29
ISMI-29 PUC = .83 ISMI-29 ECVGen = .763 ISMI-29 ωH = .91 PUC > .80 (ECVGen > .60 and ωH > .70, anyways) Thus, ISMI-29 essentially unidimensional

34 General vs. Specific Most items were stronger measures of the General factor than Specific factors Gen S1 S2 S3 ismi1 0.797 0.085 ismi7 0.561 0.116 ismi14 0.600 0.495 ismi2 0.730 0.147 ismi8 0.581 0.210 ismi15 0.595 0.477 ismi3 0.532 0.040 ismi9 0.267 0.439 ismi16 0.667 0.554 ismi4 0.657 0.474 ismi10 0.666 0.274 ismi17 0.647 0.482 ismi5 0.646 0.527 ismi11 0.638 0.398 ismi18 0.787 -0.032 ismi6 0.697 0.393 ismi12 0.521 0.563 ismi13 0.750 0.271 S4 S5 ismi19 0.172 ismi25 0.056 0.404 ismi20 0.743 0.465 ismi26 0.742 ismi21 0.785 0.082 ismi27 0.673 0.291 ismi22 0.723 0.358 ismi28 0.351 0.483 ismi23 0.718 0.107 ismi29 0.206 0.349 ismi24 0.735 0.218

35 Individual Explained Common Variance (IECVGen)
(Stucky et al., 2014, 2015) Tells us how strongly each item measures the general dimension IECV near 1 indicates an item only reflects the general dimension IECV > .50 indicates an item reflects the general dimension more than a specific dimension Now, if a goal of this analysis was to select a subset of items that most represent a unidimensional trait, then Stucky and Edelen (2015) suggest selecting items from the bifactor analysis that have both standardized loadings > .85 and IECVGen > .85.

36 𝐼𝐸𝐶𝑉 𝐺𝒆𝒏 = 𝜆 𝑗 ∗𝐺 𝑠 2 𝜆 𝑗 ∗𝐺 2 + 𝜆 𝑗 ∗𝑆 2
𝐼𝐸𝐶𝑉 𝐺𝒆𝒏 = 𝜆 𝑗 ∗𝐺 𝑠 𝜆 𝑗 ∗𝐺 𝜆 𝑗 ∗𝑆 2

37 IECV Results 24 of the 29 items measure the general dimension more than the specific dimension 9 items are very strong measures of the general dimension

38 Average Relative Parameter Bias (ARPB)
(Rodriguez et al., 2016) Compare item loadings on general factor (from bifactor solution) to loadings on single factor (from unidimensional solution) Use the Excel-based ARPB Calculator from DrJosephHammer.com 10-15% upper limit for ARPB recommended by Muthen, Kaplan, and Hollis (1987)

39 How similar are Uni loadings to Gen loadings?
ismi1 0.788 0.797 ismi2 0.730 ismi3 0.523 0.532 ismi4 0.699 0.657 ismi5 0.693 0.646 ismi6 0.726 0.697 ismi7 0.556 0.561 ismi8 0.586 0.581 ismi9 0.306 0.267 ismi10 0.678 0.666 ismi11 0.661 0.638 ismi12 0.572 0.521 ismi13 0.758 0.750 ismi14 0.660 0.600 ismi15 0.654 0.595 ismi16 0.738 0.667 ismi17 0.707 0.647 ismi18 0.764 0.787 ismi19 0.583 ismi20 0.768 0.743 ismi21 0.773 0.785 ismi22 0.744 0.723 ismi23 0.710 0.718 ismi24 0.739 0.735 ismi25 0.071 0.056 ismi26 0.728 0.742 ismi27 0.665 0.673 ismi28 0.359 0.351 ismi29 0.213 0.206 How similar are Uni loadings to Gen loadings? Similar loadings ARPB only 5%, well below the 10-15% upper limit

40 Internal Structure Conclusion
The ISMI-29 is best conceptualized as a primarily unidimensional instrument, despite the presence of some multidimensionality Minimal measurement bias would be introduced by treating the ISMI-29 as unidimensional, despite the poor fit of the unidimensional CFA model

41 Ancillary Bifactor Measures
Ancillary bifactor measures address either: Dimensionality of instrument Model-Based Reliability of total and/or subscale scores

42 Model-Based Reliability
Necessary to provide model-based reliability evidence that the instrument’s total and subscale scores truly represent the target constructs of interest In the absence of such evidence, researchers risk misinterpreting the meaning and significance of these scores

43 Regarding Reliability
Omega (ω) (McDonald, 1999) Omega Hierarchical (ωH) (McDonald, 1999; Zinbarg, Barlow, & Brown, 1997) Omega Hierarchical Subscale (ωHS) (Reise et al., 2013) Percentage of Reliable Variance (PRV; Li et al., in preparation) See Rodriguez et al., (2016) for review and calculation formulas

44 What measures for what scores?
Omega (ω) – total and subscale scores Omega Hierarchical (ωH) – total score Omega Hierarchical Subscale (ωHS) – subscale scores Percentage of Reliable Variance (PRV) – total and subscale scores

45 omega= 𝑖=1 𝑛 𝜆 𝑖 2 𝑖=1 𝑛 𝜆 𝑖 2 + 𝑖=1 𝑛 1− ℎ 𝑖 2

46 omegaH= 𝑖=1 𝑖 𝜆 𝑖_𝑔𝑒𝑛𝑒𝑟𝑎𝑙 𝑖=1 𝑛 𝜆 𝑖_𝑔𝑒𝑛𝑒𝑟𝑎𝑙 ( 𝑖=1 𝑛 𝜆 𝑖_𝑓1 ) 2 + ( 𝑖=1 𝑛 𝜆 𝑖_𝑓2 ) 2 +…+ 𝑖=1 𝑛 1− ℎ 𝑖 2

47 omegaHf1= 𝑖=1 𝑖 𝜆 𝑖_𝑔𝑒𝑛𝑒𝑟𝑎𝑙_ f ( 𝑖=1 𝑛 𝜆 𝑖_𝑓1 ) 𝑖=1 𝑛 𝜆 𝑖_𝑔𝑒𝑛𝑒𝑟𝑎𝑙 _f ( 𝑖=1 𝑛 𝜆 𝑖_𝑓1 ) 2 …+ 𝑖=1 𝑛 1− ℎ 𝑖 _𝑓1 2

48 OmegaHS= ( 𝑖=1 𝑛 𝜆 𝑖_𝑓1 ) 2 𝑖=1 𝑛 𝜆 𝑖_𝑔𝑒𝑛𝑒𝑟𝑎𝑙 _𝑓1 2 + ( 𝑖=1 𝑛 𝜆 𝑖_𝑓1 ) 2 + 𝑖=1 𝑛 1− ℎ 𝑖_f1 2

49 Omega, for total score Omega (ω) for total score
Proportion of total score variance that can be attributed to all common factors (i.e., true score variance, which excludes error variance) The reliability of the multidimensional composite total score

50 Omega, for subscale score
Proportion of subscale score variance that can be attributed to all common factors (i.e., the general factor plus the specific factor for that subscale) The reliability of the multidimensional composite subscale score

51 Omegas Total score ω = .96 96% of the total score variance is due to all common factors (general + 5 specific factors) Subscale score ω’s = .88, .85, .89, .89, .67 67% to 89% of subscale score variance is due to general + that specific factor

52 Omega H Omega Hierarchical (ωH)
Proportion of total score variance that can be attributed to the general factor after accounting for all specific factors Degree to which the total score reflects the target dimension (i.e., the general factor)

53 Omega H “Tentatively, we can propose that a minimum would be greater than .50, and values closer to .75 would be much preferred” (p.137; Reise et al., 2013) Thus, ωH > .75 would indicate that the ISMI-29’s total score predominantly reflects a single general factor, permitting users to interpret the total score as a sufficiently reliable measure of the general factor ISMI-29 ωH = .91

54 Omega HS Omega Hierarchical Subscale (ωHS)
Proportion of subscale score variance that can be attributed to the specific factor after accounting for the general factor Degree to which the subscale score reflects the target dimension (i.e., the intended specific factor)

55 Omega HS ωHS < .50 would indicate that the majority of that subscale score’s variance is due to the general factor and that negligible unique variance is due to that specific factor. In other words, that subscale score’s reliability is overwhelmingly inflated (i.e., confounded) by the general factor and does not reliably measure the intended subdomain construct! To interpret such a subscale as capturing something unique could be misleading.

56 Omega HS ωHS = .13, .21, .24, .09, .26 Lack of evidence in favor of using any of the five subscale scores as measures of their narrower specific factors Remember, ωHS > .50 is minimum and around .75 or higher is preferred

57 Percentage of Reliable Variance (PRV)
(Rodriguez et al., 2016) PRV more definitive than ωH or ωHS because it accounts for ω Comes in 2 flavors: Total score PRV Subscale score PRV

58

59 PRV, for total score No empirically-derived guidelines exist for total score PRV Li et al., (in preparation) chose PRV > 75% cutoff Over 75% of the reliable variance in the total score should be due to the general factor Prerequisite to using the raw total score

60 PRV, for subscale score No empirically-derived guidelines exist for subscale score PRV either However, if we extend the logic of Reise and colleagues' (2013) “ωHS > .50 minimum and > .75 preferred” recommendation, then we might tentatively recommend the following…

61 PRV, for subscale score If subscale score PRV < 50%, then less than 50% of the reliable variance in that subscale score is due to its specific factor, which means that subscale score has questionable added value beyond total score. If 50% < PRV < 75% is gray area If PRV > 75%, subscale score is a sufficiently reliable measure of its specific factor, and has added value beyond total score.

62 PRV Results for ISMI-29 Total score PRV = 95%
Meets > 75% criteria: further evidence in favor of using total score Subscale score PRVs = 14%, 25%, 26%, 10%, 38% All fail the minimum > 50% criteria: further evidence contraindicating the use of any of the subscale scores

63 OmegaH(S) and PRV Hint If the OmegaH (or OmegaHS) meets the criteria, then the corresponding PRV is also likely to meet the criteria. However, in the case of disagreement, the PRV should be given more consideration, because it better accounts for the context (i.e., omega).

64 Model-Based Reliability Conclusion
Use the ISMI-29 raw total score Do not use the any of the subscale scores They are mostly re-measuring the general factor, so they don’t provide substantial added value beyond what the total score provides

65 Cautions Academic consensus regarding bifactor best practices has not yet been reached So, take our advice with a grain of salt Stay up to date on new bifactor literature

66 References Brouwer, D., Meijer, R. R., & Zevalkink, J. (2013). On the factor structure of the Beck Depression Inventory–II: G is the key. Psychological Assessment, 25, 136–145. Cho, S., Wilmer, J., Herzmann, G., Williams McGugin, R., Fiset, D., Van Gulick, A. E., & ... Gauthier, I. (2015). Item Response Theory Analyses of the Cambridge Face Memory Test (CFMT). Psychological Assessment, 27, Li, C. R., Toland, M. D., Usher, E. L. (in preparation). Dimensionality, scoring, and interpretation of the Short Grit Scale. Marsh, H.W., Scalas, L.F., & Nagengast, B. (2010). Longitudinal tests of competing factor structures for the Rosenberg Self-Esteem Scale: Traits, ephemeral artifacts, and stable response styles. Psychological Assessment, 22, McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: Erlbaum Muthén, B. O., Kaplan, D., & Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random. Psychometrika, 52, 431–462. Quinn, H. O. (2014). Bifactor models, explained common variance (ECV), and the usefulness of scores from unidimensional item response theory analyses (Unpublished master’s thesis). University of North Carolina, Chapel Hill. Reise, S. P. (2012). The rediscovery of bifactor measurement models. Multivariate Behavioral Research, 47, Reise, S. P., Bonifay, W. E., & Haviland, M. G. (2013). Scoring and modeling psychological measures in the presence of multidimensionality. Journal of Personality Assessment, 95, Reise, S. P., Moore, T. N., & Haviland, M. G. (2010). Bifactor models and rotations: Exploring the extent to which multidimensional data yield univocal scale scores. Journal of Personality Assessment, 92, 544–559. Reise, S. P., Morizot, J., & Hays, R. D. (2007). The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Quality of Life Research, 16, Reise, S. P., Scheines, R., Widaman, K. F., & Haviland, M. G. (2013). Multidimensionality and structural coefficient bias in structural equation modeling: A bifactor perspective. Educational and Psychological Measurement, 73(1), 5–26. Ritsher, J. B., Otilingam, P. G., & Grajales, M. (2003). Internalized stigma of mental illness: Psychometric properties of a new measure. Psychiatry Research, 121, Rodriguez, A., Reise, S. P., & Haviland, M. G. (2016). Evaluating bifactor models: Calculating and interpreting statistical indices. Psychological Methods, 21, Rodriguez, A., Reise, S. P., & Haviland, M. G. (2016). Applying bifactor statistical indices in the evaluation of psychological measures. Journal of Personality Assessment, 98, Stucky, B. D., Edelen, M. O., Vaughan, C. A., Tucker, J. S., & Butler, J. (2014). The Psychometric Development and Initial Validation of the DCI-A Short Form for Adolescent Therapeutic Community Treatment Process. Journal of Substance Abuse Treatment, 46, 516–521. Stucky, B. D., & Edelen, M. O. (2015). Using hierarchical IRT models to create unidimensional measures from multidimensional data. In S. P. Reise & D. A. Revicki (Eds.), Handbook of item response theory modeling: Applications to typical performance assessment ( ). London, UK: Taylor & Francis. Zinbarg, R. E., Barlow, D. H., & Brown, T. A. (1997). The hierarchical structure and general factor saturation of the Anxiety Sensitivity Index: Evidence and implications. Psychological Assessment, 9,

67 Questions

68 How do I cite and reference this talk?
If you wish to cite the video of this AQPS Talk, please use this reference and citation: Reference: Hammer, J. H., & Toland, M. D. (2016, November). Bifactor analysis in Mplus. [Video file]. Retrieved from In-text citation: Hammer and Toland (2016) or (Hammer & Toland, 2016) This PowerPoint Handout can be found at the APS Lab website: You can download the de-identified raw data and Mplus input and output syntax used in this talk from the APS Lab website. You are encouraged to adapt our syntax for yourown research—please use this reference and citation when doing so: Hammer, J. H. & Toland, M. D. (2016). Name of specific syntax file you adapted from us goes here  [Data file]. Retrieved from

69 Thank You Joseph H. Hammer, PhD joe.hammer@uky.edu
Michael D. Toland, PhD


Download ppt "Joseph H. Hammer, PhD Michael D"

Similar presentations


Ads by Google