Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multiple Imputation in Finite Mixture Modeling Daniel Lee Presentation for MMM conference May 24, 2016 University of Connecticut 1.

Similar presentations


Presentation on theme: "Multiple Imputation in Finite Mixture Modeling Daniel Lee Presentation for MMM conference May 24, 2016 University of Connecticut 1."— Presentation transcript:

1 Multiple Imputation in Finite Mixture Modeling Daniel Lee Presentation for MMM conference May 24, 2016 University of Connecticut 1

2 2 Introduction: Finite Mixture Models Class of statistical models that treat group membership as a latent categorical variable A class of analysis that estimates parameters for a hypothesized number of groups, or classes, from a single data set (McLachlan & Peel, 2000) – This usually involved: Investigating population heterogeneity in model parameters Finding the possible number of latent groups classifying cases into these groups examine the extent to which auxiliary information can be used to evaluate classes Any statistical method that can be formulated as a multiple group problem can be formulated as a finite mixture model

3 3 Introduction: Finite Mixture Models example (factor mixture models)

4 4 Introduction: Missing data in finite mixtures Missing data handling methods in finite mixture models (Sterba, 2014) – Strategy in which missingness is handled interferes with discriminating between latent class or latent continuous models. – MVN MI, FIML-EM, and newer MI approaches considered – MI strategies for multiple group SEMs (Enders & Gottschall, 2011) – Explored 2 MI methods with multiple groups SGI PTI – Cautionary note on latent categorical variables (mixture models)

5 5 Introduction: Missing Data Missing data in practice – Listwise/Pairwise Deletion – Full Information Maximum Likelihood – Multiple Imputation (MI; Rubin, 1976) Multiple Imputation – Imputation Phase: generate m different datasets, each with slightly different estimates for the missing values. – Analysis Phase: Analysis performed on the m datasets and parameters across m results averaged (Special rule for standard errors provided by Rubin (1987))

6 6 Introduction: Research Questions When groups are unknown (mixture models) how will MI perform? In a recent discussion with Craig Enders… “The gist is that standard MI routines will not work for mixtures because they will generate imputations from a single- class model. In effect, MI leaves out the most important variable in the analysis, the latent classes, thereby biasing the resulting estimates toward a single, common class...” In MI the group structure should be accounted for, otherwise imputations will produce poor values (since it uses the entire dataset to get these imputations) Label switching problem (Tueller, Drotar, & Lubke, 2011)

7 7 Methods: Simulation Manipulated 3 variables (total 12 conditions): – Sample size: 50 and 250 – MCAR missing rates: 5%, 15%, 25% (even benign missing values can cause bias) – Mahalanobis Distances: low ( 4) 100 multivariate normal complete data sets from a 2-group CFA model with 6 indicator variables. Each data set contained data for two groups with distinct population parameters, including true group variable (e.g. n = 250 was split into two groups, 125 in each group, with different population values)

8 8 Methods: Data Generating Model Group 2 Group 1

9 9 Methods: Data analysis Analysis 1: Used MI with 10 imputation when groups were known (normal CFA model), using the SGI procedure. Used built-in Mplus imputation (MI in Mplus; Asparouhov & Muthen, 2010) and MG- CFA analysis. – WHAT KIND OF IMPUTATION MODEL IS USED HERE? Analysis 2: Used MI with 10 imputations when groups were unknown (factor mixture model). Used Mplus for imputation and FMM analysis. – Starting values: true parameters Estimates from analysis 1 and analysis 2 were compared against true population parameters and standard bias estimates. Standard error estimates greater than 0.40 considered significant (Collins, Schafer, & Kam, 2001).

10 10 Label switching (Tueller, Drotar, & Lubke, 2011) Common issue in LVMM simulations Simple example: – TRUE generating values for factor variances: class 1 = 2 and class 2 = 4. – Rep.1 LVMM estimates show: class 1 = 3.9 and class 2 = 2.1 (switched) – Rep. 2 LVMM estimates show: class 1 = 1.9 and class 2 = 4.1 (OK) – Rep. 3 LVMM estimates show: class 1 = 2 and class 2 = 3.7 (switched) Problem: aggregating parameter estimates over potentially mislabeled classes

11 11 Methods: Evaluation criteria Bias PUT THE FORMULA HERE -0.05 used as cut-off (Hoogland & Boomsma, 1998) RMSE – PUT THE FORMULA HERE – Expected squared loss around the true parameter Standard error ratio (e.g., Lee, Poon, & Bentler, 1995) -SE(theta_hat(m))/SD(theta_hat(m) -values < 1  inflated Type I error -values > 1  inflated type II error -non-converged replications omitted

12 12 Results: Bias

13 13 Results: Bias

14 14 Label switching check (Tueller, Drotar, & Lubke, 2011)

15 15 Results: RMSE

16 16 Results: Standard Error Ratio

17 17 Discussion and Recommendations (and issues) MI not recommended for finite mixture models Other solutions? – Different sample sizes? – Larger differences in parameters? Label switching? – Does it happen at the imputation level or analysis level?


Download ppt "Multiple Imputation in Finite Mixture Modeling Daniel Lee Presentation for MMM conference May 24, 2016 University of Connecticut 1."

Similar presentations


Ads by Google