Latent Class Analysis Computing examples

1 Latent Class Analysis Computing examples
Karen Bandeen-Roche October 28, 2016

2 Objectives For you to leave here knowing…
How to use the LCR SAS Macro for latent class analysis Brief introduction to poLCA in R How to interpret, report output How to create residuals and conduct model checking with them

3 Basics on using the software
Part I: Basics on using the software

4 SAS MACRO … Beginning of file— basic documentation
/* */ /* */ /* TITLE: LCR */ /* */ /* A SAS Macro for Latent Class Regression using PROC IML */ /* Requires SAS/IML */ /* Please send any suggestions or corrections to */ /* DESCRIPTION */ /* */ /* This program contains a macro for fitting LCA and LCR models and an example */ /* To fit a standard latent class model, only include an intercept in the model */ /* The macro uses the algorithm describe in */ /* Bandeen-Roche, K; Miglioretti, DL; Zeger, SL; Rathouz, P, */ /* "Latent variable regression for multiple discrete outcomes," */ /* JASA, In Press (1997) */ /* to fit latent class regression models. Beginning of file— basic documentation

5 SAS MACRO … This text creates output data sets!
create &outlib..beta from beta [colname='value' rowname=betaname]; append from beta [rowname=betaname]; close &outlib..beta; create &outlib..eta from eta [colname=etaname]; append from eta; close &outlib..eta; create &outlib..theta from h [colname=thetanam]; append from h; close &outlib..theta; expct = nrow(x)#bottom; create &outlib..expect from expct [colname='expected']; append from expct; close &outlib..expect; create &outlib..pi from pi [colname=pi2name rowname=varname]; append from pi [rowname=varname]; close &outlib..pi; create &outlib..var from var [colname=parm rowname=parm]; append from var [rowname=parm]; close &outlib..var; title; %mend lcr; This text creates output data sets! Posterior probabilities Expected cell counts Before anything else— need to run through here

6 Toy example (in software: immediately follows the macro)
BINARY INDICATORS data dataset; set a; if y1=. | y2=. | y3=. | y4=. | y5=. then delete; int=1; run; Create “intercept” Designed for complete data

7 SAS Macro Command line format
Name of your dataset Response variables Covariates (just intercept for LCA) Number of classes Initial parameters (“0” triggers self-initialization) Iterate to criterion precision

8 SAS Macro Output format
Top: initial estimates, # iterations Bottom: final estimates, fit criteria

9 SAS Macro Command line format
Example with initial estimates filled in rather than self-initialization “pi” = as we have defined it (conditional probabilities) “eta” = our “Pj” (latent class probabilities)

10 R function: poLCA > poLCA(formula = cbind(Y1, Y2, Y3, Y4, Y5) ~ 1, data = j2, nclass = 2) Conditional item response (column) probabilities, by outcome variable, for each class (row) $V1 Pr(1) Pr(2) class 1: class 2: $V2 class 1: class 2: $V3 class 1: class 2: $V4 Pr(1) Pr(2) class 1: class 2: $V5 class 1: class 2: Estimated class population shares

11 R function: poLCA ========================================================= Fit for 2 latent classes: number of observations: 100 number of estimated parameters: 11 residual degrees of freedom: 20 maximum log-likelihood: AIC(2): BIC(2): G^2(2): (Likelihood ratio/deviance statistic) X^2(2): (Chi-square goodness of fit)

12 Post-traumatic stress disorder
Part II: Application Post-traumatic stress disorder



15 Data set up (immediately following Macro)
Pull in pre-existing data A convenient way to code “patterns” Dataset to pass on to LCA

16 Pattern frequency listing
pscor=b1+10*b4+100*b5+1000*c *c *c *d *d *d3; No symptoms b1 only b4 only b1 & b4

17 LCA Macro “Call” Name of dataset Response variables (9 of them)
Number of classes Initial parameters “Canned” initialization & other starts yield same Am arranging for “low” symptom probability to be the “last” class (relevant for LCR)

18 Output Class Latent class probabilities Class 3 prevalence
1 2 3 Latent class probabilities Class 3 prevalence Class conditional probabilities Pr(B1=1|Class 3)

19 Classes reordered for reporting

20 Revisiting the Model for “Fit”
5 class model “None,” “PTSD” classes very stable AIC, BIC: both lower AIC, BIC LR test: Better LR test

21 Revisiting the Model for “Fit”
Five class model appears “better” Trustworthy? Data quite sparse! Seeing is believing—thus….

22 Checking Fit - Residuals
Standardized residuals (multinomial) In this case, residuals are actual cell counts vs. expected cell counts.

23 Expected counts: SAS Macro
Pull the expected values into a dataset. They’re labeled “expected”—rename them to avoid code-word problems Sort and tabulate to show the pattern, observed count, and expected count

24 Observed vs. Expected Comparison
Three class Five class Cut and paste into Excel: Stat transfer to Stata

25 Data structure in Stata
Obs Class Class

26 QQPlot of residuals, 5 vs 3 class
. gen resid3=n-tclass . gen resid5=n-fclass . qqplot resid5 resid3

27 QQPlot of standardized residuals, 5 vs 3 class
. gen sresid3 = resid3/sqrt(tclass*(1-tclass/1827)) . gen sresid5 = resid5/sqrt(fclass*(1-fclass/1827)) . qqplot sresid5 sresid3 Favors 5-class model

28 Listing–Largest |Standardized Residual| Differences
Negative difference favor 3-class model. Only a few large— these have small n. . gen sadiff = abs(sresid3)-abs(sresid5) . sort sadiff . list Pattern sadiff n tclass fclass sresid3 sresid5

29 Listing–Largest |Standardized Residual| Differences
. gen sadiff = abs(sresid3)-abs(sresid5) . sort sadiff . list Pattern sadiff n tclass fclass sresid3 sresid5 Both models underestimate the number having all symptoms Positive values favor 5-class model. A few large values have considerably large n (ex/ 110 = cues create distress, reactivity without re-experiencing).

30 Conclusion The latent class model fit suggests a nosology with subpopulations exhibiting “few” (just over half), “many” (~14%) and “re-experiencing plus a few other” symptoms. The conditional independence assumption may not be reasonable for these data

