Latent Class Regression Computing examples Karen Bandeen-Roche October 28, 2016
Objectives For you to leave here knowing… How to use the LCR SAS Macro for latent class regression How to interpret, report output Model checking: how to conduct, interpret pseudo-value analysis
Basics on using the software Part I: Basics on using the software
General recommendations Fit an LCA Add covariates one at a time Initialize at prior fits and an “agnostic” value for coefficient of newly added covariate
Reminder – Toy Example
Reminder – Toy Example
Initialization using Previous Fit Result: beta sebeta Covariates int -0.248578 0.3123104 COV1 1.8922928 0.5594732
Initialization using Previous Fit beta sebeta Covariates int -0.248578 0.3123104 COV1 1.8922928 0.5594732 Initial coefficient for newly added covariate
Back to the PTSD Example Part II: Back to the PTSD Example
PTSD Latent Class Analysis SYMPTOM DOMAIN (prevalence) SYMPTOM PROBABILITY (π) Class 1 - NO PTSD Class 2 - PRECLINICAL Class 3 - PTSD RE- EXPERIENCE (1 of 5) Recurrent thoughts (.49) .20 .74 .96 Distress to event cues (.42) .12 .68 .88 Reactivity to cues (.31) .05 .51 .77 AVOIDANCE/ NUMBING (3 of 7) Avoid related thoughts (.28) .08 .37 .75 Avoid activities (.24) .34 .66 Detachment (.15) .01 .14 .64 INCREASED AROUSAL (2 of 5) Difficulty sleeping (.19) .02 .18 .78 Irritability (.21) .22 .83 Difficulty concentrating (.25) .03 .30 .89 MEAN PREVALENCE-BASELINE .52 .33 [Omitted: nightmares, flashback; amnesia, interest, affect, short future; hypervigilance, startle] Report concordance with diagnosis: 79.1 Sens, 93.7 Spec, 63.7 PPV, 97.1 NPV; NOT FIT Female 3 times the RP as males; Assault between 2.5 and 10 time the RP as other traumas
LCR Coding – Three Class Model Covariates: Sex (1 if female, 0 if male) Trauma type (indicators for injury / other shock trauma to loved one death to loved one; personal assault = ref.) (Intercept)
LCR Coding – Three Class Model Initialization: from model With only “female” as covariate (three trauma type indicators 1st vs. 3rd and 2nd vs. 3rd class initialized at 0 1st vs 3rd class 2nd vs 3rd class
Parameter interpretation, inference Part III: Parameter interpretation, inference
Covariate Coefficient Estimates Example: “FEMALE” exp(1.137) = 3.117 Odds of “PTSD” vs “NONE” 3.12 higher in females vs males (holding trauma type constant) 95%CI = Exp(1.137-1.96*0.172, 1.137+1.96*0.172) = (2.224,4.367) Log Relative Prevalence Ratios, “PTSD” vs “NONE”
Data support that females are at substantially higher risk than males; that persons with traumas other than assault are at substantially lower risk
Covariate Coefficient Estimates exp(0.109) = Relative prevalence “Some symptoms” vs None (Class 2 vs Class 3) in male assault victims (reference group) = 1.115 Prevalence “Some symptoms” in male assault victims = exp(0.109)/ [1+exp(-0.535)+exp(0.109)] =0.413 Log Relative Prevalence Ratios, “Some Symptoms” vs “NONE” (Class 2 vs 3) FEMALE injshock traumlov deathlov
Interaction analysis “PTSD” versus “None” Interactions for “some symptoms” versus “none”: negligible
Interactions? AIC, BIC slightly increased (vs no interactions) -2 log likelihood comparison: No interactions Interactions > Difference = 10.71 on (43-37) df > p-value = 0.10 (χ2 with 6 df) > A suggestion that the extent of F vs M increase in risk is trauma-type dependent loglik dfm -14731.83 df 43 loglik dfm -14742.54 df 37
Part IV: Model Checking
Checking How the Model Fails to Fit Basic ideas: Suppose the model is true If we knew persons’ latent class memberships, we would check directly: Within classes: Check correlations or pairwise odds ratios among the item responses (Conditional Independence) Regress item responses on covariates (non-differential measurement) Regress class memberships on covariates, hope for Similar findings re regression coefficients No strong effects of outliers Identify strongly nonlinear covariates effects
Checking How the Model Fails to Fit But in reality, we don’t know the true latent class membership! Latent class memberships must be estimated Randomize people into “pseudo” classes C* using their posterior probabilities or assign to “most likely class” corresponding to the highest posterior probability Posterior probability is defined as Analyze as described before, except using “pseudo” class membership rather than true ones Bandeen Roche, Miglioretti, Zeger & Rathouz, 1997 Huang & Bandeen-Roche, 2004; Wang, Brown & Bandeen-Roche, 2005 Bandeen-Roche et al., J Am Statist Assoc., 1997
PTSD analysis Implementation Step 1: Obtain posterior probabilities in a dataset data post; merge theta dataset; run; data junk; set post; file 'ptsd1pos.dat'; put theta1 theta2 theta3 b1 b4 b5 c1 c2 c5 d1 d2 d3 female injshock traumlov deathlov;
PTSD analysis Implementation Step 2: Randomize individuals into pseudo classes C* (see next slide…)
# read data data_matrix(scan("ptsd1pos # read data data_matrix(scan("ptsd1pos.dat"),ncol=16,byrow=T) data_cbind(1:nrow(data),data) # establish a randomization vector rvec_runif(nrow(data)) rvec_order(rvec) data_cbind(data,rvec) # labels dimnames(data)_list(NULL,c("id","theta1","theta2","theta3","b1","b4","b5","c1", "c2","c5","d1","d2","d3","female","injshock","traumlov","deathlov","random")) nrow(data) # posterior probabilities pi_data[,c("theta1","theta2","theta3")] print(dim(pi)) # randomization ptsd1pos_matrix(0,ncol=ncol(pi),nrow=nrow(pi)) ptsd1pos[,1]_1*(rvec<= pi[,1]) ptsd1pos[,2]_1*((rvec > pi[,1]) & (rvec <= (pi[,1]+pi[,2]))) ptsd1pos[,3]_1*(rvec > (1-pi[,3])) # complete data ptsd1pos_cbind(data,ptsd1pos) dimnames(ptsd1pos)_list(NULL,c(dimnames(data)[[2]],"class1","class2","class3"))
PTSD analysis Implementation Step 3: Evaluation of [Y|x] “per” C* Time-saving method: GEE to analyze [Y|x,c*] “GEE2”: Heagerty & Zeger, 1996 ordgee in R
PTSD analysis Implementation Two concurrent regressions Mean model: Logistic regression of each item Yim on covariates “Basic”: item, pseudo-class indicators and their interactions - reproduce measurement model Item-by-x terms: assesses differential measurement Association model describes pairwise odds ratios: factors by which odds of positive response on one item vary by response on another item M-choose-2 “outcomes” per person (pairs)
Model Checking Conclusions Symptoms were differentially sensitive to different traumas Within latent classes: Those with a non-assaultive trauma were less prone to report distress to cues, reactivity to cues, avoiding thoughts, & avoiding activities more prone to report recurrent thoughts & difficulty concentrating There was considerable tendency for symptoms within categories (esp. avoidance) to be reported together Concern: Criteria may insensitively detect psychiatric sequelae to assault than to traumas other than assault Differential measurement Conditional dependence
Objectives For you to leave here knowing… How to use the LCR SAS Macro for latent class regression How to interpret, report output Model checking: how to conduct, interpret pseudo-value analysis
EXTRA SLIDES