Mixed latent Markov models for longitudinal multiple diagnostics data with an application to salmonella in Malawi Marc Henrion, Angeziwa Chirambo, Tonney S. Nyirenda, Melita Gordon Malawi – Liverpool - Wellcome Trust Clinical Research Programme Liverpool School of Tropical Medicine mhenrion@mlw.mw JSM 2018 - Vancouver, Canada 30 July 2018 Don’t talk too much here!
Latent Markov Models Latent process, observed outcome variables at each time point Conditional on latent state, outcome variables assumed independent Salmonella data from Malawi: longitudinal data on multiple (binary) diagnostic tests no gold standard D1 D2 y1,1 y2,1 yk,1 … DT y1,2 y2,2 yk,2 y1,T y2,T yk,T TP1 TP2 TPT-1 CRP IP MEASUREMENT MODEL STRUCTURAL MODEL LMM = latent variable model Extends LCA to longitudinal data Structural model + measurement model Completely specified by IP, TP and CRP To keep maths tractable, typically assume independence of outcomes conditional on latent state Application to Salmonella data: 5 tests, 4 molecular tests, reference stool culture Conditional independence unlikely to hold: 4 tests use same PCR technology and 2-by-2 use the same primers
Relaxing the conditional independence assumption 𝑃 𝒀=𝒚;ϑ = 𝑖=1 𝑛 𝑑 𝑖 𝑃( 𝐷 𝑖 (1) = 𝑑 𝑖 (1) ) 𝑡=2 𝑇 1− 𝜏 01,𝑡 (1− 𝑑 𝑖 (𝑡) )(1− 𝑑 𝑖 (𝑡−1) ) 𝜏 01,𝑡 𝑑 𝑖 𝑡 (1− 𝑑 𝑖 (𝑡−1) ) 1− 𝜏 11,𝑡 (1− 𝑑 𝑖 (𝑡) ) 𝑑 𝑖 (𝑡−1) 𝜏 11,𝑡 𝑑 𝑖 (𝑡) 𝑑 𝑖 (𝑡−1) 𝑚=1 𝑘 𝜑 10,𝑘 𝑡 𝑦 𝑖𝑘 (𝑡) 1− 𝑑 𝑖 (𝑡) 1− 𝜑 10,𝑘 𝑡 1− 𝑦 𝑖𝑘 (𝑡) 1− 𝑑 𝑖 (𝑡) 𝜑 11,𝑘 𝑡 𝑦 𝑖𝑘 (𝑡) 𝑑 𝑖 𝑡 1− 𝜑 11,𝑘 𝑡 1− 𝑦 𝑖𝑘 (𝑡) 𝑑 𝑖 (𝑡) Initial state probabilities Conditional response probabilities 𝜑 𝑦𝑑,𝑚 =𝑃 𝑌 𝑚 =𝑦 | 𝑠𝑡𝑎𝑡𝑒=𝑑 Transition probabilities Sum over all possible latent state combinations Product over all patients Basic LMM Now add random effect to CRPs: 𝜑 𝑦𝑑,𝑖,𝑚 = 𝑒𝑥𝑝 𝛼 𝑑,𝑚 + 𝛽 𝑚 𝑍 𝑖 1+𝑒𝑥𝑝 𝛼 𝑑,𝑚 + 𝛽 𝑚 𝑍 𝑖 , 𝑍 𝑖 ~𝑁 0, 𝜎 2 Bayesian approach Frequentist much harder. Principled way of dealing with missing values. Other authors have considered mixed LMMs, but typically consider random effects in the latent process / structural model. Basic LMM likelihood: product over sum of latent states combinations of product of IP, TP and CRPs CRP term further factorises into product over outcome variables – this is what keeps the maths tractable Add subject specific random effects using a logistic function– trivial to add covariates into the CRPs Now you get large integrals to compute: Bayesian approach much easier, also better for missing data Could do similar extension to the transmission probas – but this has been done before
Simulations Assess convergence, identifiability issues, … Converges (up to label switching) Compare performance of basic and mixed LMMs: DIC, … basic LMM mixed LMM Simulations to check: Convergence, identifiability issues Compare performance: DIC, bias of MAP estimates, importance of prior, … Results: Yes, identifiable and converges – up to label switching typical for Bayesian mixture models; easily fixed after running MCMC Less biased MAP (or other) estimates
Application to Malawi salmonella data 4 new molecular PCR tests + stool culture 60 patients observed monthly for 12 months Results TTR primer PCR test achieves best sensitivity/specificity trade-off Data too sparse for time heterogeneous and mixed LMMs despite being biologically more plausible One specific PCR test is best: TTR Figure shows MAP estimates with 95% credible, highest posterior density intervals Unfortunately [embarrassingly?], even though this data motivated the model development, the more complex models: mixed, time heterogeneous do not converge [further work?] COME AND TALK TO ME THIS AFTERNOON!