Setting Control Limits Using Bayesian Methods When Most Observations Are Below the Limit of Quantitation Steven Novick, Harry Yang, and Wei Zhao May 18, 2016 MBSW 1
Manuscript submitted PDA Journal of Pharmaceutical Science and Technology Awaiting decision… 2
Aseptic environment regulations 2004: FDA Guidance for industry. Sterile Drug Products Produced by Aseptic Processing — Current Good Manufacturing Practice 2008: EU Guidelines to Good Manufacturing Practice Medicinal Products for Human and Veterinary Use -- Annex 1 Manufacture of Sterile Medicinal Products (corrected version) 2013: USP "Microbiological Control and Monitoring of Aseptic Processing Environments“ 3
Purpose of environmental monitoring program Oversight for microbiological cleanliness of manufacturing operation Document the state of control of the facility 4
Data collection for surface sampling Surfaces of equipment may contain microbiological contaminants. For a particular testing occasion, several swabs are taken and assayed on a piece of equipment. Testing is performed over time 5 Test 1 Test 2 Test I …
Key to the success Establishment of alert and action control limits. “[…] usually at a distance of ±3 standard deviations […] from the […] mean.” 6
Some agreement in EM literature Let Y = microbial assay response (from a swab) Step 1: Model the data with a distribution Step 2: Create control limits from quantiles – Alert limits: upper 95% quantile of Y – Action limits: upper 99% quantile of Y Many Y values < limit of quantitation (LOQ) 7
Literature 2003: Christensen, et. al. “Environmental monitoring based on a hierarchical Poisson-Gamma model”, J. Qual Technol 2004: Hoffman, D. “Negative binomial control limits for count data with extra-Poisson variation”, Pharmaceutical Statistics. 2013: Yang, et. al. “Environmental Monitoring: Setting Alert and Action Limits Based on a Zero- Inflated Model”, PDA J Pharm Sci and Tech 8
9 Some authors set Y < LOQ to Y = 0. Many Y values < limit of quantitation (LOQ) Negative Binomial = Gamma-Poisson Y ~ Poisson( i ) i ~ gamma( , ) Or zero-inflated NB LOQ
10 When Y values are set to 0, Normal distribution not appropriate Mean +/- 3SD = (-13, 21) LOQ Use a counting distribution? Log-normal not appropriate (deal with 0s in the data)..or is it?
Tobit likelihood (Tobin, 1958) 11 Observed responses Left-censored responses
12 LOQ
Data collection for surface sampling Testing is performed over time with sub-sampling Suggesting: two variance components (2VC) – Testing occasion – Swab within testing occasion 13 Test 1 Test 2 Test I …
Tobit likelihood for 2VC 14 Observed responses Left-censored responses Integrate over this
Go Bayes! Software: STAN via rstan Weakly informative priors for ( , T, e ) 4 independent MCMC chains – Burnin = 20,000 – Thinning = 25 – Posterior draws after thinning = 10,000 Total posterior draws = 40,000 – Effective sample sizes all > 10,000 15
Stan model pseudo-code 16 Integrate over this Likelihood for observed responses Likelihood for left-censored responses
Sample from posterior distribution 17
18 Example data I =194 testing occasions 1-16 swabs/test. ~97.5% of Values < LOQ = 6 g/25cm 2
19
Alert/Action limits Lower 95% credible limit of quantiles 95%-ile = 2.6 < LOQ 99%-ile = 6.9 (just barely above LOQ) 99.5%-ile = 9.4 (maybe…) 20 Action Limit Alert Limit
97.5% of Values < LOQ ? 21
22 Conditional distribution 95%-ile = 18 99%-ile = %-ile = 35 Lower 95% credible limit of conditional quantiles
So much data! Do we really need two variance components? 23
Model log(Y ij ) ~ N(Mean= , SD= Total ) ? 24
25
26 2VC 1VC 2VC 1VC 95%-ile = %-ile = %-ile = 35 39
Will this hold true with massive left-censoring? Short answer: Yes Long answer: Yes, but… 27
Simulations: 1VC Generally ok Coverage for AL may be inadequate when T is large. Too confident. Better for situation with small T. 28
Simulations: 2VC Generally ok. When (% Y < LOQ) is very large, # test occasions and/or # swabs/test is small, not enough data to support the model. Good for situations with enough data to provide estimate for T and e. 29
Summary Tobit likelihood: models observed and left-censored observations. Bayes: useful for calculating lower 95% credible interval of alert/action limits. 1VC may be adequate for some EM data (when T ) is small. 2VC works well when the data can support the model (i.e., there must be enough Y LOQ). 30
Possible Extensions For discrete data, instead of zero-inflated NB, apply Tobit likelihood to NB. For discrete data, model components of variability through the mean parameter. 31
Thank you! 32