Application of Monte Carlo Methods for Process Modeling John Kauffman, Changning Guo FDA\CDER Division of Pharmaceutical Analysis Jean-Marie Geoffroy Takeda Global Research and Development The opinions expressed in this presentation are those of the authors, and do not necessarily represent the opinions or policies of the FDA.
Outline Why propagate uncertainty in regression-based process models? Why use Monte Carlo (MC) simulation? Why solve regression models with MC?
What is Design Space? Design space is “the multidimensional combination and interaction of input variables and process parameters that have been demonstrated to provide assurance of quality”. (ICH Q8) Process modeling (DOE) is a central component of design space determination.
Design Space Schematic Parameter #1 #2 Knowledge Space Design Space
Design Space Schematic with uncertainty Parameter #1 #2 Knowledge Space Design Space
Case Study #1: Modeling 45 minute dissolution (D45) of a tableting process 32 Factorial Experimental Design Granulating Water (GS: 36-38 kg) Granulating Power (P: 18.5-22.5 kW) Nested Compression Factors Compression Force (CF: 11.5-17.5 kN) Press Speed (S: 70-110 kTPH) Least Squares Predictive Model* D45 = 68.35 – 1.34(GS) – 2.88(P) - 8.95(CF) + 2.43(GS)2 * Parameter values are mean-centered and range-scaled. In Puerto Rico the granulation endpoint was not power alone, and it was variable. So upon transfer to the UK, GP was selected as an endpoint, and DOE was required to study and validate the process. Publication Reference Application of Quality by Design Knowledge (QbD) From Site Transfers to Commercial Operations Already in Progress,” J. PAT, Jan/Feb, pg. 8, 2006.
Diagram of Experimental Design 38 36 37 18.5 20.5 22.5 speed force granulation water (kg) granulation power (kW)
Experimentation and Process Modeling D45exp 1 D45exp 2 D45exp 3 GSexp 1 Pexp 1 CFexp 1 GSexp 2 Pexp 2 CFexp 2 GSexp 3 Pexp 3 CFexp 3
Experimentation and Process Modeling D45pred 1 = B0 + B1·GSexp 1 + B2·Pexp 1 + B3·CFexp 1 + B4·GS2exp 1 D45pred 2 = B0 + B1·GSexp 2 + B2·Pexp 2 + B3·CFexp 2 + B4·GS2exp 2 D45pred 3 = B0 + B1·GSexp 3 + B2·Pexp 3 + B3·CFexp 3 + B4·GS2exp 3 Propagation of uncertainty in process model predictions: All Model Coefficient variances and Process Variable variances contribute to each predicted Response uncertainty in a model-dependent manner.
Propagation of Uncertainty in Regression Modeling What procedures can be used to estimate uncertainty in design space? What is the benefit of propagating uncertainty using Monte Carlo simulation?
The Process Model: Matrix Representation D45pred 1 D45pred 2 D45pred 3 1 GSexp 1 Pexp 1 CFexp 1 GS2exp 1 1 GSexp 2 Pexp 2 CFexp 2 GS2exp 2 1 GSexp 3 Pexp 3 CFexp 3 GS2exp 3 B0 B1 B2 B3 B4 = Response matrix Design B R = DB
Least Squares Solution to a Process Model Matrix Representation of Process Model: R = DB Define the pseudoinverse of D: D† = (DTD)-1DT Solving for the Model Coefficients: D†R = B The pseudoinverse solution of a matrix equation gives the least squares best estimates of the B coefficients!
Estimating Variance in Prediction: The Basis for Uncertainty in Design Space Response Covariance matrix Cov(R) = B[Cov(D)]BT Jth experimental variance = Jth diagonal element of Cov(R) Assumptions: Only D has uncertainty. Problems: 1.) We know that B has uncertainty. 2.) We know that uncertainties in D will be correlated, but we don’t know Cov(D)
Estimating Variance of Process Model Regression Coefficients Cov(B)=[DTD]-1 dR 2 Response variance ( p = # model coefficients N = # experiments) = S (Ri – Ri)2 N - p ^ i=1 N Model coefficient Covariance matrix Cov(B)=D†[Cov(R)]D†T Jth Model coefficient variance = Jth diagonal element of Cov(B) Assumptions: Only R has uncertainty; Errors uncorrelated and constant Problems: 1.) We know that D (matrix of input variables) has uncertainty. 2.) We suspect that uncertainties may be correlated.
Monte Carlo Methods Develop a mathematical model. The Process Model. Add random variables. Replace quantities of interest with random numbers selected from appropriate distribution functions that are expected to describe the variables. Monitor selected output variables. Output variables become distributions whose properties are determined by the model and the distributions of the random variables. Advantage #1: We make no assumptions concerning sources of uncertainty or covariance between variables.
Case Study #1: Influence of Process Parameter Variation on Prediction Model Conditions GS mean = 36 kg P mean = 20 kW CF mean = 14 kN Input parameter standard deviations were varied. Dissolution values were predicted.
Example: Simulation 1 GS Mean = 36 kg Std. Dev. = 0.25 kg P Mean = 20 kW Std. Dev. = 1 kW CF Mean = 14 kN Std. Dev. = 1 kN D45 Simulation Result Mean = 74.6% Std. Dev. = 3.70% D45 = 68.35 – 1.34(GS) – 2.88(P) - 8.95(CF) + 2.43(GS)2
Example: Simulations 1-4 D45 Simulation Mean = 74.6% Std. Dev. = 3.70% D45 Simulation Mean = 75.0% Std. Dev. = 4.59% D45 Simulation Mean = 76.9% Std. Dev. = 7.89% D45 Simulation Mean = 76.9% Std. Dev. = 8.26% GS Std. Dev.=0.25 kg P Std. Dev.=1 kW CF Std. Dev.= 1 kN D45 Std. Dev.= 0% GS Std. Dev.=0.5 kg P Std. Dev.=1 kW CF Std. Dev.= 1 kN D45 Std. Dev.= 0% GS Std. Dev.=1 kg P Std. Dev.=1 kW CF Std. Dev.= 1 kN D45 Std. Dev.= 0% GS Std. Dev.=1 kg P Std. Dev.=2 kW CF Std. Dev.= 1 kN D45 Std. Dev.= 0%
Influence of Process Parameters Variation Increase in granulation water mass (GS) variance: Increases predicted D45 variance. Slightly shifts predicted D45 means. Skews the predicted D45 distributions. Increase in granulator power (P) endpoint variance: Does not shift predicted D45 means. Does not skew the predicted D45 distributions.
Influence of Dissolution Measurement Error Model Conditions GS mean = 36 kg P mean = 20 kW CF mean = 14 kN Input parameter standard deviations were varied. Dissolution measurement error was added. Dissolution values were predicted.
Example: Simulations 5-7 D45 Simulation Mean = 75.0% Std. Dev. = 3.88% D45 Simulation Mean = 75.0% Std. Dev. = 4.37% D45 Simulation Mean = 75.0% Std. Dev. = 5.58% D45 Simulation Mean = 75.0% Std. Dev. = 7.16% GS Std. Dev=0.5 kg P Std. Dev=1 kW CF Std. Dev=0.5 kN D45 Std. Dev=0% (Control) GS Std. Dev=0.5 kg P Std. Dev=1 kW CF Std. Dev=0.5 kN D45 Std. Dev=2% GS Std. Dev=0.5 kg P Std. Dev=1 kW CF Std. Dev=0.5 kN D45 Std. Dev=4% GS Std. Dev=0.5 kg P Std. Dev=1 kW CF Std. Dev=0.5 kN D45 Std. Dev=6%
Influence of Dissolution Measurement Error Increase in D45 measurement variance: does not shift predicted D45 means. does not appear to skew predicted D45 distributions. increases predicted D45 variance. Advantage #2, we get the distribution, not just the standard deviation. Advantage #3, sensitivity analysis allows us to prioritize process improvement.
Measurement Uncertainty and Prediction Uncertainty (%) Benchmark
Monte Carlo Prediction Error Std. Error Random Coefficients Random Response Random Inputs Result using estimated coefficient St. Dev.
Prediction Error Based on Estimated Coefficient Standard Deviations Estimated model coefficient standard deviations do not predict the observed response uncertainty. Can we use Monte Carlo simulation to provide better estimates of model coefficient standard deviations? Model coef uncertainty only (Note, N matters!) Coef and D uncertainty Coef, D and R Measured R Can we condense R and D into B uncertainty?
Propagation of Uncertainty in Process Modeling The pseudoinverse of D: D†=(DTD)-1DT Solving for the Model Coefficients: D†R=B B1 = D†11·RExp 1 + D†12·RExp 2 + D†13·RExp 3 +… B2 = D†21·RExp 1 + D†22·RExp 2 + D†23·RExp 3 +… 1. Assign random variables to Dissolution values (R) and use Monte Carlo simulations to propagate error to the model coefficients (B). 2. Assign random variables to Process Parameters (D) and use Monte Carlo simulations to propagate error to B.
How Do Variances in Process Parameters Influence Model Coefficients? Simulation # 1 (1-0.25-1) Measured D45 means and standard deviations. P 19-23 kW ± 1 kW GS 36-38 kg ± 0.25 kg CF 12-18 ± 1 kN Compare to regression distributions Model coefficient means Model coefficient standard deviations
How Do Variances in Process Parameters Influence Model Coefficients? Ran. Input Regression “Bias” Power (P) Water (GS) Force (CF) Water2 Increase in process parameter variance causes a shift in some model coefficients. Increase in process parameter variance increases model coefficient variance.
How Do Variances in Process Parameters Influence Model Coefficients? Simulation # 1 (1-.25-1) Simulation # 2 (1-0.5-1) Simulation # 3 ( 1-1-1) Simulation # 4 ( 2-1-1) Increasing input parameter variance: increases variance in the model coefficients. can skew the model coefficient distribution. can shift model coefficient means.
Estimated Model Coefficient Uncertainties from Monte Carlo Simulation Std. Error Tabletting model. Table of conditions, coefficients and predicted vs observed R uncertainties. (N matters!)
Case Study #2: Nasal Spray Performance Models A nasal spray product is a combination of a therapeutic formulation and a delivery device. 3-level, 4-factor Box-Behnken designs Pfeiffer nasal spray pump Placebo formulations (CMC & Tween 80 solutions) Reference: Changning Guo, Keith J. Stine, John F. Kauffman, William H. Doub. 2008 “Assessment of the influence factors on in vitro testing of nasal sprays using Box-Behnken experimental design”, European Journal of Pharmaceutical Sciences 35 (12 ) 417–426 Changning Guo, Wei Ye, John F. Kauffman, William H. Doub. “Evaluation of Impaction Force of Nasal Sprays and Metered-Dose Inhalers Using the Texture Analyser.” Journal of Pharmaceutical Sciences. In press
Response Variables Parameters used to describe the shape of a nasal spray plume: spray pattern area, plume width. Plume geometry: measures the side view of a spray plume at its fully developed phase Spray Pattern: measures the cross sectional uniformity of the spray
Response Variables Droplet Size Distribution Impaction Force Volume Median Diameter D50 Impaction Force
Nasal Spray Response Models Optimized regression models
Spray Pattern Model Variances from input variables and spray pattern area measurements have similar level of influence on the model coefficients.
Plume Width Model Variance from plume width measurements have more influence on the model coefficients than those from the input variables.
Droplet Size Model – D50 offset V C VC V2 Random R only Random D only Random R&D Variance from input variables have more influence on the model coefficients than those from the D50 measurements.
Impaction Force Model Variances from input variables and impaction force measurements have similar level of influence on the model coefficients.
How Do Variances in Formulation and Actuation Influence Model Coefficients? The means of model coefficients show good agreement between regression results and Monte Carlo simulations The standard deviations of model coefficients obtained from regression results are larger than those from Monte Carlo simulation. The estimated standard deviations from regression may overestimate the uncertainties in the model coefficients. Regression based coefficient standard deviations in defining design space may result in a smaller selection range of input variable values that are necessary to meet the desired confidence level.
Advantages of Monte Carlo Simulation 1. We make no assumptions concerning sources of uncertainty or variable covariance. 2. We see the distribution of output variable values, not just a standard deviation. 3. Sensitivity analysis allows us to prioritize high risk input variables and improve process control.