France Recent advances in Global Sensitivity Analysis techniques S. Kucherenko Imperial College London, UK
France Introduction of Global Sensitivity Analysis and Sobol’ Sensitivity Indices Why Quasi Monte Carlo methods (Sobol’ sequence sampling) are much more efficient than Monte Carlo (random sampling) ? Effective dimensions and their link with Sobol’ Sensitivity Indices Classification of functions based on global sensitivity indices Link between Sobol’ Sensitivity Indices and Derivative based Global Sensitivity Measures Quasi Randon Sampling - High Dimensional Model Representation with polynomial approximation Application of parametric GSA for optimal experimental design Outline
France Model Output x i : input factors Propagation of uncertainty Input x1x1 x2x2 … x3x3 … 1 2n … x4x4 xkxk y
France Consider a model x is a vector of input variables Y is the model output. Variance decomposition: Sobol’ SI: Sensitivity Indices (SI) ANOVA decomposition (HDMR):
France Sobol’ Sensitivity Indices (SI) Definition: - partial variances - variance Requires 2 n integral evaluations for calculations Sensitivity indices for subsets of variables: Introduction of the total variance: Corresponding global sensitivity indices:
France How to use Sobol’ Sensitivity Indices? accounts for all interactions between y and z, x=(y,z). The important indices in practice are and does not depend on ; does only depend on ; corresponds to the absence of interactions between and other variables If then function has additive structure: Fixing unessential variables If does not depend on so it can be fixed complexity reduction, from to variables
France Evaluation of Sobol’ Sensitivity Indices Straightforward use of Anova decomposition requires 2 n integral evaluations – not practical ! There are efficient formulas for evaluation of Sobol’ Sensitivity Indices ( Sobol’ 1990): of Sobol’ Sensitivity Indices ( Sobol’ 1990): Evaluation is reduced to high-dimensional integration. Monte Carlo method is the only way to deal with such problems
France Original vrs Improved formulae for evaluation of Sobol’ Sensitivity Indices
France Improved formula for Sobol’ Sensitivity Indices
France Comparison deterministic and Monte Carlo integration methods
France Monte Carlo integration methods
France How to improve MC ?
France Sobol’ Sequences vrs Random numbers and regular grid Unlike random numbers, successive Sobol’ points “know" about the position of previously sampled points and fill the gaps between them
France Quasi random sequences
France What is the optimal way to arrange N points in two dimensions? Regular GridSobol’ Sequence Low dimensional projections of low discrepancy sequences are better distributed than higher dimensional projections
France Comparison between Sobol sequences and random numbers
France Normally distributed Sobol’ Sequences Normal probability plots Histograms Uniformly distributed Sobol’ sequences can be transformed to any other distribution with a known distribution function
France Are QMC efficient for high dimensional problems ? “For high-dimensional problems (n > 12), QMC offers no practical advantage over Monte Carlo” ( Bratley, Fox, and Niederreiter (1992)) ?!
France Discrepancy I. Low Dimensions
France Discrepancy II. High Dimensions MC in high-dimensions has smaller discrepancy
France Is MC more efficient for high-dimensional problems than QMC ? Pros: MC in high-dimensions has smaller discrepancy Some studies show degradation of the convergence rate of QMC methods in high-dimensions to O(1/√N) Cons: Huge success of QMC methods in finance: QMC methods were proven to be much more efficient than MC even for problems with thousands of variables Many tests showed superior performance of QMC methods for high-dimensional integration
France Effective dimension ___________________________________________________________
France For many problems only low order terms in the ANOVA decomposition are important Consider an approximation error Theorem 1: Link between an approximation error and effective dimension in superposition sense Approximation errors Set of variables can be regarded as not important if If and Consider an approximation error Theorem 2: Link between an approximation error and effective dimension in truncation sense ___________________________________________________________________
France Type B. Dominant low order indices Classification of functions Type B,C. Variables are equally important Type A. Variables are not equally important Type C. Dominant higher order indices
France Sensitivity indices for type A functions
France Integration error vs. N. Type A Integration error vs. N. Type A (a) f(x) = ∑ n j=1 (-1) i i j=1 x j, n = 360, (b) f(x) = s i=1 │4x i -2│/(1+a i ), n = 100 (a) (b)
France Sensitivity indices for type B functions Sensitivity indices for type B functions Dominant low order indices
France Integration error vs. N. Type B Integration error vs. N. Type B Dominant low order indices (a) (b)
France Sensitivity indices for type C functions Sensitivity indices for type C functions Dominant higher order indices
France The integration error vs. N. Type C The integration error vs. N. Type C Dominant higher order indices: (a) (b)
France The Morris method Model Elementary Effect for the i th input factor in a point X o
France r elem. effects EE 1 i EE 2 i … EE r i are computed at X 1, …, X r and then averaged. Average of EEi’s (x i ) Standard deviation of the EEi’s σ (x i ) The EEi is still a local measure Solution: take the average of several EE
France A graphical representation of results Factors can be screened on the (x i ), σ (x i ) plane
France Implemention of the Morris method r trajectories of (k+1) sample points are generated, each providing one EE per input A trajectory of the EE design Total cost = r (k + 1) r is in the range 4 -10
France *(x i ) and S Ti give similar ranking Problems: large Δ -> incorrect *(x i ) a=99 a=9 a=0.9 A comparison with variance-based methods: *(x i ) is related to S Ti Test: the g-function of Sobol’
France Derivative based Global Sensitivity Measures Morris measure in the limit Δ → 0 Sample X1, …, Xr Sobol points, estimate finite differences E 1 i, E 2 i … E r i and then averaged. Average of Ei’s M*(x i )
France The integration error vs. N. Type A The integration error vs. N. Type A g-function of Sobol’. (a) (b)
France Comparison of Sobol’ SI and Derivative based Global Sensitivity Measures (a) (b) (c) There is a link between and
France Comparison of Sobol’ SI and Derivative based Global Sensitivity Measures 1. Small values of imply small values of. 2. For highly nonlinear functions ranking based on global SI can be very different from that based on derivative based sensitivity measures
France For many problems only low order terms in the ANOVA decomposition are important. Sobol’ SI: Quasi Randon Sampling HDMR is a metamodel (HDMR), Rabitz et al: It is assumed that effective dimension in superposition sense d s =2.
France Polynomial Approximation Properties: Orthonormal polynomial base First few Legendre polynomials:
France Global Sensitivity Analysis (HDMR) The number of function evaluations is N(n+2) for original Sobol’ method N for sensitivity indices based on RS-HDMR
France How to define maximum polynomial order ? Homma-Saltelli function
France RMSE for Homma-Saltelli function Root mean square error: QMC outperforms MC RS-HDMR has higher convergence than Sobol SI method
France g-function: with 2 important and 8 unimportant variables Sobol g-function QRS-HDMR converges faster Values of S i tot can be inaccurate.
France Sobol g-function Error measure: Function Approximation
France QRS-HDMR method requires 10 to 10 3 times less model evaluations than Sobol SI method ! Computational costs
France Optimal experimental design (OED) for parameter estimation Find values of experimentally manipulable variables (controls) and the time sampling strategy for a set of N exp experiments which provides maximum information for the subsequent parameter estimation problem Non-linear programming problem (NLP) with partial differential-algebraic (PDAEs) constraints subject to: System dynamics (ODEs, DAEs) Other algebraic constraints Upper and lower bounds:
France Case study: fed-batch reactor Biomass: Substrate: Reaction rate: Parameters to be estimated: p 1, p < p 1 < 0.98, 0.05 < p 2 < 0.98 Control variables: u 1, u 2 Dilution factor: 0.05 < u 1 < 0.5 Feed substrate concentration: 5 < u 2 < 50
France OED traditional approach Fisher Information Matrix ( FIM ) based criteria: A criterion = D criterion = E criterion = Modified-E criterion = Main drawback: based on local SI non-realistic linear and local assumptions
France Parametric GSA Optimal experimental design: identification of a set of experiments with conditions that deliver measurement data that are the most sensitive to the unknown parameters
France Application of ParametricGSA for parameter optimization Application of Parametric GSA for parameter optimization Main advantage: based on global SI allows to consider a range of values for the parameters to be estimated n objective function: n Application of Global Optimization method
France Case study: fed-batch reactor Biomass: Substrate: Reaction rate: Parameters to be estimated: p 1, p < p 1 < 0.98, 0.05 < p 2 < 0.98 Control variables: u 1, u 2 Dilution factor: 0.05 < u 1 < 0.5 Feed substrate concentration: 5 < u 2 < 50
France Optimal Experimental Design Problem constraints: Experiment duration: 10 h Number of measurement times: 10 Controls varied every 2 hours n Results: Optimal input profile for u 1 and u 2 :
France Setting of the Parameter Estimation Problem Steps to find p: Take experimental or generated pseudo-experimental points Maximum likelihood optimization subject to: System dynamics (ODEs, DAEs) Other algebraic constraints Upper and lower bounds: Non-linear programming problem (NLP) with partial differential-algebraic (PDAEs) constraints p: set of parameters to be estimated : model prediction : measurements variance : experimental measures
France Results of parameter estimation p 1 = 0.37 ± 0.02, p 2 = 0.72 ± 0.12 p 1 = 0.5 ± 0.05, p 2 = 0.5 ± 0.11
France Publications Publications Hung WY, Kucherenko S., Samsatli N.J. and Shah N., The Proceedings of the 2003 Summer Computer Simulation Conference, Canada. Simulation Series, V 35, N3, pp (2003) Hung W.Y., Kucherenko S., Samsatli N.J. and Shah N (2004). Journal of the Operational Research Society 55, Sobol’ I., Kucherenko S. Monte Carlo Methods and Simulation, 11, 1, 1-9 (2005). Sobol’ I., Kucherenko S. Wilmott, 56-61, 1 (2005). Kucherenko S., Shah N. Wilmott, 82-91, 4 (2007). Sobol, I.M., S. Tarantola, D. Gatelli, S.S. Kucherenko, W. Mauntz Reliability Engineering & System Safety, , 92 (2007 ). Rodriguez-Fernandez M., Kucherenko S., Pantelides C., Shah N. Proc. ESCAPE17, V. Plesu and P.S. Agachi (Editors), p66-71, (2007) Kucherenko S., Mauntz W. Submitted to Journal of Comp. Physics (2007). S. Kucherenko. Fifth International Conference on Sensitivity Analysis of Model Output, Budapest, (2007) S. Kucherenko, M. Rodriguez-Fernandez, C. Pantelides, N. Shah. Submitted to Reliability Engineering Systems Safety (2007) D. Gatelli, S. Kucherenko, M. Ratto, S. Tarantola, Submitted to Reliability Engineering Systems Safety (2007) I.M. Sobol’, S. Kucherenko. Submitted to Journal of Comp. Physics (2008). Application of Global Sensitivity Analysis to Biological Models A.Kiparissides, M.Rodriguez-Fernandez, S. Kucherenko, A. Mantalaris, E.Pistikopoulos Application of Global Sensitivity Analysis to Biological Models, Submitted to ESCAPE18 (2008).
France Summary uasi MC methods based on Sobol’ sequences outperform MC The error generated by the factors fixing is bounded by the total sensitivity index of the fixed factors Functions can be classified according to their effective dimension The method of derivative based global sensitivity measures (DGSM) is more efficient than the Morris and the Sobol’ SI methods. There is a link between DGSM and Sobol’ SI Quasi Randon Sampling - High Dimensional Model Representation with polynomial approximation can be orders of magnitude more efficient than Sobol’ SI for evaluation of main effects Application of global SI to OED results in the reduction of the required experimental work and the increased accuracy of parameter estimation Summary Quasi MC methods based on Sobol’ sequences outperform MC The error generated by the factors fixing is bounded by the total sensitivity index of the fixed factors Functions can be classified according to their effective dimension The method of derivative based global sensitivity measures (DGSM) is more efficient than the Morris and the Sobol’ SI methods. There is a link between DGSM and Sobol’ SI Quasi Randon Sampling - High Dimensional Model Representation with polynomial approximation can be orders of magnitude more efficient than Sobol’ SI for evaluation of main effects Application of global SI to OED results in the reduction of the required experimental work and the increased accuracy of parameter estimation
France Thank you for inviting me !Acknowledgments Prof. Sobol’ Imperial College London, UK: N. Shah, M. Rodríguez Fernández, B. Feil, W. Mauntz, C. Pantelides Joint Research Centre, ISPRA, Italy: S. Tarantola, D. Gatelli, M. Ratto Financial support: EPSRC Grant EP/D506743/1