France 2008 1 Recent advances in Global Sensitivity Analysis techniques S. Kucherenko Imperial College London, UK

Slides:



Advertisements
Similar presentations
Rachel T. Johnson Douglas C. Montgomery Bradley Jones
Advertisements

Design of Experiments Lecture I
Component Analysis (Review)
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Structural reliability analysis with probability- boxes Hao Zhang School of Civil Engineering, University of Sydney, NSW 2006, Australia Michael Beer Institute.
Fast Algorithms For Hierarchical Range Histogram Constructions
CHAPTER 13 M ODELING C ONSIDERATIONS AND S TATISTICAL I NFORMATION “All models are wrong; some are useful.”  George E. P. Box Organization of chapter.
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
Mathematics in Finance Numerical solution of free boundary problems: pricing of American options Wil Schilders (June 2, 2005)
Computational statistics 2009 Sensitivity analysis (SA) - outline 1.Objectives and examples 2.Local and global SA 3.Importance measures involving derivatives.
6/10/ Visual Recognition1 Radial Basis Function Networks Computer Science, KAIST.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
CF-3 Bank Hapoalim Jun-2001 Zvi Wiener Computational Finance.
1 Variance Reduction via Lattice Rules By Pierre L’Ecuyer and Christiane Lemieux Presented by Yanzhi Li.
Evaluating Hypotheses
Development of Empirical Models From Process Data
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Fast integration using quasi-random numbers J.Bossert, M.Feindt, U.Kerzel University of Karlsruhe ACAT 05.
Lecture II-2: Probability Review
Modern Navigation Thomas Herring
Decision analysis and Risk Management course in Kuopio
1 Assessment of Imprecise Reliability Using Efficient Probabilistic Reanalysis Farizal Efstratios Nikolaidis SAE 2007 World Congress.
Component Reliability Analysis
1 CE 530 Molecular Simulation Lecture 7 David A. Kofke Department of Chemical Engineering SUNY Buffalo
1 Statistical Mechanics and Multi- Scale Simulation Methods ChBE Prof. C. Heath Turner Lecture 11 Some materials adapted from Prof. Keith E. Gubbins:
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Advanced Risk Management I Lecture 6 Non-linear portfolios.
Optimal Multilevel System Design under Uncertainty NSF Workshop on Reliable Engineering Computing Savannah, Georgia, 16 September 2004 M. Kokkolaras and.
FDA- A scalable evolutionary algorithm for the optimization of ADFs By Hossein Momeni.
Probabilistic Mechanism Analysis. Outline Uncertainty in mechanisms Why consider uncertainty Basics of uncertainty Probabilistic mechanism analysis Examples.
RNGs in options pricing Presented by Yu Zhang. Outline Options  What is option?  Kinds of options  Why options? Options pricing Models Monte Carlo.
Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:
EFFICIENT CHARACTERIZATION OF UNCERTAINTY IN CONTROL STRATEGY IMPACT PREDICTIONS EFFICIENT CHARACTERIZATION OF UNCERTAINTY IN CONTROL STRATEGY IMPACT PREDICTIONS.
Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Yaomin Jin Design of Experiments Morris Method.
1 Andrea Saltelli, Jessica Cariboni and Francesca Campolongo European Commission, Joint Research Centre SAMO 2007 Budapest Accelerating factors screening.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Chapter 4 Stochastic Modeling Prof. Lei He Electrical Engineering Department University of California, Los Angeles URL: eda.ee.ucla.edu
An Efficient Sequential Design for Sensitivity Experiments Yubin Tian School of Science, Beijing Institute of Technology.
Solution of a Partial Differential Equations using the Method of Lines
Xianwu Ling Russell Keanini Harish Cherukuri Department of Mechanical Engineering University of North Carolina at Charlotte Presented at the 2003 IPES.
Robust System Design Session #11 MIT Plan for the Session Quiz on Constructing Orthogonal Arrays (10 minutes) Complete some advanced topics on OAs Lecture.
Maximum a posteriori sequence estimation using Monte Carlo particle filters S. J. Godsill, A. Doucet, and M. West Annals of the Institute of Statistical.
Machine Design Under Uncertainty. Outline Uncertainty in mechanical components Why consider uncertainty Basics of uncertainty Uncertainty analysis for.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Machine Learning 5. Parametric Methods.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Sensitivity Analysis and Experimental Design - case study of an NF-  B signal pathway Hong Yue Manchester Interdisciplinary Biocentre (MIB) The University.
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
1 Tom Edgar’s Contribution to Model Reduction as an introduction to Global Sensitivity Analysis Procedure Accounting for Effect of Available Experimental.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Why Stochastic Hydrology ?
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Chapter 3 Component Reliability Analysis of Structures.
Morgan Bruns1, Chris Paredis1, and Scott Ferson2
Chapter 4a Stochastic Modeling
Statistical Methods For Engineers
Chapter 4a Stochastic Modeling
Filtering and State Estimation: Basic Concepts
Lecture 2 – Monte Carlo method in finance
10701 / Machine Learning Today: - Cross validation,
Lecture 4 - Monte Carlo improvements via variance reduction techniques: antithetic sampling Antithetic variates: for any one path obtained by a gaussian.
Parametric Methods Berlin Chen, 2005 References:
Meeting of the Steering Group on Simulation (SGS) Defining the simulation plan in the Kriging meta-model development Thessaloniki, 07 February 2014.
Uncertainty Propagation
Presentation transcript:

France Recent advances in Global Sensitivity Analysis techniques S. Kucherenko Imperial College London, UK

France Introduction of Global Sensitivity Analysis and Sobol’ Sensitivity Indices Why Quasi Monte Carlo methods (Sobol’ sequence sampling) are much more efficient than Monte Carlo (random sampling) ? Effective dimensions and their link with Sobol’ Sensitivity Indices Classification of functions based on global sensitivity indices Link between Sobol’ Sensitivity Indices and Derivative based Global Sensitivity Measures Quasi Randon Sampling - High Dimensional Model Representation with polynomial approximation Application of parametric GSA for optimal experimental design Outline

France Model Output x i : input factors Propagation of uncertainty Input x1x1 x2x2 … x3x3 … 1 2n … x4x4 xkxk y

France Consider a model x is a vector of input variables Y is the model output. Variance decomposition: Sobol’ SI: Sensitivity Indices (SI) ANOVA decomposition (HDMR):

France Sobol’ Sensitivity Indices (SI) Definition: - partial variances - variance  Requires 2 n integral evaluations for calculations Sensitivity indices for subsets of variables: Introduction of the total variance:  Corresponding global sensitivity indices:

France How to use Sobol’ Sensitivity Indices? accounts for all interactions between y and z, x=(y,z). The important indices in practice are and  does not depend on ;  does only depend on ;  corresponds to the absence of interactions between and other variables  If then function has additive structure: Fixing unessential variables  If does not depend on so it can be fixed  complexity reduction, from to variables

France Evaluation of Sobol’ Sensitivity Indices Straightforward use of Anova decomposition requires 2 n integral evaluations – not practical ! There are efficient formulas for evaluation of Sobol’ Sensitivity Indices ( Sobol’ 1990): of Sobol’ Sensitivity Indices ( Sobol’ 1990): Evaluation is reduced to high-dimensional integration. Monte Carlo method is the only way to deal with such problems

France Original vrs Improved formulae for evaluation of Sobol’ Sensitivity Indices

France Improved formula for Sobol’ Sensitivity Indices

France Comparison deterministic and Monte Carlo integration methods

France Monte Carlo integration methods

France How to improve MC ?

France Sobol’ Sequences vrs Random numbers and regular grid Unlike random numbers, successive Sobol’ points “know" about the position of previously sampled points and fill the gaps between them

France Quasi random sequences

France What is the optimal way to arrange N points in two dimensions? Regular GridSobol’ Sequence Low dimensional projections of low discrepancy sequences are better distributed than higher dimensional projections

France Comparison between Sobol sequences and random numbers

France Normally distributed Sobol’ Sequences Normal probability plots Histograms Uniformly distributed Sobol’ sequences can be transformed to any other distribution with a known distribution function

France Are QMC efficient for high dimensional problems ? “For high-dimensional problems (n > 12), QMC offers no practical advantage over Monte Carlo” ( Bratley, Fox, and Niederreiter (1992)) ?!

France Discrepancy I. Low Dimensions

France Discrepancy II. High Dimensions MC in high-dimensions has smaller discrepancy

France Is MC more efficient for high-dimensional problems than QMC ? Pros:  MC in high-dimensions has smaller discrepancy  Some studies show degradation of the convergence rate of QMC methods in high-dimensions to O(1/√N) Cons: Huge success of QMC methods in finance: QMC methods were proven to be much more efficient than MC even for problems with thousands of variables Many tests showed superior performance of QMC methods for high-dimensional integration

France Effective dimension ___________________________________________________________

France For many problems only low order terms in the ANOVA decomposition are important Consider an approximation error Theorem 1: Link between an approximation error and effective dimension in superposition sense Approximation errors Set of variables can be regarded as not important if If and Consider an approximation error Theorem 2: Link between an approximation error and effective dimension in truncation sense ___________________________________________________________________

France Type B. Dominant low order indices Classification of functions Type B,C. Variables are equally important Type A. Variables are not equally important Type C. Dominant higher order indices

France Sensitivity indices for type A functions

France Integration error vs. N. Type A Integration error vs. N. Type A (a) f(x) = ∑ n j=1 (-1) i  i j=1 x j, n = 360, (b) f(x) =  s i=1 │4x i -2│/(1+a i ), n = 100 (a) (b)

France Sensitivity indices for type B functions Sensitivity indices for type B functions Dominant low order indices

France Integration error vs. N. Type B Integration error vs. N. Type B Dominant low order indices (a) (b)

France Sensitivity indices for type C functions Sensitivity indices for type C functions Dominant higher order indices

France The integration error vs. N. Type C The integration error vs. N. Type C Dominant higher order indices: (a) (b)

France The Morris method Model Elementary Effect for the i th input factor in a point X o

France r elem. effects EE 1 i EE 2 i … EE r i are computed at X 1, …, X r and then averaged. Average of EEi’s   (x i ) Standard deviation of the EEi’s  σ (x i ) The EEi is still a local measure Solution: take the average of several EE

France A graphical representation of results Factors can be screened on the  (x i ), σ (x i ) plane

France Implemention of the Morris method r trajectories of (k+1) sample points are generated, each providing one EE per input A trajectory of the EE design Total cost = r (k + 1) r is in the range 4 -10

France  *(x i ) and S Ti give similar ranking Problems: large Δ -> incorrect  *(x i ) a=99 a=9 a=0.9 A comparison with variance-based methods:  *(x i ) is related to S Ti Test: the g-function of Sobol’

France Derivative based Global Sensitivity Measures Morris measure in the limit Δ → 0 Sample X1, …, Xr Sobol points, estimate finite differences E 1 i, E 2 i … E r i and then averaged. Average of Ei’s  M*(x i )

France The integration error vs. N. Type A The integration error vs. N. Type A g-function of Sobol’. (a) (b)

France Comparison of Sobol’ SI and Derivative based Global Sensitivity Measures (a) (b) (c) There is a link between and

France Comparison of Sobol’ SI and Derivative based Global Sensitivity Measures 1. Small values of imply small values of. 2. For highly nonlinear functions ranking based on global SI can be very different from that based on derivative based sensitivity measures

France For many problems only low order terms in the ANOVA decomposition are important. Sobol’ SI: Quasi Randon Sampling HDMR is a metamodel (HDMR), Rabitz et al: It is assumed that effective dimension in superposition sense d s =2.

France Polynomial Approximation Properties: Orthonormal polynomial base First few Legendre polynomials:

France Global Sensitivity Analysis (HDMR) The number of function evaluations is N(n+2) for original Sobol’ method N for sensitivity indices based on RS-HDMR

France How to define maximum polynomial order ? Homma-Saltelli function

France RMSE for Homma-Saltelli function Root mean square error: QMC outperforms MC RS-HDMR has higher convergence than Sobol SI method

France g-function: with 2 important and 8 unimportant variables Sobol g-function QRS-HDMR converges faster Values of S i tot can be inaccurate.

France Sobol g-function Error measure: Function Approximation

France QRS-HDMR method requires 10 to 10 3 times less model evaluations than Sobol SI method ! Computational costs

France Optimal experimental design (OED) for parameter estimation Find values of experimentally manipulable variables (controls) and the time sampling strategy for a set of N exp experiments which provides maximum information for the subsequent parameter estimation problem Non-linear programming problem (NLP) with partial differential-algebraic (PDAEs) constraints subject to: System dynamics (ODEs, DAEs) Other algebraic constraints Upper and lower bounds:

France Case study: fed-batch reactor Biomass: Substrate: Reaction rate: Parameters to be estimated: p 1, p < p 1 < 0.98, 0.05 < p 2 < 0.98 Control variables: u 1, u 2 Dilution factor: 0.05 < u 1 < 0.5 Feed substrate concentration: 5 < u 2 < 50

France OED traditional approach Fisher Information Matrix ( FIM ) based criteria:  A criterion =  D criterion =  E criterion =  Modified-E criterion = Main drawback: based on local SI non-realistic linear and local assumptions

France Parametric GSA Optimal experimental design: identification of a set of experiments with conditions that deliver measurement data that are the most sensitive to the unknown parameters

France Application of ParametricGSA for parameter optimization Application of Parametric GSA for parameter optimization Main advantage: based on global SI allows to consider a range of values for the parameters to be estimated n objective function: n Application of Global Optimization method

France Case study: fed-batch reactor Biomass: Substrate: Reaction rate: Parameters to be estimated: p 1, p < p 1 < 0.98, 0.05 < p 2 < 0.98 Control variables: u 1, u 2 Dilution factor: 0.05 < u 1 < 0.5 Feed substrate concentration: 5 < u 2 < 50

France Optimal Experimental Design Problem constraints:  Experiment duration: 10 h  Number of measurement times: 10  Controls varied every 2 hours  n Results: Optimal input profile for u 1 and u 2 :

France Setting of the Parameter Estimation Problem Steps to find p: Take experimental or generated pseudo-experimental points Maximum likelihood optimization subject to: System dynamics (ODEs, DAEs) Other algebraic constraints Upper and lower bounds: Non-linear programming problem (NLP) with partial differential-algebraic (PDAEs) constraints p: set of parameters to be estimated : model prediction : measurements variance : experimental measures

France Results of parameter estimation p 1 = 0.37 ± 0.02, p 2 = 0.72 ± 0.12 p 1 = 0.5 ± 0.05, p 2 = 0.5 ± 0.11

France Publications Publications Hung WY, Kucherenko S., Samsatli N.J. and Shah N., The Proceedings of the 2003 Summer Computer Simulation Conference, Canada. Simulation Series, V 35, N3, pp (2003) Hung W.Y., Kucherenko S., Samsatli N.J. and Shah N (2004). Journal of the Operational Research Society 55, Sobol’ I., Kucherenko S. Monte Carlo Methods and Simulation, 11, 1, 1-9 (2005). Sobol’ I., Kucherenko S. Wilmott, 56-61, 1 (2005). Kucherenko S., Shah N. Wilmott, 82-91, 4 (2007). Sobol, I.M., S. Tarantola, D. Gatelli, S.S. Kucherenko, W. Mauntz Reliability Engineering & System Safety, , 92 (2007 ). Rodriguez-Fernandez M., Kucherenko S., Pantelides C., Shah N. Proc. ESCAPE17, V. Plesu and P.S. Agachi (Editors), p66-71, (2007) Kucherenko S., Mauntz W. Submitted to Journal of Comp. Physics (2007). S. Kucherenko. Fifth International Conference on Sensitivity Analysis of Model Output, Budapest, (2007) S. Kucherenko, M. Rodriguez-Fernandez, C. Pantelides, N. Shah. Submitted to Reliability Engineering Systems Safety (2007) D. Gatelli, S. Kucherenko, M. Ratto, S. Tarantola, Submitted to Reliability Engineering Systems Safety (2007) I.M. Sobol’, S. Kucherenko. Submitted to Journal of Comp. Physics (2008). Application of Global Sensitivity Analysis to Biological Models A.Kiparissides, M.Rodriguez-Fernandez, S. Kucherenko, A. Mantalaris, E.Pistikopoulos Application of Global Sensitivity Analysis to Biological Models, Submitted to ESCAPE18 (2008).

France Summary uasi MC methods based on Sobol’ sequences outperform MC The error generated by the factors fixing is bounded by the total sensitivity index of the fixed factors Functions can be classified according to their effective dimension The method of derivative based global sensitivity measures (DGSM) is more efficient than the Morris and the Sobol’ SI methods. There is a link between DGSM and Sobol’ SI Quasi Randon Sampling - High Dimensional Model Representation with polynomial approximation can be orders of magnitude more efficient than Sobol’ SI for evaluation of main effects Application of global SI to OED results in the reduction of the required experimental work and the increased accuracy of parameter estimation Summary Quasi MC methods based on Sobol’ sequences outperform MC The error generated by the factors fixing is bounded by the total sensitivity index of the fixed factors Functions can be classified according to their effective dimension The method of derivative based global sensitivity measures (DGSM) is more efficient than the Morris and the Sobol’ SI methods. There is a link between DGSM and Sobol’ SI Quasi Randon Sampling - High Dimensional Model Representation with polynomial approximation can be orders of magnitude more efficient than Sobol’ SI for evaluation of main effects Application of global SI to OED results in the reduction of the required experimental work and the increased accuracy of parameter estimation

France Thank you for inviting me !Acknowledgments Prof. Sobol’ Imperial College London, UK: N. Shah, M. Rodríguez Fernández, B. Feil, W. Mauntz, C. Pantelides Joint Research Centre, ISPRA, Italy: S. Tarantola, D. Gatelli, M. Ratto Financial support: EPSRC Grant EP/D506743/1