1 Andrea Saltelli, Jessica Cariboni and Francesca Campolongo European Commission, Joint Research Centre SAMO 2007 Budapest Accelerating factors screening.

Slides:



Advertisements
Similar presentations
The Future (and Past) of Quantum Lower Bounds by Polynomials Scott Aaronson UC Berkeley.
Advertisements

Lindsey Bleimes Charlie Garrod Adam Meyerson
Running a model's adjoint to obtain derivatives, while more efficient and accurate than other methods, such as the finite difference method, is a computationally.
Objectives 10.1 Simple linear regression
Fast Algorithms For Hierarchical Range Histogram Constructions
EE 553 Integer Programming
1. Algorithms for Inverse Reinforcement Learning 2
Sampling distributions. Example Take random sample of 1 hour periods in an ER. Ask “how many patients arrived in that one hour period ?” Calculate statistic,
Theoretical Program Checking Greg Bronevetsky. Background The field of Program Checking is about 13 years old. Pioneered by Manuel Blum, Hal Wasserman,
Ecological factors shaping the genetic quality of seeds and seedlings in forest trees. A simulation study coupled with sensitivity analyses Project BRG-Regeneration.
N.D. Analysis We are NOT going to analyze Notre Dame, just Numerical (Quantitative) Data. There are essentially two ways to represent and analyze quantitative.
Section 4.2 Fitting Curves and Surfaces by Least Squares.
Rational Trigonometry Applied to Robotics
SIMS 213: User Interface Design & Development Marti Hearst Thurs, March 13, 2003.
A new crossover technique in Genetic Programming Janet Clegg Intelligent Systems Group Electronics Department.
Stat 301 – Day 14 Review. Previously Instead of sampling from a process  Each trick or treater makes a “random” choice of what item to select; Sarah.
Variability Measures of spread of scores range: highest - lowest standard deviation: average difference from mean variance: average squared difference.
2.3. Measures of Dispersion (Variation):
Estimation Error and Portfolio Optimization Global Asset Allocation and Stock Selection Campbell R. Harvey Duke University, Durham, NC USA National Bureau.
Decision analysis and Risk Management course in Kuopio
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
IE 594 : Research Methodology – Discrete Event Simulation David S. Kim Spring 2009.
Inference for regression - Simple linear regression
Economics 173 Business Statistics Lecture 2 Fall, 2001 Professor J. Petry
Quantitative Skills: Data Analysis and Graphing.
1 9/8/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
France Recent advances in Global Sensitivity Analysis techniques S. Kucherenko Imperial College London, UK
Chapter 9 Comparing More than Two Means. Review of Simulation-Based Tests  One proportion:  We created a null distribution by flipping a coin, rolling.
(a.k.a: The statistical bare minimum I should take along from STAT 101)
1 9/23/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
Sample size vs. Error A tutorial By Bill Thomas, Colby-Sawyer College.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
Quantitative Skills 1: Graphing
Next Generation Domain-Services in PL-Grid Infrastructure for Polish Science Daniel Bachniak 1, Jakub Liput 2, Łukasz Rauch 1, Renata Słota 2,3, Jacek.
9th IMA Conference on Cryptography & Coding Dec 2003 More Detail for a Combined Timing and Power Attack against Implementations of RSA Werner Schindler.
Sampling Distributions. What is a sampling distribution? Grab a sample of size N Compute a statistic (mean, variance, etc.) Record it Do it again (until.
MATH IN THE FORM OF STATISTICS IS VERY COMMON IN AP BIOLOGY YOU WILL NEED TO BE ABLE TO CALCULATE USING THE FORMULA OR INTERPRET THE MEANING OF THE RESULTS.
Yaomin Jin Design of Experiments Morris Method.
Monte Carlo Methods.
INTRODUCTION TO ANALYSIS OF VARIANCE (ANOVA). COURSE CONTENT WHAT IS ANOVA DIFFERENT TYPES OF ANOVA ANOVA THEORY WORKED EXAMPLE IN EXCEL –GENERATING THE.
Measures of Dispersion
European Commission DG Joint Research Centre Formal and informal approaches to the quality of information in integrated.
CS1Q Computer Systems Lecture 7
CMP 131 Introduction to Computer Programming Violetta Cavalli-Sforza Week 3, Lecture 1.
Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.
Statistics : Statistical Inference Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University 1.
Statistics Describing, Exploring and Comparing Data
Sorting: Implementation Fundamental Data Structures and Algorithms Klaus Sutner February 24, 2004.
Computing Science 1P Large Group Tutorial 14 Simon Gay Department of Computing Science University of Glasgow 2006/07.
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
COMPUTER PROGRAMMING Year 9 – lesson 1. Objective and Outcome Teaching Objective We are going to look at how to construct a computer program. We will.
FNAL Software School Day 4 Matt Herndon, University of Wisconsin – Madison.
Managerial Economics & Decision Sciences Department tyler realty  old faithful  business analytics II Developed for © 2016 kellogg school of management.
Statistics 20 Testing Hypothesis and Proportions.
Testing Hypotheses about Proportions
Simulation-Based Approach for Comparing Two Means
Stat 217 – Day 28 Review Stat 217.
Sampling Distribution of the Sample Mean
Estimation Error and Portfolio Optimization
Estimation Error and Portfolio Optimization
Fundamentals of Data Representation
Summary descriptive statistics: means and standard deviations:
Estimation Error and Portfolio Optimization
Software Development Process
Lecture 4 - Monte Carlo improvements via variance reduction techniques: antithetic sampling Antithetic variates: for any one path obtained by a gaussian.
CHAPTER 2: Basic Summary Statistics
Estimation Error and Portfolio Optimization
EE384Y: Packet Switch Architectures II
Meeting of the Steering Group on Simulation (SGS) Defining the simulation plan in the Kriging meta-model development Thessaloniki, 07 February 2014.
Sampling Plans.
Presentation transcript:

1 Andrea Saltelli, Jessica Cariboni and Francesca Campolongo European Commission, Joint Research Centre SAMO 2007 Budapest Accelerating factors screening

2 1.Sensitivity analysis web at JRC (software, tutorials,..) 2.New book on SA with exercises for students - at Wiley for review - Please flag errors! 3.Summer school in 2008 – date to be decided Sensitivity analysis at the Joint Research Centre of Ispra

3 Where do we stands in terms of good practices for global SA : Screening: Morris – Campolongo – EE ( ) Quantitative: Sobol’, plus several investigators,

4 Screening: Morris – Campolongo – EE ( ) Good but not so efficient Quantitative: Sobol’, Saltelli ( ) Efficient for S i (Mara’ + Tarantola [scrambled FAST], Ratto + Young [SDR] + proximities [Marco’s presentation of yesterday]) Not so efficient for S Ti (Saltelli 2002)

5 The EE method can be seen as an extension of a derivative-based analysis. Where to start? From the best available practice in screening: The method of Elementary Effects (Morris 1991) Max Morris, Department of Statistics Iowa State University

6 The method of Elementary Effects Model Elementary Effect for the i th input factor in a point X o

7 r elem. effects EE 1 i EE 2 i … EE r i are computed at X 1, …, X r and then averaged. Average of EE i ’s   (x i ) Standard deviation of the EEi’s   (xi) Factors can be screened on the  (xi)  (xi) plane Using EE method: The EEi is still a local measure Solution: take the average of several EE

8 A graphical representation of results

9 Using the EE method Each input varies across p possible values (levels – quantiles usually) within its range of variation xi U(0,1) p = 4  p 1 = 0 p 2 = 1/3 p 3 = 2/3 p 4 = 1 The optimal choice for  is  = p / 2 (p -1) 01/32/3101/ 3 2/31 Grid in 2D Sampling the levels uniformly

10 Improving the EE (Campolongo et al., … ) - Taking the modulus of  (xi),  *(xi) Instead of using the couple of  (xi) and  (xi) x1x1 x2x2 AB C A’ C’B’ -Maximizing the spread of the trajectories in the input space -Application to groups of factors

11 S Ti available analytically a=99 a=9 a=0.9 A comparison with variance-based methods: Is  *(xi) related to either S i or S Ti ? Empirical evidence: the g-function of Sobol’

12 Empirical evidence: the g-function Factor a(i) x x289.9 x35.54 x442.1 x50.78 x61.26 x70.04 x80.79 x x x x A comparison with variance-based   *(x i ) is a good proxy for S Ti

13 Implementing the EE method Original implementation estimate r EE’s per input. r trajectories of (k+1) sample points are generated, each providing one EE per input A trajectory of the EE design Total cost = r (k + 1) r is in the range Each trajectory gives k effect EE at the cost of (k + 1) simulations. Efficiency =k/(k+1)~1

14 Conclusion: the EE is a useful method Is its efficiency k/(k+1) ~ 1 good? We can compare with the Saltelli 2002 method to implement the calculation of the first order and total order sensitivity indices:

15 One of this plus … … one of this plus … plus K of these With: One can compute all first and total effects for k factors Saltelli 2002

16 One of this K of these Total: N(K+2) runs To obtain N*2*k elementary effects (for S i or S Ti ) Efficiency=2k/(k+2)~2 Better that the EE method. Saltelli 2002

17 Conclusion: the efficiency of EE might have scope for improvement. The better efficiency of the global method (Saltelli 2002) against the screening method (EE) is due to the fact that two effects (one of the first order and one of the total order) are computed from each row of Ai. Can we do the same with EE?

18 … is one step in the non- X i direction (all moves but X i ) Saltelli 2002 From To

19 … is one step in the X i direction (X i moves and X ~i does not) Saltelli 2002 From To

20 How about alternating steps along the X i ’s axes with steps along the along the X ~i ’s also for an EE-line screening method? How can we combine steps along X i ’s axes with steps along the X ~i ’s?

21 Can we generate efficiently exploration trajectories in the hyperspace of the input factors where steps in the X i and X ~i directions are nicely arranged, e.g. in a square? Beyond Elementary Effects Method

22 Beyond Elementary Effects Method

23 Our thesis is that (1) Both |y 1 -y 3 | and |y 2 -y 4 | tells me about the first order effect of X 1

24 … and that : (2) ||y 1 -y 4 |-|y 1 -y 2 ||, ||y 2 -y 3 |-|y 2 -y 1 ||, ||y 3 -y 2 |-|y 3 -y 4 ||, ||y 4 -y 1 |-|y 4 -y 3 ||, all tell me about the total order effect of a factor.

25 Before trying to substantiate our thesis we give a look at how these squares could be built efficiently Four runs, six factors

26 Four runs, six factors, six steps along the X ~i directions We call these four runs ‘base runs’

27 Base runs Clones For each step in the X ~i direction we add two in the X i direction

28 Base runs Clones Let’s count: Run 3 is a step away from run 1 in the X 1 direction. Run 4 is a step away from run 2 in the X 1 direction. Run 2 was already a step away from run 1 in the X ~1 direction Run 4 is also a step away from run 3 in the X ~1 direction … the square is closed.

29 Beyond Elementary Effects Method

30 Base runs Clones Let do some more counting. We have 4 base runs, 16 runs in total, six factors and four effects for factor. Efficiency= 24/16=3/2

31 For 6 base runs, we have 15 factors, 36 runs in total, again four effects for factor. Efficiency= 60/36 ~ 2 for increasing number of factors … It would be nice to stop here! … but let us go back to the 6 factors example

32 There are many more effects hidden in the scheme: e.g. three more effects for run 16. Most of these effects are of the X ~i type The number of extra terms is between 2k and 4 k

33 The number of extra terms grows with k Some of these need only one more point to close a square Most of these need two extra points to close a square

34 Let us forget about the additional terms for the moment and let us try screening …

35 Numerical Experiment: g-function where

36 Results: g-function (180 runs) a1a1 a2a2 a3a3 a4a4 a5a5 a6a6 a7a7 a8a8 a9a9 a 10 a 11 a 12 a 13 a 14 a

37 Number of runs: EE(2007)= 25; EE =22 K=10, a=(0.01,0.02,0.015,99,78,57,89,97,96,87)

38 Test function Book (2007) The last two Z’s and the last two omegas are the most important factors

39 Number of runs: new method= 64; old method =58

40 g function 25 replicas of EE1(2007)

41 g function 25 replicas of EE2(2007)

42 g function 25 replicas of EE

43 book function 25 replicas of EE2(2007)

44 book function 25 replicas of EE1(2007)

45 book function 25 replicas of EE

46 What next? Good for S i, S Ti ?

47 S i couple S Ti couple S i couple S Ti couple

48 S i couple S Ti couple S i couple S Ti couple Try to exploit this design for the improvement of the Saltelli 2002 method for the S Ti

49 The number of extra terms grows with k Some of these need only one more point to close a square Most of these need two extra points to close a square (closed squares give 4 effects, 2 S i & 2 S Ti )

50 Conclusions The new scheme (aka il matricione ) has promises for EE and S Ti Work on the algorithms is needed to make a sizeable difference with best available practices …

51 il matricione