Southampton workshop, July 2009Slide 1 Tony O’Hagan, University of Sheffield Simulators and Emulators.

Slides:



Advertisements
Similar presentations
Uncertainties in Predictions of Arctic Climate Peter Challenor, Bablu Sinha (NOC) Myles Allen (Oxford), Robin Tokmakian (NPS)
Advertisements

Rachel T. Johnson Douglas C. Montgomery Bradley Jones
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
STATISTICS Sampling and Sampling Distributions
Slide 1 of 18 Uncertainty Representation and Reasoning with MEBN/PR-OWL Kathryn Blackmond Laskey Paulo C. G. da Costa The Volgenau School of Information.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Bayesian tools for analysing and reducing uncertainty Tony OHagan University of Sheffield.
C82MST Statistical Methods 2 - Lecture 2 1 Overview of Lecture Variability and Averages The Normal Distribution Comparing Population Variances Experimental.
1 Contact details Colin Gray Room S16 (occasionally) address: Telephone: (27) 2233 Dont hesitate to get in touch.
1 -Classification: Internal Uncertainty in petroleum reservoirs.
Experimental Particle Physics PHYS6011 Joel Goldstein, RAL 1.Introduction & Accelerators 2.Particle Interactions and Detectors (2) 3.Collider Experiments.
Chapter 7 Sampling and Sampling Distributions
How to Emulate: Recipes without Patronising
Credit Risk Plus November 15, 2010 By: A V Vedpuriswar.
Author: Chengchen, Bin Liu Publisher: International Conference on Computational Science and Engineering Presenter: Yun-Yan Chang Date: 2012/04/18 1.
1 Uncertainty in rainfall-runoff simulations An introduction and review of different techniques M. Shafii, Dept. Of Hydrology, Feb
1 Analysis of Random Mobility Models with PDE's Michele Garetto Emilio Leonardi Politecnico di Torino Italy MobiHoc Firenze.
MUCM going forward Team Meeting, July MUCM2  Two year project  Starting 1 st October 2010  Finishing 30 th September 2012  People  Continuation.
Uncertainty and Sensitivity Analysis of Complex Computer Codes
Quantifying and managing uncertainty with Gaussian process emulators Tony O’Hagan University of Sheffield.
Emulators and MUCM. Outline Background Simulators Uncertainty in model inputs Uncertainty analysis Case study – dynamic vegetation simulator Emulators.
SAMSI Kickoff 11/9/06Slide 1 Simulators, emulators, predictors – Validity, quality, adequacy Tony O’Hagan.
14 May 2008RSS Oxford1 Towards quantifying the uncertainty in carbon fluxes Tony O’Hagan University of Sheffield.
SAMSI Distinguished, October 2006Slide 1 Tony O’Hagan, University of Sheffield Managing Uncertainty in Complex Models.
Durham workshop, July 2008Slide 1 Tony O’Hagan, University of Sheffield MUCM: An Overview.
Project leader’s report MUCM Advisory Panel Meeting, November 2006.
Insert Date HereSlide 1 Using Derivative and Integral Information in the Statistical Analysis of Computer Models Gemma Stephenson March 2007.
Interfacing physical experiments and computer models Preliminary remarks Tony O’Hagan.
Simulators and Emulators Tony O’Hagan University of Sheffield.
Emulation, Elicitation and Calibration UQ12 Minitutorial Presented by: Tony O’Hagan, Peter Challenor, Ian Vernon UQ12 minitutorial - session 11.
Addition 1’s to 20.
Slide 1 John Paul Gosling University of Sheffield GEM-SA: a tutorial.
Running a model's adjoint to obtain derivatives, while more efficient and accurate than other methods, such as the finite difference method, is a computationally.
25 seconds left…...
Week 1.
Statistical Inferences Based on Two Samples
We will resume in: 25 Minutes.
IP, IST, José Bioucas, Probability The mathematical language to quantify uncertainty  Observation mechanism:  Priors:  Parameters Role in inverse.
1 PART 1 ILLUSTRATION OF DOCUMENTS  Brief introduction to the documents contained in the envelope  Detailed clarification of the documents content.
Multiple Regression and Model Building
1 Fuel poverty in policy and practice - a postgraduate symposium Friday 16th November 2012 Interdisciplinary Centre of the Social Sciences, University.
Probabilistic Reasoning over Time
1 Review Lecture: Guide to the SSSII Assignment Gwilym Pryce 5 th March 2006.
Designing Ensembles for Climate Prediction
Validating uncertain predictions Tony O’Hagan, Leo Bastos, Jeremy Oakley, University of Sheffield.
Gaussian Processes I have known
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
Climate case study. Outline The challenge The simulator The data Definitions and conventions Elicitation Expert beliefs about climate parameters Expert.
Value of Information for Complex Economic Models Jeremy Oakley Department of Probability and Statistics, University of Sheffield. Paper available from.
Session 3: Calibration Using observations of the real process.
Gaussian process modelling
Calibration and Model Discrepancy Tony O’Hagan, MUCM, Sheffield.
Calibration of Computer Simulators using Emulators.
6 July 2007I-Sim Workshop, Fontainebleau1 Simulation and Uncertainty Tony O’Hagan University of Sheffield.
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
Monte Carlo Simulation CWR 6536 Stochastic Subsurface Hydrology.
29 May 2008IMA Scottish Branch1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan University of Sheffield.
Using an emulator. Outline So we’ve built an emulator – what can we use it for? Prediction What would the simulator output y be at an untried input x?
Slide 1 Marc Kennedy, Clive Anderson, Anthony O’Hagan, Mark Lomas, Ian Woodward, Andreas Heinemayer and John Paul Gosling Uncertainty in environmental.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
Center for Radiative Shock Hydrodynamics Fall 2011 Review Assessment of predictive capability Derek Bingham 1.
Slide 1 Marc Kennedy, Clive Anderson, Anthony O’Hagan, Mark Lomas, Ian Woodward, Andreas Heinemayer and John Paul Gosling Quantifying uncertainty in the.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
Options and generalisations. Outline Dimensionality Many inputs and/or many outputs GP structure Mean and variance functions Prior information Multi-output,
Introduction to emulators Tony O’Hagan University of Sheffield.
8 Sept 2006, DEMA2006Slide 1 An Introduction to Computer Experiments and their Design Problems Tony O’Hagan University of Sheffield.
Marc Kennedy, Tony O’Hagan, Clive Anderson,
Presentation transcript:

Southampton workshop, July 2009Slide 1 Tony O’Hagan, University of Sheffield Simulators and Emulators

2 Computer models In almost all fields of science, technology, industry and policy making, people use mechanistic models to describe complex real- world processes For understanding, prediction, control There is a growing realisation of the importance of uncertainty in model predictions Can we trust them? Without any quantification of output uncertainty, it’s easy to dismiss them

3 Examples Climate prediction Molecular dynamics Nuclear waste disposal Oil fields Engineering design Hydrology

4 Sources of uncertainty A computer model takes inputs x and produces outputs y = f(x) How might y differ from the true real-world value z that the model is supposed to predict? Error in inputs x Initial values, forcing inputs, model parameters Error in model structure or solution Wrong, inaccurate or incomplete science Bugs, solution errors

5 Quantifying uncertainty The ideal is to provide a probability distribution p(z) for the true real-world value The centre of the distribution is a best estimate Its spread shows how much uncertainty about z is induced by uncertainties on the last slide How do we get this? Input uncertainty: characterise p(x), propagate through to p(y) Structural uncertainty: characterise p(z-y)

6 Example: UK carbon flux in 2000 Vegetation model predicts carbon exchange from each of 700 pixels over England & Wales in 2000 Principal output is Net Biosphere Production Accounting for uncertainty in inputs Soil properties Properties of different types of vegetation Land usage (Not structural uncertainty) Aggregated to England & Wales total Allowing for correlations Estimate 7.46 Mt C Std deviation 0.54 Mt C

7 Maps Mean NBPStandard deviation

8 Sensitivity analysis Map shows proportion of overall uncertainty in each pixel that is due to uncertainty in the vegetation parameters As opposed to soil parameters Contribution of vegetation uncertainty is largest in grasslands/moorlands

9 England & Wales aggregate PFT Plug-in estimate (Mt C) Mean (Mt C) Variance (Mt C 2 ) Grass Crop Deciduous Evergreen Covariances Total

10 Reducing uncertainty To reduce uncertainty, get more information! Informal – more/better science Tighten p(x) through improved understanding Tighten p(z-y) through improved modelling or programming Formal – using real-world data Calibration – learn about model parameters Data assimilation – learn about the state variables Learn about structural error z-y Validation

11 So far, so good, but In principle, all this is straightforward In practice, there are many technical difficulties Formulating uncertainty on inputs Elicitation of expert judgements Propagating input uncertainty Modelling structural error Anything involving observational data! The last two are intricately linked And computation

12 The problem of big models Tasks like uncertainty propagation and calibration require us to run the model many times Uncertainty propagation Implicitly, we need to run f(x) at all possible x Monte Carlo works by taking a sample of x from p(x) Typically needs thousands of model runs Calibration Traditionally this is done by searching the x space for good fits to the data Both become impractical if the model takes more than a few seconds to run We need a more efficient technique

13 Gaussian process representation More efficient approach First work in early 1980s (DACE) Represent the code as an unknown function f(.) becomes a random process We generally represent it as a Gaussian process (GP) Or its second-order moment representation Training runs Run model for sample of x values Condition GP on observed data Typically requires many fewer runs than Monte Carlo And x values don’t need to be chosen randomly

14 Emulation Analysis is completed by prior distributions for, and posterior estimation of, hyperparameters The posterior distribution is known as an emulator of the computer code Posterior mean estimates what the code would produce for any untried x (prediction) With uncertainty about that prediction given by posterior variance Correctly reproduces training data

code runs Consider one input and one output Emulator estimate interpolates data Emulator uncertainty grows between data points

code runs Adding another point changes estimate and reduces uncertainty

code runs And so on

18 Then what? Given enough training data points we can in principle emulate any model accurately So that posterior variance is small “everywhere” Typically, this can be done with orders of magnitude fewer model runs than traditional methods At least in relatively low-dimensional problems Use the emulator to make inference about other things of interest E.g. uncertainty analysis, calibration Conceptually very straightforward in the Bayesian framework But of course can be computationally hard

19 BACCO This has led to a wide ranging body of tools for inference about all kinds of uncertainties in computer models All based on building the emulator of the model from a set of training runs This area is now known as BACCO Bayesian Analysis of Computer Code Output MUCM’s objective is to develop BACCO methods into a robust technology that is widely applicable across the spectrum of modelling applications

20 BACCO includes Uncertainty analysis Sensitivity analysis Calibration Data assimilation Model validation Optimisation Etc… All within a single coherent framework

21 MUCM Managing Uncertainty in Complex Models Large 4-year research grant June 2006 to September postdoctoral research associates 4 project PhD students Based in Sheffield, Durham, Aston, Southampton, LSE MUCM2: New directions for MUCM Smaller 2-year grant to September 2012 Scoping and developing research proposals

22 MUCM workpackages Theme 1 – High Dimensionality WP1.1: Screening WP1.2: Sparsity and projection WP1.3: Multiscale models Theme 2 – Using Observational Data WP2.1: Linking models to reality WP2.2: Diagnostics and validation WP3.2: Calibration and data assimilation Theme 3 – Realising the Potential WP3.1: Experimental design WP3.2: Toolkit WP3.3: Case studies

23 Primary deliverables Methodology and papers moving the technology forward Particularly in Themes 1 and 2 Papers both in statistics and application area journals The toolkit Wiki based Documentation of the methods and how to use them With emphasis on what is found to work reliably across a range of modelling areas Case studies Three substantial and detailed case studies Showcasing methods and best practice Linked to toolkit Workshops Both conceptual and hands-on

24 Today Jeremy Oakley presents our first Case Study Epidemiological model Dan Cornford introduces you to the toolkit With live demo! Peter Challenor and Ian Vernon tell you about two more substantial applications Rapid climate change Modelling the universe!