Bayesian tools for analysing and reducing uncertainty Tony OHagan University of Sheffield.

Slides:



Advertisements
Similar presentations
ADAPT A ssessment & D esign for A daptation: a P rototype T ool Ian Noble, World Bank & Fareeha Y. Iqbal (Consultant, World Bank)
Advertisements

Exploratory methods to analyse output from complex environmental models Exploratory methods to analyse output from complex environmental models Adam Butler,
Southampton workshop, July 2009Slide 1 Tony O’Hagan, University of Sheffield Simulators and Emulators.
Quantifying and managing uncertainty with Gaussian process emulators Tony O’Hagan University of Sheffield.
Using an emulator. Outline So we’ve built an emulator – what can we use it for? Prediction What would the simulator output y be at an untried input x.
Durham workshop, July 2008Slide 1 Tony O’Hagan, University of Sheffield MUCM: An Overview.
Insert Date HereSlide 1 Using Derivative and Integral Information in the Statistical Analysis of Computer Models Gemma Stephenson March 2007.
Interfacing physical experiments and computer models Preliminary remarks Tony O’Hagan.
Data assimilation Preliminary remarks Tony O’Hagan.
Running a model's adjoint to obtain derivatives, while more efficient and accurate than other methods, such as the finite difference method, is a computationally.
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 12 Measures of Association.
Validating uncertain predictions Tony O’Hagan, Leo Bastos, Jeremy Oakley, University of Sheffield.
Automated Anomaly Detection, Data Validation and Correction for Environmental Sensors using Statistical Machine Learning Techniques
Gaussian Processes I have known
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
Climate case study. Outline The challenge The simulator The data Definitions and conventions Elicitation Expert beliefs about climate parameters Expert.
Sensitivity Analysis for Complex Models Jeremy Oakley & Anthony O’Hagan University of Sheffield, UK.
Parameterising Bayesian Networks: A Case Study in Ecological Risk Assessment Carmel A. Pollino Water Studies Centre Monash University Owen Woodberry, Ann.
Computational Methods for Management and Economics Carla Gomes Module 3 OR Modeling Approach.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 4: Modeling Decision Processes Decision Support Systems in the.
Uncertainty analysis and Model Validation.
A Two Level Monte Carlo Approach To Calculating
1 Simulation Modeling and Analysis Verification and Validation.
1 Validation and Verification of Simulation Models.
17/12/2002 CHEBS Launch Seminar CHEBS Activities and Plans Tony O’Hagan Director.
Review.
Value of Information for Complex Economic Models Jeremy Oakley Department of Probability and Statistics, University of Sheffield. Paper available from.
The Calibration Process
Session 3: Calibration Using observations of the real process.
Engineering subprogramme, 7 November 2006 Tony O’Hagan.
Gaussian process modelling
Calibration and Model Discrepancy Tony O’Hagan, MUCM, Sheffield.
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
Calibration of Computer Simulators using Emulators.
6 July 2007I-Sim Workshop, Fontainebleau1 Simulation and Uncertainty Tony O’Hagan University of Sheffield.
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
1 / 41 Inference and Computation with Population Codes 13 November 2012 Inference and Computation with Population Codes Alexandre Pouget, Peter Dayan,
29 May 2008IMA Scottish Branch1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan University of Sheffield.
Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters.
Using an emulator. Outline So we’ve built an emulator – what can we use it for? Prediction What would the simulator output y be at an untried input x?
Slide 1 Marc Kennedy, Clive Anderson, Anthony O’Hagan, Mark Lomas, Ian Woodward, Andreas Heinemayer and John Paul Gosling Uncertainty in environmental.
Conceptual Modelling and Hypothesis Formation Research Methods CPE 401 / 6002 / 6003 Professor Will Zimmerman.
Statistical / empirical models Model ‘validation’ –should obtain biomass/NDVI measurements over wide range of conditions –R 2 quoted relates only to conditions.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
Center for Radiative Shock Hydrodynamics Fall 2011 Review Assessment of predictive capability Derek Bingham 1.
Extreme values and risk Adam Butler Biomathematics & Statistics Scotland CCTC meeting, September 2007.
Chap. 5 Building Valid, Credible, and Appropriately Detailed Simulation Models.
Fuzzy Systems Michael J. Watts
Experiences in assessing deposition model uncertainty and the consequences for policy application Rognvald I Smith Centre for Ecology and Hydrology, Edinburgh.
Building Simulation Model In this lecture, we are interested in whether a simulation model is accurate representation of the real system. We are interested.
Slide 1 Marc Kennedy, Clive Anderson, Anthony O’Hagan, Mark Lomas, Ian Woodward, Andreas Heinemayer and John Paul Gosling Quantifying uncertainty in the.
Chapter 7. Learning through Imitation and Exploration: Towards Humanoid Robots that Learn from Humans in Creating Brain-like Intelligence. Course: Robots.
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
DFT Applications Technology to calculate observables Global properties Spectroscopy DFT Solvers Functional form Functional optimization Estimation of theoretical.
- 1 - Calibration with discrepancy Major references –Calibration lecture is not in the book. –Kennedy, Marc C., and Anthony O'Hagan. "Bayesian calibration.
Options and generalisations. Outline Dimensionality Many inputs and/or many outputs GP structure Mean and variance functions Prior information Multi-output,
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
Information and Statistics in Nuclear Experiment and Theory - Introduction D. G. Ireland 16 November 2015 ISNET-3, ECT* Trento.
Building Valid, Credible & Appropriately Detailed Simulation Models
Introduction to emulators Tony O’Hagan University of Sheffield.
Analysis Tools interface - configuration Wouter Verkerke Wouter Verkerke, NIKHEF 1.
8 Sept 2006, DEMA2006Slide 1 An Introduction to Computer Experiments and their Design Problems Tony O’Hagan University of Sheffield.
Marc Kennedy, Tony O’Hagan, Clive Anderson,
Project Cost Management
PSY 626: Bayesian Statistics for Psychological Science
The Calibration Process
Lithography Diagnostics Based on Empirical Modeling
Physics-guided machine learning for milling stability:
Presentation transcript:

Bayesian tools for analysing and reducing uncertainty Tony OHagan University of Sheffield

Or …

Uncertainty, Complexity and Predictive Reliability of (environmental/biological) process models

Summary Uncertainty Complexity Predictive Reliability

Uncertainty is everywhere … Internal parameters Initial conditions Forcing inputs Model structure Observational error Code uncertainty

Uncertainty (2) And all sources of uncertainty must be recognised quantified Otherwise we dont know how good model predictions are how to use data

Tasks involving uncertainty Whether or not we have data Sensitivity analysis Uncertainty analysis Interacting with observational data Calibration Data assimilation Discrepancy estimation Validation

Complexity This is already a big task It is massively exacerbated by model complexity High dimensionality Long model run times But there are powerful statistical tools available

Its a big task Quantifying uncertainty is often difficult Unfamiliar task Need for expert statistical skills Statistical modelling Elicitation It deserves to be recognised as a task of comparable status to developing the model And EMS is all about respecting each others expertise

Computational complexity All the tasks involving uncertainty can be computed by simple (MC)MC methods if the model runs quickly enough Otherwise emulation is needed Requires orders of magnitude fewer model runs

Emulation A computer model encodes a function, that takes inputs and produces outputs An emulator is a statistical approximation of that function NOT just an approximation Estimates what outputs would be obtained from given inputs With statistically valid measure of uncertainty

Emulators Multiple regression models Do not make valid uncertainty statements Neural networks Can make valid uncertainty statements but complex Data based mechanistic models Do not make valid uncertainty statements Gaussian processes

GPs Gaussian process emulators are nonparametric make no assumptions other than smoothness estimate the code accurately with small uncertainty and run instantly So we can do uncertainty based tasks fast and efficiently Conceptually, we use model runs to learn about the function then derive any desired properties of model

2 code runs Consider one input and one output Emulator estimate interpolates data

2 code runs Emulator uncertainty grows between data points

3 code runs Adding another point changes estimate and reduces uncertainty

5 code runs And so on

Smoothness It is the basic assumption of a (homogeneously) smooth, continuous function that gives the GP its computational advantages The actual degree of smoothness concerns how rapidly the function wiggles A rough function responds strongly to quite small changes in inputs We need many more data points to emulate accurately a rough function over a given range

Effect of Smoothness Smoothness determines how fast the uncertainty increases between data points

Estimating smoothness We can estimate the smoothness from the data This is obviously a key Gaussian process parameter to estimate But tricky Need robust estimate Validate by predicting left-out data points

Code uncertainty Emulation, like MC, is just a computational device But a highly efficient one! Like MC, quantities of interest are computed subject to error Statistically quantifiable and validatable Reducible if we can do more model runs This is code uncertainty

And finally … Predictive Reliability

What can we do with observational data? Model validation Check observations against predictive distributions based on current knowledge Calibration Data assimilation Model correction Learn about values of uncertain model parameters (possibly including model structure) For dynamic models, learn about the current value of the state vector Learn about model discrepancy function Do all of these (in one coherent Bayesian system)

Doing it all Its crucial to model uncertainties carefully to avoid using data twice to apportion observation error between parameters, state vector and model discrepancy to get appropriate learning about all these Data assimilation alone is useful only for short term prediction

This is challenging We (Sheffield and Durham) have developed theory and serious case studies Growing practical experience But still lots to do, both theoretically and practically Each new model poses new challenges Our science is as exciting and challenging as any other

Sorry … We are not yet at the stage where implementation is routine Very limited software Most publications in the statistics literature But were working on it And were very willing to interact with modellers/users in any discipline Particularly if you have resources!

Who we are Sheffield Tony OHagan Marc Kennedy, Stefano Conti, Jeremy Oakley Durham Michael Goldstein Peter Craig, Jonathan Rougier, Alan Seheult