Marc Kennedy, Tony O’Hagan, Clive Anderson,

Slides:



Advertisements
Similar presentations
Case studies in Gaussian process modelling of computer codes for carbon accounting Marc Kennedy, Clive Anderson, Stefano Conti, Tony OHagan.
Advertisements

Bayesian tools for analysing and reducing uncertainty Tony OHagan University of Sheffield.
Uncertainty and Sensitivity Analysis of Complex Computer Codes
Southampton workshop, July 2009Slide 1 Tony O’Hagan, University of Sheffield Simulators and Emulators.
Quantifying and managing uncertainty with Gaussian process emulators Tony O’Hagan University of Sheffield.
Emulators and MUCM. Outline Background Simulators Uncertainty in model inputs Uncertainty analysis Case study – dynamic vegetation simulator Emulators.
14 May 2008RSS Oxford1 Towards quantifying the uncertainty in carbon fluxes Tony O’Hagan University of Sheffield.
SAMSI Distinguished, October 2006Slide 1 Tony O’Hagan, University of Sheffield Managing Uncertainty in Complex Models.
Durham workshop, July 2008Slide 1 Tony O’Hagan, University of Sheffield MUCM: An Overview.
Insert Date HereSlide 1 Using Derivative and Integral Information in the Statistical Analysis of Computer Models Gemma Stephenson March 2007.
Dialogue Policy Optimisation
Slide 1 John Paul Gosling University of Sheffield GEM-SA: a tutorial.
Running a model's adjoint to obtain derivatives, while more efficient and accurate than other methods, such as the finite difference method, is a computationally.
SUMMARIZING DATA: Measures of variation Measure of Dispersion (variation) is the measure of extent of deviation of individual value from the central value.
Validating uncertain predictions Tony O’Hagan, Leo Bastos, Jeremy Oakley, University of Sheffield.
Gaussian Processes I have known
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
Management impacts on the C balance in agricultural ecosystems Jean-François Soussana 1 Martin Wattenbach 2, Pete Smith 2 1. INRA, Clermont-Ferrand, France.
Climate case study. Outline The challenge The simulator The data Definitions and conventions Elicitation Expert beliefs about climate parameters Expert.
Sensitivity Analysis for Complex Models Jeremy Oakley & Anthony O’Hagan University of Sheffield, UK.
Value of Information for Complex Economic Models Jeremy Oakley Department of Probability and Statistics, University of Sheffield. Paper available from.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Carbon losses from all soils across England and Wales Pat Bellamy, Peter Loveland, Ian Bradley, Murray Lark (Rothamsted), Guy Kirk
Uncertainty in Engineering - Introduction Jake Blanchard Fall 2010 Uncertainty Analysis for Engineers1.
1 D r a f t Life Cycle Assessment A product-oriented method for sustainability analysis UNEP LCA Training Kit Module k – Uncertainty in LCA.
Helsinki University of Technology Adaptive Informatics Research Centre Finland Variational Bayesian Approach for Nonlinear Identification and Control Matti.
Estimates of global biogenic isoprene emissions from the terrestrial biosphere with varying levels of CO 2 David J. Wilton 1,2*, Kirsti Ashworth 2, Juliette.
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
Gaussian process modelling
Calibration and Model Discrepancy Tony O’Hagan, MUCM, Sheffield.
Calibration of Computer Simulators using Emulators.
6 July 2007I-Sim Workshop, Fontainebleau1 Simulation and Uncertainty Tony O’Hagan University of Sheffield.
Error Analysis Accuracy Closeness to the true value Measurement Accuracy – determines the closeness of the measured value to the true value Instrument.
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
Soft Sensor for Faulty Measurements Detection and Reconstruction in Urban Traffic Department of Adaptive systems, Institute of Information Theory and Automation,
29 May 2008IMA Scottish Branch1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan University of Sheffield.
Using an emulator. Outline So we’ve built an emulator – what can we use it for? Prediction What would the simulator output y be at an untried input x?
Translation to the New TCO Panel Beverly Law Prof. Global Change Forest Science Science Chair, AmeriFlux Network Oregon State University.
Slide 1 Marc Kennedy, Clive Anderson, Anthony O’Hagan, Mark Lomas, Ian Woodward, Andreas Heinemayer and John Paul Gosling Uncertainty in environmental.
Why it is good to be uncertain ? Martin Wattenbach, Pia Gottschalk, Markus Reichstein, Dario Papale, Jagadeesh Yeluripati, Astley Hastings, Marcel van.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
5-1 ANSYS, Inc. Proprietary © 2009 ANSYS, Inc. All rights reserved. May 28, 2009 Inventory # Chapter 5 Six Sigma.
CAMELS CCDAS A Bayesian approach and Metropolis Monte Carlo method to estimate parameters and uncertainties in ecosystem models from eddy-covariance data.
Slide 1 Marc Kennedy, Clive Anderson, Anthony O’Hagan, Mark Lomas, Ian Woodward, Andreas Heinemayer and John Paul Gosling Quantifying uncertainty in the.
Emulation and Sensitivity Analysis of the CMAQ Model During a UK Ozone Pollution Episode Andrew Beddows Environmental Research Group King’s College London.
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
Goal: to understand carbon dynamics in montane forest regions by developing new methods for estimating carbon exchange at local to regional scales. Activities:
How Good is a Model? How much information does AIC give us? –Model 1: 3124 –Model 2: 2932 –Model 3: 2968 –Model 4: 3204 –Model 5: 5436.
- 1 - Calibration with discrepancy Major references –Calibration lecture is not in the book. –Kennedy, Marc C., and Anthony O'Hagan. "Bayesian calibration.
Inverse Modeling of Surface Carbon Fluxes Please read Peters et al (2007) and Explore the CarbonTracker website.
Building Valid, Credible & Appropriately Detailed Simulation Models
Introduction to emulators Tony O’Hagan University of Sheffield.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
1 Life Cycle Assessment A product-oriented method for sustainability analysis UNEP LCA Training Kit Module k – Uncertainty in LCA.
8 Sept 2006, DEMA2006Slide 1 An Introduction to Computer Experiments and their Design Problems Tony O’Hagan University of Sheffield.
Probabilistic methods for aggregate and cumulative exposure to pesticides Marc Kennedy Risk and Numerical Sciences team
Data Analysis.
CO2 sources and sinks in China as seen from the global atmosphere
STA 291 Spring 2010 Lecture 12 Dustin Lueker.
How Good is a Model? How much information does AIC give us?
Ecosystem Demography model version 2 (ED2)
Introduction to Instrumentation Engineering
Adam Butler & Glenn Marion, Biomathematics & Statistics Scotland •
2. Stratified Random Sampling.
Uncertainties in emission inventories
STA 291 Summer 2008 Lecture 12 Dustin Lueker.
STA 291 Spring 2008 Lecture 12 Dustin Lueker.
Comparing Theory and Measurement
Uncertainty Propagation
Presentation transcript:

The Estimation of the Net CO2 Flux for England & Wales and its Uncertainty using Emulation Marc Kennedy, Tony O’Hagan, Clive Anderson, Mark Lomas, John Paul Gosling and Ian Woodward (University of Sheffield) Andreas Heinemeyer (University of York)

Carbon flux Carbon dioxide (CO2) is one of the principal greenhouse gases that drives global warming To what extent can vegetation reduce the quantity of CO2 going into the atmosphere? Source or sink? Kyoto agreement signatories are required each year to account for carbon (C) emissions How to estimate this? Inventories Models

Computer models In almost all fields of science, technology, industry and policy making, people use mechanistic models to describe complex real-world processes For understanding, prediction, control Growing realisation of importance of uncertainty in model predictions Can we trust them? Without any quantification of output uncertainty, it’s easy to dismiss them

Uncertainty analysis Consider just one source of uncertainty We have a computer model that produces output y = f (x) when given input x But for a particular application we do not know x precisely So X is a random variable, and so therefore is Y = f (X ) We are interested in the uncertainty distribution of Y How can we compute it?

Monte Carlo The usual approach is Monte Carlo Sample values of x from its distribution Run the model for all these values to produce sample values yi = f (xi) These are a sample from the uncertainty distribution of Y Neat but impractical if it takes minutes or hours to run the model We can then only make a small number of runs

Emulation A computer model encodes a function, that takes inputs and produces outputs An emulator is a statistical approximation of that function Estimates what outputs would be obtained from given inputs With statistical measure of estimation error Given enough training data, estimation error variance can be made small

So what? A good emulator estimates the model output accurately with small uncertainty and runs “instantly” So we can do uncertainty analysis etc fast and efficiently Conceptually, we use model runs to train the emulator then derive any desired properties of model

Gaussian process We use Gaussian process (GP) emulation Nonparametric, so can fit any function Error measures can be validated Analytically tractable, so can often do uncertainty analysis etc analytically Highly efficient for up to 100 inputs The method uses Bayesian theory Formally, the posterior distribution of the function is a GP This posterior distribution is the emulator

BACCO This has led to a wide ranging body of tools for inference about all kinds of uncertainties in computer models All based on building the GP emulator of the model from a set of training runs This area is known as BACCO Bayesian Analysis of Computer Code Output Includes not just uncertainty analysis Sensitivity analysis, calibration, data assimilation, validation, optimisation …

CTCD and MUCM Centre for Terrestrial Carbon Dynamics (CTCD) http://ctcd.group.shef.ac.uk Mission: To understand C fluxes from vegetation Managing Uncertainty in Complex Models (MUCM) http://mucm.group.shef.ac.uk To develop robust and widely applicable BACCO methods

The England & Wales carbon flux in 2000 Recent application of these methods Dynamic vegetation model (SDGVMd) Predicts carbon sequestration and release from vegetation and soils NBP (net biosphere production) GPP (gross primary production) Over 700 pixels across E&W 4 plant functional types separately modelled Deciduous broadleaf (DcBl), evergreen needleleaf (EvNl), C3 grasses and crops

SDGVMd outputs for 2000 Plug-in maps

Outline of analysis Build emulators for each PFT at a sample of sites Identify most important inputs Define distributions to describe uncertainty in important inputs Analysis of soils data Elicitation of uncertainty in PFT parameters Need to consider correlations

Carry out uncertainty analysis in each sampled site Interpolate across all sites Mean corrections and standard deviations Aggregate across sites and PFTs Allowing for correlations

Sensitivity analysis for one pixel/PFT Main effects at site (54.417, -0.75)

Elicitation Beliefs of expert (developer of SDGVMd) regarding plausible values of PFT parameters Important to allow for uncertainty about mix of species in a pixel and role of parameter in the model In the case of leaf life span for evergreens, this was more complex

EvNl leaf life span

Correlations PFT parameter in one pixel may differ from in another Because of variation in species mix Common uncertainty about average over all species induces correlation Elicit beliefs about average over whole UK EvNl joint distributions are mixtures of 25 components, with correlation both between and within years

Mean NBP corrections Interpolated maps of E(NBP) – plugins, assuming 100% coverage of each PFT

NBP standard deviations Standard deviations of NBP, assuming 100% coverage of each PFT

Land cover (from LCM2000) Land cover map

Aggregate across 4 PFTs Corrected map for NBP, with standard deviation, after weighted aggregation over different PFTs

Sensitivity analysis Map shows proportion of overall uncertainty in each pixel that is due to uncertainty in the parameters of PFTs As opposed to soil parameters Contribution of PFT uncertainty largest in grasslands/moorlands

Aggregate over England & Wales PFT Plug-in estimate (Mt C) Mean (Mt C) Variance (Mt C2) Grass 5.279 4.639 0.269 Crop 0.853 0.445 0.034 Deciduous 2.132 1.683 0.013 Evergreen 0.798 0.781 0.001 Covariances Total 9.061 7.548 0.321 Breakdown of variance in total E&W variance of NBP, and comparison with plug-in totals

Conclusions BACCO methods offer a powerful basis for computation of uncertainties in model predictions Analysis of E&W aggregate NBP in 2000 Good case study for uncertainty and sensitivity analyses Involved several technical extensions Has important implications for our understanding of C fluxes Policy implications