Eawag: Swiss Federal Institute of Aquatic Science and Technology Mechanism-Based Emulation of Dynamic Simulation Models – Concept and Application in Hydrology.

Slides:



Advertisements
Similar presentations
Eawag: Swiss Federal Institute of Aquatic Science and Technology Analyzing possible causes of bias of hydrological models with stochastic, time-dependent.
Advertisements

Bayesian Belief Propagation
1 Probabilistic Uncertainty Bounding in Output Error Models with Unmodelled Dynamics 2006 American Control Conference, June 2006, Minneapolis, Minnesota.
Insert Date HereSlide 1 Using Derivative and Integral Information in the Statistical Analysis of Computer Models Gemma Stephenson March 2007.
CSCE643: Computer Vision Bayesian Tracking & Particle Filtering Jinxiang Chai Some slides from Stephen Roth.
Use of Kalman filters in time and frequency analysis John Davis 1st May 2011.
Dynamic Bayesian Networks (DBNs)
Chapter 4: Linear Models for Classification
Presenter: Yufan Liu November 17th,
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
Visual Recognition Tutorial
Pattern Recognition and Machine Learning
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
Constraining Astronomical Populations with Truncated Data Sets Brandon C. Kelly (CfA, Hubble Fellow, 6/11/2015Brandon C. Kelly,
Stochastic Differentiation Lecture 3 Leonidas Sakalauskas Institute of Mathematics and Informatics Vilnius, Lithuania EURO Working Group on Continuous.
Curve-Fitting Regression
CS 547: Sensing and Planning in Robotics Gaurav S. Sukhatme Computer Science Robotic Embedded Systems Laboratory University of Southern California
Bootstrap in Finance Esther Ruiz and Maria Rosa Nieto (A. Rodríguez, J. Romo and L. Pascual) Department of Statistics UNIVERSIDAD CARLOS III DE MADRID.
A Concept of Environmental Forecasting and Variational Organization of Modeling Technology Vladimir Penenko Institute of Computational Mathematics and.
Efficient Methodologies for Reliability Based Design Optimization
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
Estimation and the Kalman Filter David Johnson. The Mean of a Discrete Distribution “I have more legs than average”
Course AE4-T40 Lecture 5: Control Apllication
Extreme Value Analysis, August 15-19, Bayesian analysis of extremes in hydrology A powerful tool for knowledge integration and uncertainties assessment.
Lecture II-2: Probability Review
Robin McDougall, Ed Waller and Scott Nokleby Faculties of Engineering & Applied Science and Energy Systems & Nuclear Science 1.
Gaussian process modelling
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
Computer vision: models, learning and inference Chapter 19 Temporal models.
Model Inference and Averaging
Computer vision: models, learning and inference Chapter 19 Temporal models.
CSDA Conference, Limassol, 2005 University of Medicine and Pharmacy “Gr. T. Popa” Iasi Department of Mathematics and Informatics Gabriel Dimitriu University.
Kalman Filter (Thu) Joon Shik Kim Computational Models of Intelligence.
Eawag: Swiss Federal Institute of Aquatic Science and Technology Problems of Inference and Uncertainty Estimation in Hydrologic Modelling Peter Reichert.
1 Robot Environment Interaction Environment perception provides information about the environment’s state, and it tends to increase the robot’s knowledge.
Bayesian Inversion of Stokes Profiles A.Asensio Ramos (IAC) M. J. Martínez González (LERMA) J. A. Rubiño Martín (IAC) Beaulieu Workshop ( Beaulieu sur.
Overview Particle filtering is a sequential Monte Carlo methodology in which the relevant probability distributions are iteratively estimated using the.
Statistics and the Verification Validation & Testing of Adaptive Systems Roman D. Fresnedo M&CT, Phantom Works The Boeing Company.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
Applications of optimal control and EnKF to Flow Simulation and Modeling Florida State University, February, 2005, Tallahassee, Florida The Maximum.
Center for Radiative Shock Hydrodynamics Fall 2011 Review Assessment of predictive capability Derek Bingham 1.
-Arnaud Doucet, Nando de Freitas et al, UAI
Mobile Robot Localization (ch. 7)
ECE-7000: Nonlinear Dynamical Systems Overfitting and model costs Overfitting  The more free parameters a model has, the better it can be adapted.
An Introduction to Kalman Filtering by Arthur Pece
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
Sequential Monte-Carlo Method -Introduction, implementation and application Fan, Xin
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Reducing MCMC Computational Cost With a Two Layered Bayesian Approach
- 1 - Calibration with discrepancy Major references –Calibration lecture is not in the book. –Kennedy, Marc C., and Anthony O'Hagan. "Bayesian calibration.
Nonlinear State Estimation
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
Geogg124: Data assimilation P. Lewis. What is Data Assimilation? Optimal merging of models and data Models Expression of current understanding about process.
Potential Projects for SAMSI Program on Environmental Models Peter Reichert.
Eawag: Swiss Federal Institute of Aquatic Science and Technology Analyzing input and structural uncertainty of a hydrological model with stochastic, time-dependent.
Anders Nielsen Technical University of Denmark, DTU-Aqua Mark Maunder Inter-American Tropical Tuna Commission An Introduction.
Eawag: Swiss Federal Institute of Aquatic Science and Technology Use of time-dependent parameters for improvement and uncertainty estimation of dynamic.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Introduction to emulators Tony O’Hagan University of Sheffield.
Eawag: Swiss Federal Institute of Aquatic Science and Technology Analyzing input and structural uncertainty of deterministic models with stochastic, time-dependent.
8 Sept 2006, DEMA2006Slide 1 An Introduction to Computer Experiments and their Design Problems Tony O’Hagan University of Sheffield.
Ch3: Model Building through Regression
Department of Civil and Environmental Engineering
PSG College of Technology
Rutgers Intelligent Transportation Systems (RITS) Laboratory
Student: Hao Xu, ECE Department
Filtering and State Estimation: Basic Concepts
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Presentation transcript:

Eawag: Swiss Federal Institute of Aquatic Science and Technology Mechanism-Based Emulation of Dynamic Simulation Models – Concept and Application in Hydrology Peter Reichert Eawag Dübendorf and ETH Zürich Switzerland

Data-driven and physically- based models, IMS, Singapore, Jan Contents Motivation Concept Implementation Application Discussion  Motivation  Concept of Emulators  General Concept  Gaussian Process Emulator  Dynamic Emulator  Implementation  Application  Discussion and Outlook

Data-driven and physically- based models, IMS, Singapore, Jan Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Motivation Concept Implementation Application Discussion Problem  Many important systems analytical techniques, such as optimization, sensitivity analysis, and statistical inference (e.g. Bayesian inference using MCMC) require a large number of model evaluations.  Many environmental simulation models are computationally demanding.  Model-based analysis of environmental systems is often limited by computational requirements.

Data-driven and physically- based models, IMS, Singapore, Jan Motivation Concept Implementation Application Discussion Solution Strategies 1.Improve the efficiency of the implementation of environmental simulation models. 2.Improve the efficiency of the implementaton of systems analytical techniques. 3.Replace the simulation model by a simplified statistical description, an emulator.  Obviously, all three strategies must be followed.  This talk is about recent progress with strategy 3: The construction and use of emulators of dynamic environmental simulation models.

Data-driven and physically- based models, IMS, Singapore, Jan Concept Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Concept Emulator: An emulator is a statistical approximation of a deterministic simulation model It can be used for interpolating model results between simulation results gained at carefully chosen design points in model input space. Replacing the simulation model by the emulator can tremendously increase the efficiency of analyses (but it also adds additional uncertainty). The emulator provides a deterministic interpolation result as well as a probability distribution representing our knowledge of the uncertainty of emulation. Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Concept Gaussian Process Emulators: Emulators have quite successfully been constructed by setting-up a Gaussian process prior with a mean consisting of a linear combination of basis functions and then conditioning this prior on the design data. Motivation Concept Implementation Application Discussion O‘Hagan 2006

Data-driven and physically- based models, IMS, Singapore, Jan Concept Gaussian Process Emulators: Limitations: 1.Dense output in the time domain leads to numerical difficulties (large size and poor conditioning of matrices to be inverted). 2.The knowledge about the mechanisms built into the simulation program is not used. It can be expected that we could built a better emulator when using this knowledge. This is of particular importance if the design set is small. Motivation Concept Implementation Application Discussion This raises the question how to build an emulator of a dynamic model that resolves both of these issues.

Data-driven and physically- based models, IMS, Singapore, Jan Concept Emulators for Dynamic Models: Three Options: Motivation Concept Implementation Application Discussion 1.Application of Gaussian processes with time dimension as an additional input. Can lead to very large and poorly conditioned matrices to invert and numerical problems. 2.For Markovian or state-space models: Emulate transfer function from one state to the next instead of the complete dynamic response. 3.Use a simple dynamic model as a prior and model innovations as Gaussian processes in the other input dimensions. These Gaussian processes correct for the bias in the simple model.

Data-driven and physically- based models, IMS, Singapore, Jan Concept Emulators for Dynamic Models: All emulators proposed so far (to my knowledge) do not consider our knowledge about the mechanisms implemented in the simulation model (with the exception of an problem-specific choice of basis functions). Approach proposed in this talk: Motivation Concept Implementation Application Discussion  Use a simplified, linear state-space model to describe the approximate dynamics of the simulation model.  Formulate the innovations as Gaussian processes of parameters (and potentially other input).  Derive the emulator (posterior) by Kalman smoothing.

Data-driven and physically- based models, IMS, Singapore, Jan Implementation Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Construction of Emulators Construction of Emulators: We can distinguish five steps of emulator development: 1.Choice of Design Data 2.Choice of a Simplified Probabilistic Model 3.Coupling of Replicated Simplified Models 4.Conditioning the Simplified Model on the Design Data 5.Calculation of Expected Value and Uncertainty Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Construction of Emulators 1. Choice of Design Data: Often parameter values are chosen by latin hypercube sampling from reasonable domains of model parameters. However, adaptive sampling schemes could be used that increase the density of sampling points in regions of high variability of results. The design data set consists of these parameter values and the corresponding simulation results: Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Construction of Emulators 2. Choice of a Simplified Probabilistic Model: The emulator is based on a simplified probabilistic model M‘ of the simulation model M. This model expresses our prior beliefs of the behaviour of the deterministic simulation model. Ist likelihood function is given by: Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Construction of Emulators 3. Coupling of Replicated Simplified Models: The augmented model consists of n replicates of the simplified model for different parameter values: Motivation Concept Implementation Application Discussion These models are stochastically coupled. Probabilities represent here beliefs in a Bayesian sense. We construct a model with n = n D +1 replicates of the simplified model. These correspond to models for the n D design parameter sets and for the emulation parameter set.

Data-driven and physically- based models, IMS, Singapore, Jan Construction of Emulators 4.Conditioning the Simplified Model on the Design Data: Motivation Concept Implementation Application Discussion We calculate the distribution of the last set of components conditional on results for the first n D sets of components: The emulator is gained by integrating out additional parameters:

Data-driven and physically- based models, IMS, Singapore, Jan Construction of Emulators 5. Calculation of Expected Value and Uncertainty: Motivation Concept Implementation Application Discussion The expected value provides the deterministic emulator: The variance-covariance matrix of the emulator is a quantification of emulation uncertainty.

Data-driven and physically- based models, IMS, Singapore, Jan Gaussian Process Emulator 1. Choice of Design Data: Often parameter values are chosen by latin hypercube sampling from reasonable domains of model parameters. However, adaptive sampling schemes could be used that increase the density of sampling points in regions of high variability of results. The design data set consists of these parameter values and the corresponding simulation results: Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Gaussian Process Emulator 2. Choice of a Simplified Probabilistic Model: Motivation Concept Implementation Application Discussion The simplified probabilistic model consists of a deterministic model plus a multivariate normal error term with mean zero: The simplified model can contain additional parameters. Often a linear combination of suitably chosen basis function is used:

Data-driven and physically- based models, IMS, Singapore, Jan Gaussian Process Emulator 3. Coupling of Replicated Simplified Models: Motivation Concept Implementation Application Discussion The augmented model consists of independent replications of the deterministic simplified model and error terms that are stochastically coupled:

Data-driven and physically- based models, IMS, Singapore, Jan Gaussian Process Emulator 3. Coupling of Replicated Simplified Models: Motivation Concept Implementation Application Discussion A simple stochastic coupling is obtained by:

Data-driven and physically- based models, IMS, Singapore, Jan Gaussian Process Emulator 4.Conditioning the Simplified Model on the Design Data: Motivation Concept Implementation Application Discussion The augmented model is then multivariate normal. For this reason, we can apply the standard result for conditioning a multivariate normal distribution on some of ist components:

Data-driven and physically- based models, IMS, Singapore, Jan Gaussian Process Emulator 4.Conditioning the Simplified Model on the Design Data: Motivation Concept Implementation Application Discussion This leads to the emulator as a multivariate normal distribution: with

Data-driven and physically- based models, IMS, Singapore, Jan Gaussian Process Emulator 5. Calculation of Expected Value and Uncertainty: O‘Hagan 2006 Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Dynamic Emulator Motivation Concept Implementation Application Discussion Dynamic models (and their emulators) have a structured output:

Data-driven and physically- based models, IMS, Singapore, Jan Dynamic Emulator 1. Choice of Design Data: Often parameter values are chosen by latin hypercube sampling from reasonable domains of model parameters. However, adaptive sampling schemes could be used that increase the density of sampling points in regions of high variability of results. The design data set consists of these parameter values and the corresponding simulation results: Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Dynamic Emulator 2. Choice of a Simplified Probabilistic Model: Motivation Concept Implementation Application Discussion Concept: Use of state-space model – emulation of „observed“ output only. Reasons:  This accounts for the typical „hidden Markov“ structure of environmental simulation models.  It allows us to implement an emulator with a simplied (lower dimensional) state space.

Data-driven and physically- based models, IMS, Singapore, Jan Dynamic Emulator 2. Choice of a Simplified Probabilistic Model: Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Dynamic Emulator 3. Coupling of Replicated Simplified Models: Motivation Concept Implementation Application Discussion Augmented Model (1):

Data-driven and physically- based models, IMS, Singapore, Jan Dynamic Emulator 3. Coupling of Replicated Simplified Models: Motivation Concept Implementation Application Discussion Augmented Model (2):

Data-driven and physically- based models, IMS, Singapore, Jan Dynamic Emulator 3. Coupling of Replicated Simplified Models: Motivation Concept Implementation Application Discussion Augmented Model (3): Stochastic coupling

Data-driven and physically- based models, IMS, Singapore, Jan Dynamic Emulator 4.Conditioning the Simplified Model on the Design Data: Motivation Concept Implementation Application Discussion Kalman (forward) filtering (Künsch, 2001):

Data-driven and physically- based models, IMS, Singapore, Jan Dynamic Emulator 4.Conditioning the Simplified Model on the Design Data: Motivation Concept Implementation Application Discussion Kalman (backward) smoothing (Künsch, 2001):

Data-driven and physically- based models, IMS, Singapore, Jan Dynamic Emulator 5. Calculation of Expected Value and Uncertainty: Motivation Concept Implementation Application Discussion Calculation of expected value and variance-covariance matrix of last set of components:

Data-driven and physically- based models, IMS, Singapore, Jan Implementation Due to the dependence on (which depends on the design data as well as on the new parameter values), the smoothing step is very inefficient. By using the general matrix identity we are able to separate-out the inversion of the large sub-matrix that depends only on the design data. This makes the procedure much more efficient as we do not have to perform large matrix inversions when using the emulator at new parameter values. Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Application Motivation Concept Implementation Application Discussion Application

Data-driven and physically- based models, IMS, Singapore, Jan Hydrological Model Simple Hydrological Watershed Model (1): Kuczera et al Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Hydrological Model Simple Hydrological Watershed Model (2): Kuczera et al model parameters 3 initial conditions 1 standard dev. of obs. err. Motivation Concept Implementation Application Discussion 6

Data-driven and physically- based models, IMS, Singapore, Jan Hydrological Model Simple Hydrological Watershed Model (3): Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Model Application  Data set of Abercrombie watershed, New South Wales, Australia (2770 km 2 ), kindly provided by George Kuczera (Kuczera et al. 2006).  Box-Cox transformation applied to model and data to decrease heteroscedasticity of residuals.  Step function input to account for input data in the form of daily sums of precipitation and potential evapotranspiration.  Daily averaged output to account for output data in the form of daily averaged discharge. Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Linearization Motivation Concept Implementation Application Discussion Linearization of model nonlinearities:

Data-driven and physically- based models, IMS, Singapore, Jan Linearization Motivation Concept Implementation Application Discussion Derivation of simplified, linear state-space model:

Data-driven and physically- based models, IMS, Singapore, Jan Results Motivation Concept Implementation Application Discussion Preliminary results with a simpler model look promising. They demonstrate that the concept works. Unfortunately, the results for the hydrological model are not yet available.

Data-driven and physically- based models, IMS, Singapore, Jan Discussion Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Discussion We developed a general technique of constructing emulators for dynamic simulation models. In addition to solving technical problems of Gaussian process emulation of dynamic models, this technique easily allows us to rely on mechanisms incorporated in the simulation model. It can be expected that this improves the emulation process. This is of particular importance if the design set is small. There is need for more research: Gaining more experience with our approach. Extending the approach to the estimation of additional parameters of the simplified model. Learning about advantages and disadvantages of the different approaches to dynamic emulation. Motivation Concept Implementation Application Discussion

Data-driven and physically- based models, IMS, Singapore, Jan Acknowledgements  Collaboration for this paper: Gentry White, Susie Bayarri, Bruce Pitman, Tom Santner during my stay at SAMSI, NC, USA Hydrological example and data: George Kuczera. More Interactions at SAMSI: Jim Berger, Fei Liu, Rui Paulo, Robert Wolpert, John Paul Gosling, Tony O‘Hagan, and many more. Motivation Concept Implementation Application Discussion