Options and generalisations. Outline Dimensionality Many inputs and/or many outputs GP structure Mean and variance functions Prior information Multi-output,

Slides:



Advertisements
Similar presentations
Copula Representation of Joint Risk Driver Distribution
Advertisements

Applications of one-class classification
Bayesian tools for analysing and reducing uncertainty Tony OHagan University of Sheffield.
How to Emulate: Recipes without Patronising
Building an Emulator.
Using an emulator. Outline So we’ve built an emulator – what can we use it for? Prediction What would the simulator output y be at an untried input x.
Insert Date HereSlide 1 Using Derivative and Integral Information in the Statistical Analysis of Computer Models Gemma Stephenson March 2007.
COMM 472: Quantitative Analysis of Financial Decisions
Designing Ensembles for Climate Prediction
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
TRIM Workshop Arco van Strien Wildlife statistics Statistics Netherlands (CBS)
CMPUT 466/551 Principal Source: CMU
STAT 497 APPLIED TIME SERIES ANALYSIS
Validating uncertain predictions Tony O’Hagan, Leo Bastos, Jeremy Oakley, University of Sheffield.
Cox Model With Intermitten and Error-Prone Covariate Observation Yury Gubman PhD thesis in Statistics Supervisors: Prof. David Zucker, Prof. Orly Manor.
Gaussian Processes I have known
Bayesian statistics – MCMC techniques
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
Climate case study. Outline The challenge The simulator The data Definitions and conventions Elicitation Expert beliefs about climate parameters Expert.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Lecture Notes for CMPUT 466/551 Nilanjan Ray
Discrete-Event Simulation: A First Course Steve Park and Larry Leemis College of William and Mary.
Robert M. Saltzman © DS 851: 4 Main Components 1.Applications The more you see, the better 2.Probability & Statistics Computer does most of the work.
Ensemble Learning (2), Tree and Forest
CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.
Principles of the Global Positioning System Lecture 11 Prof. Thomas Herring Room A;
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Introduction to Monte Carlo Methods D.J.C. Mackay.
1 CE 530 Molecular Simulation Lecture 7 David A. Kofke Department of Chemical Engineering SUNY Buffalo
Lecture 7: Simulations.
1 Statistical Mechanics and Multi- Scale Simulation Methods ChBE Prof. C. Heath Turner Lecture 11 Some materials adapted from Prof. Keith E. Gubbins:
Gaussian process modelling
Calibration and Model Discrepancy Tony O’Hagan, MUCM, Sheffield.
Kalman filtering techniques for parameter estimation Jared Barber Department of Mathematics, University of Pittsburgh Work with Ivan Yotov and Mark Tronzo.
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
1 Institute of Engineering Mechanics Leopold-Franzens University Innsbruck, Austria, EU H.J. Pradlwarter and G.I. Schuëller Confidence.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.
Emulation, Uncertainty, and Sensitivity
Using an emulator. Outline So we’ve built an emulator – what can we use it for? Prediction What would the simulator output y be at an untried input x?
Learning Lateral Connections between Hidden Units Geoffrey Hinton University of Toronto in collaboration with Kejie Bao University of Toronto.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Statistics and the Verification Validation & Testing of Adaptive Systems Roman D. Fresnedo M&CT, Phantom Works The Boeing Company.
Getting started with GEM-SA. GEM-SA course - session 32 This talk Starting GEM-SA program Creating input and output files Explanation of the menus, toolbars,
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
Center for Radiative Shock Hydrodynamics Fall 2011 Review Assessment of predictive capability Derek Bingham 1.
Spatial Interpolation III
Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation Radford M. Neal 발표자 : 장 정 호.
CHAPTER 17 O PTIMAL D ESIGN FOR E XPERIMENTAL I NPUTS Organization of chapter in ISSO –Background Motivation Finite sample and asymptotic (continuous)
Stats 845 Applied Statistics. This Course will cover: 1.Regression –Non Linear Regression –Multiple Regression 2.Analysis of Variance and Experimental.
Correlation & Regression Analysis
R. Ty Jones Director of Institutional Research Columbia Basin College PNAIRP Annual Conference Portland, Oregon November 7, 2012 R. Ty Jones Director of.
Gaussian Processes For Regression, Classification, and Prediction.
Eurostat Accuracy of Results of Statistical Matching Training Course «Statistical Matching» Rome, 6-8 November 2013 Marcello D’Orazio Dept. National Accounts.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Simulation. Types of simulation Discrete-event simulation – Used for modeling of a system as it evolves over time by a representation in which the state.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Introduction to emulators Tony O’Hagan University of Sheffield.
CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
8 Sept 2006, DEMA2006Slide 1 An Introduction to Computer Experiments and their Design Problems Tony O’Hagan University of Sheffield.
Non-linear relationships
Machine Learning Basics
CSCI 5822 Probabilistic Models of Human and Machine Learning
Hidden Markov Models Part 2: Algorithms
Objective of This Course
Product moment correlation
Presentation transcript:

Options and generalisations

Outline Dimensionality Many inputs and/or many outputs GP structure Mean and variance functions Prior information Multi-output, dynamic and non-deterministic emulators Design Design for building the emulator Design for validation Design for calibration Bayes linear emulators MUCM short course - session 32

Dimensionality

Many inputs This session is about variations and extensions of the methodology illustrated in Session 2 Brief look at a number of topics – more detail in the toolkit! The first topic is dimensionality All serious simulators require more than one input The norm is anything from a few to thousands All of the basic emulation theory in the toolkit assumes multiple inputs Even the core problem (e.g. ThreadCoreGP) Large numbers of inputs pose computational problems Dimension reduction techniques have been developed Output typically depends principally on a few inputs MUCM short course - session 34

Many outputs Most simulators also produce multiple outputs For instance, a climate simulator may predict temperature on a grid of pixels, sea level, etc. Usually, for any given use of the simulator we are interested in just one output So we can just emulate that one But some problems require multi-output emulation Again, there are dimension reduction techniques All described in the toolkit MUCM short course - session 35

GP structure

The GP mean function We can use this to say what kind of shape we would expect the output to take as a function of the inputs Most simulator outputs exhibit some overall trend in response to varying a single input So we usually specify a linear mean function Slopes (positive or negative) are estimated from the training data The emulator mean smoothes the residuals after fitting the linear terms We can generalise to other kinds of mean function if we have a clear idea of how the simulator will behave The better the mean function the less the GP has to do MUCM short course - session 37

Example Simulator is solid line Dashed line is linear fit Thin blue lines indicate fitted residuals Without the linear mean function, we’d have a horizontal (constant) fit And larger residuals Leading to larger emulator uncertainty MUCM short course - session 38

9

The GP covariance function The covariance function determines how ‘wiggly’ the response is to each input There’s a lot of flexibility here, but standard covariance functions have a parameter for each input These ‘correlation length’ parameters are also estimated from the training data But some care is needed For predicting output at untried x, correlation lengths are important They determine how much information comes from nearby training points And hence the emulator accuracy MUCM short course - session 310

Prior distributions Prior information enters through the form of the mean function And to a lesser extent the covariance function But we can also supply prior information through the prior distributions For slope/regression parameters and correlation lengths Also the overall variance parameter Putting in genuine prior information here generally improves emulator performance Compared with standard ‘non-informative’ priors MUCM short course - session 311

Multi-output emulators When we need to emulate several simulator outputs, there are a number of available approaches Single output GP with added input(s) indexing the outputs For temperature outputs on a grid, make grid coordinates 2 additional inputs Independent GPs Multivariate GP Independent GPs for a linear transformation E.g. Principal components Possibility for dimension reduction These are all documented in the toolkit MUCM short course - session 312

MUCM short course - session 313 From ThreadVariantMultipleOutputs

Dynamic emulation Many simulators predict a process evolving in time At each time-step the simulator updates the system state Often driven by external forcing variables at each time- step Climate models are usually dynamic in this sense We are interested in emulating the simulator’s time series of outputs The various forms of multi-output emulation can be used Or a dynamic emulator that works by emulating the single time-step And then iterating the emulator Also documented in the toolkit MUCM short course - session 314

MUCM short course - session 315 From ThreadVariantDynamic

Stochastic emulation Other simulators produce non-deterministic outputs Running a stochastic simulator twice with the same input x produces randomly different outputs Different emulation strategies arise depending on what aspect of the output is of interest Interest focuses on the mean Output has added noise Which we allow for when building the emulator Interest focuses on risk of exceeding a threshold Emulate the distribution and derive the risk Emulate the risk This is not yet covered in the toolkit MUCM short course - session 316

Design

Training sample design To build an emulator, we use a set of simulator runs Our training data are y 1 = f(x 1 ),..., y n = f(x n ) Where x 1, x 2,..., x n are n different points in the space of possible inputs This set of n points is a design Traditional Monte Carlo methods use randomly chosen design points One reason why emulation can be better is that we can use a carefully chosen design A good design will provide us with maximum information about the simulator And hence an emulator that is as good as possible MUCM short course - session 318

MUCM short course - session 319

Design principles Design space Over what range of values for the inputs do we want to build the emulator? This is usually a small part of the possible input space Filling the space We want to place n points in this space It is generally good practice to spread them quite evenly over the design space Because adding a new point very close to an existing point provides little additional information There are several ways to generate space-filling designs MUCM short course - session 320

Factorial designs A factorial design is a grid Uses a set of values for each input And the grid is all combinations of these This is a 3x3 = 9 point design for two inputs Disadvantages In higher dimensions requires too many points E.g. Just 3 values for each of 8 inputs means 3 8 = 6561 runs Wasteful when some inputs don’t affect output appreciably The 3x3 design has just 3 points if one input is inactive MUCM short course - session 321 xxx xxx xxx

Latin hypercube designs LHC designs Use n values for each input Combining randomly Advantages Doesn’t necessarily require a large number of points Nothing lost if some inputs are inactive Disadvantages Random choice may not produce an even spread of points Need to generate many LHC designs and pick the best MUCM short course - session 322 x x x x x x x x x

MUCM short course - session 323

Some more choices Various formulae and algorithms exist to generate space-filling designs for any number of inputs The Sobol sequence is often used Quick and convenient Not always good when some inputs are inactive Optimal designs maximise/minimise some criterion E.g. Maximum entropy designs Can be hard to compute Hybrid designs try to satisfy two criteria Space-filling but also having a few points closer together In order to estimate correlation lengths well MUCM short course - session 324

Design for validation Validation checks outputs from a sample of new simulator runs against the predictions of an emulator Are the true outputs as close to the emulator mean values as the emulator variances say they should be? What would be good inputs to use for validation? Points close to others already used for training Such points test correlation lengths Points far from any training sample point These points test the mean function Together, the training and validation designs should look space filling except for some points close together As suggested for hybrid training sample designs MUCM short course - session 325

Design for calibration A very different design problem How should we design an experiment observing the real system? How to set the values of controllable inputs? With the aim of learning most about uncertain (uncontrollable) inputs And model discrepancy This topic is as yet largely unexplored! MUCM short course - session 326

MUCM short course - session 327 From ThreadTopicExperimentalDesign

Bayes linear emulation

Bayes linear methods The approach in this course is based in the fully Bayesian framework But there is an alternative framework – Bayes linear methods Based only on first and second order moments Means, variances, covariances Avoids making assumptions about distributions Its predictions are also first and second order moments Means, variances, covariances but no distributions The toolkit contains theory and procedures for Bayes linear emulators ThreadCoreBL etc. MUCM short course - session 329

MUCM short course - session 330

Bayes linear emulators Much of the mathematics is very similar A Bayes linear emulator is not a GP but gives the same mean and variance predictions for f(x) For given correlation lengths, mean function parameters Although these are handled differently But the emulator predictions no longer have distributions Compared with GP emulators Advantages – simpler and may be feasible for more complex problems Disadvantages – absence of distributions limits many of the uses of emulators Compromises made MUCM short course - session 331