Bayesian methods for calibrating and comparing process-based vegetation models Marcel van Oijen (CEH-Edinburgh)

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Rachel T. Johnson Douglas C. Montgomery Bradley Jones
Statistical evaluation of model uncertainties in Copert III, by I. Kioutsioukis & S. Tarantola (JRC, I)
Key sources of uncertainty in forest carbon inventories Raisa Mäkipää with Mikko Peltoniemi, Suvi Monni, Taru Palosuo, Aleksi Lehtonen & Ilkka Savolainen.
Spatial point patterns and Geostatistics an introduction
Process-based modelling of vegetations and uncertainty quantification Marcel van Oijen (CEH-Edinburgh) Course Statistics for Environmental Evaluation Glasgow,
Process-based modelling of vegetations and uncertainty quantification Marcel van Oijen (CEH-Edinburgh) Statistics for Environmental Evaluation Glasgow,
Bayesian calibration and uncertainty analysis of dynamic forest models Marcel van Oijen CEH-Edinburgh.
Bayesian tools for analysing and reducing uncertainty Tony OHagan University of Sheffield.
1 Correlation and Simple Regression. 2 Introduction Interested in the relationships between variables. What will happen to one variable if another is.
MCMC estimation in MlwiN
1 -Classification: Internal Uncertainty in petroleum reservoirs.
Uncertainty in reservoirs
Exploratory methods to analyse output from complex environmental models Exploratory methods to analyse output from complex environmental models Adam Butler,
Chapter 7 Sampling and Sampling Distributions
1 Probabilistic Uncertainty Bounding in Output Error Models with Unmodelled Dynamics 2006 American Control Conference, June 2006, Minneapolis, Minnesota.
1 Uncertainty in rainfall-runoff simulations An introduction and review of different techniques M. Shafii, Dept. Of Hydrology, Feb
Capacity Planning For Products and Services
Estimating Uncertainty in Ecosystem Budgets Ruth Yanai, SUNY-ESF, Syracuse Ed Rastetter, Ecosystems Center, MBL Dusty Wood, SUNY-ESF, Syracuse.
Environmental Data Analysis with MatLab
Insert Date HereSlide 1 Using Derivative and Integral Information in the Statistical Analysis of Computer Models Gemma Stephenson March 2007.
A Partition Modelling Approach to Tomographic Problems Thomas Bodin & Malcolm Sambridge Research School of Earth Sciences, Australian National University.
1 General Iteration Algorithms by Luyang Fu, Ph. D., State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting LLP 2007 CAS.
1 Workshop on inventories of greenhouse gas emissions from aviation and navigation May 2004, Copenhagen EU greenhouse gas emission trends and projections.
1 Introduction, reporting requirements, workshop objectives Workshop on energy balances and energy related greenhouse gas emission inventories (under WG.
Jose-Luis Blanco, Javier González, Juan-Antonio Fernández-Madrigal University of Málaga (Spain) Dpt. of System Engineering and Automation May Pasadena,
IP, IST, José Bioucas, Probability The mathematical language to quantify uncertainty  Observation mechanism:  Priors:  Parameters Role in inverse.
Part 11: Random Walks and Approximations 11-1/28 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department.
Basics of Statistical Estimation
Probabilistic Reasoning over Time
Bayesian Estimation in MARK
Bayesian calibration and comparison of process-based forest models Marcel van Oijen & Ron Smith (CEH-Edinburgh) Jonathan Rougier (Durham Univ.)
Markov-Chain Monte Carlo
Chapter 4: Linear Models for Classification
Bayesian statistics – MCMC techniques
Management impacts on the C balance in agricultural ecosystems Jean-François Soussana 1 Martin Wattenbach 2, Pete Smith 2 1. INRA, Clermont-Ferrand, France.
Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.
Machine Learning CMPT 726 Simon Fraser University
4. Testing the LAI model To accurately fit a model to a large data set, as in the case of the global-scale space-borne LAI data, there is a need for an.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Bayesian Analysis for Extreme Events Pao-Shin Chu and Xin Zhao Department of Meteorology School of Ocean & Earth Science & Technology University of Hawaii-
Lecture II-2: Probability Review
Robin McDougall, Ed Waller and Scott Nokleby Faculties of Engineering & Applied Science and Energy Systems & Nuclear Science 1.
Introduction to Monte Carlo Methods D.J.C. Mackay.
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
Gaussian process modelling
Applications of Bayesian sensitivity and uncertainty analysis to the statistical analysis of computer simulators for carbon dynamics Marc Kennedy Clive.
Exam I review Understanding the meaning of the terminology we use. Quick calculations that indicate understanding of the basis of methods. Many of the.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Methods Model. The TECOS model is used as forward model to simulate carbon transfer among the carbon pools (Fig.1). In the model, ecosystem is simplified.
Why it is good to be uncertain ? Martin Wattenbach, Pia Gottschalk, Markus Reichstein, Dario Papale, Jagadeesh Yeluripati, Astley Hastings, Marcel van.
17 May 2007RSS Kent Local Group1 Quantifying uncertainty in the UK carbon flux Tony O’Hagan CTCD, Sheffield.
Simulation techniques Summary of the methods we used so far Other methods –Rejection sampling –Importance sampling Very good slides from Dr. Joo-Ho Choi.
Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Reducing MCMC Computational Cost With a Two Layered Bayesian Approach
Adaptive Spatial Resampling as a McMC Method for Uncertainty Quantification in Seismic Reservoir Modeling Cheolkyun Jeong*, Tapan Mukerji, and Gregoire.
- 1 - Calibration with discrepancy Major references –Calibration lecture is not in the book. –Kennedy, Marc C., and Anthony O'Hagan. "Bayesian calibration.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
Uncertainty analysis of carbon turnover time and sequestration potential in terrestrial ecosystems of the Conterminous USA Xuhui Zhou 1, Tao Zhou 1, Yiqi.
Bayesian II Spring Major Issues in Phylogenetic BI Have we reached convergence? If so, do we have a large enough sample of the posterior?
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
Introduction to emulators Tony O’Hagan University of Sheffield.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Marc Kennedy, Tony O’Hagan, Clive Anderson,
Ch3: Model Building through Regression
Filtering and State Estimation: Basic Concepts
Robust Full Bayesian Learning for Neural Networks
Presentation transcript:

Bayesian methods for calibrating and comparing process-based vegetation models Marcel van Oijen (CEH-Edinburgh)

Contents Process-based modelling of forests and uncertainties Bayes’ Theorem (BT) Bayesian Calibration (BC) of process-based models Bayesian Model Comparison (BMC) BC & BMC in NitroEurope Examples of BC & BMC in other sciences BC & BMC as tools to develop theory References, Summary, Discussion

1. Introduction: Process-based modelling of forests and uncertainties

1.1 Forest growth in Europe Project RECOGNITION (FAIRCT98-4124): 15 partner countries across Europe Previous observations RECOGNITION Forests across Europe have started to grow faster in the 20th century: Causes? Future trend? 22 sites Empirical methods + process-based modelling Modelling groups in UK, Sweden and Finland (2), coordinated by CEH-Edinburgh

1.2 Forest growth in Europe NPP before growth rate increase (1920) CONCLUSION 20th century Growth accelerated by N-deposition. Environmental change 2000-2080: Effects on NPP HOG PFZ HEL KAR PUS RAJ PFF SOL BRI LOP TRI GA2 GA1 ALT AAL SKO BLA JAD PUN KAN KEM KOL % Change in NPP -10 -5 5 10 15 20 25 CO 2 Climate N-deposition CUMULATIVE EFFECTS Latitude EFM CONCLUSION 21st century: Growth likely to be accelerated by climate change and increasing [CO2]. N-deposition CO2 Temperature

1.3 Reality check ! How reliable is the European forest study: Sufficient data for model parameterization? Sufficient data for model input? Would another model have given different results? In every study using systems analysis and simulation: Model parameters, inputs and structure are uncertain How to deal with uncertainties optimally?

1.4 Forest models and uncertainty [Levy et al, 2004]

1.4 Forest models and uncertainty NdepUE (kg C kg-1 N) bgc century hybrid [Levy et al, 2004]

1.5 Model-data fusion Uncertainties are everywhere: Models (environmental inputs, parameters, structure), Data Uncertainties can be expressed as probability distributions (pdf’s) We need methods that: Quantify all uncertainties Show how to reduce them Efficiently transfer information: data  models  model application Calculating with uncertainties (pdf’s) = Probability Theory

2. Bayes’ Theorem

2.1 Dealing with uncertainty: Medical diagnostics A flu epidemic occurs: one percent of people is ill P(dis) = 0.01 Diagnostic test, 99% reliable P(pos|hlth) = 0.01 P(pos|dis) = 0.99 P(dis|pos) = P(pos|dis) P(dis) / P(pos) Bayes’ Theorem Test result is positive (bad news!) What is P(diseased|test positive)? 0.50 0.98 0.99

2.1 Dealing with uncertainty: Medical diagnostics A flu epidemic occurs: one percent of people is ill P(dis) = 0.01 Diagnostic test, 99% reliable P(pos|hlth) = 0.01 P(pos|dis) = 0.99 Bayes’ Theorem Test result is positive (bad news!) What is P(diseased|test positive)? 0.50 0.98 0.99 P(dis|pos) = P(pos|dis) P(dis) / P(pos) = P(pos|dis) P(dis) P(pos|dis) P(dis) + P(pos|hlth) P(hlth)

2.1 Dealing with uncertainty: Medical diagnostics A flu epidemic occurs: one percent of people is ill P(dis) = 0.01 Diagnostic test, 99% reliable P(pos|hlth) = 0.01 P(pos|dis) = 0.99 Bayes’ Theorem Test result is positive (bad news!) What is P(diseased|test positive)? 0.50 0.98 0.99 P(dis|pos) = P(pos|dis) P(dis) / P(pos) = P(pos|dis) P(dis) P(pos|dis) P(dis) + P(pos|hlth) P(hlth) = 0.99 0.01 0.99 0.01 + 0.01 0.99 = 0.50

2.2 Bayesian updating of probabilities Bayes’ Theorem: Prior probability → Posterior prob. Medical diagnostics: P(disease) → P(disease|test result) Model parameterization: P(params) → P(params|data) Model selection: P(models) → P(model|data) SPAM-killer: P(SPAM) → P(SPAM|E-mail header) Weather forecasting: … Climate change prediction: … Oil field discovery: … GHG-emission estimation: … Jurisprudence: … …

2.2 Bayesian updating of probabilities Bayes’ Theorem: Prior probability → Posterior prob. Model parameterization: P(params) → P(params|data) Model selection: P(models) → P(model|data) Application of Bayes’ Theorem to process-based models (not analytically solvable): Markov Chain Monte-Carlo (Metropolis algorithm)

2.3 What and why? We want to use data and models to explain and predict ecosystem behaviour Data as well as model inputs, parameters and outputs are uncertain No prediction is complete without quantifying the uncertainty. No explanation is complete without analysing the uncertainty Uncertainties can be expressed as probability density functions (pdf’s) Probability theory tells us how to work with pdf’s: Bayes Theorem (BT) tells us how a pdf changes when new information arrives BT: Prior pdf  Posterior pdf BT: Posterior = Prior x Likelihood / Evidence BT: P(θ|D) = P(θ) P(D|θ) / P(D) BT: P(θ|D)  P(θ) P(D|θ)

3. Bayesian Calibration (BC) of process-based models

3.1 Process-based forest models Environmental scenarios Height NPP Initial values Soil C Parameters Model

3.2 Process-based forest model BASFOR 40+ parameters 12+ output variables BASFOR

3.3 BASFOR: outputs Carbon in trees Volume (standing + thinned) Carbon in soil

3.4 BASFOR: parameter uncertainty

3.5 BASFOR: prior output uncertainty Carbon in trees (standing + thinned) Volume (standing) Carbon in soil

3.6 Data Dodd Wood (R. Matthews, Forest Research) Carbon in trees (standing + thinned) Volume (standing) Carbon in soil

3.7 Using data in Bayesian calibration of BASFOR Prior pdf Data Bayesian calibration Posterior pdf

3.8 Bayesian calibration: posterior uncertainty Carbon in trees (standing + thinned) Volume (standing) Carbon in soil

P(|D) = P() P(D| ) / P(D)  P() P(D|f()) 3.9 How does BC work again? f = the model, e.g. BASFOR P(|D) = P() P(D| ) / P(D)  P() P(D|f()) “Posterior distribution of parameters” “Prior distribution of parameters” “Likelihood” of data, given mismatch with model output

Bayesian calibration in action! Bayes’ Theorem: P( |D)  P() P(D|(f()) Parameter prob. distr. Output Data

3.10 Calculating the posterior using MCMC MCMC trace plots P(|D)  P() P(D|f()) Start anywhere in parameter-space: p1..39(i=0) Randomly choose p(i+1) = p(i) + δ IF: [ P(p(i+1)) P(D|f(p(i+1))) ] / [ P(p(i)) P(D|f(p(i))) ] > Random[0,1] THEN: accept p(i+1) ELSE: reject p(i+1) i=i+1 4. IF i < 104 GOTO 2 Bayesian Calibration of the forest model BASFOR cannot be done analytically. Therefore we use a Markov Chain Monte Carlo (MCMC) method. The MCMC does not give us a formula for the posterior parameter pdf, but it generates a representative sample from the posterior. The simplest MCMC-method is used: the Metropolis et al (1953) algorithm. Metropolis et al (1953) Sample of 104 -105 parameter vectors from the posterior distribution P(|D) for the parameters

3.11 MCMC in action BC3D.AVI

3.12 Using data in Bayesian calibration of BASFOR Prior pdf Data Bayesian calibration Posterior pdf

3.13 Parameter correlations 39 parameters 39 parameters

3.14 Continued calibration when new data become available Prior pdf Posterior pdf Bayesian calibration Prior pdf New data

3.14 Continued calibration when new data become available Prior pdf Posterior pdf Prior pdf New data Bayesian calibration

3.15 Bayesian projects at CEH-Edinburgh Parameterization and uncertainty quantification of 3-PG model of forest growth & C-stock (Genevieve Patenaude, Ronnie Milne, M. v.Oijen) [CO2] Uncertainty in earth system resilience (Clare Britton & David Cameron) Time Selection of forest models Data Assimilation forest EC data (David Cameron, Mat Williams, M.v.Oijen) Risk of frost damage in grassland Uncertainty in UK C-sequestration (Marcel van Oijen, Jonathan Rougier, Ron Smith, Tommy Brown, Amanda Thomson)

3.16 BASFOR: forest C-sequestration 2005-2076 Change in annual mean Temperature Change in potential C-seq. Uncertainty in change of potential C-seq. UKCIP Uncertainty due to model parameters only, NOT uncertainty in inputs / upscaling

3.17 Integrating RS-data (Patenaude et al.) Model 3-PG BC RS-data: Hyper-spectral, LiDAR, SAR

3.18 What kind of measurements would have reduced uncertainty the most ?

3.19 Prior predictive uncertainty & height-data Prior pred. uncertainty Height Biomass Height data Skogaby

3.20 Prior & posterior uncertainty: use of height data Prior pred. uncertainty Height Biomass Posterior uncertainty (using height data) Height data Skogaby

3.20 Prior & posterior uncertainty: use of height data Prior pred. uncertainty Height Biomass Posterior uncertainty (using height data) Height data (hypothet.)

3.20 Prior & posterior uncertainty: use of height data Prior pred. uncertainty Height Biomass Posterior uncertainty (using height data) Posterior uncertainty (using precision height data)

3.21 Summary for BC procedure Prior P() Model f Data D ± σ “Error function” e.g. N(0, σ) MCMC Samples of  (104 – 105) Samples of f() P(D|f()) Posterior P(|D) PCC Calibrated parameters, with covariances Sensitivity analysis of model parameters Uncertainty of model output

3.22 Summary for BC vs tuning Model tuning Define parameter ranges (permitted values) Select parameter values that give model output closest (r2, RMSE, …) to data Do the model study with the tuned parameters (i.e. no model output uncertainty) Bayesian calibration Define parameter pdf’s Define data pdf’s (probable measurement errors) Use Bayes’ Theorem to calculate posterior parameter pdf Do all future model runs with samples from the parameter pdf (i.e. quantify uncertainty of model results) BC can use data to reduce parameter uncertainty for any process-based model

4. Bayesian Model Comparison (BMC)

4.1 RECOGNITION revisited: model uncertainty EFM 25 20 15 10 5 Latitude

4.1 RECOGNITION revisited: model uncertainty EFM 25 20 Q 20 15 10 15 5 10 5 -5 Latitude -10 -5 40 HOG PFZ HEL KAR PUS RAJ PFF SOL BRI LOP TRI GA2 GA1 ALT AAL SKO BLA JAD PUN KAN KEM KOL EFIMOD FinnFor 30 20 20 10 10 Latitude HOG PFZ HEL KAR PUS RAJ PFF SOL BRI LOP TRI GA2 GA1 ALT AAL SKO BLA JAD PUN KAN KEM KOL Latitude -10 HOG PFZ HEL KAR PUS RAJ PFF SOL BRI LOP TRI GA2 GA1 ALT AAL SKO BLA JAD PUN KAN KEM KOL

4.2 Bayesian comparison of two models Bayes Theorem for model probab.: P(M|D) = P(M) P(D|M) / P(D) The “Integrated likelihood” P(D|Mi) can be approximated from the MCMC sample of outputs for model Mi (*) P(M1) = P(M2) = ½ P(M2|D) / P(M1|D) = P(D|M2) / P(D|M1) The “Bayes Factor” P(D|M2) / P(D|M1) quantifies how the data D change the odds of M2 over M1 (*) harmonic mean of likelihoods in MCMC-sample (Kass & Raftery, 1995)

4.3 BMC: Tuomi et al. 2007

4.4 Bayes Factor for two big forest models Calculation of P(D|BASFOR) MCMC 5000 steps Data Rajec: Emil Klimo Calculation of P(D|BASFOR+) MCMC 5000 steps

4.5 Bayes Factor for two big forest models Calculation of P(D|BASFOR) P(D|M1) = 7.2e-016 P(D|M2) = 5.8e-15 MCMC 5000 steps Bayes Factor = 7.8, so BASFOR+ supported by the data Data Rajec: Emil Klimo Calculation of P(D|BASFOR+) MCMC 5000 steps

4.6 Summary of BMC procedure Data D Prior P(1) Updated parameters MCMC Samples of 1 (104 – 105) Posterior P(1|D) Model 1 MCMC Prior P(2) Model 2 Samples of 2 (104 – 105) Posterior P(2|D) Updated parameters P(D|M1) P(D|M2) Bayes factor Updated model odds

5. BC & BMC in NitroEurope

5.1 NitroEurope & uncertainty What is the effect of reactive nitrogen supply on the direction and magnitude of net greenhouse gas budgets for Europe? This CEH co-ordinated IP builds on CEH’s involvement in other previous and current European GHG projects such as GREENGRASS, CarboMont and CarboEurope IP

5.2 NitroEurope & Uncertainty NitroEurope (NEU): non-CO2 GHG Europe experiments at plot-scale, observations at regional scale models at plot- and regional scale protocols for good-modelling practice and for uncertainty quantification and analysis (collab. with CEU in JUTF) Modellers NEU (2006)

5.3 Uncertainty assessment NEU models BC BC Plot scale forest model added in 2007: DAYCENT BC Yes

5.4 NEU – Forest model comparison 2007-8 4 models (DNDC, BASFOR, COUP, DayCENT) Models frozen 30-11-2007 Calibration of models using data Höglwald (D) {Mainly N2O & NO-emission rates} Comparison of models using data AU & DK Bayesian Model Comparison (BMC) Bayesian Calibration (BC)

Bayesian Calibration (BC) and Bayesian Model Comparison (BMC) of process-based models in NitroEurope: Theory, implementation and guidelines

Methods for doing BC: MCMC and Accept-Reject Bayesian Calibration (BC) and Bayesian Model Comparison (BMC) of process-based models in NitroEurope: Theory, implementation and guidelines Theory of BC and BMC Methods for doing BC: MCMC and Accept-Reject 3.1 Standard Metropolis algorithm 3.2 Metropolis with a modified proposal generating mechanism (“Reflection method”) 3.3 Accept-Reject algorithm FAQ – Bayesian Calibration References Appendix 1: MCMC code in MATLAB: the Metropolis algorithm Appendix 2: MCMC code in MATLAB: Metropolis-with-Reflection Appendix 3: ACCEPT-REJECT code in MATLAB Appendix 4: MCMC code in R: the Metropolis algorithm

5.6 BASFOR changes for NEU Soil temperature calculated Mineralisation of litter and SOM = f(Tsoil): Gaussian curve (Tuomi et al. 2007): f = exp[ (T-10) (2Tm-T-10) / 2σ2 ] Nemission split up into N2O and NO: Hole-In-the-Pipe (HIP) approach (Davidson & Verchot, 2000): fN2O = 1 / ( 1 + exp[-r(WFPS-WFPS50)] ) fN2O (-) Water-Filled Pore Space (WFPS) (-)

5.12 BC results: Prior & Posterior

5.13 BC results: simulation uncertainty & data

5.15 Data have information content, which is additive = +

BASFOR with T-sensitivity 5.16 BMC BASFOR BASFOR with T-sensitivity log P(D) = -607.4 Data 1983-1997 log P(D) = -614.4 BF = 1131.0 log P(D) = -428.7 Data 1998-2003 log P(D) = -427.6 BF = 0.33

6. Examples of BC & BMC in other sciences

6.1 Bayes in other disguises Linear regression using least squares BC, except that uncertainty is ignored Model: straight line Prior: uniform Likelihood: Gaussian (iid) = Note: Realising that LS-regression is a special case of BC opens up possibilities to improve on it, e.g. by having more information in the prior or likelihood (Sivia 2005) All Maximum Likelihood estimation methods can be seen as limited forms of BC where the prior is ignored (uniform) and only the maximum value of the likelihood is identified (ignoring uncertainty) BC, e.g. for spatiotemporal stochastic modelling with spatial correlations included in the prior Hierarchical modelling =

6.2 Bayes in other disguises (cont.) Inverse modelling (e.g. to estimate emission rates from concentrations) Geostatistics, e.g. Bayesian kriging Data Assimilation (KF, EnKF etc.)

6.3 Regional application of plot-scale models Upscaling method Model structure Modelling uncertainty 1. Stratify into homogeneous subregions & Apply Unchanged P(θ) unchanged Upscaling unc. 2. Apply to selected points (plots) & Interpolate Unchanged (but extend w. geostatistical model) P(θ) unchanged (Bayesian kriging only), Interpolation uncertainty 3. Reinterpret the model as a regional one & Apply New BC using regional I-O data 4. Summarise model behav. & Apply exhaustively (deterministic metamodel) E.g. multivariate regression model or simple mechanistic New BC needed of metamodel using plot-data 5. As 4. (stochastic emulator) E.g. Gaussian process emulator Code uncertainty (Kennedy & O’H.) 6. Summarise model behaviour & Embed in regional model Unrelated new model

7. References, Summary, Discussion

7.1 Bayesian methods: References Bayes’ Theorem MCMC BMC Forest models Crop models Probability theory Complex process-based models, MCMC Bayes, T. (1763) Metropolis, N. (1953) Kass & Raftery (1995) Green, E.J. / MacFarlane, D.W. / Valentine, H.T. , Strawderman, W.E. (1996, 1998, 1999, 2000) Jansen, M. (1997) Jaynes, E.T. (2003) Van Oijen et al. (2005)

7.2 Discussion statements / Conclusions Uncertainty (= incomplete information) is described by pdf’s  Plausible reasoning implies probability theory (PT) (Cox, Jaynes) Main tool from PT for updating pdf’s: Bayes Theorem Parameter estimation = quantifying joint parameter pdf Model evaluation = quantifying pdf in model space  requires at least two models

7.2 Discussion statements / Conclusions Uncertainty (= incomplete information) is described by pdf’s  Plausible reasoning implies probability theory (PT) (Cox, Jaynes) Main tool from PT for updating pdf’s: Bayes Theorem Parameter estimation = quantifying joint parameter pdf  BC Model evaluation = quantifying pdf in model space  requires at least two models  BMC Practicalities: When new data arrive: MCMC provides a universal method for calculating posterior pdf’s Quantifying the prior: Not a key issue in env. sci.: (1) many data, (2) prior is posterior from previous calibration MaxEnt can be used (Jaynes) Defining the likelihood: Normal pdf for measurement error usually describes our prior state of knowledge adequately (Jaynes) Bayes Factor shows how new data change the odds of models, and is a by-product from Bayesian calibration (Kass & Raftery) Overall: Uncertainty quantification often shows that our models are not very reliable

App2.1 How to do BC The problem: You have: (1) a prior pdf P(θ) for your model’s parameters, (2) new data. You also know how to calculate the likelihood P(D|θ). How do you now go about using BT to calculate the posterior P(θ|D)? Methods of using BT to calculate P(θ|D): Analytical. Only works when the prior and likelihood are conjugate (family-related). For example if prior and likelihood are normal pdf’s, then the posterior is normal too. Numerical. Uses sampling. Three main methods: MCMC (e.g. Metropolis, Gibbs) Sample directly from the posterior. Best for high-dimensional problems Accept-Reject Sample from the prior, then reject some using the likelihood. Best for low-dimensional problems Model emulation followed by MCMC or A-R

Should we measure the “sensitive parameters”? Yes, because the sensitive parameters: are obviously important for prediction ? No, because model parameters: are correlated with each other, which we do not measure cannot really be measured at all So, it may be better to measure output variables, because they: are what we are interested in are better defined, in models and measurements help determine parameter correlations if used in Bayesian calibration Key question: what data are most informative?