Uncertainty Quantification and Bayesian Model Averaging

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Design of Experiments Lecture I
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Experimental Design, Response Surface Analysis, and Optimization
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
The Simple Linear Regression Model: Specification and Estimation
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
Chapter 10 Simple Regression.
Chapter 12 Simple Regression
The Simple Regression Model
Maximum likelihood (ML)
Confidence Interval.
Introduction to Regression Analysis, Chapter 13,
Statistical Analysis of Systematic Errors and Small Signals Reinhard Schwienhorst University of Minnesota 10/26/99.
Objectives of Multiple Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Model Inference and Averaging
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Applications of optimal control and EnKF to Flow Simulation and Modeling Florida State University, February, 2005, Tallahassee, Florida The Maximum.
Center for Radiative Shock Hydrodynamics Fall 2011 Review Assessment of predictive capability Derek Bingham 1.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
1 Correlation analysis for neutron skins Witold Nazarewicz and Paul-Gerhard Reinhard (presented at PREX Workshop, JLab, August 17-19, 2008)
July 11, 2006Bayesian Inference and Maximum Entropy Probing the covariance matrix Kenneth M. Hanson T-16, Nuclear Physics; Theoretical Division Los.
Lecture 2: Statistical learning primer for biologists
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
DFT Applications Technology to calculate observables Global properties Spectroscopy DFT Solvers Functional form Functional optimization Estimation of theoretical.
Machine Learning 5. Parametric Methods.
Information content of a new observable Witold Nazarewicz University of Aizu-JUSTIPEN-EFES Symposium on "Cutting-Edge Physics of Unstable Nuclei” Aizu-Wakamatsu,
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
Review of statistical modeling and probability theory Alan Moses ML4bio.
1 Information Content Tristan L’Ecuyer. 2 Degrees of Freedom Using the expression for the state vector that minimizes the cost function it is relatively.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Data Modeling Patrice Koehl Department of Biological Sciences
Chapter 4: Basic Estimation Techniques
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Probability Theory and Parameter Estimation I
Bayesian data analysis
ICS 280 Learning in Graphical Models
Model Inference and Averaging
Ch3: Model Building through Regression
Course: Autonomous Machine Learning
Special Topics In Scientific Computing
Calibration.
Variable Selection for Gaussian Process Models in Computer Experiments
Basic Estimation Techniques
Quantify correlations between the neutron skin and other quantities characterizing finite nuclei and infinite nuclear matter. How will additional data.
More about Posterior Distributions
Modelling data and curve fitting
Multiple Regression Models
Filtering and State Estimation: Basic Concepts
Since When is it Standard to Be Deviant?
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Cross-validation for the selection of statistical models
Pattern Recognition and Machine Learning
Robust Full Bayesian Learning for Neural Networks
Ch 3. Linear Models for Regression (2/2) Pattern Recognition and Machine Learning, C. M. Bishop, Previously summarized by Yung-Kyun Noh Updated.
CS639: Data Management for Data Science
Mathematical Foundations of BME Reza Shadmehr
Applied Statistics and Probability for Engineers
Presentation transcript:

Uncertainty Quantification and Bayesian Model Averaging Witold Nazarewicz (FRIB/MSU) DOE topical collaboration “Nuclear Theory for Double-Beta Decay and Fundamental Symmetries”, February 3-4, 2017, ACFI, UMass Amherst Perspective UQ: statistical aspects UQ: systematic aspects Bayesian Model Selection and Averaging Conclusions and Homework W. Nazarewicz, Amherst Feb. 3-4, 2017

Current 0nbb predictions “There is generally significant variation among different calculations of the nuclear matrix elements for a given isotope. For consideration of future experiments and their projected sensitivity it would be very desirable to reduce the uncertainty in these nuclear matrix elements.” (Neutrinoless Double Beta Decay NSAC Report 2014) Low-resolution and high-resolution models Global and local models Based on very different assumptions Fitted to vastly different observables No uncertainties are provided! W. Nazarewicz, Amherst Feb. 3-4, 2017

The promise... (in our proposal) The tools that support much of this work have been or are being developed through the SciDAC NUCLEI collaboration. Our collaboration contains all the expertise needed to fully apply those tools, which include innovative methods for estimating uncertainty, to double-beta decay matrix elements. Accurate nuclear matrix elements with quantified uncertainty are perhaps the most urgent project for our collaboration. Our collaboration intends to reduce the uncertainty considerably. Benchmarking and Uncertainty Quantification To have confidence in our predictions, we need to to quantify both systematic and statistical error. At present, systematic errors on nuclear matrix elements dominate statistical errors, and assigning uncertainty is difficult How will the procedure work in detail? First, with each method we will calculate observables that can be compared with experiment: spectra and transitions... All this benchmarking concerns difficult systematic error. We will also address statistical error, which reflects the degree of variation in the predictions of a single model with parameters that are fit to large amounts of experimental data, and which, fortunately, is easier to assess. Nuclear physicists typically apply linear regression and/or Bayesian inference to quantify statistical error. If one knows the covariance matrix or posterior distribution of model parameters, it is straightforward to estimate an associated error for any observable... W. Nazarewicz, Amherst Feb. 3-4, 2017

Consider a model described by coupling constants q ={q1, q2…qk ). Any predicted expectation value of an observable Yi is a function of these parameters. Since the model space has been optimized to a limited set of observables, there may also exist correlations between model parameters. Note that a model is defined through: mathematical framework (equations, approximations...) parameters/coupling constants and active space fit-observables Objective function Model predictions fit-observables (may include pseudo-data) Expected uncertainties W. Nazarewicz, Amherst Feb. 3-4, 2017

W. Nazarewicz, Amherst Feb. 3-4, 2017 Parameter estimation. The set of fit-observables M1 M1271 M2 Y M3 MLE=maximum likelihood estimate Ma Ya Y ⊂Ya ⊂Ytot set of observables Ytot W. Nazarewicz, Amherst Feb. 3-4, 2017

W. Nazarewicz, Amherst Feb. 3-4, 2017 q1(SM) q2(SM) q3(SM) q820765(SM) maximum likelihood estimate W. Nazarewicz, Amherst Feb. 3-4, 2017

W. Nazarewicz, Amherst Feb. 3-4, 2017 Minimal nuclear theorist’s approach to a statistical model error estimate Statistical uncertainty in variable A: covariance matrix Correlation between variables A and B: Product-moment correlation coefficient between two observables/variables A and B: =1: full alignment/correlation =0: not aligned/statistically independent W. Nazarewicz, Amherst Feb. 3-4, 2017

Nuclear charge and neutron radii and nuclear matter: trend analysis in Skyrme-DFT approach P.-G. Reinhard and WN, PRC 93, 051303 (R) (2016) 14-parameter model, optimized to 2 different sets of fit-observables (Y=E, R) (Y=E) stiff stiff sloppy sloppy sloppy W. Nazarewicz, Amherst Feb. 3-4, 2017

P.-G. Reinhard and WN, PRC 93, 051303 (R) (2016) SV-E SV-min Their lengths represent the magnitude of corresponding variations. W. Nazarewicz, Amherst Feb. 3-4, 2017

W. Nazarewicz, Amherst Feb. 3-4, 2017 Uncertainty Quantification for Nuclear Density Functional Theory and Information Content of New Measurements, J. McDonnell et al., Phys. Rev. Lett. 114, 122501 (2015). Pilot Study Applied to UNEDF1 Massively parallel approach Uniform priors with bounds 130 data points (including deformed nuclei) Gaussian process response surface 200 test parameter sets Latin hyper-rectangle UNEDF1 No improvement on model’s predictibility except for postdictions on additional data UNEDF1CPT See also: Higdon et al., A Bayesian Approach for Parameter Estimation and Prediction Using a Computationally Intensive Model, J. Phys. G 42, 034009 (2015) W. Nazarewicz, Amherst Feb. 3-4, 2017

W. Nazarewicz, Amherst Feb. 3-4, 2017 Naïve nuclear theorist’s approach to a systematic (model) error estimate: Take a set of reasonable models Mi Make a prediction E(y;Mi)=yi Compute average and variation within this set Compute rms deviation from existing experimental data. If the number of fit-observables is large, statistical error is small and the error is predominantly systematic. ^ Can we do better? Yes! W. Nazarewicz, Amherst Feb. 3-4, 2017

Bayesian Model Averaging (BMA) models considered posterior distribution given data Posterior distribution of y under each model quantity of interest fit-observables (data) posterior probability of a model The posterior probability for model Mk prior probability that the model Mk is true (!!!) marginal density of the data; integrated likelihood of model Mk likelihood prior distribution of parameters

W. Nazarewicz, Amherst Feb. 3-4, 2017 Model selection and Bayes Factor (BF) BF an be used to decide which of two models is more likely given a result y. The outcome The posterior mean and variance of y are: W. Nazarewicz, Amherst Feb. 3-4, 2017

W. Nazarewicz, Amherst Feb. 3-4, 2017 What is required? Common dataset Y (as large as possible) needs to be defined Statistical analysis for individual models needs to be carried out. Priors, posteriors, likelihoods determined Individual model predictions carried out, including statistical uncertainties Decision should be made on the prior model probability p(Mk) ISNET website: http://iopscience.iop.org/journal/0954-3899/page/ISNET INT Program INT-16-2a: http://www.int.washington.edu/PROGRAMS/16-2a/ References: http://bayesint.github.io/references.html Bayesian Model Averaging refs: http://www.stat.washington.edu/raftery/Research/bma.html Hoeting BMA review: http://www.stat.washington.edu/www/research/online/hoeting1999.pdf Wasserman BMA review: https://pdfs.semanticscholar.org/207c/cd4e6514824bf489362799bc138e1ec8ac44.pdf W. Nazarewicz, Amherst Feb. 3-4, 2017

W. Nazarewicz, Amherst Feb. 3-4, 2017 From Hoeting and Wasserman: When faced with several candidate models, the analyst can either choose one model or average over the models. Bayesian methods provide a set of tools for these problems. Bayesian methods also give us a numerical measure of the relative evidence in favor of competing theories. Model selection refers to the problem of using the data to select one model from the list of candidate models. Model averaging refers to the process of estimating some quantity under each model and then averaging the estimates according to how likely each model is. Bayesian model selection and model averaging is a conceptually simple, unified approach. An intrinsic Bayes factor might also be a useful approach. There is no need to choose one model. It is possible to average the predictions from several models. Simulation methods make it feasible to compute posterior probabilities in many problems. It should be emphasized that BMA should not be used as an excuse for poor science... BMA is useful after careful scientific analysis of the problem at hand. Indeed, BMA offers one more tool in the toolbox of applied statisticians for improved data analysis and interpretation. W. Nazarewicz, Amherst Feb. 3-4, 2017