Bayes Factor Based on Han and Carlin (2001, JASA).

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review By Mary Kathryn Cowles and Bradley P. Carlin Presented by Yuting Qi 12/01/2006.
Bayesian Estimation in MARK
Introduction of Markov Chain Monte Carlo Jeongkyun Lee.
Ch 11. Sampling Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by I.-H. Lee Biointelligence Laboratory, Seoul National.
Part 24: Bayesian Estimation 24-1/35 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Gibbs Sampling Qianji Zheng Oct. 5th, 2010.
Markov-Chain Monte Carlo
Computer Vision Lab. SNU Young Ki Baik An Introduction to MCMC for Machine Learning (Markov Chain Monte Carlo)
Bayesian Methods with Monte Carlo Markov Chains III
Markov Chains 1.
Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.
CHAPTER 16 MARKOV CHAIN MONTE CARLO
1 Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
Bayesian statistics – MCMC techniques
Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Journal Club 11/04/141 Neal, Radford M (2011). " MCMC Using Hamiltonian Dynamics. "
BAYESIAN INFERENCE Sampling techniques
Industrial Engineering College of Engineering Bayesian Kernel Methods for Binary Classification and Online Learning Problems Theodore Trafalis Workshop.
CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov
Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.
Machine Learning CUNY Graduate Center Lecture 7b: Sampling.
End of Chapter 8 Neil Weisenfeld March 28, 2005.
Bayesian Analysis for Extreme Events Pao-Shin Chu and Xin Zhao Department of Meteorology School of Ocean & Earth Science & Technology University of Hawaii-
Image Analysis and Markov Random Fields (MRFs) Quanren Xiong.
The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.
Correlation With Errors-In-Variables3/28/20021 Correlation with Errors-In-Variables and an Application to Galaxies William H. Jefferys University of Texas.
Bayesian parameter estimation in cosmology with Population Monte Carlo By Darell Moodley (UKZN) Supervisor: Prof. K Moodley (UKZN) SKA Postgraduate conference,
Priors, Normal Models, Computing Posteriors
Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
1 Gil McVean Tuesday 24 th February 2009 Markov Chain Monte Carlo.
2004 All Hands Meeting Analysis of a Multi-Site fMRI Study Using Parametric Response Surface Models Seyoung Kim Padhraic Smyth Hal Stern (University of.
Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation Radford M. Neal 발표자 : 장 정 호.
Bayesian Reasoning: Tempering & Sampling A/Prof Geraint F. Lewis Rm 560:
Bayesian Generalized Kernel Mixed Models Zhihua Zhang, Guang Dai and Michael I. Jordan JMLR 2011.
Tracking Multiple Cells By Correspondence Resolution In A Sequential Bayesian Framework Nilanjan Ray Gang Dong Scott T. Acton C.L. Brown Department of.
Latent Class Regression Model Graphical Diagnostics Using an MCMC Estimation Procedure Elizabeth S. Garrett Scott L. Zeger Johns Hopkins University
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Lecture #9: Introduction to Markov Chain Monte Carlo, part 3
Trond Reitan (Division of statistics and insurance mathematics, Department of Mathematics, University of Oslo) Statistical modelling and latent variables.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
An Iterative Monte Carlo Method for Nonconjugate Bayesian Analysis B. P. Carlin and A. E. Gelfand Statistics and Computing 1991 A Generic Approach to Posterior.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
Introduction: Metropolis-Hasting Sampler Purpose--To draw samples from a probability distribution There are three steps 1Propose a move from x to y 2Accept.
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
Institute of Statistics and Decision Sciences In Defense of a Dissertation Submitted for the Degree of Doctor of Philosophy 26 July 2005 Regression Model.
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
Hierarchical Models. Conceptual: What are we talking about? – What makes a statistical model hierarchical? – How does that fit into population analysis?
CS498-EA Reasoning in AI Lecture #19 Professor: Eyal Amir Fall Semester 2011.
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Markov Chain Monte Carlo in R
Advanced Statistical Computing Fall 2016
Jun Liu Department of Statistics Stanford University
Omiros Papaspiliopoulos and Gareth O. Roberts
Markov chain monte carlo
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
Markov Networks.
Multidimensional Integration Part I
Statistical NLP: Lecture 4
Robust Full Bayesian Learning for Neural Networks
Opinionated Lessons #39 MCMC and Gibbs Sampling in Statistics
Uncertainty Propagation
Classical regression review
Presentation transcript:

Bayes Factor Based on Han and Carlin (2001, JASA)

Model Selection Data: y : finite set of competing models : a distinct unknown parameter vector of dimension n j corresponding to the jth model Prior : all possible values for : collection of all model specific

Model Selection Posterior probability –A single “best” model –Model averaging Bayes factor: Choice between two models

Estimating Marginal Likelihood Marginal likelihood Estimation –Ordinary Monte Carlo sampling Difficult to implement for high-dimensional models MCMC does not provide estimate of marginal likelihood directly –Include model indicator as a parameter in sampling Product space search by Gibbs sampling Metropolis-Hastings Reversible Jump MCMC

Product space search Carlin and Chib (1995, JRSSB) Data likelihood of Model j Prior of model j Assumption: –M is merely an indicator of which is relevant to y –Y is independent of given the model indicator M –Proper priors are required –Prior independence among given M

Product space search The sampler operates over the product space Marginal likelihood Remark:

Search by Gibbs sampler

Bayes Factor Provided the sampling chain for the model indicator mixes well, the posterior probability of model j can be estimated by Bayes factor is estimated by

Choice of prior probability In general, can be chosen arbitrarily –Its effect is divided out in the estimate of Bayes factor Often, they are chosen so that the algorithm visits each model in roughly equal proportion –Allows more accurate estimate of Bayes factor –Preliminary runs are needed to select computationally efficient values

More remarks Performance of this method is optimized when the pseudo-priors match the corresponding model specific priors as nearly as possible Draw back of the method –Draw must be made from each pseudo prior at each iteration to produce acceptably accurate results –If a large number of models are considered, the method becomes impractical

Metropolized product space search Dellaportas P., Forster J.J., Ntzoufras I. (2002). On Bayesian Model and Variable Selection Using MCMC. Statistics and Computing, 12, A hybrid Gibbs-Metropolis strategy Model selection step is based on a proposal moving between models

Metropolized Carlin and Chib (MCC)

Advantage Only needs to sample from the pseudo prior for the proposed model

Reversible jump MCMC Green (1995, Biometrika) This method operates on the union space It generates a Markov chain that can jump between models with parameter spaces of different dimensions

RJMCMC

Using Partial Analytic Structure (PAS) Godsill (2001, JCGS) Similar setup as in the CC method, but allows parameters to be shared between different models. Avoids dimension matching

PAS

Marginal likelihood estimation (Chib 1995)

Let When all full conditional distributions for the parameters are in closed form

Chib (1995) The first three terms on the right side are available in close form The last term on the right side can be estimated from Gibbs steps

Chib and Jeliazkov (2001) Estimation of the last term requires knowing the normalizing constant –Not applicable to Metropolis-Hastings Let the acceptance probability be

Chib and Jeliazkov (2001)

Example: linear regression model

Two models

Priors

For MCC h(1,1)=h(1,2)=h(2,1)=h(2,2)=0.5

For RJMCMC Dimension matching is automatically satisfied Due to similarities between two models

Chib’s method Two block gibbs sampler –(regression coefficients, variance)

Results By numerical integration, the true Bayes factor should be 4862 in favor of Model 2