Bayesian Estimation in MARK

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review By Mary Kathryn Cowles and Bradley P. Carlin Presented by Yuting Qi 12/01/2006.
Part 24: Bayesian Estimation 24-1/35 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Gibbs Sampling Qianji Zheng Oct. 5th, 2010.
Markov-Chain Monte Carlo
CHAPTER 16 MARKOV CHAIN MONTE CARLO
1 Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
Bayesian statistics – MCMC techniques
BAYESIAN INFERENCE Sampling techniques
Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Computer vision: models, learning and inference
Computer vision: models, learning and inference Chapter 3 Common probability distributions.
Robin McDougall, Ed Waller and Scott Nokleby Faculties of Engineering & Applied Science and Energy Systems & Nuclear Science 1.
Bayes Factor Based on Han and Carlin (2001, JASA).
Material Model Parameter Identification via Markov Chain Monte Carlo Christian Knipprath 1 Alexandros A. Skordos – ACCIS,
Correlation With Errors-In-Variables3/28/20021 Correlation with Errors-In-Variables and an Application to Galaxies William H. Jefferys University of Texas.
Bayesian parameter estimation in cosmology with Population Monte Carlo By Darell Moodley (UKZN) Supervisor: Prof. K Moodley (UKZN) SKA Postgraduate conference,
Priors, Normal Models, Computing Posteriors
Exam I review Understanding the meaning of the terminology we use. Quick calculations that indicate understanding of the basis of methods. Many of the.
Stochastic Monte Carlo methods for non-linear statistical inverse problems Benjamin R. Herman Department of Electrical Engineering City College of New.
1 Gil McVean Tuesday 24 th February 2009 Markov Chain Monte Carlo.
Bayesian Inversion of Stokes Profiles A.Asensio Ramos (IAC) M. J. Martínez González (LERMA) J. A. Rubiño Martín (IAC) Beaulieu Workshop ( Beaulieu sur.
1 A Bayesian statistical method for particle identification in shower counters IX International Workshop on Advanced Computing and Analysis Techniques.
- 1 - Bayesian inference of binomial problem Estimating a probability from binomial data –Objective is to estimate unknown proportion (or probability of.
Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008.
Problem: 1) Show that is a set of sufficient statistics 2) Being location and scale parameters, take as (improper) prior and show that inferences on ……
Phisical Fluctuomatics (Tohoku University) 1 Physical Fluctuomatics 4th Maximum likelihood estimation and EM algorithm Kazuyuki Tanaka Graduate School.
G. Cowan RHUL Physics Bayesian Higgs combination page 1 Bayesian Higgs combination based on event counts (follow-up from 11 May 07) ATLAS Statistics Forum.
MCMC (Part II) By Marc Sobel. Monte Carlo Exploration  Suppose we want to optimize a complicated distribution f(*). We assume ‘f’ is known up to a multiplicative.
Lecture 2: Statistical learning primer for biologists
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
MCMC reconstruction of the 2 HE cascade events Dmitry Chirkin, UW Madison.
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Lecture #9: Introduction to Markov Chain Monte Carlo, part 3
Latent regression models. Where does the probability come from? Why isn’t the model deterministic. Each item tests something unique – We are interested.
CS774. Markov Random Field : Theory and Application Lecture 15 Kyomin Jung KAIST Oct
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Chapter 2: Bayesian hierarchical models in geographical genetics Manda Sayler.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Markov-Chain-Monte-Carlo (MCMC) & The Metropolis-Hastings Algorithm P548: Intro Bayesian Stats with Psych Applications Instructor: John Miyamoto 01/19/2016:
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press 1 Computational Statistics with Application to Bioinformatics Prof. William.
How many iterations in the Gibbs sampler? Adrian E. Raftery and Steven Lewis (September, 1991) Duke University Machine Learning Group Presented by Iulian.
Hierarchical Models. Conceptual: What are we talking about? – What makes a statistical model hierarchical? – How does that fit into population analysis?
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Markov Chain Monte Carlo in R
MCMC Output & Metropolis-Hastings Algorithm Part I
Markov Chain Monte Carlo methods --the final project of stat 6213
Advanced Statistical Computing Fall 2016
Introducing Bayesian Approaches to Twin Data Analysis
Introduction to the bayes Prefix in Stata 15
Bayesian inference Presented by Amir Hadadi
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
Location-Scale Normal Model
More about Posterior Distributions
Multidimensional Integration Part I
Graduate School of Information Sciences, Tohoku University
Opinionated Lessons #39 MCMC and Gibbs Sampling in Statistics
Computing and Statistical Data Analysis / Stat 10
Presentation transcript:

Bayesian Estimation in MARK Gary C. White

Bayes Theorem Bayes' theorem relates the conditional and marginal probabilities of stochastic events A and B: http://en.wikipedia.org/wiki/Bayes'_theorem

Derivation

Example 2 cookie bowls Bowl 1: 10 chocolate-chip, 30 plain Buck picks a plain cookie from one of the bowls, but which bowl? Pr(A) = Bowl 1 = 0.5, 1 − Pr(A) = Bowl 2 Pr(B) = Plain cookie = 50/80 = 0.625 Pr(B|A) = 30/40 = 0.75 Pr(A|B) = 0.75 x 0.5/0.625 = 0.6

Components of Bayesian Inference Prior Distribution – use probability to quantify uncertainty about unknown quantities (parameters) Likelihood – relates all variables into a “full probability model” Posterior Distribution – result of using data to update information about unknown quantities (parameters)

Bayesian inference Prior information p(θ) on parameters θ Likelihood of data given parameter values f(y| θ)

Bayesian inference or Posterior distribution is proportional to likelihood × prior distribution.

Bayesian inference Not generally necessary to compute this integral.

Metropolis-Hastings An algorithm that generates a sequence {θ(0), θ(1), θ(2), …} from a Markov Chain whose stationary distribution is π(θ) (i.e., the posterior distribution) Fast computers and recognition of this algorithm has allowed Bayesian estimation to develop.

Metropolis-Hastings Initial value θ(0) to start the Markov Chain Propose new value Accepted value:

Metropolis-Hastings

MCMC Markov Chain Monte Carlo The sequence {θ(0), θ(1), θ(2), …} is a Markov chain, obtained through the Monte Carlo method, in MARK the Metropolis-Hastings method.

MARK – Defaults – Likelihood Data type used to compute the model – same likelihood as is used to compute maximum likelihood estimates

MARK – Prior Distributions Would be logical to use a U(0,1) distribution as the prior on the real scale However, MARK estimates parameters on the beta scale, and transforms them to the real scale Hence, the prior distribution has to be on the beta parameter.

MARK – Defaults – Prior Distribution For the beta parameters with logit link, normal with mean 0 and SD 1.75 = “uninformative” prior

MARK – Defaults – Proposal Distribution Distribution used to propose new values Normal distribution with mean 0 and SD estimated to give a 40–45% acceptance rate That is, the SD is estimated during the “tuning” phase to accept the new proposal 40–45% of the time.

MARK Estimation Defaults Tuning phase – 4000 iterations Burn-in phase – 1000 iterations Sampling phase – 10000 iterations

MARK – Posterior Summaries Mean Median Mode Percentiles 2.5, 5, 10, 20, 50, 80, 90, 95, 97.5

MARK – Assessing Convergence Multiple chains R statistic that compares variances within chains to between chains Graphical evaluation Histograms Plots of chain

Hyperdistributions Normal distribution from which a set of beta parameters on the logit scale are assumed to have been sampled For example, annual survival rates where

Priors on hyperdistributions Prior on μ ~ N(0, 100) “uninformative” Prior on σ2 ~ Inverse Gamma(0.001, 0.001) i.e., 1/ σ2 = τ ~ Gamma(0.001, 0.001)

Multivariate Hyperdistributions Joint distribution of 2 sets of parameters assumed to be multivariate normal, e.g., Prior on correlation Uniform(−1, 1)