Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.

Slides:



Advertisements
Similar presentations
Motivating Markov Chain Monte Carlo for Multiple Target Tracking
Advertisements

Bayesian Estimation in MARK
Inspiral Parameter Estimation via Markov Chain Monte Carlo (MCMC) Methods Nelson Christensen Carleton College LIGO-G Z.
Bayesian statistics – MCMC techniques
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
“Inferring Phylogenies” Joseph Felsenstein Excellent reference
Bayesian estimation Bayes’s theorem: prior, likelihood, posterior
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
Maximum Likelihood Network Topology Identification Mark Coates McGill University Robert Nowak Rui Castro Rice University DYNAMICS May 5 th,2003.
Bayesian Networks. Graphical Models Bayesian networks Conditional random fields etc.
Course overview Tuesday lecture –Those not presenting turn in short review of a paper using the method being discussed Thursday computer lab –Turn in short.
Statistical Background
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Probability, Bayes’ Theorem and the Monty Hall Problem
7. Bayesian phylogenetic analysis using MrBAYES UST Jeong Dageum Thomas Bayes( ) The Phylogenetic Handbook – Section III, Phylogenetic.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
BINF6201/8201 Molecular phylogenetic methods
WSEAS AIKED, Cambridge, Feature Importance in Bayesian Assessment of Newborn Brain Maturity from EEG Livia Jakaite, Vitaly Schetinin and Carsten.
Statistical Decision Theory
Introduction to MCMC and BUGS. Computational problems More parameters -> even more parameter combinations Exact computation and grid approximation become.
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Speciation history inferred from gene trees L. Lacey Knowles Department of Ecology and Evolutionary Biology University of Michigan, Ann Arbor MI
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
Bayesian Inversion of Stokes Profiles A.Asensio Ramos (IAC) M. J. Martínez González (LERMA) J. A. Rubiño Martín (IAC) Beaulieu Workshop ( Beaulieu sur.
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
A brief introduction to phylogenetics
Lab3: Bayesian phylogenetic Inference and MCMC Department of Bioinformatics & Biostatistics, SJTU.
Likelihood function and Bayes Theorem In simplest case P(B|A) = P(A|B) P(B)/P(A) and we consider the likelihood function in which we view the conditional.
Molecular Systematics
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
Making sense of randomness
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Bayesian Phylogenetics. Bayes Theorem Pr(Tree|Data) = Pr(Data|Tree) x Pr(Tree) Pr(Data)
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
FINE SCALE MAPPING ANDREW MORRIS Wellcome Trust Centre for Human Genetics March 7, 2003.
Bayesian Statistics and Decision Analysis
MCMC (Part II) By Marc Sobel. Monte Carlo Exploration  Suppose we want to optimize a complicated distribution f(*). We assume ‘f’ is known up to a multiplicative.
An Introduction to Markov Chain Monte Carlo Teg Grenager July 1, 2004.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
TEMPLATE DESIGN © Approximate Inference Completing the analogy… Inferring Seismic Event Locations We start out with the.
Bayesian statistics named after the Reverend Mr Bayes based on the concept that you can estimate the statistical properties of a system after measuting.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Bayesian II Spring Major Issues in Phylogenetic BI Have we reached convergence? If so, do we have a large enough sample of the posterior?
Markov-Chain-Monte-Carlo (MCMC) & The Metropolis-Hastings Algorithm P548: Intro Bayesian Stats with Psych Applications Instructor: John Miyamoto 01/19/2016:
HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),
Ray Karol 2/26/2013. Let’s Make a Deal Monte Hall Problem Suppose you’re on a game show, and you’re given a choice of three doors: Behind one door is.
SIR method continued. SIR: sample-importance resampling Find maximum likelihood (best likelihood × prior), Y Randomly sample pairs of r and N 1973 For.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Overview G. Jogesh Babu. R Programming environment Introduction to R programming language R is an integrated suite of software facilities for data manipulation,
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Bayesian Learning Reading: Tom Mitchell, “Generative and discriminative classifiers: Naive Bayes and logistic regression”, Sections 1-2. (Linked from.
MCMC Output & Metropolis-Hastings Algorithm Part I
Bayesian estimation Bayes’s theorem: prior, likelihood, posterior
Presented by: Karen Miller
Bayesian inference Presented by Amir Hadadi
More about Posterior Distributions
Reverend Thomas Bayes ( )
Presentation transcript:

Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability

Thomas Bayes Bayesian methods were invented in the 18 th century, but their application in phylogenetics dates from Thomas Bayes? (1701?-1761?)

Bayes’ theorem Bayes’ theorema links a conditional probability to its inverse Prob(H|D) = Prob(H) Prob(D|H) ∑ H Prob(H) Prob(D|H)

Bayes’ theorem in the case of two alternative hypotheses, the theorem can be written as Prob(H|D) = Prob(H) Prob(D|H) ∑ H Prob(H) Prob(D|H) Prob(H 1 |D) = Prob(H 1 ) Prob(D|H 1 ) Prob(H 1 ) Prob(D|H 1 ) + Prob(H 2 ) Prob(D|H 2 )

Bayes’ theorem Bayes for smarties m m m m = D H 1 =D came from mainly orange bag H 2 =D came from mainly blue bag Prob(D|H 1 ) = ¾ ¾ ¾ ¾ ¼ 5 = 405/1024 Prob(D|H 2 ) = ¼ ¼ ¼ ¼ ¾ 5 = 15/1024 Prob(H 1 ) = ½ Prob(H 2 ) = ½ Prob(H 1 |D) = Prob(H 1 ) Prob(D|H 1 ) Prob(H 1 ) Prob(D|H 1 ) + Prob(H 2 ) Prob(D|H 2 ) = = ½ 405/1024 ½ 405/ ½ 15/1024 m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m m

Bayes’ theorem a-priori knowledge can affect one’s conclusions positive test resultnegative test result illtrue positivefalse negative healthyfalse positivetrue negative positive test resultnegative test result ill99%1% healthy0.1%99.9% using the data only, P(ill|positive test result)≈0.99

Bayes’ theorem a-priori knowledge can affect one’s conclusions positive test resultnegative test result illtrue positivefalse negative healthyfalse positivetrue negative positive test resultnegative test result ill99%1% healthy0.1%99.9% using the data only, P(ill|positive test result)≈0.99

Bayes’ theorem a-priori knowledge can affect one’s conclusions positive test resultnegative test result ill99%1% healthy0.1%99.9% a-priori knowledge: 0.1% of the population (n= ) is ill positive test resultnegative test result Ill (100)991 Healthy (99 900) with a-priori knowledge: 99/190 of persons with positive test results is ill P(ill|positive result) ≈ 50%

Bayes’ theorem a-priori knowledge can affect one’s conclusions

Bayes’ theorem a-priori knowledge can affect one’s conclusions

Bayes’ theorem a-priori knowledge can affect one’s conclusions Behind door 1Behind door 2Behind door 3Result if staying at door 1 Result if switching to door offered CarGoat CarGoat CarGoat Car Goat CarGoatCar

Bayes’ theorem a-priori knowledge can affect one’s conclusions P(C=c|H=h, S=s) = P(H=h|C=c, S=s) P(C=c|S=s) P(H=h|S=s) C=number of the door hiding the car S=number of the door selected by the player H=number of the door opened by the host probability of finding the car, after the original selection and the host’s opening of one.

Bayes’ theorem a-priori knowledge can affect one’s conclusions P(C=c|H=h, S=s) = P(H=h|C=c, S=s) P(C=c|S=s) ∑ P(H=h|C=c,S=s) C=number of the door hiding the car S=number of the door selected by the player H=number of the door opened by the host the host’s behaviour depends on the candidate’s selection and on where the car is. C=1 3

Bayes’ theorem a-priori knowledge can affect one’s conclusions P(C=2|H=3, S=1) = 1 1/3 C=number of the door hiding the car S=number of the door selected by the player H=number of the door opened by the host 1/2 1/ / /3 = 2/3

Bayes’ theorem Bayes’ theorema is used to combine a prior probability with the likelihood to produce a posterior probability. Prob(H|D) = Prob(H) Prob(D|H) ∑ H Prob(H) Prob(D|H) prior probability posterior probability likelihood normalizing constant

Bayesian inference of trees in BI, the players are the tree topology and branch lengths, the evolution model and the (sequence) data) tree topology and branch lengths evolutionary model (sequence) data

Bayesian inference of trees the posterior probability of a tree is calculated from the prior and the likelihood Prob(, | ) = Prob(, ) Prob( |, ) Prob( ) posterior probability of a tree prior probability of a tree summation over all possible branch lengths and model parameter values likelihood

Bayesian inference of trees the prior probability of a tree is often not known and therefore all trees are considered equally probable A B C D E A B D C E A B E D C A C B D E B C A D E A D C B E A D B C E A D E B C A C D B E D C A B E A E C B D A E B C D A E B D C A C E B D E C A B E

Bayesian inference of trees Prob(Tree i) Prob(Data |Tree i) Prob(Tree i |Data) prior probability likelihood posterior probability the prior probability of a tree is often not known and therefore all trees are considered equally probable

Bayesian inference of trees but prior knowledge of taxonomy could suggest other prior probabilities A B C D E A B D C E A B E D C A C B D E B C A D E A D C B E A D B C E A D E B C A C D B E D C A B E A E C B D A E B C D A E B D C A C E B D E C A B E (CDE) constrained:

Bayesian inference of trees BI requires summation over all possible trees … which is impossible to do analytically Prob(, | ) = Prob(, ) Prob( |, ) Prob( ) summation over all possible branch lengths and model parameter values

1.Start at a random point Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability Posterior probability density tree 1 tree 2 tree 3 parameter space

1.Start at a random point 2.Make a small random move 3.Calculate posterior density ratio r = new/old state Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability Posterior probability density tree 1 tree 2 tree 3 parameter space 1 2

1.Start at a random point 2.Make a small random move 3.Calculate posterior density ratio r = new/old state 4.If r > 1 always accept move Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability Posterior probability density tree 1 tree 2 tree 3 parameter space 1 2 always accepted

1.Start at a random point 2.Make a small random move 3.Calculate posterior density ratio r = new/old state 4.If r > 1 always accept move If r < 1 accept move with a probability ~ 1/distance Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability Posterior probability density tree 1 tree 2 tree 3 parameter space 1 2 perhaps accepted

1.Start at a random point 2.Make a small random move 3.Calculate posterior density ratio r = new/old state 4.If r > 1 always accept move If r < 1 accept move with a probability ~ 1/distance Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability Posterior probability density tree 1 tree 2 tree 3 parameter space 1 2 rarely accepted

1.Start at a random point 2.Make a small random move 3.Calculate posterior density ratio r = new/old state 4.If r > 1 always accept move If r < 1 accept move with a probability ~ 1/distance 5.Go to step 2 Bayesian inference of trees the proportion of time that MCMC spends in a particular parameter region is an estimate of that region’s posterior probability. Posterior probability density tree 1 tree 2 tree 3 parameter space 20%48%32%

Bayesian inference of trees Metropolis-coupled Markov Chain Monte Carlo speeds up the search cold chain hot chain: P(tree|data)  hotter chain: P(tree|data)  hottest chain: P(tree|data)  0 <  cold chainflat

Bayesian inference of trees Metropolis-coupled Markov Chain Monte Carlo speeds up the search cold scout stuck on local optimum Hey! Over here! hot scout signalling better spot