Bayesian II Spring 2010. Major Issues in Phylogenetic BI Have we reached convergence? If so, do we have a large enough sample of the posterior?

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

Other MCMC features in MLwiN and the MLwiN->WinBUGS interface
A Bayesian random coefficient nonlinear regression for a split-plot experiment for detecting differences in the half- life of a compound Reid D. Landes.
Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review By Mary Kathryn Cowles and Bradley P. Carlin Presented by Yuting Qi 12/01/2006.
Bayesian Estimation in MARK
Efficient Cosmological Parameter Estimation with Hamiltonian Monte Carlo Amir Hajian Amir Hajian Cosmo06 – September 25, 2006 Astro-ph/
G. Alonso, D. Kossmann Systems Group
Practical Session: Bayesian evolutionary analysis by sampling trees (BEAST) Rebecca R. Gray, Ph.D. Department of Pathology University of Florida.
Markov-Chain Monte Carlo
Leptothorax gredosi Leptothorax racovitzae Camponotus herculeanus Thomas Bayes
Bayesian Reasoning: Markov Chain Monte Carlo
Bayesian statistics – MCMC techniques
BAYESIAN INFERENCE Sampling techniques
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.
Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
Course overview Tuesday lecture –Those not presenting turn in short review of a paper using the method being discussed Thursday computer lab –Turn in short.
End of Chapter 8 Neil Weisenfeld March 28, 2005.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Approximate Inference 2: Monte Carlo Markov Chain
Introduction to Monte Carlo Methods D.J.C. Mackay.
7. Bayesian phylogenetic analysis using MrBAYES UST Jeong Dageum Thomas Bayes( ) The Phylogenetic Handbook – Section III, Phylogenetic.
Bayes Factor Based on Han and Carlin (2001, JASA).
Input for the Bayesian Phylogenetic Workflow All Input values could be loaded as text file or typing directly. Only for the multifasta file is advised.
Bayesian parameter estimation in cosmology with Population Monte Carlo By Darell Moodley (UKZN) Supervisor: Prof. K Moodley (UKZN) SKA Postgraduate conference,
Introduction to MCMC and BUGS. Computational problems More parameters -> even more parameter combinations Exact computation and grid approximation become.
Priors, Normal Models, Computing Posteriors
Exam I review Understanding the meaning of the terminology we use. Quick calculations that indicate understanding of the basis of methods. Many of the.
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
1 Gil McVean Tuesday 24 th February 2009 Markov Chain Monte Carlo.
Lab3: Bayesian phylogenetic Inference and MCMC Department of Bioinformatics & Biostatistics, SJTU.
A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models Russell Almond Florida State University College of Education Educational Psychology.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Simulation techniques Summary of the methods we used so far Other methods –Rejection sampling –Importance sampling Very good slides from Dr. Joo-Ho Choi.
Molecular Systematics
CAMELS CCDAS A Bayesian approach and Metropolis Monte Carlo method to estimate parameters and uncertainties in ecosystem models from eddy-covariance data.
Bayesian Phylogenetics. Bayes Theorem Pr(Tree|Data) = Pr(Data|Tree) x Pr(Tree) Pr(Data)
G. Cowan RHUL Physics Bayesian Higgs combination page 1 Bayesian Higgs combination based on event counts (follow-up from 11 May 07) ATLAS Statistics Forum.
Ben Stöver WS 2012/2013 Ancestral state reconstruction Molecular Phylogenetics – exercise.
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Bayesian Travel Time Reliability
T Test for Two Independent Samples. t test for two independent samples Basic Assumptions Independent samples are not paired with other observations Null.
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
Bayesian statistics named after the Reverend Mr Bayes based on the concept that you can estimate the statistical properties of a system after measuting.
Bioinf.cs.auckland.ac.nz Juin 2008 Uncorrelated and Autocorrelated relaxed phylogenetics Michaël Defoin-Platel and Alexei Drummond.
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
SIR method continued. SIR: sample-importance resampling Find maximum likelihood (best likelihood × prior), Y Randomly sample pairs of r and N 1973 For.
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Lecture 14 – Consensus Trees & Nodal Support
Missing data: Why you should care about it and what to do about it
IMa2(Isolation with Migration)
Advanced Statistical Computing Fall 2016
Bayesian data analysis
ERGM conditional form Much easier to calculate delta (change statistics)
Maximum likelihood (ML) method
Introduction to the bayes Prefix in Stata 15
Bayesian inference Presented by Amir Hadadi
Markov Chain Monte Carlo
Markov chain monte carlo
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
Multidimensional Integration Part I
the goal of Bayesian divergence time estimation
Ch13 Empirical Methods.
Bayesian inference with MrBayes Molecular Phylogenetics – exercise
Lecture 14 – Consensus Trees & Nodal Support
Bayesian Networks in Educational Assessment
Presentation transcript:

Bayesian II Spring 2010

Major Issues in Phylogenetic BI Have we reached convergence? If so, do we have a large enough sample of the posterior?

Have we reached convergence? Look at the trace plots of the posterior probability (not reliable) Convergence Diagnostics: – Average Standard Deviation of the Split frequencies: compares the node frequencies between the independent runs (the closer to zero, the better, but it is not clear how small it should be) – Potential Scale Reduction Factor (PSRF): the closer to one, the better

Convergence Diagnostics By default performs two independent analyses starting from different random trees (mcmc nruns=2) Average standard deviation of clade frequencies calculated and presented during the run (mcmc mcmcdiagn=yes diagnfreq=1000) and written to file (.mcmc) Standard deviation of each clade frequency and potential scale reduction for branch lengths calculated with sumt Potential scale reduction calculated for all substitution model parameters with sump

Have we reached convergence? PSRF (sump command) Model parameter summaries over all 3 runs sampled in files "Ligia16SCOI28SGulfnoADLGGTRGbygene.nex.run1.p", "Ligia16SCOI28SGulfnoADLGGTRGbygene.nex.run2.p" etc: (Summaries are based on a total of samples from 3 runs) (Each run produced samples of which samples were included) 95% Cred. Interval Parameter Mean Variance Lower Upper Median PSRF * TL{all} r(A C){1} r(A G){1} r(A T){1} r(C G){1} r(C T){1} r(G T){1} pi(A){1} pi(C){1} pi(G){1} pi(T){1} alpha{1} * Convergence diagnostic (PSRF = Potential scale reduction factor [Gelman and Rubin, 1992], uncorrected) should approach 1 as runs converge. The values may be unreliable if you have a small number of samples. PSRF should only be used as a rough guide to convergence since all the assumptions that allow one to interpret it as a scale reduction factor are not met in the phylogenetic context.

Copyright restrictions may apply. Nylander, J. A.A. et al. Bioinformatics : ; doi: /bioinformatics/btm388 Are We There Yet (AWTY)?

Empirical Data: two independent runs 300,000,000 generations: complex model with three partitions (by codon): the bad news Plotted in Tracer

Empirical Data: two independent runs 300,000,000 generations: complex model with three partitions (by codon): the good news; splits were highly correlated between the two runs Plotted in AWTY:

Empirical Data: two independent runs 300,000,000 generations: complex model with three partitions (by codon): the good news; splits were highly correlated between the two runs Plotted in AWTY: What caused the difference in posterior probabilities? Estimation of particular parameters

Yes we have reached convergence: Do we have a large enough sample of the posterior? Long runs are better than short one, but how long? Good mixing: “Examine the acceptance rates of the proposal mechanisms used in your analysis (output at the end of the run) The Metropolis proposals used by MrBayes work best when their acceptance rate is neither too low nor too high. A rough guide is to try to get them within the range of 10 % to 70 %”

Acceptance Rates Analysis used seconds of CPU time on processor 0 Likelihood of best state for "cold" chain of run 1 was Likelihood of best state for "cold" chain of run 2 was Likelihood of best state for "cold" chain of run 3 was Likelihood of best state for "cold" chain of run 4 was Acceptance rates for the moves in the "cold" chain of run 1: With prob. Chain accepted changes to % param. 1 (revmat) with Dirichlet proposal % param. 2 (revmat) with Dirichlet proposal % param. 3 (revmat) with Dirichlet proposal % param. 4 (state frequencies) with Dirichlet proposal % param. 5 (state frequencies) with Dirichlet proposal % param. 6 (state frequencies) with Dirichlet proposal % param. 7 (gamma shape) with multiplier % param. 8 (gamma shape) with multiplier % param. 9 (gamma shape) with multiplier % param. 10 (rate multiplier) with Dirichlet proposal % param. 11 (topology and branch lengths) with extending TBR % param. 11 (topology and branch lengths) with LOCAL

tree 1tree 2 tree 3 Posterior probability distribution Parameter space Posterior probability

cold chain heated chain Metropolis-coupled Markov chain Monte Carlo a. k. a. MCMCMC a. k. a. (MC) 3

cold chain hot chain

cold chain hot chain

cold chain hot chain

unsuccessful swap cold chain hot chain

cold chain hot chain

cold chain hot chain

cold chain hot chain successful swap

cold chain hot chain

cold chain hot chain

cold chain hot chain successful swap

cold chain hot chain

Improving Convergence (Only if convergence diagnostics indicate problem!) Change tuning parameters of proposals to bring acceptance rate into the range 10 % to 70 % Propose changes to ‘difficult’ parameters more often Use different proposal mechanisms Change heating temperature to bring acceptance rate of swaps between adjacent chains into the range 10 % to 70 %. Run the chain longer Increase the number of heated chains Make the model more realistic

Sampled value Target distribution Too modest proposals Acceptance rate too high Poor mixing Too bold proposals Acceptance rate too low Poor mixing Moderately bold proposals Acceptance rate intermediate Good mixing