Species Tree Workshop January 14, 2012 Practice with BEST Please download MrBayes 3.2 for either Windows, Macintos, or UNIX from

Slides:



Advertisements
Similar presentations
The multispecies coalescent: implications for inferring species trees
Advertisements

An Algorithm for Constructing Parsimonious Hybridization Networks with Multiple Phylogenetic Trees Yufeng Wu Dept. of Computer Science & Engineering University.
A Tutorial on Learning with Bayesian Networks
Practical Session: Bayesian evolutionary analysis by sampling trees (BEAST) Rebecca R. Gray, Ph.D. Department of Pathology University of Florida.
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
Discordance due to gene flow or horizontal gene transfer.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Bayesian statistics – MCMC techniques
Molecular Evolution Revised 29/12/06
Exact Computation of Coalescent Likelihood under the Infinite Sites Model Yufeng Wu University of Connecticut ISBRA
Course overview Tuesday lecture –Those not presenting turn in short review of a paper using the method being discussed Thursday computer lab –Turn in short.
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
Positive selection A new allele (mutant) confers some increase in the fitness of the organism Selection acts to favour this allele Also called adaptive.
Hands-On Microsoft Windows Server 2003 Administration Chapter 5 Administering File Resources.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
Gene Trees and Species Trees: Lessons from morning glories Lauren A. Eserman & Richard E. Miller Department of Biological Sciences Southeastern Louisiana.
“Species Trees”. What is the “species tree?” The true tree (when there is one) The population tree The dominant history ????
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
7. Bayesian phylogenetic analysis using MrBAYES UST Jeong Dageum Thomas Bayes( ) The Phylogenetic Handbook – Section III, Phylogenetic.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
BINF6201/8201 Molecular phylogenetic methods
Input for the Bayesian Phylogenetic Workflow All Input values could be loaded as text file or typing directly. Only for the multifasta file is advised.
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
Molecular phylogenetics
Priors, Normal Models, Computing Posteriors
Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview.
Speciation history inferred from gene trees L. Lacey Knowles Department of Ecology and Evolutionary Biology University of Michigan, Ann Arbor MI
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
16 September 2007 Coalescent Consequences for Consensus Cladograms J. H. Degnan 1, M. Degiorgio 2, D. Bryant 3, and N. A. Rosenberg 1,2 1 Dept. of Human.
Phylogenetics and Coalescence Lab 9 October 24, 2012.
PAML: Phylogenetic Analysis by Maximum Likelihood Ziheng Yang Depart of Biology University College London
PHYLOGENETICS CONTINUED TESTS BY TUESDAY BECAUSE SOME PROBLEMS WITH SCANTRONS.
A brief introduction to phylogenetics
Lab3: Bayesian phylogenetic Inference and MCMC Department of Bioinformatics & Biostatistics, SJTU.
Gene trees and species trees (cont.). If we pick the adjacent nucleotide, what gene tree do we expect?
1 Tree Indexing (1) Linear index is poor for insertion/deletion. Tree index can efficiently support all desired operations: –Insert/delete –Multiple search.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Gene tree discordance and multi-species coalescent models Noah Rosenberg December 21, 2007 James Degnan Randa Tao David Bryant Mike DeGiorgio.
Parallel & Distributed Systems and Algorithms for Inference of Large Phylogenetic Trees with Maximum Likelihood Alexandros Stamatakis LRR TU München Contact:
FINE SCALE MAPPING ANDREW MORRIS Wellcome Trust Centre for Human Genetics March 7, 2003.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Figure 5.1 Giant panda (Ailuropoda melanoleuca)
Ben Stöver WS 2012/2013 Ancestral state reconstruction Molecular Phylogenetics – exercise.
Estimating genetic diversity (  within populations  =  a function of the number of polymorphic sites in a population (S) “Watterson’s theta”
Bayesian Evolutionary Analysis by Sampling Trees (BEAST) LEE KIM-SUNG Environmental Health Institute National Environment Agency.
Bioinf.cs.auckland.ac.nz Juin 2008 Uncorrelated and Autocorrelated relaxed phylogenetics Michaël Defoin-Platel and Alexei Drummond.
CS 395T: Computational phylogenetics January 18, 2006 Tandy Warnow.
Bayesian II Spring Major Issues in Phylogenetic BI Have we reached convergence? If so, do we have a large enough sample of the posterior?
Application of Phylogenetic Networks in Evolutionary Studies Daniel H. Huson and David Bryant Presented by Peggy Wang.
Molecular Evolution. Study of how genes and proteins evolve and how are organisms related based on their DNA sequence Molecular evolution therefore is.
ITEC 2620M Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: ec2620m.htm Office: TEL 3049.
Multiple Sequence Alignment with PASTA Michael Nute Austin, TX June 17, 2016.
Lecture 19 – Species Tree Estimation
Workshop Biogeography
MCMC Stopping and Variance Estimation: Idea here is to first use multiple Chains from different initial conditions to determine a burn-in period so the.
IMa2(Isolation with Migration)
FIG. 1. The Poptree window that appears right after starting POPTREE2 and the dialog box for specifying an input data file. When users start POPTREE2,
Nonparametric estimation of phylogenetic tree distributions
Pipelines for Computational Analysis (Bioinformatics)
Reading Cladograms Who is more closely related?
Introduction to Computers
Bayesian inference Presented by Amir Hadadi
CPSC 531: System Modeling and Simulation
Endeavour to reconstruct the characters of each hypothetical ancestor.
Molecular Clocks Rose Hoberman.
Molecular Evolution.
the goal of Bayesian divergence time estimation
Bayesian inference with MrBayes Molecular Phylogenetics – exercise
Bruce Rannala, Jeff P. Reeve  The American Journal of Human Genetics 
Presentation transcript:

Species Tree Workshop January 14, 2012 Practice with BEST Please download MrBayes 3.2 for either Windows, Macintos, or UNIX from

Agenda The MrBayes with BEST (v 3.2) implementation (work in progress) Run the finch example (download finch.nex) Run a multiple allele data set (yeast with 4 genes, 22 taxa, 6 species ) …or Try your own data

Previous Implementation: MrBayes with BEST Step 1: Use MrBayes to propose vectors of joint gene trees (unlinked and rooted with outgroup). Step 2: Given those gene trees, propose a compatible species tree. Step 3: Implement the chain fully within MrBayes using the usual properties of the MCMC as proposed by the user. Program found at

New Implementation: MrBayes 3.2 integrated with BEST Assumes molecular clock for gene trees as part of a full model including Coalescent for gene trees|species tree Program found at

As always Wide variety of nucleotide, amino acid, and codon models Variety of proposal distribution options Parallel “hot” and “cold” chains to balance efficiency while covering large tree spaces. Checkpointing to allow stop and starts New speed improvements BEST can use MPI for Mac and UNIX GPU (NVIDIA graphics card) support Implementation: MrBayes 3.2

Steps for any Bayesian Runs Read the data Set the model (data|gene tree) Set the Prior (including gene|species) Set the MCMC rules Run the MCMC Check convergence Summarize results

Files created ckp (Checkpoint file for restarting) tree5.run2.t (trees saved loci 5 in run 2) tree5.parts (partitions seen for tree 5) tree5.trprobs (tree probabilties) tree5.con.tre (consensus tree) tree5.tstat (partition statistics) tree5.vstat (branch and node statistics)

Remember Use a separate folder for each analysis Remember the “taxset”and “speciespartition” statements in MrBayes with ≥ one taxa per species Remember to allow variable population sizes With n loci, the species tree shows up as files labeled n+1

Remember to unlink Gene tree topologies and branch lengths for sure unlink topology=(all) brlens=(all); Parameters of model as approriate unlink statefreq=(all) revmat=(all);

Issues Gene trees following a molecular clock is too restrictive Some outputs still need to be modified for species tree use

Species Tree Notation ADCB Topology, branch lengths, & population sizes: (D: 0.035(C:0.03(A:0.02,B:0.02):0.01#0. 3):0.005#0.2)#0.25

Three lineages of grassfinches (Poephila) Long-tailed (acuticauda) Long-tailed (hecki) Black-throated (cincta)

30 gene trees from Australian finches Jennings & Edwards (2005) Evolution 59, P. acuticaudaP. heckiP. cincta

Estimated species tree distribution using BEST

hecki acuticauda c. atropygialis c.atropygialis acuticauda hecki