Molecular Clocks Prediction of time from molecular divergence.

Slides:



Advertisements
Similar presentations
Introduction to molecular dating methods. Principles Ultrametricity: All descendants of any node are equidistant from that node For extant species, branches,
Advertisements

Juan Daza UCF Fall 2008 Juan Daza UCF Fall 2008 Estimating divergence times from molecular data.
Multiple Regression and Model Building
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Classification of Living Things. 2 Taxonomy: Distinguishing Species Distinguishing species on the basis of structure can be difficult  Members of the.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
Molecular Clock I. Evolutionary rate Xuhua Xia
Molecular Evolution Revised 29/12/06
Molecular Evolution with an emphasis on substitution rates Gavin JD Smith State Key Laboratory of Emerging Infectious Diseases & Department of Microbiology.
Ln(7.9* ) –ln(6.2* ) is  2 – distributed with (n-2) degrees of freedom Output from Likelihood Method. Likelihood: 6.2*  = 0.34.
Adaptive Molecular Evolution Nonsynonymous vs Synonymous.
Molecular Clocks Rose Hoberman.
Phylogenetic trees Sushmita Roy BMI/CS 576
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Molecular phylogenetics
Molecular Clock. Rate of evolution of DNA is constant over time and across lineages Resolve history of species –Timing of events –Relationship of species.
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
The Evolutionary History of Biodiversity
Chapter 26: Phylogeny and the Tree of Life Objectives 1.Identify how phylogenies show evolutionary relationships. 2.Phylogenies are inferred based homologies.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Phylogenetic Trees: Common Ancestry and Divergence 1B1: Organisms share many conserved core processes and features that evolved and are widely distributed.
The Molecular Clock? By: T. Michael Dodson. Hypothesis For any given macromolecule (a protein or DNA sequence) the rate of evolution is approximately.
GENE 3000 Fall 2013 slides More geologists agree that the age of the Earth is ~4.5 billion years old geneticists have independent data suggesting.
PAML: Phylogenetic Analysis by Maximum Likelihood Ziheng Yang Depart of Biology University College London
PHYLOGENETICS CONTINUED TESTS BY TUESDAY BECAUSE SOME PROBLEMS WITH SCANTRONS.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Evolutionary Biology Concepts Molecular Evolution Phylogenetic Inference BIO520 BioinformaticsJim Lund Reading: Ch7.
Introduction to Phylogenetics
Lab3: Bayesian phylogenetic Inference and MCMC Department of Bioinformatics & Biostatistics, SJTU.
Calculating branch lengths from distances. ABC A B C----- a b c.
Chapter 24: Molecular and Genomic Evolution CHAPTER 24 Molecular and Genomic Evolution.
Identifying and Modeling Selection Pressure (a review of three papers) Rose Hoberman BioLM seminar Feb 9, 2004.
Models of Molecular Evolution III Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections 7.5 – 7.8.
Lecture 16 – Molecular Clocks Up until recently, studies such as this one relied on sequence evolution to behave in a clock-like fashion, with a uniform.
Cédric Notredame (08/12/2015) Molecular Evolution Cédric Notredame.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Phylogeny & the Tree of Life
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
MODELLING EVOLUTION TERESA NEEMAN STATISTICAL CONSULTING UNIT ANU.
NEW TOPIC: MOLECULAR EVOLUTION.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Systematics and Phylogenetics Ch. 23.1, 23.2, 23.4, 23.5, and 23.7.
Bioinf.cs.auckland.ac.nz Juin 2008 Uncorrelated and Autocorrelated relaxed phylogenetics Michaël Defoin-Platel and Alexei Drummond.
PHYOGENY & THE Tree of life Represent traits that are either derived or lost due to evolution.
Ch. 26 Phylogeny and the Tree of Life. Opening Discussion: Is this basic “tree of life” a fact? If so, why? If not, what is it?
Phylogeny and Taxonomy. Phylogeny and Systematics The evolutionary history of a species or related species Reconstructing phylogeny is done using evidence.
Phylogeny and the Tree of Life
Lecture 16 – Molecular Clocks
Phylogeny & the Tree of Life
The neutral theory of molecular evolution
Linkage and Linkage Disequilibrium
Pipelines for Computational Analysis (Bioinformatics)
In-Text Art, Ch. 16, p. 316 (1).
Models of Sequence Evolution
Patterns in Evolution I. Phylogenetic
Molecular Clocks Rose Hoberman.
Summary and Recommendations
The Most General Markov Substitution Model on an Unrooted Tree
Morphological Phylogenetics in the Genomic Age
Summary and Recommendations
But what if there is a large amount of homoplasy in the data?
Presentation transcript:

Molecular Clocks Prediction of time from molecular divergence

Outline What is the molecular clock hypothesis? How do you detect deviations of the molecular clock hypothesis? Assuming a perfect molecular clock, what are the potential pitfalls in using it for dating? Dating with “relaxed” clocks Cautionary notes

Molecular Clock Molecular divergence is ROUGHLY correlated with divergence of time

Evidence for Rate Constancy in Hemoglobin from Zuckerkandl and Pauling (1965)

Given Can we date other nodes in the tree? M Given a phylogenetic tree branch lengths a time estimate for one (or more) node(s) 110 MYA Can we date other nodes in the tree? Yes... if the rate of molecular change is constant across all branches

The Molecular Clock Hypothesis Amount of genetic difference between sequences is a function of time since separation Rate of molecular change is constant (enough) to predict times of divergence (within the bounds of particular genes and taxa)

Rate Constancy? Page & Holmes p240

Rate Heterogeneity Rate of molecular evolution can differ between nucleotide positions genes genomic regions genomes within species (nuclear vs organelle) species over time If not considered, introduces bias into time estimates

Rate Heterogeneity among lineages Cause Reason Repair mechanisms e.g. RNA viruses have error-prone polymerases Metabolic rate More free radicals Generation time Copies DNA more frequently Population size Effects mutation fixation rate

Local Clocks? Closely related species often share similar properties, likely to have similar rates For example murid rodents on average 2-6 times faster than apes and humans (Graur & Li p150) mouse and rat rates are nearly equal (Graur & Li p146)

Rate Changes within a Lineage Cause Reason Population size changes Genetic drift more likely to fix neutral alleles in small population Strength of selection changes over time new role/environment gene duplication change in another gene

Identifying rate heterogeneity Tests of molecular clock: Likelihood ratio test identifies deviance from clock but not the deviant sequences Relative rates tests compares rates of sister nodes using an outgroup Tajima test Number of sites in which character shared by outgroup and only one of two ingroups should be equal for both ingroups Branch length test deviation of distance from root to leaf compared to average distance

Likelihood Ratio Test estimate a phylogeny under molecular clock and without it e.g. root-to-tip distances must be equal difference in likelihood ~ 2*Chi^2 with n-2 degrees of freedom (n = # taxa in tree) asymptotically when models are nested

Relative Rates Tests Sarich & Wilson 1973, Wu and Li 1985 Tests whether distance between two taxa and an outgroup are equal (or average rate of two clades vs an outgroup) need to compute expected variance many triples to consider, and not independent (although modifications such as Li & Bousquet 1992 correct for this) Lacks power, esp short sequences low rates of change Given length and number of variable sites in typical sequences used for dating, (Bromham et al 2000) says: unlikely to detect moderate variation between lineages (1.5-4x) likely to result in substantial error in date estimates

Relative Rates Tests Sarich & Wilson 1973, Wu and Li 1985 Taxon 1 Taxon 1 Taxon 2 Taxon 2 Taxon 3 Outgroup Taxon 3 Outgroup

Relative Rates Tests Sarich & Wilson 1973, Wu and Li 1985 H0: K01 = K02 or K01 - K02 = 0 K13 = K01 + K03 (1) K23 = K02 + K03 (2) K12 = K01 + K02 (3) K01 = (K13 + K12 – K23 )/2 (4) K02 = (K12 + K23 – K13 )/2 (5) K03 = (K13 + K23 – K12 )/2 (6) K01 – K02 = K13 - K23 Variance z = K13 - K23 \ [var (K13 - K23)] 1/2 Compare to normal distribution K01 Taxon 1 K02 Taxon 2 K03 Taxon 3 Outgroup

Bayesian Relative Rates test (Wilcox et al. 2004) MrBayes in conjunction with Cadence; variation is estimated from the posterior distribution Cadence summarizes for all tree samples, the distance between specific taxa and the most recent common ancestor (MRCA)

Measuring Evolutionary time with a molecular clock Estimate genetic distance d = number amino acid replacements Use paleontological data to determine date of common ancestor T = time since divergence Estimate calibration rate (number of genetic changes expected per unit time) r = d / 2T Calculate time of divergence for novel sequences Tij = dij / 2r

Perfect Molecular Clock Change linear function time (substitutions ~ Poisson) (variation is only due to stochastic error) Rates constant (positions/lineages) Tree perfect Molecular distance estimated perfectly Calibration dates without error Regression (time vs substitutions) without error

Poisson Variance (Assuming A Perfect Molecular Clock) If mutation every MY Poisson variance 95% lineages 15 MYA old have 8-22 substitutions 8 substitutions also could be 5 MYA Molecular Systematics p532

Estimating Substitution Rate Calculate separate rate for each data set (species/genes) using known date of divergence (from fossil, biogeography) One calibration point Rate = d/2T More than one calibration point use regression

Calibration Complexities Cannot date fossils perfectly Fossils usually not direct ancestors branched off tree before (after?) splitting event. Impossible to pinpoint the age of last common ancestor of a group of living species

Linear Regression Fix intercept at (0,0) Fit line between divergence estimates and calibration times Calculate regression and prediction confidence limits A = regression line B1-B2 = 95% CI of regression line C1-C2 = 95% CI for predicted time values Molecular Systematics p536

Molecular Dating Sources of Error (assuming constant rates) Both X and Y values only estimates substitution model could be incorrect tree could be incorrect errors in orthology assignment Poisson variance is large Pairwise divergences correlated (Molec Systematics p534) inflates correlation between divergence & time Sometimes calibrations correlated if using derived calibration points Error in inferring slope Confidence interval for predictions much larger than confidence interval for slope

Working Around Rate Heterogeneity Identify lineages that deviate and remove them Quantify degree of rate variation to put limits on possible divergence dates requires several calibration dates, not always available gives very conservative estimates of molecular dates Explicitly model rate variation (relaxed clocks)

Relaxing the Molecular Clock Rutschmann 2006 (review) Likelihood analysis Assign each branch a rate parameter explosion of parameters, not realistic User can partition branches based on domain knowledge Rates of partitions are independent Nonparametric methods smooth rates along tree and penalized likelihood (program r8s) Bayesian approach stochastic model of evolutionary change prior distribution of rates: Autocorrelation: BEAST and Multidivtime Non-autocorrelation: BEAST (can also incorporate uncertainty in topology)

Multiple Gene Loci “Trying to estimate time of divergence from one protein is like trying to estimate the average height of humans by measuring one human” --Molecular Systematics p539 Ideally: use multiple genes use multiple calibration points

Even so, be Very cautious about divergence time inferences Point estimates are absurd Sample errors often based only on the difference between estimates in the same study Even estimates with confidence intervals unlikely to really capture all sources of variance

General References Reviews/Critiques Bromham and Penny. The modern molecular clock, Nature Genetics, 2003. Graur and Martin. Reading the entrails of chickens...the illusion of precision. Trends in Genetics, 2004. Rutschmann.2006 Molecular dating of phylogenetic trees: A brief review of current methods that estimate divergence times. Diversity and Distributions Textbooks: Molecular Systematics. 2nd edition. Edited by Hillis, Moritz, and Mable. Inferring Phylogenies. Felsenstein. Molecular Evolution, a phylogenetic approach. Page and Holmes. Chapter 11 textbook “The Phylogenetic Handbook”

Rate Heterogeneity References Dealing with Rate Heterogeneity Yang and Yoder. Comparison of likelihood and bayesian methods for estimating divergence times. Syst. Biol, 2003. Kishino, Thorne, and Bruno. Performance of a divergence time estimation method under a probabilistic model of rate evolution. Mol. Biol. Evol, 2001. Huelsenbeck, Larget, and Swofford. A compound poisson process for relaxing the molecular clock. Genetics, 2000. Testing for Rate heterogeneity Takezaki, Rzhetsky and Nei. Phylogenetic test of the molecular clock and linearized trees. Mol. Bio. Evol., 1995. Bromham, Penny, Rambaut, and Hendy. The power of relative rates test depends on the data. J Mol Evol, 2000. Wilcox, T. P., F. J. Garcia de Leon, D. A. Hendrickson, and D. M. Hillis. 2004. Convergence among cave catfishes: long-branch attraction and a Bayesian relative rates test. Mol. Phylogenet. Evol. 31:1101-1113.