Jérôme Chave EDB CNRS-Université Paul Sabatier, Toulouse

Slides:



Advertisements
Similar presentations
The multispecies coalescent: implications for inferring species trees
Advertisements

B. Knudsen and J. Hein Department of Genetics and Ecology
Gibbs Sampling Methods for Stick-Breaking priors Hemant Ishwaran and Lancelot F. James 2001 Presented by Yuting Qi ECE Dept., Duke Univ. 03/03/06.
Hierarchical Dirichlet Processes
Community and gradient analysis: Matrix approaches in macroecology The world comes in fragments.
Amorphophallus titanum Largest unbranched inflorescence in the world Monecious and protogynous Carrion flower (fly/beetle pollinated) Indigenous to the.
The neutral model approach Stephen P. Hubbell (1942- Motoo Kimura ( )
Forward Genealogical Simulations Assumptions:1) Fixed population size 2) Fixed mating time Step #1:The mating process: For a fixed population size N, there.
Cenancestor (aka LUCA or MRCA) can be placed using the echo remaining from the early expansion of the genetic code. reflects only a single cellular component.
Lecture 13 – Performance of Methods Folks often use the term “reliability” without a very clear definition of what it is. Methods of assessing performance.
OUR Ecological Footprint …. Ch 20 Community Ecology: Species Abundance + Diversity.
Population Genetics Learning Objectives
Molecular phylogenetics
Haplotype Blocks An Overview A. Polanski Department of Statistics Rice University.
Speciation history inferred from gene trees L. Lacey Knowles Department of Ecology and Evolutionary Biology University of Michigan, Ann Arbor MI
Quantifying uncertainty in species discovery with approximate Bayesian computation (ABC): single samples and recent radiations Mike HickersonUniversity.
Lecture 3: population genetics I: mutation and recombination
1 Demographic and environmental stochasticity in population processes 1) Population dynamics 2) Community dynamics 3) Selection in fluctuating environments.
Unit 5 Evolution. Biological Evolution All of the changes that have transformed life on Earth from the earliest beginnings to the diversity of organisms.
1 Dirichlet Process Mixtures A gentle tutorial Graphical Models – Khalid El-Arini Carnegie Mellon University November 6 th, 2006 TexPoint fonts used.
Stick-Breaking Constructions
Population genetics. coalesce 1.To grow together; fuse. 2.To come together so as to form one whole; unite: The rebel units coalesced into one army to.
© 2001 Prentice-Hall, Inc.Chap 7-1 BA 201 Lecture 11 Sampling Distributions.
Relative-Abundance Patterns
11 Stochastic age-structured modelling: dynamics, genetics and estimation Steinar Engen, Norwegian University of Science and Technology Abstract In his.
By Mireya Diaz Department of Epidemiology and Biostatistics for EECS 458.
1 Chapter 4, Part 1 Basic ideas of Probability Relative Frequency, Classical Probability Compound Events, The Addition Rule Disjoint Events.
Surveying II. Lecture 1.. Types of errors There are several types of error that can occur, with different characteristics. Mistakes Such as miscounting.
Evolution of Biodiversity. Diversity of Species Biodiversity a. ecosystem diversity b. species diversity c. genetic diversity.
Ms. Hughes.  Evolution is the process by which a species changes over time.  In 1859, Charles Darwin pulled together these missing pieces. He was an.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:
Taxonomy & Phylogeny. B-5.6 Summarize ways that scientists use data from a variety of sources to investigate and critically analyze aspects of evolutionary.
Mean Field Methods for Computer and Communication Systems Jean-Yves Le Boudec EPFL Network Science Workshop Hong Kong July
Chapter 8 Confidence Intervals Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Lecture 19 – Species Tree Estimation
Sampling and Sampling Distribution
Bootstrap – The Statistician’s Magic Wand
Evolution, Biodiversity, & Population Ecology
Chapter 11: Evolution of Populations
QMT 3033 ECONOMETRICS QMT 3033 ECONOMETRIC.
An Algorithm for Computing the Gene Tree Probability under the Multispecies Coalescent and its Application in the Inference of Population Tree Yufeng Wu.
Computational Physics (Lecture 10)
Do we really need theory?
Bayesian Generalized Product Partition Model
Relative-Abundance Patterns
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
Maximum Likelihood Estimation
Chapter 8: Inference for Proportions
The evolution of Populations
Slide 1: Thank you Elizabeth for the introduction, and hello everybody. So, I have been a PhD student with Charles Semple and Mike Steel at the UoC since.
HMD Bio Chapter 11 Section 4 KEY CONCEPT Hardy-Weinberg equilibrium provides a framework for understanding how populations evolve.
Endeavour to reconstruct the characters of each hypothetical ancestor.
Summary and Recommendations
Mattew Mazowita, Lani Haque, and David Sankoff
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
16-2 Evolution as Genetic Change
Species diversity indices
16-2 Evolution as Genetic Change
Copyright Pearson Prentice Hall
Lecture 7 Sampling and Sampling Distributions
11.1 Genetic Variation within Popln
Reverend Thomas Bayes ( )
Parametric Methods Berlin Chen, 2005 References:
Learning From Observed Data
Sampling Distributions (§ )
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
Summary and Recommendations
Fractional-Random-Weight Bootstrap
Fundamental Sampling Distributions and Data Descriptions
Presentation transcript:

Jérôme Chave EDB CNRS-Université Paul Sabatier, Toulouse Properties of a stochastic model of macroevolution with protracted speciation Jérôme Chave EDB CNRS-Université Paul Sabatier, Toulouse

The biodiversity challenge Pace, Science (1997)

Problem with counting species 2013 1430 permanent plots 2017

Problem with placing species within a phylogenetic context (and defining species) Sukumaran & Knowles PNAS (2017)

Neutral theory of biodiversity and biogeography The regional species pool Probability 1- n Probability n q = n ×n n: number of individuals in the regional species pool n : speciation rate

Predictions of the neutral theory The same model predicts The shape of a species abundance distribution (useful to ecologists) The shape of phylogenetic trees (useful to macroevolutionists) For that reason it is an attractive model in biodiversity research

Method to construct multispecies trees from the neutral theory Coalescence event Hits with an intensity q Incomplete lineage sorting

Urn construction of species abundance distributions Random partitions of samples of individuals in the neutral model are equivalent to random partitions generated by Hoppe’s urn scheme (Hoppe 1984) It follows that the expected number of species Sn for a sample of n individuals is such that

Broken stick construction Another way to look at the same problem is to consider the forward in time neutral model (Fisher-Wright or Moran model with mutation) which assumes that n∞ In that case the equilibrium distribution of relative species abundances is given by a Dirichlet process (see e.g. Watterson Theoretical Population Genetics 1976) Relative species abundance complying to a Dirichlet process (i.e. ‘drawn’ from this process) may be generated through a simple procedure (Griffith, Engen & McCloskey, or GEM construction) This is called the ‘broken stick’ construction because it is like breaking a stick of unit length at a location given by W1, then cutting the remaining stick (of length 1-W1) at a relative location W2 (so the weight is W2(1-W1) ) and so forth. This is mentioned only in passing here, but this is an important property in nonparameteric Bayesian estimation, to define priors in classification/clustering problems (Ferguson 1973).

An interesting historical paper

Where neutral theory fails Empirical species accumulation curves are often power-law, not logarithmic The neutral model with point speciation produces too many rare species, and drift is too slow to wipe away abundant species (Ricklefs 2003, Nee 2005)

(dispersal-limited) local sample Hubbell’s proposal Speciation rate q Regional pool Immigration: m (dispersal-limited) local sample This model does a nice job at predicting species abundance distributions, but still has problems reconciling ecological and macroevolutionary timescales Hubbell The unified neutral theory of species abundance and diversity (2001) See Harris et al. IEEE (105, 2017) for a nice general treatement of this model

One solution: protracted speciation In a forward in time (Moran-type) model, where P(j,t) is the number of species with abundance j at time t Where n = m/(1+t) is the speciation rate, and t is the time lag to turn a lineage into a proper species. Rosindell et al. Ecol Lett (2010)

Another possible protracted model: urn construction Pitman, J. (1996). Some developments of the Blackwell-MacQueen urn scheme. Lecture Notes-Monograph Series, 245-267.

Intuitive idea In a forward-in-time approach, this model allows for the genesis of as many species as in the ‘classic’ neutral model (q is unchanged), but it makes it harder for the rare species to survive (they are picked more rarely, and could go extinct). Note: this model generates a partition structure that is a ‘natural’ two-parameter generalization of the neutral model [Pitman and Yor 1992]

Broken stick construction Reminder (from a few slides above): The construction of a relative species abundance may be done using the following (GEM) procedure [Pitman 1995] A similar construction exists for the two-parameter model J. Pitman. Exchangeable and partially exchangeable random partitions. Probability Theory and Related Fields, 102:145–158, 1995

Some asymptotic results [Pitman 1996] As n ∞, almost surely [Pitman 1996] Let us rank the species from the most to the least abundant in a sample generated by the model above, and write P(k) for the abundance of the kth species, then as n ∞ As s0, convergence towards the neutral partition structure (and log-series species abundance distribution) is assured Pitman, J. (1996). Lecture Notes-Monograph Series, 245-267.

What type of multispecies tree does this model generate? Polytomies are not rare in multispecies trees, and they partly reflect the complex, reticulated history of genes

Realistic multispecies coalescents Degnan and Rosenberg TREE (2009)

One possible idea: synchronous speciation events Bifurcation Trifurcation … The coalescent interpretation means that multiple particles are allowed to collide at once. More precisely, if b particles are present, the probability that k of them collide is defined as lk,b. In the case l2,b = 1, lk,b = 0 for all k≠2, this is equivalent to the neutral coalescent (see above) Berestycki, N. (2009). Recent progress in coalescent theory. Ensaios Matematicos, 16(1), 1-193.

Theory for this model Representation theorem (Pitman 1999). Defining L(x) to be a finite measure on [0,1], the lk,b are uniquely defined by And this is called a L-coalescent (equal to the ‘neutral’ coalescent if L=d)

Asymptotic results Results [Berestycki, Berestycki & Limic 2014] Assume a L-coalescent is constructed and n leaves are sampled. Assume ‘speciation’ events are drawn upon the tree at a (Poisson) rate q. Assume that L(dx)~Axs-1dx, with 1>s≥0 Then: (B(s) is a known function) Berestycki, J., Berestycki, N., & Limic, V. (2014). Asymptotic sampling formulae for L-coalescents. In Annales de l'Institut Henri Poincaré, Probabilités et Statistiques (Vol. 50, No. 3, pp. 715-731). Institut Henri Poincaré.

Summary Two-parameter random-partition model L-coalescent

Questions Is there an equivalence between the L-coalescent sampling scheme and the Pitman sampling formula? Is there a generative process for the random partition created in the L-coalescent?

Work ahead: a case for birds McCormack et al. Plos Biol (2013)

Exploring the polytomies of a phylogeny Jetz et al. Nature (2012) Jarvis et al. Science (2014)

Idea Compute the values of lk,b for k=2, k=3, k=4, and infer s by inverting

Thank you jerome.chave@univ-tlse3.fr