Species as datapoints Comparative Methods Biology 683 Heath Blackmon

Slides:



Advertisements
Similar presentations
Analysis of variance and statistical inference.
Advertisements

Testing methods & comparing models using continuous and discrete character simulation Liam J. Revell.
Phylogenetic comparative trait and community analyses.
On board do traits fit B.M. model? can we use model fitting to answer evolutionary questions? pattern vs. process table.
New phylogenetic methods for studying the phenotypic axis of adaptive radiation Liam J. Revell University of Massachusetts Boston.
An Introduction to Phylogenetic Methods
Interpreting Cladograms Notes
Phylogenetic reconstruction
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
Molecular Evolution Revised 29/12/06
Current Approaches to Whole Genome Phylogenetic Analysis Hongli Li.
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Distance Methods. Distance Estimates attempt to estimate the mean number of changes per site since 2 species (sequences) split from each other Simply.
Maximum Likelihood Flips usage of probability function A typical calculation: P(h|n,p) = C(h, n) * p h * (1-p) (n-h) The implied question: Given p of success.
Measuring Evolution Evidence for evolution –Estimate natural selection in the wild –Trace fossils –Infer phylogeny from behavior –Infer behavioral evolution.
1 Bayesian inference of genome structure and application to base composition variation Nick Smith and Paul Fearnhead, University of Lancaster.
Introduction to Monte Carlo Methods D.J.C. Mackay.
1 Advances in Statistics Or, what you might find if you picked up a current issue of a Biological Journal.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Comparative methods: Using trees to study evolution.
Midterm mean = 38.9 ± 12.9 >50 = A >40 = B>29 = C.
Chapter 14 Monte Carlo Simulation Introduction Find several parameters Parameter follow the specific probability distribution Generate parameter.
Homologous Structures vs. Analogous Structures
Tree Confidence Have we got the true tree? Use known phylogenies Unfortunately, very rare Hillis et al. (1992) created experimental phylogenies using phage.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Why species are not independent data points and what to do about it: correlation on independent contrasts.
GENE 3000 Fall 2013 slides More geologists agree that the age of the Earth is ~4.5 billion years old geneticists have independent data suggesting.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Introduction to Phylogenetics
Calculating branch lengths from distances. ABC A B C----- a b c.
Patterns of divergent selection from combined DNA barcode and phenotypic data Tim Barraclough, Imperial College London.
Christian Arnold Bioinformatics Group, University of Leipzig Bioinformatics Herbstseminar October 23th, 2009 Three Weeks of Experience at the formatics.
Why phylogenetics? Barbara Holland School of Physical Sciences University of Tasmania.
ONE thing I wish I knew. How can we allow for extinction and non- neutral evolution in using phylogenies?** * ancestral state reconstruction, deriving.
Phylogenetics.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Review of statistical modeling and probability theory Alan Moses ML4bio.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Phylogeny.
Bioinf.cs.auckland.ac.nz Juin 2008 Uncorrelated and Autocorrelated relaxed phylogenetics Michaël Defoin-Platel and Alexei Drummond.
Comparative methods wrap-up and “key innovations”.
Building Phylogenies. Phylogenetic (evolutionary) trees Human Gorilla Chimp Gibbon Orangutan Describe evolutionary relationships between species Cannot.
Darwin’s Tree of Life, July million species Phylogenetic inference from genomic.
Phylogenetic comparative methods Comparative studies (nuisance) Evolutionary studies (objective) Community ecology (lack of alternatives)
Lecture 19 – Species Tree Estimation
Section 2: Modern Systematics
Applied statistics Usman Roshan.
Phylogeny and the Tree of Life
Evolutionary genomics can now be applied beyond ‘model’ organisms
The general linear model and Statistical Parametric Mapping
Section 2: Modern Systematics
Bayesian inference Presented by Amir Hadadi
Endeavour to reconstruct the characters of each hypothetical ancestor.
Summary and Recommendations
Statistical Analysis Error Bars
Adaptive Evolution of Gene Expression in Drosophila
Reading Phylogenetic Trees
The general linear model and Statistical Parametric Mapping
Margarita V. Chibalina, Dmitry A. Filatov  Current Biology 
The Most General Markov Substitution Model on an Unrooted Tree
Morphological Phylogenetics in the Genomic Age
Chapter 14 Monte Carlo Simulation
18.2 Modern Systematics I. Traditional ______________
Modern Comparative Methods
Interpreting Cladograms Notes
Phylogeny and the Tree of Life
Summary and Recommendations
1 2 Biology Warm Up Day 6 Turn phones in the baskets
Evolution Biology Mrs. Johnson.
Presentation transcript:

Species as datapoints Comparative Methods Biology 683 Heath Blackmon Joe Felsenstein

The Problem Does body size impact brain size? Does flight effect speciation rate? Does the evolution of some traits make species more likely to go extinct? Do structures in predators and prey coevolve? Are the expression level of some genes maintained over evolutionary time scales? How many times have eyes evolved? Some questions are hard to ask with experimental studies!

The Problem An example from R

Solutions Calculating independent contrasts Only good for two continuous traits Treating a phylogeny as a covariance matrix PGLS Creating a process based model that explicitly incorporates the phylogeny into the likelihood calculation BAMM Diversitree Simulation using a monte carlo approach

Independent Contrasts library(ape) pic(tree, x)

Assumptions If our trait does not fit a Brownian motion model then we may have biased results. If our branch lengths are inaccurate we will have error. If we misplace taxa we will have error and reduced power.

Phylogeny as a covariance matrix B C D 1 0.66 0.33 Species A Species B Species C Species D 0 1 2 3 Mathematically it can be shown that phylogenetic independent contrasts are a special case of this approach

Phylogeny as a covariance matrix library(ape) Library(nlme) pglsModel <- gls(y ~ x, correlation = corBrownian(phy), data = data) This is a less restricted approach because it allows us to fit different correlation structures. Still can be difficult/impossible to deal with discrete variables.

Process based model Process based models can be developed where we calculate the likelihood of our data given a model and the tip states. These models can include things like speciation, extinction, sampling bias, discrete and continuous characters.

Example with chromosomes 10 11 22 23

Example with chromosomes With process based models we are usually interested in the parameters of the model and do model comparison to find the best model or Bayesian analyses where we compare parameters in the model

Example from beetles Single parameter: σ2 the “drift” parameter vs. σ2 σ2 & σ2 From Revell 2008

Effect of wing loss on rate of chromosome evolution Example from beetles Effect of wing loss on rate of chromosome evolution 2 rate model preferred on all 500 trees max p-val < 0.01 Winged Wingless Winged σ2 = 1.5 Wingless σ2 = 9.1

Estimation via Monte Carlo Using regular stats test then simulating a lot of data and getting a null distribution of stats (like the f-statistic if you are doing anova). These are implemented for most things you might want. APE, Geiger, and Phytools Simulating data and getting a null distribution for any value you are interested in.

Estimation via Monte Carlo Using existing stats: library(geiger) fit <- aov.phylo(y~x, phy, nsim=100) Creating your own tests and Monte Carlos isn’t hard

An example from Hymenoptera Hypothesis: Eusociality should lead to selection for higher recombination and by extension higher chromosome number. Sherman 1979

An example from Hymenoptera On average one crossover per arm per meiosis: 1 3 4 5 6 2 With 1 chromosome: 5 Genotypes ♂ ♀ With 3 chromosomes: 1 3 4 5 6 2 8 Genotypes ♂ ♀

An example from Hymenoptera Standard ANOVA p-value < 0.001 Phylo - ANOVA F-statistic = 70.5, p-value = 0.23  Eusocial  Solitary 1 60 Chromosomes

Origins of haplodiploidy Bull’s hypothesis Species with relatively few chromosomes should experience transitions to haplodiploidy more frequently than species with many chromosomes. Bull 1979, 1983

Origins of haplodiploidy 2 XY ♂ ♀ 1 X 2 X Y 16% of genome 1 X 2 ♂ Fusion between chromosome 2 and X; followed by degeneration of neoY 1 XY ♂ ♀ 1 X X Y 50% of genome ♂ 1 X

Origins of haplodiploidy  Diplodiploid  Haplodiploid 4 29 Phylo - ANOVA F-statistic = 97.8, p-value < 0.001

Ancestral condition test 1 15

Ancestral condition test Diplodiploid Haplodiploid

Ancestral condition test 7 8 9 10 11

Section 4: Origins of haplodiploidy P-value = 0.019 Null

Resources R CRAN phylogenetics task view Inferring phylogenies Bodega Bay Phylogenetics Workshop Me if you’ve already googled a little

For Thursday Bring laptop to class! Heath Blackmon BSBW 309 coleoguy@gmail.com @coleoguy