New phylogenetic methods for studying the phenotypic axis of adaptive radiation Liam J. Revell University of Massachusetts Boston.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Empirical Estimator for GxE using imputed data Shuo Jiao.
Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review By Mary Kathryn Cowles and Bradley P. Carlin Presented by Yuting Qi 12/01/2006.
Bayesian Estimation in MARK
Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Gibbs Sampling Qianji Zheng Oct. 5th, 2010.
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
QUANTITATIVE DATA ANALYSIS
Maximum likelihood (ML) and likelihood ratio (LR) test
Part 2b Parameter Estimation CSE717, FALL 2008 CUBS, Univ at Buffalo.
Simulation Modeling and Analysis Session 12 Comparing Alternative System Designs.
A quick introduction to the analysis of questionnaire data John Richardson.
End of Chapter 8 Neil Weisenfeld March 28, 2005.
Statistical Background
Chapter 8 Estimation: Single Population
Ignore parts with eye-ball estimation & computational formula
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Statistical Treatment of Data Significant Figures : number of digits know with certainty + the first in doubt. Rounding off: use the same number of significant.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Simple Linear Regression and Correlation
Relationships Among Variables
Standard error of estimate & Confidence interval.
MEASURES OF CENTRAL TENDENCY & DISPERSION Research Methods.
Introduction to Linear Regression and Correlation Analysis
Introduction to plausible values National Research Coordinators Meeting Madrid, February 2010.
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
Bayesian parameter estimation in cosmology with Population Monte Carlo By Darell Moodley (UKZN) Supervisor: Prof. K Moodley (UKZN) SKA Postgraduate conference,
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Chapter 3 Descriptive Measures
Geo479/579: Geostatistics Ch12. Ordinary Kriging (1)
Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability usually accompanies.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Lab3: Bayesian phylogenetic Inference and MCMC Department of Bioinformatics & Biostatistics, SJTU.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Three Broad Purposes of Quantitative Research 1. Description 2. Theory Testing 3. Theory Generation.
Confidence Interval & Unbiased Estimator Review and Foreword.
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Analysis of Experimental Data; Introduction
CHAPTER 2: Basic Summary Statistics
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
Review of statistical modeling and probability theory Alan Moses ML4bio.
Introduction: Metropolis-Hasting Sampler Purpose--To draw samples from a probability distribution There are three steps 1Propose a move from x to y 2Accept.
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson0-1 Supplement 2: Comparing the two estimators of population variance by simulations.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
HW7: Evolutionarily conserved segments ENCODE region 009 (beta-globin locus) Multiple alignment of human, dog, and mouse 2 states: neutral (fast-evolving),
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Applied statistics Usman Roshan.
Chapter 4: Basic Estimation Techniques
Measurement, Quantification and Analysis
Probability Theory and Parameter Estimation I
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Ch3: Model Building through Regression
Elementary Statistics
Bayesian inference Presented by Amir Hadadi
Discrete Event Simulation - 4
Multidimensional Integration Part I
CHAPTER 2: Basic Summary Statistics
Chapter 10 – Part II Analysis of Variance
Species as datapoints Comparative Methods Biology 683 Heath Blackmon
Applied Statistics and Probability for Engineers
Presentation transcript:

New phylogenetic methods for studying the phenotypic axis of adaptive radiation Liam J. Revell University of Massachusetts Boston

Outline 1.The ‘phytools’ package. 2.New approaches for the analysis of quantitative trait data: a)Phylogenetic analysis of the evolutionary correlation. b)Bayesian method for locating rate shifts in the tree. c)Incorporating intraspecific variability in phylogenetic analyses. 3.Luke!

Outline 1.The ‘phytools’ package. 2.New approaches for the analysis of quantitative trait data: a)Phylogenetic analysis of the evolutionary correlation. b)Bayesian method for locating rate shifts in the tree. c)Incorporating intraspecific variability in phylogenetic analyses. 3.Luke!

Phylogenetics in R

Simulation Visualization Tree input/output/manipulation Inference Comparative biology -> Major functions of ‘phytools’

Outline 1.The ‘phytools’ package. 2.New approaches for the analysis of quantitative trait data: a)Phylogenetic analysis of the evolutionary correlation. b)Bayesian method for locating rate shifts in the tree. c)Incorporating intraspecific variability in phylogenetic analyses. 3.Luke!

Outline 1.The ‘phytools’ package. 2.New approaches for the analysis of quantitative trait data: a)Phylogenetic analysis of the evolutionary correlation. b)Bayesian method for locating rate shifts in the tree. c)Incorporating intraspecific variability in phylogenetic analyses. 3.Luke!

The evolutionary correlation Revell & Collar 2009, Evolution

non-piscivorous piscivorous

Likelihood For 2-Correlation Model Revell & Collar 2009, Evolution

Table. Model selection for the one and two rate matrix models. Modelrlog(L)AICc One matrix model R = Two matrix model R 1 (Non-piscivory) = R 2 (Piscivory) =0.779 Likelihood ratio test -2·log(L 1 /L 2 ) = 19.94P(χ 2,df=3) < Revell & Collar 2009, Evolution

Outline 1.R phylogenetics and the ‘phytools’ package. 2.New approaches for the analysis of quantitative trait data: a)Phylogenetic analysis of the evolutionary correlation. b)Bayesian method for locating rate shifts in the tree. c)Incorporating intraspecific variability in phylogenetic analyses. 3.Luke!

Bayesian MCMC method for rate variation x 20 Revell, & al. 2012

Bayesian MCMC method for rate variation Revell, & al. 2012

Bayesian MCMC method for rate variation ? ? ? Revell, & al. 2012

Bayesian MCMC chain: evol.rate.mcmc() posterior sample Starting values σ12σ12 σ22σ22 Evolutionary rates & rate- shift Proposal σ12σ12 σ22σ22 Propose new rate-shift (or rates)                                                XLP XLP | |,1min  Compute posterior odds ratio Retain proposal with probability α Repeat Reject proposal with probability 1-α

Bayesian MCMC chain

MCMC proposal σ22σ22 σ12σ12 Rate shift point with two evolutionary rates: the rate tipward (σ 2 2, in this case) and rootward of the rate- shift.

MCMC proposal 1. Propose shift from exponential distribution. 2. Go right or left with equal probability; reflect back down the tree from the tips.

MCMC proposal σ12σ12 σ22σ22 1. Propose shift from exponential distribution. 2. Go right or left with equal probability; reflect back down the tree from the tips.

Averaging the posterior sample: min.split() To find the median shift-point in our sample, we first computed the patristic distance between all the shifts in our posterior sample. We then picked the split with the lowest summed distance to all the other sample. (We might have instead found the shift with the lowest sum of squared distances, or found a point on a tree that minimized the sum of squared distances.)

Averaging the posterior sample: min.split() We can also compute the posterior probabilities of the shift being on any edge. We just calculate these as the frequency of the edge in the posterior sample

Averaging the posterior: posterior.evolrate() Averaging the posterior sample of rates is also non-trivial. This is because our posterior sample is a mixture of rates comprising different parts of the tree and different tips. How can we average the rates from these different samples?

Averaging the posterior: posterior.evolrate() σ 2 2 σ 1 2

σ 2 2 σ 1 2 XX

X X x σ 1 2 x σ 2 2

Averaging the posterior: posterior.evolrate()

X X x σ 2 2 x σ 1 2 In this case, σ 2 2 = σ 2 2 ; while σ 1 2 > σ 1 2

Identification of the “correct” edge Somewhat surprisingly, identification of the “correct” edge was effectively independent of the number of tips in the tree for a given rate shift. However, relative patristic distance from the true shift point does decline with increasing N.

Estimating the evolutionary rates We do get better at estimating the evolutionary rates unbiasedly (and their ratio) for increased N. The evolutionary rates tend to be biased towards each other for small N, which we think is a natural consequence of integrating over uncertainty in the location of the rate shift.

Outline 1.The ‘phytools’ package. 2.New approaches for the analysis of quantitative trait data: a)Phylogenetic analysis of the evolutionary correlation. b)Bayesian method for locating rate shifts in the tree. c)Incorporating intraspecific variability in phylogenetic analyses. 3.Luke!

Phylogenetic comparative analyses are conducted with species means. But data in empirical studies are uncertain estimates obtained by measuring one or a few individuals. **Ignoring intraspecific variability can cause bias in various types of comparative analysis. Our solution: sample both species means & variances, and the parameters of the evolutionary model, from their joint posterior distribution using Bayesian MCMC. Introduction

First, we need an equation for the likelihood:

How does it work?

MCMC phytools:: fitBayes Posterior sample (Revell & Reynolds, 2012) How does it work?

> results gen sig2 a e e e-01 t1 t e e How does it work?

xbar estimated means How does it work?

xbar estimated means > results gen sig2 a e e e-01 t1 t e e > SS.bayes [1] > SS.arith [1] How does it work?

Is this result general.... YES! Generating σ Mean square error -- MSE Bayesian means -- MSE arithmetic means

Outline 1.The ‘phytools’ package. 2.New approaches for the analysis of quantitative trait data: a)Phylogenetic analysis of the evolutionary correlation. b)Bayesian method for locating rate shifts in the tree. c)Incorporating intraspecific variability in phylogenetic analyses. 3.Luke!