Phylogenetic comparative methods Comparative studies (nuisance) Evolutionary studies (objective) Community ecology (lack of alternatives)

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Phylogenetic comparative trait and community analyses.
On board do traits fit B.M. model? can we use model fitting to answer evolutionary questions? pattern vs. process table.
The General Linear Model Or, What the Hell’s Going on During Estimation?
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
Correlation and Regression. Spearman's rank correlation An alternative to correlation that does not make so many assumptions Still measures the strength.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
Mixed models Various types of models and their relation
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Lecture 13 – Performance of Methods Folks often use the term “reliability” without a very clear definition of what it is. Methods of assessing performance.
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
Processing & Testing Phylogenetic Trees. Rooting.
Variance and covariance Sums of squares General linear models.
Inferential statistics Hypothesis testing. Questions statistics can help us answer Is the mean score (or variance) for a given population different from.
L Berkley Davis Copyright 2009 MER301: Engineering Reliability Lecture 13 1 MER301: Engineering Reliability LECTURE 13 Chapter 6: Multiple Linear.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Molecular phylogenetics
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
GENE 3000 Fall 2013 slides More geologists agree that the age of the Earth is ~4.5 billion years old geneticists have independent data suggesting.
Corinne Introduction/Overview & Examples (behavioral) Giorgia functional Brain Imaging Examples, Fixed Effects Analysis vs. Random Effects Analysis Models.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Model Building and Model Diagnostics Chapter 15.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
ONE thing I wish I knew. How can we allow for extinction and non- neutral evolution in using phylogenies?** * ancestral state reconstruction, deriving.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
4 basic analytical tasks in statistics: 1)Comparing scores across groups  look for differences in means 2)Cross-tabulating categoric variables  look.
Full modeling versus summarizing gene- tree uncertainty: Method choice and species-tree accuracy L.L. Knowles et al., Molecular Phylogenetics and Evolution.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Building Phylogenies. Phylogenetic (evolutionary) trees Human Gorilla Chimp Gibbon Orangutan Describe evolutionary relationships between species Cannot.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Why Model? Make predictions or forecasts where we don’t have data.
The General Linear Model (GLM)
REGRESSION G&W p
Distance based phylogenetics
Multiple Imputation using SOLAS for Missing Data Analysis
B&A ; and REGRESSION - ANCOVA B&A ; and
The general linear model and Statistical Parametric Mapping
Regression.
12 Inferential Analysis.
...Relax... 9/21/2018 ST3131, Lecture 3 ST5213 Semester II, 2000/2001
Quantitative Methods Simple Regression.
BIVARIATE REGRESSION AND CORRELATION
The Making of the Fittest Evidence of Evolution youtube
6-1 Introduction To Empirical Models
Welcome to the class! set.seed(843) df <- tibble::data_frame(
Multiple Regression Models
OVERVIEW OF LINEAR MODELS
Linear Hierarchical Modelling
12 Inferential Analysis.
Elements of a statistical test Statistical null hypotheses
Simple Linear Regression
OVERVIEW OF LINEAR MODELS
Homoscedasticity/ Heteroscedasticity In Brief
EOC Review – Day 3 Standard B-5:
Fixed, Random and Mixed effects
Product moment correlation
Talking to Biologists, by a biologist
Homoscedasticity/ Heteroscedasticity In Brief
A protocol for data exploration to avoid common statistical problems
Modern Comparative Methods
PHYLOGENETIC TREES.
10.4 How to Construct a Cladogram
3 basic analytical tasks in bivariate (or multivariate) analyses:
Species as datapoints Comparative Methods Biology 683 Heath Blackmon
Incorporating uncertainty in distance-matrix phylogenetics
Presentation transcript:

Phylogenetic comparative methods Comparative studies (nuisance) Evolutionary studies (objective) Community ecology (lack of alternatives)

Current growth of phylogenetic comparative methods New statistical methods Availability of phylogenies Culture

One of many possible types of problems or as a special case This model structure can be used for a variety of types of problems

Assumptions: y takes continuous values x can be a random variable or a set of known values (continuous or not) y is linearly related to x  are random variables with expectation 0 and finite (co)variances that are known

Statistical methods (P)IC = GLS Phylogenetic independent contrasts Generalized Least Squares (these are methods, not models) Other methods for other statistical models ML, REML, EGLS, GLM, GLMM, GEE, “Bayesian” methods

 are random variables with expectation 0 and finite (co)variances that are known Phylogeny provides a hypothesis for these covariances

Close Relatives Tend to Resemble Each Other

What does this represent? How is it constructed? Is it known for certain?

Assume that this represents time and is known without error Translate into the pattern of covariances in  among species V

Hypothetical trait for a single species under Brownian motion evolution Trait value Time possible course of evolution

Trait value Time another possible course of evolution

Trait value Time another possible course of evolution

Brownian motion evolution gives the hypothetical variance of a trait Trait value Time Variance

Brownian motion evolution Trait value Time Variance

Brownian motion evolution of a hypothetical trait during speciation

Variance between species = Time

Total variance = Total time Variance between species = Time

Covariance = Shared time Total variance = Total time Variance between species = Time

Brownian motion Covariance matrix giving phylogenetic covariances among species diagonal elements give the total variance for species i off-diagonal elements give covariances between species i and species j

I am confused by the authors use of "branch lengths" on page I'm not sure if "different types of branch lengths" mean different phylogenetic analyses or something else I'm not aware of. Digression - non-Brownian models of evolution

Ornstein-Uhlenbeck evolution Stabilizing selection with strength given by d

Variance between species < Time

Total variance << Total time

Ornstein-Uhlenbeck evolution Time Variance Stabilizing selection means information is “lost” through time Phylogenetic correlations between species decrease

Phylogenetic Signal (Blomberg, Garland, and Ives 2003)  measures the strength of signal OU process

Assumptions: y takes continuous values x can be a random variable or a set of known numbers y is linearly related to x  are random variables with expectation 0 and finite (co)variances that are known If d must be estimated, cannot be analyzed using PIC or GLS

If we are dealing with a recent, rapid radiation, (supported clade but with short branches) will the lack of branch length data render any PIC not very informative biologically, because we would expect non-significant probabilities, based solely on the branch lengths alone? page 3022, second paragraph.

Phylogenetic Signal (Blomberg, Garland, and Ives 2003)  measures the strength of signal OU process

Statistical methods (P)IC = GLS Phylogenetic independent contrasts Generalized Least Squares (these are methods, not models) Other methods for other statistical models ML, REML, EGLS, GLM, GLMM, GEE, “Bayesian” methods

PIC y1y1 y2y2 y3y3 y4y

y1y1 y2y2 y3y3 y4y

Regression through the origin

PIC You could also use different branch lengths for x:

Branch lengths of y Branch lengths of x

PIC When could this be justified? You could also use different branch lengths for x:

When could this be justified? Never (?)

Statistical methods (P)IC = GLS Phylogenetic independent contrasts Generalized Least Squares (these are methods, not models) Other methods for other statistical models ML, REML, EGLS, GLM, GLMM, GEE, “Bayesian” methods

Elements of V are given by shared branch lengths under the assumption of “Brownian motion” evolution

Generalized Least Squares, GLS

Ordinary least squares V = I

Related to ordinary least squares

Values of are linear combinations of y i

If IC and GLS can yield identical results and the authors refer to IC as "a special case of GLS models" (p. 3032), in what situation(s) would GLS be a more appropriate method? In other words, why not just use IC?

Divergence time for desert and montane ringtail populations assumed to be 10,000 years

Predicting values for ancestral and new species

Is the prediction of the estimate of y for species I more or less precise than what you would expect from a standard regression analysis?

When dealing with multiple, incongruent gene trees, we can perform multiple PIC's on each tree, and find a correlation or not. How do we know which is the "right" answer? The three main phylogenetically based statistical methods described in the reading (IC, GLS, and Monte Carlo simulations) rely on correct information about tree topology and branch lengths. If we are unsure of the correctness of these basic assumptions, what is the best way to analyze our data?

I'm unclear how data can be statistically significant when transformed, but not significant otherwise. This seems like cheating/lying. The paper discussed researchers' decisions about branch lengths, especially in terms of transformations (OU, ACDC). Do researchers use ultrametric trees for these analyses?