Most Likely Rates given Phylogeny L[  |001] =  0 x (P[  A ] + P[  B ]) +  1 x (P[  C ] + P[  D ])

Slides:



Advertisements
Similar presentations
CHAPTER 40 Probability.
Advertisements

PHYLOGENETIC TREES Bulent Moller CSE March 2004.
Lecture 2 - The History of Phylogenetic Inference Haeckel’s evolutionary trees are among the first attempts at phylogeny inference.
Autocorrelation and Linkage Cause Bias in Evaluation of Relational Learners David Jensen and Jennifer Neville.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Phylogenetic reconstruction
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
QUANTITATIVE DATA ANALYSIS
Heuristic alignment algorithms and cost matrices
CHAPTER 22 Reliability of Ordination Results From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach,
Cluster Analysis.  What is Cluster Analysis?  Types of Data in Cluster Analysis  A Categorization of Major Clustering Methods  Partitioning Methods.
PSY 1950 Confidence and Power December, Requisite Quote “The picturing of data allows us to be sensitive not only to the multiple hypotheses that.
10/17/071 Read: Ch. 15, GSF Comparing Ecological Communities Part Two: Ordination.
Chapter 7: Variation in repeated samples – Sampling distributions
Slide 1 Statistics Workshop Tutorial 4 Probability Probability Distributions.
Lecture II-2: Probability Review
Chapter 5: Descriptive Research Describe patterns of behavior, thoughts, and emotions among a group of individuals. Provide information about characteristics.
What Is Phylogeny? The evolutionary history of a group.
Separate multivariate observations
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
Terminology of phylogenetic trees
Molecular phylogenetics
Some matrix stuff.
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.
1 Statistical Inference Greg C Elvers. 2 Why Use Statistical Inference Whenever we collect data, we want our results to be true for the entire population.
Chapter 15 Data Analysis: Testing for Significant Differences.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
1 Dan Graur Molecular Phylogenetics Molecular phylogenetic approaches: 1. distance-matrix (based on distance measures) 2. character-state.
1 2. Independence and Bernoulli Trials Independence: Events A and B are independent if It is easy to show that A, B independent implies are all independent.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
 Read Chapter 4.  All living organisms are related to each other having descended from common ancestors.  Understanding the evolutionary relationships.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Introduction to Phylogenetics
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
New Measures of Data Utility Mi-Ja Woo National Institute of Statistical Sciences.
PCB 3043L - General Ecology Data Analysis. PCB 3043L - General Ecology Data Analysis.
Phylogeny Ch. 7 & 8.
Authenticity of results of statistical research. The Normal Distribution n Mean = median = mode n Skew is zero n 68% of values fall between 1 SD n 95%
Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.
Textbook Basics of an Expert System: – “Expert systems: Design and Development,” by: John Durkin, 1994, Chapters 1-4. Uncertainty (Probability, Certainty.
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
Evaluating the Fossil Record with Model Phylogenies Cladistic relationships can be determined without ideas about stratigraphic completeness; implied gaps.
Geo479/579: Geostatistics Ch12. Ordinary Kriging (2)
Principal Component Analysis
Statistical stuff: models, methods, and performance issues CS 394C September 3, 2009.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
Psychometrics: Exam Analysis David Hope
Direct method of standardization of indices. Average Values n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the.
PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Université d’Ottawa / University of Ottawa 2003 Bio 8102A Applied Multivariate Biostatistics L4.1 Lecture 4: Multivariate distance measures l The concept.
Methods of Presenting and Interpreting Information Class 9.
Statistical analysis.
Causality, Null Hypothesis Testing, and Bivariate Analysis
Keep all significant matches
Statistical analysis.
PCB 3043L - General Ecology Data Analysis.
Lecture 2 - The History of Phylogenetic Inference
Statistical inference: distribution, hypothesis testing
Multiple Alignment and Phylogenetic Trees
Biological Classification: The science of taxonomy
R.J.Watt D.I.Donaldson University of Stirling
Systematics: Tree of Life
Systematics: Tree of Life
Cladistics and Molecular Systematics
Principles of Experimental Design
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Most Likely Rates given Phylogeny L[  |001] =  0 x (P[  A ] + P[  B ]) +  1 x (P[  C ] + P[  D ])

Most Likely Rates given Phylogeny If  0 =  1 : L[  |001] ∝ P[001| ,  A ] + P[001| ,  B ] + P[001| ,  C ] + P[001| ,  D ]

Most Likely Rates given Phylogeny P[001| ,  A ] = (1-  ) 3 x  1 P[001| ,  B ] = (1-  ) 2 x  2 P[001| ,  C ] = (1-  ) 1 x  3 P[001| ,  D ] = (1-  ) 2 x  2

Most Likely Rates given Phylogeny If  0 =  1 : L[  |001] ∝  -  2

Most Likely Rates given Phylogeny If  0 =  1 : L[  |001] ∝  -  2 dL[  |001] ∝  -2  ∴ L[  |001] maximized at 2  = 

Allowing uncertain ancestry favors high rates P[001| ,  ] =  -  2 i.e., probability maximized when character changes half of the time!

Matrices have obvious structure (25 steps over 16 taxa & 20 characters)…

….. or lack thereof (50 steps over 16 taxa & 20 characters)

Evaluating Matrix Structure: Do the distributions of character states tell us anything prior to phylogenetic analysis? TaxonChar AChar BChar C Alpha000 Beta100 Gamma001 Delta101 Epsilon111 Frack110 Kappa110

Character Compatibility Compatible character pairs: two characters with state combinations that do not necessarily imply homoplasy Incompatible character pairs: two characters with state combinations that do necessarily imply homoplasy.

Evaluating Matrix Structure: TaxonChar AChar BChar C Alpha000 Beta100 Gamma001 Delta101 Epsilon111 Frack110 Kappa110

Compatibility among binary characters Compatible: no homoplasy on some trees AB0 101 Incompatible: homoplasy on all trees AB

Graphical depiction of (in)compatibility: closing the circuit is “bad”

Compatibility among unordered multistates: break characters down into binaries DEDEDEDE = DEDEDEDE = Compatible: all “pairs” must be compatible AND each pair must have unique combinations.

Compatibility among unordered multistates: only one “pair” need be incompatible DFDFDFDF = DFDFDFDF = Incompatible: states 0 and 2 show all combinations.

Compatibility among unordered multistates: sometimes incompatibility is subtle. DGDGDGDG = DGDGDGDG = Incompatible: all pairs compatible, but you cannot have the third pair and the first two without homoplasy.

Graphical depiction of (in)compatibility: circuit completed

Compatibility among Ordered multistates: Order of states must not be broken HIHJ HIHJ HI show no gap in distributions (compatible);

Compatibility among Ordered multistates: Order of states must not be broken HIHJ HIHJ HJ show a gap in distributions (incompatible).

Inferring Phylogeny with Compatibility: Clique Analysis Clique: a group of characters that all are compatible; Take the largest clique and infer phylogeny from those characters –This can produce a general (usually polytomous) tree with no homoplasy; Within each section of the phylogeny, find the largest remaining clique; –Use this to clarify relations among those taxa; –Rinse & repeat…. Note: “That was them, not us…..” (G. Estabrook, a.k.a., “Mr. Compatibilty”

Testing Matrix Structure with Compatibility: Permutation Tests Calculate number of compatible pairs in matrix. Permute matrix, scrambling states within each character; –Each character retains same number of taxa with each state; –Estimates P[observed compatibility] given such high rates of change that there is no inheritance. Calculate compatibility of matrix & characters; –If observed compatibility is within the range of permuted matrix, then the data likely are –If an individual character’s compatibility is with the range of a permuted character, then it is likely useless.

Testing Matrix Structure with Compatibility: Simple Inverse Models Calculate number of compatible pairs in matrix. Evolve a tree and matrix of the same dimensions as the original data; –Use same number of states as seen for each character; –Estimates P[observed compatibility] given particular frequencies of change. Calculate compatibility of matrix & characters; –Tally P[compatibility | overall changes] for matrix or matrix partitions; –Tally P[compatibility | # changes, # derived taxa] for each character.

Probability of Compatibility Given Total Number of Steps As frequencies of change (and thus homoplasy) increase, the expected matrix compatibility drops.

One can test partitions for significant differences in frequencies of change “Slug” characters are significantly less homoplastic than are shell characters among the Rapaninae.

One can test partitions for significant differences in frequencies of change This is much less true for species in the Nassariidae.

Effect of Change on Compatibility Infrequently changing characters typically have high compatibilities frequently changing ones have low compatibilities. Simulation of 32 taxa with 100 binary characters and 200 total changes

Extreme distributions of taxa with derived states: End cases automatically compatible

Effect of Derived Taxa on Compatibility Characters with 1 or 31 taxa with 0 or 1 (= autapomorphic) are automatically compatible Characters with 16 0’s and 1’s have lowest compatibility. Simulation of 32 taxa with 100 binary characters and 200 total changes

Given X steps, compatibility is correlated with the number of derived taxa. 1 Change 2 Changes3 Changes

Effect of Correlated Character Change on Compatibility Simulated case with two suites of characters in which change in one induces a 75% chance of change in others. Elevates compatibility because distributions are so similar.

Testing for Independent Character Change: Mutual Compatibility Mutual compatibility: common compatibilities between two characters. –If character i and j both are compatible with character k, then it is a mutual compatibility. Character suites that exhibit correlated change should share more mutual compatibilities than independently changing characters. –Do characters i & j have more mutual compatibilities than expected given compatibility and independent change?

Testing for Independent Character Change: Mutual Compatibility Multivariate structure among mutual compatibilities clusters correlated suites. –Similarity between each character pair based on proportion of other characters with which both are compatible.

Testing for Independent Character Change: Mutual Compatibility Mutual compatibility: common compatibilities between two characters. –If character i and j both are compatible with character k, then it is a mutual compatibility. Character suites that exhibit correlated change should share more mutual compatibilities than independently changing characters. –Do characters i & j have more mutual compatibilities than expected given compatibility and independent change?

Stratigraphic Compatibility For individual characters: no states with gaps in sampled record For character pairs: compatible pair in which the appearance of character pairs is also consistent with phylogeny.

Compatible Pair Compatible with Stratigraphy Ch 1Ch2FALA No necessary homoplasy, nor any necessary stratigraphic gaps between morphotypes.

Ch 1Ch2FALA Morphotypes 00 and 01 appear out of order given character necessary to avoid homoplasy. Compatible Pair Incompatible with Stratigraphy

Ch 1Ch2FALA Morphotypes appear in the right order, but a gap exists between morphotypes 21 & 22. Compatible Pair Incompatible with Stratigraphy

Ch 1Ch2FALA Morphotypes appear in the right order, but a gap exists within morphotypes 01. Compatible Pair Incompatible with Stratigraphy