SUPPORT and RESAMPLING

Slides:



Advertisements
Similar presentations
Bootstrapping (non-parametric)
Advertisements

Estimating the reliability of a tree Reconstructed phylogenetic trees are almost certainly wrong. They are estimates of the true tree. But how reliable.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
“Inferring Phylogenies” Joseph Felsenstein Excellent reference
Lecture 7 – Algorithmic Approaches Justification: Any estimate of a phylogenetic tree has a large variance. Therefore, any tree that we can demonstrate.
BIOE 109 Summer 2009 Lecture 4- Part II Phylogenetic Inference.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Tree Evaluation Tree Evaluation. Tree Evaluation A question often asked of a data set is whether it contains ‘significant cladistic structure’, that is.
Bootstrapping LING 572 Fei Xia 1/31/06.
Probabilistic methods for phylogenetic trees (Part 2)
Building Phylogenies Parsimony 2.
Processing & Testing Phylogenetic Trees. Rooting.
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Maximum parsimony Kai Müller.
Terminology of phylogenetic trees
A Phylogenetic Analysis of the Early Late Ordovician Orthograptinae (Graptolithina) Sue Klosterman Charles Mitchell Dan Goldman.
COMPUTATIONAL MODELS FOR PHYLOGENETIC ANALYSIS K. R. PARDASANI DEPTT OF APPLIED MATHEMATICS MAULANA AZAD NATIONAL INSTITUTE OF TECHNOLOGY (MANIT) BHOPAL.
Tree Confidence Have we got the true tree? Use known phylogenies Unfortunately, very rare Hillis et al. (1992) created experimental phylogenies using phage.
Phylogenetics and Coalescence Lab 9 October 24, 2012.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Character fit and consensus Thanks to Leandro Gaetano.
Cladogram construction Thanks to Leandro Gaetano.
Resampling techniques
Speaker: Bin-Shenq Ho Dec. 19, 2011
Phylogenetic Analysis – Part 2. Outline   Why do we do phylogenetics (cladistics)?   How do we build a tree?   Do we believe the tree?   Applications.
Phylogenetic Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics Figures from Higgs & Attwood.
Testing alternative hypotheses. Outline Topology tests: –Templeton test Parametric bootstrapping (briefly) Comparing data sets.
Maximum Likelihood Given competing explanations for a particular observation, which explanation should we choose? Maximum likelihood methodologies suggest.
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Phylogenetics and Coalescence. Goals Construct phylogenetic trees using the UPGMA method Use nucleotide sequences to construct phylogenetic trees using.
Phylogenetic Analysis – Part 2. Outline   Why do we do phylogenetics (cladistics)?   How do we build a tree?   Do we believe the tree?   Applications.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L9.1 Lecture 9: Discriminant function analysis (DFA) l Rationale.
Application of Phylogenetic Networks in Evolutionary Studies Daniel H. Huson and David Bryant Presented by Peggy Wang.
Canonical Correlation Analysis (CCA). CCA This is it! The mother of all linear statistical analysis When ? We want to find a structural relation between.
Tree Terminologies. Phylogenetic Tree - phylogenetic relationships are normally displayed in a tree-like diagram (phylogenetic tree/cladogram) - a cladogram.
9/6/07BCB 444/544 F07 ISU Dobbs - Lab 3 - BLAST1 BCB 444/544 Lab 3 BLAST Scoring Matrices & Alignment Statistics Sept6.
Data Science Credibility: Evaluating What’s Been Learned
Lecture 14 – Consensus Trees & Nodal Support
Phylogenetic basis of systematics
Confidence Interval Estimation
Z-Scores.
Keep all significant matches
Inferring a phylogeny is an estimation procedure.
Heuristic Search A heuristic is a rule for choosing a branch in a state space search that will most likely lead to a problem solution Heuristics are used.
Molecular Phylogenetics and Historical Biogeography among Salamandrids of the “True” Salamander Clade: Rapid Branching of Numberous Highly Divergent Lineages.
Phylogenetic Inference
Evolutionary history of related organisms
Elementary Statistics
Summary and Recommendations
Why Models of Sequence Evolution Matter
Multiple Decision Trees ISQS7342
Phylogeny and the Tree of Life
Assessing Phylogenetic Hypotheses and Phylogenetic Data
Assessing Phylogenetic Hypotheses and Phylogenetic Data
Bootstrapping Jackknifing
Evaluating Theory Use SCOUT.
Lecture 7 – Algorithmic Approaches
Paperclipidae Original Assumption (unstated) : 1 individual = 1 species 7 individuals Outgroup taxon not in set; described as headless straight pin or.
Lecture 14 – Consensus Trees & Nodal Support
Gradual Assembly of Avian Body Plan Culminated in Rapid Rates of Evolution across the Dinosaur-Bird Transition  Stephen L. Brusatte, Graeme T. Lloyd,
Molecular data assisted morphological analyses
Phylogenetic Trees Jasmin sutkovic.
Summary and Recommendations
But what if there is a large amount of homoplasy in the data?
Consensus Trees.
1 2 Biology Warm Up Day 6 Turn phones in the baskets
Presentation transcript:

SUPPORT and RESAMPLING Thanks to Leandro Gaetano

Support Branch length Bremmer support Bootstrap Jacknife

Bremer Support (decay index) (Bremer 1992) : is the number of extra steps needed to "collapse" a branch the absolute amount of favourable evidence supporting the group Having a Bremer Support of 2 can have two meaning: • A branch is supported by two uncontradicted characters. Therefore, it can only “cost” two steps not to have that branch. • A branch is supported by n characters, but contradicted by n-2. Therefore, to prefer the contradictory branch “costs” only 2 steps.

Support

Support Bremer support

Bremer Support Support BS= 2 BS= 1 There are some incongruences between characters. Bremer support will be the difference between the number of congruent characters (F) and the number of characters that contradict the branch

Bremer Support: Problems If heuristic search is not complete and memory in insufficient (= hold low), the consensus for the suboptimal trees can be overresolved and the BS being artificially high What BS is considered good? solution? As a rule of thumb, a Bremer score of 3 is good and a score of 5 suggests that the group is highly supported

relative Bremer Support (Goloboff and Farris, 2001) Bremer relative = (F-C)/F F = number of characters congruent with the branch C = number of characters contradicting the branch If relative Bremer support is 1 is totally supported if 0 not supported at all

Resampling The bootstrap intention is to examine the relative "confidence" associated with portions of a cladogram in relation to character sampling (Sidall, 2001). Bootstrap Resampling characters from the original data matrix, creating a pseudoreplicate matrix and then recalculating the most parsimonious tree on the pseudoreplicate data. From Soltis and Soltis (2003)

Stability can be defined only by reference to some factors (e. g Stability can be defined only by reference to some factors (e.g., a group stable under addition of characters may be very unstable under addition of taxa or under recoding of some characters). Support depends exclusively on presently available evidence (and, of course, assumptions or theories used to interpret that evidence). Resampling evaluates support because the frequency with which replicates display a given group will be determined by the relative amounts of favourable and contradictory evidence

From Soltis and Soltis (2003)

Bootstrap: Problems Resampling Boostrap not valid statistically for very large data-matrices Unilateral support, meaning: Recovered groups will be well supported BUT might be not recovered groups that can have also good support It is not clear what value of Boostrap is significant

Jackknife TAXONOMIC JACKKNIFE CHARACTER OR PARSIMONY JACKKNIFE Resampling What would have happened if I had fewer taxa or characters than I presently have, and then added the remainder? Jackknife TAXONOMIC JACKKNIFE CHARACTER OR PARSIMONY JACKKNIFE

TAXONOMIC JACKKNIFE Pseudoreplicas: remove one taxon from the data matrix and search for most parsimonious tree(s). This is repeated removing all the taxon from the matrix one every time (Pseudoreplica include all taxa -1) 2) Consensus from the trees obtained with all the pseudoreplicas 3) The Jackknife value represents the proportion of trees in which the groups recovered in the shortest tree(s) are represented

TAXONOMIC JACKKNIFE From Sidall 2001

Bremer Support and Jackniffing can give different support values to monophyletic groups. Compare values between monophyletic groups LM and FG.