The Big Issues in Phylogenetic Reconstruction Randy Linder Integrative Biology, University of Texas

Slides:



Advertisements
Similar presentations
LG 4 Outline Evolutionary Relationships and Classification
Advertisements

CS 598AGB What simulations can tell us. Questions that simulations cannot answer Simulations are on finite data. Some questions (e.g., whether a method.
A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks Luay Nakhleh Department of Computer Sciences UT Austin.
An Introduction to Phylogenetic Methods
Wellcome Trust Workshop Working with Pathogen Genomes Module 6 Phylogeny.
Lichens and Ascomycota broadly Alternative markers to COI ITS.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
LEQ: How do biologist organize living things?
Phylogeny Systematics Cladistics
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Computational biology and computational biologists Tandy Warnow, UT-Austin Department of Computer Sciences Institute for Cellular and Molecular Biology.
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
IE68 - Biological databases Phylogenetic analysis
Molecular Evolution Revised 29/12/06
“Inferring Phylogenies” Joseph Felsenstein Excellent reference
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
Bioinformatics and Phylogenetic Analysis
Maximum Likelihood Flips usage of probability function A typical calculation: P(h|n,p) = C(h, n) * p h * (1-p) (n-h) The implied question: Given p of success.
Dispersal models Continuous populations Isolation-by-distance Discrete populations Stepping-stone Island model.
CHAPTER 25 TRACING PHYLOGENY. I. PHYLOGENY AND SYSTEMATICS A.TAXONOMY EMPLOYS A HIERARCHICAL SYSTEM OF CLASSIFICATION  SYSTEMATICS, THE STUDY OF BIOLOGICAL.
Phylogenetic Tree Construction and Related Problems Bioinformatics.
Classification and Phylogenies Taxonomic categories and taxa Inferring phylogenies –The similarity vs. shared derived character states –Homoplasy –Maximum.
Gene Trees and Species Trees: Lessons from morning glories Lauren A. Eserman & Richard E. Miller Department of Biological Sciences Southeastern Louisiana.
Computational and mathematical challenges involved in very large-scale phylogenetics Tandy Warnow The University of Texas at Austin.
Characterizing the Phylogenetic Tree-Search Problem Daniel Money And Simon Whelan ~Anusha Sura.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Molecular phylogenetics
Classification and Systematics Tracing phylogeny is one of the main goals of systematics, the study of biological diversity in an evolutionary context.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Announcements Urban Forestry project starts this week. Go through protocol. We'll be sending you off on your own. Please act responsibly. Peer review of.
Bioinformatics Multiple Alignment. Overview Introduction Multiple Alignments Global multiple alignment –Introduction –Scoring –Algorithms.
Chapter 24: Molecular and Genomic Evolution CHAPTER 24 Molecular and Genomic Evolution.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Phylogeny and Genome Biology Andrew Jackson Wellcome Trust Sanger Institute Changes: Type program name to start Always Cd to phyml directory before starting.
Parallel & Distributed Systems and Algorithms for Inference of Large Phylogenetic Trees with Maximum Likelihood Alexandros Stamatakis LRR TU München Contact:
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Why phylogenetics? Barbara Holland School of Physical Sciences University of Tasmania.
Maximum Likelihood Given competing explanations for a particular observation, which explanation should we choose? Maximum likelihood methodologies suggest.
Phylogeny & the Tree of Life
Algorithmic research in phylogeny reconstruction Tandy Warnow The University of Texas at Austin.
Phylogeny & Systematics
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
CS 395T: Computational phylogenetics January 18, 2006 Tandy Warnow.
Chapter 26: Phylogeny and the Tree of Life
Chapter 26 Phylogeny and the Tree of Life
Phylogeny. Intro: Why study evolutionary relationships? Legless lizards and snakes look like they could be considered the same species By studying evolutionary.
Reconstructing and Using Phylogenies 16. Concept 16.1 All of Life Is Connected through Its Evolutionary History All of life is related through a common.
Darwin’s Tree of Life, July million species Phylogenetic inference from genomic.
The Tree of Life: Algorithmic and Software Challenges Tandy Warnow The University of Texas at Austin.
Phylogeny and the Tree of Life
Introduction to Bioinformatics Resources for DNA Barcoding
Evolutionary genomics can now be applied beyond ‘model’ organisms
Phylogenetic basis of systematics
Phylogeny & the Tree of Life
Multiple Sequence Alignment Methods
Phylogenetic Inference
Biological Classification: The science of taxonomy
Endeavour to reconstruct the characters of each hypothetical ancestor.
Summary and Recommendations
Chapter 20 Phylogenetic Trees.
CS 394C: Computational Biology Algorithms
Molecular data assisted morphological analyses
Algorithms for Inferring the Tree of Life
Tandy Warnow The University of Texas at Austin
Phylogeny and the Tree of Life
Summary and Recommendations
Presentation transcript:

The Big Issues in Phylogenetic Reconstruction Randy Linder Integrative Biology, University of Texas

Overview for Talk Biological issues –Practical Time constraints Material constraints Monetary constraints Information constraints –Theoretical Models: Processes of evolution –How does information move through the system –How do nucleotides evolve –How do indels evolve –Rearrangements and duplications Alignment Reconstruction Comparative methodologies

Overview for Talk (cont.) Computational issues –Theoretical Graph theory Algorithmic Heuristics –Practical Performance –Running times –Accuracy of results –“Efficiency”: accuracy of results for a given amount of data Will not address these directly, but they will come up at various points

General Overview of a Systematist’s Work Determine the scope of the work –Anything from kingdoms to single genera Seek funding Plan and travel to get materials –Costs –Politics –Time

General Overview of a Systematist’s Work (cont.) Extract DNA (in most cases) –Sometimes easy (most animals, microbes) –Sometimes not (many plants, fungi) Determine which DNA regions to use –Ones previously determined by other studies –Develop new ones Amplify (and clone) regions –See if they have appropriate variation –Do so in all available specimens

General Overview of a Systematist’s Work (cont.) Align sequences –Use available algorithms (most often some flavor of Clustal) –Hand align where algorithms are too stupid If using model-based reconstruction method, determine model to use Reconstruct relationships –Choose method(s) of reconstruction Time Computational feasibility Quality of result Decide if datasets can be combined

General Overview of a Systematist’s Work (cont.) Assess quality of reconstruction –Bootstrap Non-parametric (any method, especially MP) Parametric (ML) –Posterior probabilities (Bayesian) Perform comparative analyses (often) –Examples: Assess character evolution Test for patterns consistent with –Types of speciation –Biogeographic hypotheses –Adaptive hypotheses

Biological Issues: Practical Time constraints –Gathering of material for study –Finding adequate regions for analysis –Cloning (sometimes) and sequencing of information Sometimes gathering of morphological information –Run time of phylogenetic analyses –Run time of assessments of support –Time to conduct comparative analyses

Biological Issues: Practical Material constraints –Getting all of the taxa desired Monetary constraints –Travel costs –Cloning and sequencing costs Information constraints –What regions will provide the right amount of variation for my group? –What is the best sampling strategy for my group? –Do my regions for analysis avoid gene tree/species tree problems? –Does my group have any reticulation in it and how will I know?

How Do Practical Issues Translate to Compuational Issues? Prior to lab work –Not much, really –Maybe with design of taxon sampling strategies b/c getting samples is often a major constraint During lab work –Help in locating new regions for analysis –Help in assessing/developing regions that are free of gene tree/species tree problems.

How Do Practical Issues Translate to Compuational Issues? Post lab work –Improved alignment methods Better methods, independent of reconstruction –Methods that handle indels better »Sequence length differences »Non-repeat indels »Repeat indels –Assessment of quality of alignment in different regions Alignment methods that simultaneously infer phylogeny –Need alignment methods that explicitly attempt to infer positional homology according to some optimality criterion

How Do Practical Issues Translate to Compuational Issues? Post lab work –Reconstruction methods Better models of sequence evolution –Relaxing of the rates-across-sites assumption –Models that make use of information in indels Methods that assess reasons for phylogenetic incongruence and support for different explanations –Reticulation above the species level –Reticulation below the species level –Lineage sorting: alleles, gene duplications

How Do Practical Issues Translate to Compuational Issues? Post lab work –Reconstruction methods Methods that statistically assess the tree-space of optimal and nearly optimal solutions to reconstructions Better supertree methods –Support methods Develop methods that assess support for alternative explanations: –Tree, network, gene tree/species tree, etc.