Introductory Phylogenetic Workflows in the Discovery Environment Sheldon McKay iPlant Collaborative, DNALC, Cold Spring Harbor Laboratory Feb 8, 2012
Why is the tree of life important? “Knowledge of evolutionary relationships is fundamental to biology, yielding new insights across the plant sciences, from comparative genomics and molecular evolution, to plant development, to the study of adaptation, speciation, community assembly, and ecosystem functioning.”
We like to put things into categories
A B C D E F Classifications often represent evolutionary relationships* * But not always
Phylogenetic trees are representations of evolutionary history
What is the difference between taxonomy and phylogeny?
(E) human Consider primates: Do humans make up a monophyletic group? Hylobatidae Pongidae Hominidae
Phylogeny based on - globin pseudogene suggests that humans and chimpanzees make up a single monophyletic group outgroup
Trait Evolution Image courtesy of Brian Omeara
How can iPlant help with phylogenetic tree building?
Number of atoms in the universe Factorial (trees) E10 E2
Big Trees It can take weeks or months to analyze data sets with > 100, 000 species Example of iPlant contribution: NINJA/WINDJAMMER -- NINJA 216K species, ~8 days -- WINDJAMMER 216K species, ~4 hours
How can we scale up phylogenetic tree visualization? Goloboff et al Largest Published Tree (73,060 species)
HD TV: 1920 × ,533 names largest computer monitors: 3280×2048 (can be tiled) Laser printer: effectively 3600 × 4725 (can be tiled)
Prototype iPlant tree viewer
Scalability of Data
Phylogenetic workflows in the Discovery Environment
Phylogenetics workflows in the Discovery Environment Overview: The introductory phylogenetic workflows training module is designed to provide a hands on experience in of using phylogenetic and related applications of the iPlant Discovery Environment (DE), while also developing a familiarity with the general use of the DE user interface. Question: How are tRNA_leu genes related in species of Magnoliophyta? What are the implications of using gene families for phylogenetic inference? Specific Objectives: By the end of this module, participants should: 1)Be familiar with the DE user interface 2)Understand the starting data for phylogenetic analysis 3)Be able to perform a multiple sequence alignment in the DE 4)Be able to perform a simple phylogenetic analysis in the DE 5)Be able to use the DE as a portal to visualizing phylogenetic trees