Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Slides:



Advertisements
Similar presentations
© Wiley Publishing All Rights Reserved. Using Nucleotide Sequence Databases.
Advertisements

Phylogenetically Mapping Liverwort-Fungal Associations Jessica Nelson Duke University Jessica Nelson Duke University.
Lichens and Ascomycota broadly Alternative markers to COI ITS.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
LEQ: How do biologist organize living things?
Phylogeny Systematics Cladistics
THE EVOLUTIONARY HISTORY OF BIODIVERSITY
1 General Phylogenetics Points that will be covered in this presentation Tree TerminologyTree Terminology General Points About Phylogenetic TreesGeneral.
Plant Molecular Systematics (Phylogenetics). Systematics classifies species based on similarity of traits and possible mechanisms of evolution, a change.
PHYLOGENY AND SYSTEMATICS
Characterization of Endophytic Cellulose Degrading Fungi at the Sevilleta LTER site Zachary T. Gossage 1, Carolyn Weber 2, Cheryl Kuske 2, and Andrea Porras-Alfaro.
Computational biology and computational biologists Tandy Warnow, UT-Austin Department of Computer Sciences Institute for Cellular and Molecular Biology.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary, May 2006.
CHAPTER 25 TRACING PHYLOGENY. I. PHYLOGENY AND SYSTEMATICS A.TAXONOMY EMPLOYS A HIERARCHICAL SYSTEM OF CLASSIFICATION  SYSTEMATICS, THE STUDY OF BIOLOGICAL.
Algorithm Animation for Bioinformatics Algorithms.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Bell Work Dogs of a certain breed can have black fur or white fur. Black fur is dominant, but the breeder only wants puppies with white fur. Cross two.
Utilizing Fuzzy Logic for Gene Sequence Construction from Sub Sequences and Characteristic Genome Derivation and Assembly.
Topic : Phylogenetic Reconstruction I. Systematics = Science of biological diversity. Systematics uses taxonomy to reflect phylogeny (evolutionary history).
The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.
Comparative Genomics of Viruses: VirGen as a case study Dr. Urmila Kulkarni-Kale Bioinformatics Centre University of Pune Pune
Dan Masiga Molecular Biology and Biotechnology Department International Centre of Insect Physiology and Ecology, Nairobi, Kenya BARCODE Data Standard The.
Consortium for the Barcode of Life A rapid, cost-effective system for species identification David E. Schindel, Executive Secretary National Museum of.
Computing the Tree of Life The University of Texas at Austin Department of Computer Sciences Tandy Warnow.
Richard White Biodiversity Data. Outline Biodiversity: what is it? – Definitions: is biodiversity: A resource? Something which can be measured? How to.
Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,
Molecular phylogenetics
DNA Barcoding Amy Driskell Laboratories of Analytical Biology
Arabidopsis Genome Annotation TAIR7 Release. Arabidopsis Genome Annotation  Overview of releases  Current release (TAIR7)  Where to find TAIR7 release.
Systematics the study of the diversity of organisms and their evolutionary relationships Taxonomy – the science of naming, describing, and classifying.
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
The Evolutionary History of Biodiversity
Classification and Systematics Tracing phylogeny is one of the main goals of systematics, the study of biological diversity in an evolutionary context.
Assembling the Tree of Life Diana Lipscomb Program Director, Systematic Biology Program National Science Foundation.
Synopsis of current BIEN and Enquist projects managed by Martha iPlant 2014.
Yeast genome sequencing: the power of comparative genomics MEDG 505, 03/02/04, Han Hao Molecular Microbiology (2004)53(2), 381 – 389.
Christian Rinke Microbial Genomics DOE, Joint Genome Institute Introduction to ARB (From A User's Perspective)
DNA barcoding: bane or boon (or both) for taxonomy? Donal A. Hickey, Concordia University, Montreal. Collaborators: Mehrdad Hajibabaei and Gregory Singer.
Warm-Up 1.Contrast adaptive radiation vs. convergent evolution? Give an example of each. 2.What is the correct sequence from the most comprehensive to.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Construction of Substitution Matrices
ARE THESE ALL BEARS? WHICH ONES ARE MORE CLOSELY RELATED?
Gramene Objectives Provide researchers working on grasses and plants in general with a bird’s eye view of the grass genomes and their organization. Work.
Christian Arnold Bioinformatics Group, University of Leipzig Bioinformatics Herbstseminar October 23th, 2009 Three Weeks of Experience at the formatics.
Genomics and Forensics
Bioinformatics Scheme of the sequencing project (Martínez & Figueras, 2007) Construction Bookseller Bases determination Fragments assembly Gene search.
By Chris Paine Genes Essential idea: Every living organism inherits a blueprint for life from its parents. Genes and.
Riccardi: DIALOGUE Workshop August 1, 2005 Supported by NSF BDI 1 Representing and Using Phylogenetic Characters in Morphbank Greg Riccardi, David Gaitros,
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Using Divide-and-Conquer to Construct the Tree of Life Tandy Warnow University of Illinois at Urbana-Champaign.
The Big Issues in Phylogenetic Reconstruction Randy Linder Integrative Biology, University of Texas
Reconstructing and Using Phylogenies 16. Concept 16.1 All of Life Is Connected through Its Evolutionary History All of life is related through a common.
Canadian Bioinformatics Workshops
Systematics of Clavicipitaceous Fungi Associated with Morning Glories Based on rpb1 Sequence Data Saroj Simkhada, Alyssa M. Brown, and Richard E. Miller.
General Microbiology (Micr300)
Darwin’s Tree of Life, July million species Phylogenetic inference from genomic.
Metagenomic Species Diversity.
Introduction to Bioinformatics Resources for DNA Barcoding
Phylogenetic basis of systematics
Research in Computational Molecular Biology , Vol (2008)
Pipelines for Computational Analysis (Bioinformatics)
Taxonomic Placement of the Nidulariaceae of Nebraska and Iowa Based on Molecular and Morphological Data Goodmond H. Danielsen, Mark A. Schoenbeck, P. Roxanne.
Workshop on the analysis of microbial sequence data using ARB
Warm-Up Contrast adaptive radiation vs. convergent evolution? Give an example of each. What is the correct sequence from the most comprehensive to least.
Summary and Recommendations
Explore Evolution: Instrument for Analysis
Cynthia S. Parr, Robert Guralnick, Nico Cellinese, Roderic D.M. Page 
Molecular data assisted morphological analyses
Ch. 17 Biodiversity Mr. D.
Summary and Recommendations
Presentation transcript:

Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham, North Carolina USA

WHERE ARE WE IN ASSEMBLING THE FUNGAL TREE OF LIFE? % of published trees Total # published trees examined 1-locus2-locus3-locus4-locusTotal 560 publications reporting 595 fungal phylogenetic trees: # loci, # species, # orders (Martin Grube, Elizabeth Baloch, Betsy Arnold) Lutzoni et al., 2004, American Journal of Botany

Summary of survey 83.9% of fungal phylogenies are based exclusively on sequences from the ribosomal RNA tandem repeats. Few studies have focused on resolving relationships among orders of Fungi 354 of 595 trees examined (59.5%) conveyed relationships within single orders Although the number of species included in published trees has generally increased over time, most studies have included fewer than 100 species, with an overall mean of 34.2 ± 2.3 species/study (range: 3–1155 species)

Constraints on progress Lack of coordination toward a common goal - poor overlap among genes

Taxon sampling 2) A total of 13,467 GenBank sequences were considered 1) What are the loci that have been sequenced for the highest number of fungal species and appropriate for resolving deep relationhsips within the Fungi? nucSSU and nucLSU rDNA 3) 1010 unique taxa had both sequences available Taxonomic name associated with the sequence 4) Taxa excluded: < 600 bp Many errors in GenBank “definition line” 3) Unpublished sequences added from various projects and AFTOL Total of 573 taxa

Taxon sampling (part 2) 2) nucSSU+nucLSU+mitSSU = 253 taxa (but only Asco- and Basidiomycota) 1) What is the next locus that has been sequenced for the highest number of the 573 species selected and appropriate for resolving deep relationhsips within the Fungi? mitSSU rDNA, RPB2 3) nucSSU+nucLSU+RPB2 = 161 taxa (only Asco- and Basiomycota) 4) nucSSU+nucLSU+mitSSU+RPB2 = 103 taxa (only Asco- and Basiomycota)

Constraints on progress Lack of coordination toward a common goal - poor overlap among genes - sampling bias toward macro fungi

Two-locus Bayesian MCMCMC tree nucSSU+nucLSU 558 species, 430 genera, 68 orders, 5 phyla Ascomycota Basidiomycota Glomeromycota Zygomycota Chytridiomycota

Constraints on progress Lack of coordination toward a common goal - poor overlap among genes - sampling bias toward macro fungi Analytical and bioinformatic advancements in systematic biology were almost exclusively in alignment algorithms, contig assembly and visualization, and in phylogenetics

DATA ACQUISITION PHYLOGENETIC ANALYSES TWO MAIN OPERATIONS NEEDED FOR LARGE-SCALE PHYLOGENETIC ANALYSES TREE OF LIFE

Essence of AFToL in solving data acquisition issues COMMUNICATION/COORDINATION STANDARDIZATION AUTOMATION

PROPOSAL:: *1500 species of fungi for 8 loci (≈ 10 kb) in 4 years! 6 nuclear loci:nucSSU rDNA, nucLSU rDNA, ITS rDNA, RPB1, RPB2, EF-1  2 mitochondrial loci: mitSSU rDNA, ATP6 *Ultrastructural features common to all fungi New data and compilation of all existing data Starting date: January 1, 2003 National Science Foundation Assembling the Tree of Life AToL See for more informationwww.aftol.org

David Hibbett (Clark University): 400 Basidiomycota François Lutzoni (Duke University): 400 Lichen-forming Ascomycota et al. & Bioinformatics David McLaughlin (University of Minnesota): Subcellular characters Joseph Spatafora (Oregon State University): 400 Ascomycota Rytas Vilgalys (Duke University): 300 Chrytridiomycota, Glomeromycota & Zygomycota “ASSEMBLING THE FUNGAL TREE OF LIFE” (AFToL)

One proposal from the entire mycological community “ASSEMBLING THE FUNGAL TREE OF LIFE” (AFToL) 113 collaborators from 23 countries Strong bioinformatic & computational biology component Preceded by an NSF Network Coordination grant “DEEP HYPHA”

AFTOL Duke Clark Oregon State U of Minnesota RPB1 - RPB2 EF1  - ATP6 - mitSSU rDNA nucLSU - SSU - ITS rDNA 1500 species morphological data (SPB, nuclear division,...)

Source Information of DNA Samples DNA SAMPLE Voucher info Culture info Frozen material info Providers of material/collaborators Morpho & Ultrastructural info Team: Clark U Duke L Lab Duke V lab OSU UofM Expected phylum

Entry in DB of source information of DNA + (voucher, cultures, frozen sample, DNA, etc...) + team providing the material + morphological/ultrastructural info + expected phylym sequencing Automated sequencer Raw sequences output Server #1 Clark UDuke LDuke V When sequencing Completed Locus 3 alignLocus 2 alignLocus 1 align Server #2 BLAST (in house) Initiate NJ bootstrap analyses for set of taxa common to more than one locus NJ analysis of gene 1 For taxa in common between DS2 and DS3 NJ analysis of gene 1 For taxa in common between DS2 and DS4 NJ analysis of gene 2 For taxa in common between DS3 and DS2 NJ analysis of gene 2 For taxa in common between DS3 and DS4 NJ analysis of gene 3 For taxa in common between DS4 and DS2 NJ analysis of gene 3 For taxa in common between DS4 and DS2 Congruence test OSU UofM Blast check point Basidiomycete? No Phred & Phrap assembler Yes GenBank Weekly

WWW based interface to AFTOL database Secure access for registered users only Possibility to enter, retrieve and change data Each author views and modifies only his/her own data Dynamic web page construction using Zope

Kauff, Cox & Lutzoni, Systematic Biology (in press)

Future goals of AFToL and interplay with Barcode Continue developing WASABI / export to other projects Less taxa more genes (25 genes for 160 taxa) Because phylogenies will need multiple loci fungal diversity cannot be covered Because the focus of the Barcode initiative is identification and is based only on a single short locus it has the potential to cover all fungal diversity AFToL and Barcode initiatives can mutually benefit

Why ITS for Fungi? Easy to amplify and sequence Mixture of highly conserved (5.8S) and variable sites (ITS 1 and 2) [change definition of DNA barcode] Most broadly used marker for systematic and ecological studies of fungi at the species level The majority of the fungal diversity is unknown Highest overlap with other genes (AFToL) Alignment issues across a broad phylogenetic spectrum can be dealt with

> 95%

Betsy Arnold, University of Arizona, Tucson Jolanta Miadlikowska, Duke University ≈ fungal species Estimated ≈ 1.5 million species Part of this diversity is known (herbaria)

Highest overlap with other genes (AFToL) Beyond a single marker inventory of life What are the putative ecological/physiological roles? What are the most closely related taxa? = phylogenies ≠ similarity Reconstruction of trait evolution on comprehensive phylogenies Branch lengths are important and trees not significantly worse than the optimal tree are needed = Bayesian MCMC searches More genes needed (nucSSU, nucLSU, RPB1, RPB2, mitSSU)

Most endophytes & endolichenic fungi are Ascomycota. Beyond this level, specificity is an open question. Parasitism vs. mutualism? Horizontal vs. vertical transmission?

Acknowledgments Betsy Arnold, University of Arizona at Tucson Cymon Cox, British Museum, England Frank Kauff, University of Kaiserslautern, Germany Jolanta Miadlikowska, Duke University National Science Foundation