New and old regions for Barcoding Teun Boekhout CBS
Problems with Mitochondrial Genomes Taphrinomycotina versus Saccharomycotina versus Pezizomycotina Cryptococcus gattii
Three published scenario’s on fission yeasts Most nuclear genes No support Mitochondrial support Saccharomycotina Taphrinamycotina Saccharomycotina Pezizomycotina Saccharomycotina Pezizomycotina Taphrinamycotina Pezizomycotina Taphrinamycotina (1)(2) (3)
Cryptococcus gattii 4 monophyletic lineages (phylogenetic species) 6 nuclear concordant nuclear gene phylogenies Mitochondrial versus nuclear genes incongruent [ATP 6 & MtLrRNA] vs [CNLAC, TEF1α, RPB1, RPB2, ITS, IGS1] Mitochondrial loci incongruent Mitochondrial recombination
10 CBS6998 CBS1622/CBS6992 CBS A/56A CBS8273/RV20186 T CBS6289 WM830 WM276 WM176 CBS883 T CBS7748 E566 CBS7229 T /RV54130 CBS919 T WM179 R Mitochondrion Nucleus blue = AFLP4 green = AFLP6 purple = AFLP4 like yellow = AFLP6 like red = chimera AFLP4/AFLP6 ATP6 MtLrRNA M1 M3 M2 M5 M4 ATP6 haplotype 3 MtLrRNA haplotype 2 ATP6 haplotype 4 MtLrRNA haplotype 2 ATP6 haplotype 1 MtLrRNA haplotype 2 ATP6 haplotype 2 MtLrRNA haplotype 1 ATP6 haplotype 2 MtLrRNA haplotype 4 Nuclear AFLP/MLST 4 with varying mitochondria
100 1 WM714 H CBS6886 JEC20 CBS7229 RV54130 WM WM276 WM830 RAM2 ICB184 A1MR265 A1MR406 A1M368 CBS1930 CBS6956 CBS8684 WM178 A1MR409 HEC11102 A1MF2866 A1MF2932 A1MR269 A1MR271 CBS7750 WM179 E566 CBS883 CBS7748 CBS919 CBS1622 CBS6992 CBS A CBS A CBS6289 CBS6998 RV20186 CBS6993 WM161 WM728 CBS5758 CBS8755 WM C CBS6955 CN043 CBS C B5748 B5742 M27055 WM Cryptococcus neoformans AFLP6 VGII MLST 6 AFLP7 VGIV MLST7 AFLP5 VGIII MLST5 AFLP4 VGI MLST4 AFLP4 VGI MLST4 0.1 ICB184 HEC11102 CBS1930 CBS6956 CBS7750 CBS8684 A1MR269 A1MR368 WM178 A1MR406 A1MR409 A1MF2932 A1MR265 A1MR271 A1MF2866 RAM2 WM179 E566 CBS1622 CBS6992 CBS7229 RV54130 WM WM276 WM830 CBS6998 RV20186 CBS A CBS8273 CBS A WM779 M27055 B5748 B5742 CBS919 CBS883 CBS7748 CBS C CBS6955 CBS C WM161 CBS5758 WM726 CBS8755 CN043 WM728 CBS6886 JEC20 WM714 H ATP6MtLrRNA (M2) (M3) (M4) (M5) (M1) (M2) (M3) (M4) (M5) a b
The ‘old’ alternative rDNA universally present ITS + D1/D2 LSU rDNA (Kurtzman > 20years) Basidio + Asco yeasts Yeast book chapters: both needed Clinical yeasts (C. glabrata / C. nivariensis / C. bracarensis) Large datasets available Proven useful for ID (Kurtzman, Fell, CBS, et al.) Luminex technology (ID, M. Diaz, Miami) Cryptococcus, Trichosporon, Candida, Malassezia
Alternatives 2 ‘Unique’ fungal pathways E.g. Ergosterol pathway (membranes) Concatenated ERG genes resolve TOL ERG7 (lanosterol synthase) fungi/animals ERG1 (squalene epoxidase) fungi/animals/plants ERG11 (lanosterol C14 demethylase) fungi/animals
Alternatives 3 Comparative Genomics 33 Fungal Genomes 4852 KOGs 70 single protein Kogs (paralogs!) 32 gene KOs not viable (S. cerevisae) > essential genes Reference tree based on 531 KOGs KOG2671 showed highest correlation (0.97) with 531 KOG tree Cophenetic correlation between individual protein NJ trees > 0.5 > 64 KOGs remaining Development phylogenetic signal
64 concatenated proteins 0.1 Sac. cerevisiae RM11-1a Sac. cerevisiae S288c Sac. paradoxus Sac. mikatae Sac. kudriavzevii Sac. bayanus Sac. castellii Can. glabrata Ash. gossypii Kluyveromyces lactis Sac. kluyveri Can. guilliermondii Debaromyces hansenii Can. albicans Can. lusitaniae Yarrowia lipolytica Chaetomium globosum Neurospora crassa Magnaporthe grisea Fusarium graminearum Sclerotinia sclerotiorum Botritys cinerea Asp. fumigatus Asp. nidulans Coccidioides immitis Stagonospora nodorum Schizosaccharomyces pombe Coprinopsis cinereus Phanerochaete chrysosporium Cry. neoformans var. neoformans JEC21 Cry. neoformans var. grubii H99 Ustilago maydis Rhizopus oryzae Caenorhabditis elegans I II III IA IB IC IIA IIB IB1 IB2 IB3 IB4 IIIA * * * * * * * * * * * * * * * * * * * * * * * * * * * * > 80%
Development of phylogenetic signal Mainly Information storage and Processing & Cellular processes and signaling
Potential Barcoding candidates KOG2671 Putative RNA methylase KOG0340 ATP-dependent RNA helicase KOG4089 Predicted mitochondrial ribosomal protein L23 KOG S proteasome KOG2728 Uncharacterized conserved protein Etc. Top 5 orthologues with high discriminatory potential
My suggestions There may be alternative genes D1/D2 + ITS 1+2! Panfungal / Paneukaryotic / Single copy / PCRable Need to be investigated further for Barcoding (and TOL) potential Option: Multilocus Barcoding ? Webtool: Different loci per (higher) taxon Selection depends on taxonomic resolution