Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite.

Slides:



Advertisements
Similar presentations
Next-Generation Sequencing: Methodology and Application
Advertisements

Capturing the chicken transcriptome with PacBio long read RNA-seq data OR Chicken in awesome sauce: a recipe for new transcript identification Gladstone.
Whole Genome Sequencing &Crop Genetic Breeding Presentation: Wenhui Gao
Next-generation sequencing
History, protohistory and prehistory of the Arabidopsis thaliana chromosome complement Henry Yves et al 2006, in press.
Duplication, rearrangement, and mutation of DNA contribute to genome evolution Chapter 21, Section 5.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Some new sequencing technologies. Molecular Inversion Probes.
16 and 20 February, 2004 Chapter 9 Genomics Mapping and characterizing whole genomes.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
Evaluation of PacBio sequencing to improve the sunflower genome assembly Stéphane Muños & Jérôme Gouzy Presented by Nicolas Langlade Sunflower Genome Consortium.
Genome sequencing and assembling
Genome sequencing. Vocabulary Bac: Bacterial Artificial Chromosome: cloning vector for yeast Pac, cosmid, fosmid, plasmid: cloning vectors for E. coli.
High Throughput Sequencing
Copyright © 2005 Brooks/Cole — Thomson Learning Biology, Seventh Edition Solomon Berg Martin Chapter 9 Chromosomes, Mitosis, and Meiosis.
Next generation sequencing Xusheng Wang 4/29/2010.
Genome of Drosophila species Olga Dolgova UAB Barcelona, 2008.
CUGI Pilot Sequencing/Assembly Projects Christopher Saski.
PE-Assembler: De novo assembler using short paired-end reads Pramila Nuwantha Ariyaratne.
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Welcome to DNA Subway Classroom-friendly Bioinformatics.
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
Genomes & their evolution Ch 21.4,5. About 1.2% of the human genome is protein coding exons. In 9/2012, in papers in Nature, the ENCODE group has produced.
The iPlant Collaborative
Chapter 21 Eukaryotic Genome Sequences
BACTERIAL TRANSPOSONS
RNA-Seq Assembly 转录组拼接 唐海宝 基因组与生物技术研究中心 2013 年 11 月 23 日.
Genome-Wide Analysis of Transposon Insertion Polymorphisms (TIPs) Reveals Intraspecific Variation in Cultivated Rice.
Human Genome.
Anna Shcherbina Bioinformatics Challenge Day 01/10/2013 De novo assembly from clinical sample This work is sponsored by the Defense Threat Reduction Agency.
The Genome Assemblies of Tasmanian Devil Zemin Ning The Wellcome Trust Sanger Institute.
Copyright © 2008 Pearson Education, Inc., publishing as Pearson Benjamin Cummings PowerPoint ® Lecture Presentations for Biology Eighth Edition Neil Campbell.
De novo assembly validation
The Wellcome Trust Sanger Institute
SRB Genome Assembly and Analysis From 454 Sequences HC70AL S Brandon Le & Min Chen.
Genome Analysis Assaad text book slides only Lectures by F. Assaad can be downlaoded from muenchen.de/~farhah/index.htm.
LECTURE PRESENTATIONS For CAMPBELL BIOLOGY, NINTH EDITION Jane B. Reece, Lisa A. Urry, Michael L. Cain, Steven A. Wasserman, Peter V. Minorsky, Robert.
Canadian Bioinformatics Workshops
De Novo Assembly of Mitochondrial Genomes from Low Coverage Whole-Genome Sequencing Reads Fahad Alqahtani and Ion Mandoiu University of Connecticut Computer.
Molecular Tools for Detection of Plant Pathogenic Fungi ByMAZIN.S.SELMAN.
Risheng Chen et al BMC Genomics
Research Techniques Made Simple: Next-Generation Sequencing:
Lesson: Sequence processing
Fragaria vesca Herbaceous, perennial Genotypic diversity
Human Genome Project.
Sequence assembly Jose Blanca COMAV institute bioinf.comav.upv.es.
Dr. Christoph W. Sensen und Dr. Jung Soh Trieste Course 2017
Denovo genome assembly of Moniliophthora roreri
Professors: Dr. Gribskov and Dr. Weil
Very important to know the difference between the trees!
Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome.
Genomes and Their Evolution
Genomes and Their Evolution
Relationship between Genotype and Phenotype
John Rathjen and group ANU
Henrik Lantz - NBIS/SciLife/Uppsala University
Genome Projects Maps Human Genome Mapping Human Genome Sequencing
Novel PMS2 Pseudogenes Can Conceal Recessive Mutations Causing a Distinctive Childhood Cancer Syndrome  Michel De Vos, Bruce E. Hayward, Susan Picton,
2nd (Next) Generation Sequencing
Genome organization and Bioinformatics
Volume 10, Issue 11, Pages (November 2017)
lincRNAs: Genomics, Evolution, and Mechanisms
BIOL 433 Plant Genetics Term 2,
CSCI 1810 Computational Molecular Biology 2018
Introduction to Sequencing
Single-Molecule Sequencing: Towards Clinical Applications
Transposable Elements
Novel PMS2 Pseudogenes Can Conceal Recessive Mutations Causing a Distinctive Childhood Cancer Syndrome  Michel De Vos, Bruce E. Hayward, Susan Picton,
Henrik Lantz - NBIS/SciLifeLab/Uppsala University
Presentation transcript:

Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite gene clusters Dallery et al. 2017

Colletotrichum higginsianum Max Planck Institute Pathogenic fungus Affects brassica crops, such as Arabidopsis thaliana, in tropical and subtropical regions Important model pathosystem for looking at molecular basis of fungal pathogenicity and host response O’Connell et al. 2012

Rationale Affects crop yields Previous genome assembly was highly fragmented Looking for role of transposable elements (TEs) in gene and genome evolution Better understanding of genome structure of pathogenic fungi

Methods 10 μg genomic DNA → ~20kb size-selected library Sequenced on PacBio RS II platform De novo assembly with the Hierarchical Genome Assembly Process (HGAP) approach Reads filtered for min. 500bp length Genome consensus sequence polished with Quiver Assembly validated w/ PCR Illumina sequencing w/ 100 bp paired-end reads Used only to detect sequence polymorphisms

Even More Methods REPET pipelines to detect and classify TEs and simple sequence repeats Analysis of repeat-induced point mutations Gene predictions with MAKER2, SNAP, Augustus from Illumina reads Functional annotations via BLASTp and Blast2GO and predictions from SMURF, antiSMASH v.3.0, SMIPS, and CASSIS Phylogenetic analysis of secondary metabolism key genes (MEGA6 and Treedyn)

Final Methods Slide Analysis of distance of TEs to genes and gene clusters Segmental duplication analysis (SDDetector w/ PacBio unitigs) Transcriptome analysis (previous RNA-Seq data) Basically a bunch of experimental validation of transcriptome data

Genome Assembly 7.8 Gb of raw sequence reads 92,834 error-corrected reads N50 length 16,193 bp Final edited assembly = 28 unitigs (unitigs = high confidence contigs) 12 largest unitigs = chromosomes, 99.14% genome assembly Total length = 50.82 Mb Not actually gapless = gap on Chr 7 (liars)

Genome Assembly Genome assembly compared to previous 2009 assembly Assembly Statistics 2012 Assembly 2017 Assembly PacBio read coverage _ 133x Sanger read coverage 0.2x Illumina read coverage 76x 454 read coverage 25x Genome physical size 53.35 Mb Assembly length 49.05 Mb 50.72 Mb Alignable sequence 77.14 kb 50.38 Mb Number of contigs 10, 259 28 Largest contig 49.23 kb 6.04 Mb N50 contig length 6.15 kb 5.20 Mb Complete genes 2946 (79%) 3616 (97%)

Results 2699 MAKER2 genes match to previous gene models 2289 new genes w/ no match in previous annotation Includes 132/133 genes on Chr 12

Results Mini chromosomes 11 & 12 have half the gene content of the ‘core’ chromosomes Lower gene expression Much higher TE content

Results

Results Secondary Metabolite (SM) Gene Clusters

Results Genes in SM clusters + genes encoding candidate secreted effector proteins were found significantly closer to TEs than random genes over whole genome Many copies of large TEs

Results Found 6 segmental duplications 4 of these are at chrom ends and/or regions of highly similar repeats

Actually Cool Results Some TE families subject to Repeat-Induced Point (RIP) mutations Occurs during meiosis (sexual reproduction) This fungus is asexual RIP occurred either during ancestral sexual state or there is cryptic meiosis happening ~30% TEs appear active 60% of expressed SM clusters only during plant infection

Conclusions A complete genome assembly is key to analysis of TEs, teleomeres, structural rearrangements, and large gene clusters The mini-chromosomes differ dramatically from the core genome in gene and repeat content Resemble conditionally dispensable chroms. Pathogenicity-related (?) genes Repeat-mediated segmental duplication likely accelerated the pathogenicity-related gene evolution, e.g. ectopic recombination SM gene cluster inventory will help to ID novel bioactive molecules and their biosythetic pathways

Questions Would the Illumina library have helped the genome assembly if it had been included? Are unitigs as good as scaffolds when used in their place? If the fungus lost its mini-chromosomes, would it be significantly less pathogenic?