From one to many: expanding the Saccharomyces cerevisiae reference genome panel Stacia R. Engel Stanford University.

Slides:



Advertisements
Similar presentations
EAnnot: A genome annotation tool using experimental evidence Aniko Sabo & Li Ding Genome Sequencing Center Washington University, St. Louis.
Advertisements

© Wiley Publishing All Rights Reserved. Using Nucleotide Sequence Databases.
Integrating dbSNP with P. falciparum genome resources.
From Genes to Genomes: Concepts and Applications of DNA Technology, Jeremy W. Dale, Malcolm von Schantz and Nick Plant. © 2012 John Wiley & Sons, Ltd.
MainLabMeeting_PingZheng_ Ran the fgenesh on the large contigs from the matina_1_6_RNA dataset and performed BLAST the Putative genes against.
January 25, Current and Future Database (CH)  Indexing vgd_common (JM; 1Q)  Fully implement Taxonomy tables (JO, DD; 2Q)  Allow subspecies-level.
BME 130 – Genomes Lecture 7 Genome Annotation I – Gene finding & function predictions.
Biological Databases Chi-Cheng Lin, Ph.D. Associate Professor Department of Computer Science Winona State University – Rochester Center
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Reminder: Class on Friday, Discussion of Li et al. Proposal/Projects CAMERA feedback?
UCSC Known Genes Version 3 Take 10. Overall Pipeline Get alignments etc. from database Remove antibody fragments Clean alignments, project to genome Cluster.
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
Genome Annotation BCB 660 October 20, From Carson Holt.
Login: BITseminar Pass: BITseminar2011 Login: BITseminar Pass: BITseminar2011.
Chapter 6 Gene Prediction: Finding Genes in the Human Genome.
Sequence Analysis with Artemis & Artemis Comparison Tool (ACT) South East Asian Training Course on Bioinformatics Applied to Tropical Diseases (Sponsored.
Phenotypes at the Saccharomyces Genome Database
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Abstract Although transposable elements (TEs) were discovered over 50 years ago, the robust discovery of them in newly sequenced genomes remains a difficult.
Rice Sequence and Map Analysis Leonid Teytelman. Rice Genome Annotation Sequence Alignments Automation Comparative Maps Genetic Marker Correspondences.
Genome Annotation and Databases Genomic DNA sequence Genomic annotation BIO520 BioinformaticsJim Lund Reading Ch 9, Ch10.
Arabidopsis Genome Annotation TAIR7 Release. Arabidopsis Genome Annotation  Overview of releases  Current release (TAIR7)  Where to find TAIR7 release.
SAGExplore web server tutorial for Module II: Genome Mapping.
Use cases for Tools at the Bovine Genome Database Apollo and Bovine QTL viewer.
Steps in a genome sequencing project Funding and sequencing strategy source of funding identified / community drive development of sequencing strategy.
Common Errors in Student Annotation Submissions contributions from Paul Lee, David Xiong, Thomas Quisenberry Annotating multiple genes at the same locus.
Apollo Future Plans Nomi Harris, BDGP/FlyBase GMOD Meeting, Cambridge April 27, 2004.
Organizing information in the post-genomic era The rise of bioinformatics.
SAGExplore web server tutorial for Module I: Genome Explore.
My CoGe Comparing our genomes. Background and Introduction  Decreases in sequencing costs, coupled with increases in speed have paved the way for “Personal.
Analysis of GEO datasets using GEO2R Parthav Jailwala CCR Collaborative Bioinformatics Resource CCR/NCI/NIH.
Variation data in VectorBase NIH/NIAID VectorBase site visit March 2015.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Today Elements of complex genomes Protein domains and exon shuffling
Basic Biology for Bioinformatics: genes as information The central dogma of molecular genetics DNA to RNA to protein to phenotype Protein functions, synthesis.
Analyzing digital gene expression data in Galaxy Supervisors: Peter-Bram A.C. ’t Hoen Kostas Karasavvas Students: Ilya Kurochkin Ivan Rusinov.
SAGExplore web server tutorial. The SAGExplore server has three different modules …
SRB Genome Assembly and Analysis From 454 Sequences HC70AL S Brandon Le & Min Chen.
Supplementary Figure S1. Supplementary Figure S2.
Genome representation and variant identification Deanna M. Church, NCBI.
454 Genome Sequence Assembly and Analysis HC70AL S Brandon Le & Min Chen.
NCBI: something old, something new. What is NCBI? Create automated systems for knowledge about molecular biology, biochemistry, and genetics. Perform.
Case study: Saccharomyces cerevisiae grown under two different conditions RNAseq data plataform: Illumina Goal: Generate a platform where the user will.
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
Case study: Saccharomyces cerevisiae grown under two different conditions RNAseq data plataform: Illumina Goal: Generate a platform where the user will.
Entrez, dbSNP, GEO, OMIM & LinkOut JanPlan Entrez Distributed by NCBI in 1991 on CD-ROM Included linked nodes: GenBank & PDB Translated GenBank,
Introduction to Genes and Genomes with Ensembl
The NCBI Annotation Pipeline
Down Syndrome: Genes, Model Systems, and Progress towards Pharmacotherapies and Clinical Trials for Cognitive Deficits Cytogenet Genome Res 2013;141:
5' breakpoint in intron 2 (chr19:1,219,187-1,219,238 shown)
Department of Genetics • Stanford University School of Medicine
Curated Protein Information in the Saccharomyces Genome Database
Functional Annotation of the Horse Genome
Today… Review a few items from last class
Summary PA14 Genome Sequencing Project Pseudomonas syringae update
Using MATLAB to identify genes in novel genomes based on homology
GO Annotation from different sources
Strategies for annotation of a genome
Ensembl Genome Repository.
Phylogenetic footprinting and shadowing
Relationship between Genotype and Phenotype
AGEseq: Analysis of Genome Editing by Sequencing
Yating Liu July 2018 G-OnRamp workshop
Stop that Noise and Turn Up the Antisense Transcription
.1Sources of DNA and Sequencing Methods 2 Genome Assembly Strategy and Characterization 3 Gene Prediction and Annotation 4 Genome Structure 5 Genome.
BRC Science Highlight Yeast evolved for enhanced xylose utilization reveal interactions between cell signaling and Fe-S cluster biogenesis Objective Obtain.
Core genome phylogeny of V. anguillarum strains.
Common Errors in Student Annotation Submissions contributions from Paul Lee, David Xiong, Thomas Quisenberry Annotating multiple genes at the same locus.
Venn diagram of the distribution of protein CDSs inferred from the genomes of A. hydrophila ATCC 7966T, E1, and E2. Venn diagram of the distribution of.
Presentation transcript:

From one to many: expanding the Saccharomyces cerevisiae reference genome panel Stacia R. Engel Stanford University

From one to many… 1996: First yeast genome 2006: 2nd yeast genome 2016: 1000s of genome sequences

Expansion strategy Freeze 1996 genome Represent sequence variation Comparison tools for users Phenotypes, allelic differences Obtain select genome sequences Assembly / annotation pipeline Panel of genomes

www.yeastgenome.org Figure 1 Automated AGAPE output Song et al. PLoS One 10:e0120671. Figure 1 Giltae Song Automated AGAPE output www.yeastgenome.org

www.yeastgenome.org Figure 1 Automated AGAPE output Song et al. PLoS One 10:e0120671. Figure 1 Giltae Song Automated AGAPE output www.yeastgenome.org

Curation Expansion strategy Freeze 1996 genome Represent sequence variation Comparison tools for users Phenotypes, allelic differences Obtain select genome sequences Assembly and annotation pipeline Panel of genomes Curation www.yeastgenome.org

www.yeastgenome.org Figure 1 Manual curation Phase 1 Phase 2 Song et al. PLoS One 10:e0120671. Figure 1 Manual curation Starts and stops Multiple calls Introns Paralogs Superfluous contigs Phase 1 Chromosomal elements RNA genes Supercontigs Omissions Phase 2 Giltae Song Automated AGAPE output Unmatched contig sequences Legend: added removed edited resolved annotations www.yeastgenome.org

Curation strategy Starts and stops Multiple calls Paralogs RNA genes Chromosomal elements Superfluous contigs Unmatched Omissions Legend: added removed edited resolved

<2% <1% 15% 2% 1/2 18% 80% of ORFs 5% www.yeastgenome.org Manual curation Olivia Lang Starts and stops Multiple calls Introns Paralogs Superfluous contigs Phase 1 Chromosomal elements RNA genes Supercontigs Unmatched Omissions Phase 2 <2% Automated AGAPE output <1% 15% 2% 1/2 18% 80% of ORFs 5% Sept. 2014 Sept. 2015 work in progress Legend: added removed edited resolved www.yeastgenome.org

Boundary differences www.yeastgenome.org

Superfluous contigs Large number of redundant contigs (~50%) Strain  Original set Curated set CEN.PK 389 189 D273-10B 403 203 FL100 402 174 JK9-3d 431 197 RM11-1a 325 169 SEY6210 366 183 Σ1278b 451 206 W303 415 236 X2180-1A 409 212 Y55 413 198 Large number of redundant contigs (~50%) Unnecessarily complicate annotation Removed from sequence files No genes called Short overall length Ambiguous sequence www.yeastgenome.org

www.yeastgenome.org Figure 1 Manual curation Phase 1 Phase 2 Song et al. PLoS One 10:e0120671. Giltae Song Figure 1 Manual curation Olivia Lang Starts and stops Multiple calls Introns Paralogs Superfluous contigs Phase 1 Chromosomal elements RNA genes Supercontigs Omissions Phase 2 Automated AGAPE output Unmatched contig sequences Legend: added removed edited resolved annotations www.yeastgenome.org

Future directions… Incorporate into database Submit to NCBI’s GenBank Curated sequence files, annotations Submit to NCBI’s GenBank Primary sequence repository Scripts on GitHub Updates as needed Expand panel further Emerging, underserved areas of study

Curation adds value www.yeastgenome.org Olivia Lang Gail Binkley Shuai Weng Giltae Song J. Michael Cherry Pedro Assis Sage Hellerstedt Kalpana Karra Kevin MacPherson Stuart Miyasato Rob Nash Travis Sheppard Matt Simison Marek Skrzypek Edith Wong www.yeastgenome.org