Overview. What is Annotation? Annotation is the process of determining the location and function of all identifiable genes in a genome. Annotation is.

Slides:



Advertisements
Similar presentations
INSTRUCTIONS This is the BIOL375 class of These are the students currently working with Dr. Scott on the Meiothermus ruber genome annotation.
Advertisements

Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
© Wiley Publishing All Rights Reserved. How Most People Use Bioinformatics.
Bioinformatics and the Engineering Library ASEE 2008 Amy Stout.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Department of Biology Core Courses for Majors Bio 114Organisms Bio 124Ecology and Evolution Bio 214Cell and Molecular Biology Bio 224Genetics and Development.
Systems Biology Biological Sequence Analysis
National Microbial Pathogen Data Resource About us NMPDR is a Bioinformatics Resource Center dedicated to the thorough understanding of core.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
Genome Evolution: Duplication (Paralogs) & Degradation (Pseudogenes)
Microbial Genomes Features Analysis Role of high-throughput sequencing Yeast - the eukaryotic model microbe Databases –TIGR CMR –NCBI Microbial Genomes.
Enzymatic Function Module (KEGG, MetaCyc, and EC Numbers)
Annotation Presentation Alternative Start Codons &
GTL User Facilities Facility II: Whole Proteome Analysis Michelle V. Buchanan.
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
Aequatus Browser, an open-source web-based tool developed at TGAC to visualise homologous gene structures among differing species or subtypes of a common.
What a Great Time to Teach & Do Research with Undergrads!! So much data! *** So much for me & my students to do!!!*** So many questions! So many great.
Metagenomic Analysis Using MEGAN4
On-Line Service Voucher Log (SVL) Overview. To Be On-line You Must… Be an Enrolled ABC Child Care Provider Have internet access Have Web Browser Internet.
Genome-scale Metabolic Reconstruction and Modeling of Microbial Life Aaron Best, Biology Matthew DeJongh, Computer Science Nathan Tintle, Mathematics Hope.
Advancing Science with DNA Sequence Data Curation in IMG-ER Natalia Ivanova MGM Workshop May 16, 2012.
Functional Linkages between Proteins. Introduction Piles of Information Flakes of Knowledge AGCATCCGACTAGCATCAGCTAGCAGCAGA CTCACGATGTGACTGCATGCGTCATTATCTA.
Protein analysis and proteomics (Part 2 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
Functional Associations of Protein in Entire Genomes Sequences Bioinformatics Center of Shanghai Institutes for Biological Sciences Bingding.
T-COFFEE Multiple Alignments of Orthologous Sequences Horizontal Gene Transfer (Phylogenetic Trees) WebLogo.
Lab Reports. Wrapping up IMG-ACT Genome Annotation Online notebook should be completed for all 3 genes Final reports are comprised of the imgACT online.
PRODUCT BRIEFING Call us on IRRV Distance Learning Introducing the new online service.
Pathway Assignments. The assignment – Annotating Pathways KEGG Pathway Database.
 GEP Digital Laboratory Notebook Nick Reeves, Mt. San Jacinto Community College.
Introduction to Bioinformatics CPSC 265. Interface of biology and computer science Analysis of proteins, genes and genomes using computer algorithms and.
Introduction to eChalk For Students. What is eChalk? eChalk’s unique online learning environment provides your school with its own electronic “town square”
Identify gene markers for different taxonomic groups in Archaea and Bacteria Genomes Dongying Wu 1,2, Jonathan A. Eisen 1,2 1. DOE Joint Genome Institute,
Copyright OpenHelix. No use or reproduction without express written consent 2 Overview of Genome Browsers Materials prepared by Warren C. Lathe, Ph.D.
BASys: A Web Server for Automated Bacterial Genome Annotation Gary Van Domselaar †, Paul Stothard, Savita Shrivastava, Joseph A. Cruz, AnChi Guo, Xiaoli.
The BioCyc Collection of Pathway/Genome Databases Alexander Shearer Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
Advancing Science with DNA Sequence Undergraduate Genomics in a Research University Environment A Collaborative Effort between the JGI and UC Merced M.
COURSE OF BIOINFORMATICS Exam_31/01/2014 A.
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
An Introduction to the Inquiry Page and the Biology Workbench An Introduction to the Inquiry Page and the Biology Workbench Anu Murphy Department of Molecular.
Sequence-based Similarity Module (BLAST & CDD only ) & Horizontal Gene Transfer Module (Ortholog Neighborhood & GC content only)
Organizing information in the post-genomic era The rise of bioinformatics.
Professional Development Course 1 – Molecular Medicine Genome Biology June 12, 2012 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
To create an Endnote Online account Step 1 : From Library home page, click on Databases tab.
Protein and RNA Families
Microbial Genome Annotation by Undergraduates Michael Sierk Saint Vincent College NIBLSE Workshop April 17, 2014.
P HYLO P AT : AN UPDATED VERSION OF THE PHYLOGENETIC PATTERN DATABASE CONTAINS GENE NEIGHBORHOOD Presenter: Reihaneh Rabbany Presented in Bioinformatics.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Bioinformatics Lecture to accompany BLAST/ORF finder activity
Western New York Genetics in Research Partnership Expanding Exposure, Career Exploration and Interactive Projects in Basic Genome Analysis and Bioinformatics.
The iPlant Collaborative Vision Enable life science researchers and educators to use and extend cyberinfrastructure.
Introduction to biological molecular networks
The iPlant Collaborative Vision Enable life science researchers and educators to use and extend cyberinfrastructure.
Joanna Klein, Ph.D. Northwestern Scholarship Symposium May 10, 2013.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
2006 ICAR: TAIR workshop Organizers: Katica Ilic and Peifen Zhang Location: Reception Room, 4th floor A general overview of TAIR website and demonstration.
Finding genes in the genome
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
SRI International Bioinformatics Selected PathoLogic Refining Tasks Creation of Protein Complexes Assignment of Modified Proteins Operon Prediction.
NCBI: something old, something new. What is NCBI? Create automated systems for knowledge about molecular biology, biochemistry, and genetics. Perform.
COURSE OF BIOINFORMATICS Exam_30/01/2014 A.
` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.
Bioinformatics What is a genome? How are databases used? What is a phylogentic tree?
The Integrated Microbial Genome (IMG) systems
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
Overview of Microbial Pathway and Genome Databases
Annotation Presentation
General overview of the bioinformatic pipelines for the 16S rRNA gene microbial profiling and shotgun metagenomics. General overview of the bioinformatic.
Presentation transcript:

Overview

What is Annotation? Annotation is the process of determining the location and function of all identifiable genes in a genome. Annotation is an important part of bioinformatics whole-genome shotgun sequencing provides the raw material annotation provides an interpretation of the sequencing results

Figure 1 from Stothard & Wishart (2006) Automated bacterial genome analysis and annotation. Current Opinion in Microbiology 9: Verify predicted function based on amino acid sequence homology 2 Predict protein structure and localization 1.Find start and stop codons – separated by bp? 2.Find Shine-Dalgarno sequence (RBS) – upstream of start codon? 3.Find core promoter – consensus sequences for -10 & -35? 4.Find rho-independent terminator 5.Predict whether the gene could be organized into an operon – compare chromosomal neighborhood

What will we be doing? Verifying ORF calls Verifying function based on sequence conservation Verifying function based on structural conservation Verifying function based on localization data (insert image of E. coli lac permease) Insert Figure 8-40 from Microbiology – An Evolving Science © 2009 W.W. Norton & Company, Inc.

Why manually annotate? Automated annotations tend to over-predict….produce many false-positives Automated annotations also miss things…. Accuracy of any annotation is only as good as the quality of annotated genes in reference databases High sequencing error rates... A curated, finished genome has gene calls verified & proteins organized into pathways

Undergraduates provide “human expertise” GOAL: Demonstrate that student annotations can be accurate, up-to-date, reliable, and useful to scientific community! Possible solutions? Reference paper: Genome re-annotation: a wiki solution? by Steven Saltzberg Genome Biology (2007), 8:102

What is imgACT? - Web portal to access genome database, img/edu - Contains wiki-based Lab Notebook & Report Page for organizing annotation data

What is img/edu? - Simplified database for undergraduate genome annotation - Features and functions similar to that found in IMG - Directly linked to imgACT Click! IMG companion system

What is IMG? INTEGRATED MICROBIAL GENOMES (IMG) -Database managed by the U.S. Department of Energy (DOE) Joint Genome Institute (JGI) -JGI currently producing ~ 22% of the reported number of bacterial genome projects worldwide -Key mission of IMG is to provide a data management platform that supports comprehensive analysis and annotation of all publicly available genomes in a comparative genomics context

What are we annotating? (insert information about organism including location/map of collection site, image and description of organism, etc.)

Why annotate a GEBA organism? Phylogenetic tree of Bacteria showing established & candidate phyla Insert Figure 1 from Handelsman (2004) Microbiol. Mol. Biol. Rev. 68:  Note that genome sequences from members of those phyla in yellow and orange are under-represented relative to those in red  GEBA (Genomic Encyclopedia of Bacteria and Archaea) goal is to sequence genomes from under- represented phyla

What is our goal? Insert Figure 2 from Scott KM et al. (2006) The Genome of Deep-Sea Vent Chemolithoautotroph Thiomicrospira crunogena XCL-2. PLoS Biology, 4: 2196 Annotate genes in pathways & complexes

Student Goals: Conceptual Apply basic concepts in biochemistry, microbial physiology & ecology, and evolutionary biology Question basic assumptions about biochemistry, physiology and evolution Understand the power and limitations of bioinformatics

Proficiently use multiple database analysis software packages Strengthen web-based library search skills (Pubmed) Develop skills creating hypotheses and designing experiments to test them Sharpen skills in analysis, synthesis and presentation of results and data interpretation Experience the collaborative nature of science Student Goals: Technical

Each team will annotate genes encoding enzymes in a metabolic pathway or components of a cellular complex in [insert organism name] Your T.A. or instructor will tell you specific assignments Consult KEGG map and use orthologous gene in other related organisms to query the genome of [insert organism name] in IMG/EDU database For best “hit”, complete the corresponding modules of imgACT lab notebook and lab report for that gene Complete the module(s) presented each week. The imgACT online notebook & report for Modules #1 – 8 must be finished for all genes assigned (3 per student). Annotation Project

Assignments Online notebook checks end of weeks: Final Report due dates: Annotation Project

Click “Create an account” How do we get started?

address First Name Pick something you can remember Specific for our class Click “Register” once information entered Register for an img-act account Last Name No abbreviations or nicknames xxxxxxxxxxxxxxxxxxxxxxxxxx

Once registration complete, log in to imgACT

What you should see... If you can’t get this far, tell your instructor immediately! Winter 2010

Next, take pre-annotation survey Cookies must be enabled for survey to work properly.

What next? Practice! Explore the imgACT web portal All students will be assigned at least one gene, which should be used to navigate through the imgACT online lab notebook (Modules #1 – 8) and the lab report Note that students are not responsible for annotating this gene. It may be used to help students get used to navigating the web portal. “Practice gene”

click

imgACT Lab Notebook The first time you log in to Lab Notebook, you will also need to log in to the wiki. Use the same username & password as created for imgACT account.

imgACT Lab Notebook Only responsible for Modules #1 – 8 in this class

imgACT Lab Report Correspond to modules in Lab Notebook To be completed at end of the quarter