` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
HCS806 “Methods in Horticulture and Crop Science” Introduction to methods in Bioinformatics for plant science. David Francis (Coordinator) Ian Holford.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Gene Ontology John Pinney
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Semantic Similarity over the Gene Ontology F. M. Couto, M. J. Silva, P. M. Coutinho Family Correlation and Selecting Disjunctive Ancestors
1 Using Gene Ontology. 2 Assigning (or Hypothesizing About) Biological Meaning to Clusters What do you want to be able to to? –Identify over-represented.
COG and GO tutorial.
DI FC UL1 Gene Function Prediction by Mining Biomedical Literature Pooja Jain Master in Bioinformatics Supervisor - Mário Jorge Costa Gaspar.
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis- part 2.
09 / 23 / Predicting Protein Function Using Machine-Learned Hierarchical Classifiers Roman Eisner Supervisors: Duane Szafron.
Internet tools for genomic analysis: part 2
POC tutorial #1: Introduction This tutorial will run automatically in Quicktime. To run the tutorial at your own pace use the internal controllers within.
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
Claire O’Donovan EMBL-EBI. In UniProtKB, we aim to provide… o A high quality protein sequence database A non redundant protein database, with maximal.
Semantic Similarity over Gene Ontology for Multi-label Protein Subcellular Localization Shibiao WAN and Man-Wai MAK The Hong Kong Polytechnic University.
BASys: A Web Server for Automated Bacterial Genome Annotation Gary Van Domselaar †, Paul Stothard, Savita Shrivastava, Joseph A. Cruz, AnChi Guo, Xiaoli.
The BioCyc Collection of Pathway/Genome Databases Alexander Shearer Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
I529: Lab5 02/20/2009 AI : Kwangmin Choi. Today’s topics Gene Ontology prediction/mapping – AmiGo –
Part I: Identifying sequences with … Speaker : S. Gaj Date
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
EBI is an Outstation of the European Molecular Biology Laboratory. Avazeh Ghanbarian Paul Kersey Alessandro Vullo EBI Microme Annotation Meeting June 2011.
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
Organizing information in the post-genomic era The rise of bioinformatics.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Integrating the Cell Cycle Ontology with the Mouse Genome Database David R. Smith Mary Dolan Dr. Judith Blake.
Genomics and Proteomics. DNA sequencing: dideoxy chain termination.
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
Copyright OpenHelix. No use or reproduction without express written consent1.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
PROGNOCHIP-BASE, FORTH-ICS 1 PrognoChip-BASE: An Information System for the Management of Spotted DNA MicroArray Experiments Extension of BASE v
Statistical Testing with Genes Saurabh Sinha CS 466.
Clustering Algorithms to make sense of Microarray data: Systems Analyses in Biology Doug Welsh and Brian Davis BioQuest Workshop Beloit Wisconsin, June.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
A collaborative tool for sequence annotation. Contact:
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Computer Science Ph. D. Seminar Gene Ontology (GO) Based Search for Protein Structure Similarity Clustering Metrics Ph.D. Candidate Steve Johnson Committee.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
Biomax Informatics AG Bioinformatics designed with you in mind. FunCat TM, a controlled vocabulary encompassing the biology of prokaryotes, plants and.
An example of GO annotation from a primary paper Rebecca E. Foulger (UniProt Curator) GO Annotation Camp, June 2005 PMID:
InterPro Sandra Orchard.
BIOL 433 Plant Genetics Term 2, Instructors: Dr. George Haughn Dr. Ljerka Kunst BioSciences 2239BioSciences Tel
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
NCBI: something old, something new. What is NCBI? Create automated systems for knowledge about molecular biology, biochemistry, and genetics. Perform.
Gene Ontology TM (GO) Consortium
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
Networks and Interactions
Biological Databases By: Komal Arora.
BIOL 433 Plant Genetics Term 2,
Statistical Testing with Genes
Department of Genetics • Stanford University School of Medicine
Functional Annotation of the Horse Genome
Modified from slides from Jim Hu and Suzi Aleksander Spring 2016
Genome Annotation Continued
Overview Gene Ontology Introduction Biological network data
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
A User’s Guide to GO: Structural and Functional Annotation
Ensembl Genome Repository.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
BIOL 433 Plant Genetics Term 2,
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Introduction to Alternative Splicing and my research report
Statistical Testing with Genes
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY

Why Genome Annotation? Annotation of a genome entails identification of genes in terms of precise start and end sites and description of cellular components, molecular functions and biological processes. Annotation forms the basis for extrapolation of gene functions based on similarity between protein sequences.

Gene Basic hereditary unit of a living organism Genome Entire complement of all genes Genome Annotation Process of attaching biological information to genome sequence Two main steps: 1) Gene Finding 2) Attaching biological information

Gene Sequence l DNA Gene Annotation By Ontology l Gene Ontology is described by a defined library of terms related to the cellular component, biological process and molecular functions of a gene in an organism. l

Increase in the wealth of the genomic data has led to the development of tools that assist in processing information about genes, their products and functions. Genome Ontology Important tool for annotation that uses Controlled vocabulary Hierarchy of terms

What is a cell ?

Why E.coli? E.coli K12 Primary Model Bacterial Organism Small genome size Much functional information Genome of E.coli K12 was sequenced in 1997.

The goal of this project was to compare automated annotation programs with validated hand curation.

Databases compared Ecocyc and BASys Ecocyc Most complete and standard Multi-dimensional annotation BASys Bacterial Annotation Systems Automated genome annotation system Built around 30 programs

Experimental Procedure Downloaded GO files and extracted non-redundant gene Id’s and GO terms using a program written in Mathematica.

4200 Genes estimated in E.coli of them have gene Id's and ontology's assigned by Ecocyc. Programs such as BASys that automatically annotate genes and assign Gene Ontology's have Id's and ontology's assigned by BASys. RESULTS

How many genes with Ontology from BASys were validated in Ecocyc? 253 genes in both Ecocyc and BASys have common ontology's.

Total GO Numbers in BASys and Ecocyc (2.8%) common to both databases (59.6%) unique to BASys (26.7%) unique to Ecocyc (10.88%) of them were non-assigned and are true negatives.

Conclusions 14% of BASys ontology assignments were validated with Ecocyc. But it missed 70% of validated annotations. BASys has liberal assignments and Ecocyc has more conservative assignments Liberal assignments tell us the direction to approach validating bench work.

REFERENCES : 1. Multi-dimensional annotation Of Escherichia coli K-12 Genome. Nucleic Acids Research, October 2007, Vol.35, BASys: a web server for automated bacterial genome annotation Nucleic Acids Research, April 2005, Vol.33, W455-W The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Research, 2004, Vol. 32. D262-D Functional Annotation of Arabidopsis Genome Using Controlled Vocabularies. Plant Physiology, June 2004, Vol. 135,

Acknowledgements Dr. Claire Rinehart

Thank you