Department of Genetics • Stanford University School of Medicine

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

Applications of GO. Goals of Gene Ontology Project.
Modeling Functional Genomics Datasets CVM Lesson 3 13 June 2007Fiona McCarthy.
Rama Balakrishnan Saccharomyces Genome Database Stanford University
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Collaboration with IntAct and InterMine: SGD Rama Balakrishnan Saccharomyces Genome Database Gene Ontology Consortium Stanford University, CA USA.
Ontology annotation: mapping genomic regions biological function Paul D Thomas, Huaiyu Mi and Suzanna Lewis.
Gene Ontology John Pinney
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Introduction of bioinformatics and Biological Database 高雄醫學大學 生物醫學暨環境生物學系 助理教授 張學偉 2006/08/08.
COG and GO tutorial.
Internet tools for genomic analysis: part 2
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Using The Gene Ontology: Gene Product Annotation.
July 2015 CSHL Data analysis: GO tools and YeastMine, use-case examples.
New data and tools at TAIR (The Arabidopsis Information Resource)
Linking Diseases and Genes through Informatics Knowledge Bases and Ontologies Joyce A. Mitchell, Ph.D. National Library of Medicine University of Missouri.
Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview.
July 2015 CSHL Navigating data at the Saccharomyces Genome Database Rob Nash, Senior Biocuration Scientist
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
Adding GO for Large Datasets COST Functional Modeling Workshop April, Helsinki.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Gramene Objectives Provide researchers working on grasses and plants in general with a bird’s eye view of the grass genomes and their organization. Work.
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
The Gene Ontology and its insertion into UMLS Jane Lomax.
Copyright OpenHelix. No use or reproduction without express written consent1.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Copyright OpenHelix. No use or reproduction without express written consent1.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
ARGOS (A Replicable Genome InfOrmation System) for FlyBase and wFleaBase Don Gilbert, Hardik Sheth, Vasanth Singan { gilbertd, hsheth, vsingan
S. pombe Unicellular archiascomycete Diverged from S. cerevisiae Ma Size ~14 Mb, 3 chromosomes No synteny Data stored in GeneDB.
Lisa Matthews, 1 Esther Schmidt, 2 Suzanna Lewis, 3 David Croft, 2 Bernard de Bono, 2 Peter D'Eustachio, 1 Marc Gillespie, 1 Gopal Gopinath, 1 Bijay Jassal,
Copyright OpenHelix. No use or reproduction without express written consent1.
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
Protein databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen and from CSC bio-opas
Joined up ontologies: incorporating the Gene Ontology into the UMLS.
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Towards a unified MOD resource: An Overview
Getting GO annotation for your dataset
Networks and Interactions
Annotating with GO: an overview
Biological Databases By: Komal Arora.
Interactions and Ontologies
A hands on tour of the Saccharomyces Genome Database (SGD)
GO : the Gene Ontology & Functional enrichment analysis
Sequence based searches:
CottonGen: An Up-to-Date Resource Enabling Genetics, Genomics and Breeding Research for Crop Improvement Plant and Animal Genome Conference XXV Jing Yu1,
Saccharomyces Genome Database (SGD)
Functional Annotation of the Horse Genome
Genome Annotation Continued
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Strategies for annotation of a genome
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
A User’s Guide to GO: Structural and Functional Annotation
Ensembl Genome Repository.
Gramene’s Ontologies Tutorial
Advanced PGDB Editing: Gene Ontology (GO) Terms
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Department of Genetics • Stanford University School of Medicine Manually curated and computationally predicted GO annotations at the Saccharomyces Genome Database http://www.yeastgenome.org/ Eurie L. Hong, Ph.D. Department of Genetics • Stanford University School of Medicine

Data from high through-put experiments Scientific community Data from high through-put experiments Data from traditional experiments Integrated data Analysis tools Genome sequence Talk a little bit about who we are and what we do. We are community database that curates sequence, molecular biology, genetic, and biochemical infofrmation about the budding yeast S. cerevisiae. All the data available at SGD is generated by the scientific community - this includes high-throughput studies, traditional small-scale experiments, and sequencing efforts. These data are incorporated and integrated into SGD. And we provide searches an analysis tools to help the scientific community view the data others have published as well as analyze their own data.

CHS6/YJL099W Locus Summary Page Nomenclature Summary of published data Links to SGD tools and other databases Curated data from published literature Sequence Information Data from high throughput experiments All the data at SGD is centrally organized around a chromosomal feature such as a gene or a telomere or centromere. All the data associated with that feature is displayed on a Locus Summary Page. Here we are looking at the locus summary page for LEU4. You can view the nomenclature, summaries of published data, sequence information, and access other databases from a Locus Summary Page. All these data are curated from the literature and updated as the body of literature expands. I’ll just highlight a few types of data from this page that might be interest to you. Links to other databases

Accessing the data via files ftp://ftp.yeastgenome.org/yeast/ Before I start, I want to emphasize that all our data are publicly available in downloadable files on our ftp site. We also have web interfaces that allow you do download data from searches that I will point out later.

Display of GO Annotations

Status of GO Annotations at SGD All protein and RNA gene products have been annotated with GO terms All GO annotations are manually curated from literature (no IEA) 864 genes (13.7% of all genes) Cellular Component 1448 genes (23.0% of all genes) Biological Process 2112 genes (33.6% of all genes) Molecular Function from Genome Snapshot 8/23/2006 Genes without published characterization data The scientific literature describing thte biological role of a gene product is captured with Gene Ontology terms. Gene Ontology is a controlled vocabulary that contains relationships bewteen the terms. This relationship between the terms allows you to do further computational analysis about the knowledge for a gene. GO is also used by other model organism databases. Because these are controlled vocabulary terms with definitions, you know that Flybase’s use of “leucine biosynthesis” is the same as SGD’s use of the term.

Sources of Computationally Predicted GO Annotations InterPro domain matches in S. cerevisiae proteins source: GOA project Integrated analysis of multiple datasets source: publications, external databases

CHS6/YJL099W Locus Summary Page

Identifying Types of GO Annotations

{ { { CHS6/YJL099W GO Annotation Page Core GO Annotations GO Annotations from Large Scale Experiments { Computationally Predicted GO Annotations

{ { { Changes to GO Term Finder Current functionality Specify background set { Refine annotations used by annotation source or evidence codes

Improving GO Annotations Computationally predicted GO annotations Manually curated GO annotations Computational predictions may indicate publications that were overlooked Review inconsistencies between computationally predicted and manually curated GO annotations to improve mappings and manually curated annotations Review inconsistencies between computationally predicted and manually curated GO annotations to improve ontology

Additional Annotations Using Interpro2GO Information added to genes with no published characterization data Molecular Function 468 genes Biological Process 316 genes Cellular Component 207 genes from gene_association.goa_uniprot 7/2006

Preliminary Comparison: Cellular Component Annotations Other 38% 43% 15% 18% 2% 5946 IEA 9059 IC+IDA+IEP+IGI+IMP+IPI+ISS+NAS+RCA+TAS Interpro2go annotation is ancestor of curated annotation Interpro2go annotation for an unknown Other shared parent term Shared parent is root term Interpro2go annotation matches curated annotation Shared parent is child of root term 4%

Summary Currently, all GO annotations for S. cerevisiae gene products are manually curated from literature SGD will incorporate computationally predicted GO annotations that will provide additional information for a gene product’s role in biology Computationally predicted GO annotations will be used to refine and improve manually curated GO annotations at SGD

yeast-curator@genome.stanford.edu