Part II GO-Vocabulary of Genome. S. cerevisiae D. melanogaster.

Slides:



Advertisements
Similar presentations
A Comparative mapping resource ONTOLOGY DEVELOPMENT AND INTEGRATION IN GRAMENE Pankaj Jaiswal Cornell University.
Advertisements

1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida.
Annotation of Gene Function …and how thats useful to you.
Applications of GO. Goals of Gene Ontology Project.
25th June 2007 Jane Lomax Using the Gene Ontology (GO) for analysis of expression data Jane Lomax EMBL-EBI.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Microarray Data Analysis Day 2
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Gene Ontology John Pinney
Gene function analysis Stem Cell Network Microarray Course, Unit 5 May 2007.
COG and GO tutorial.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis- part 2.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
BI class 2010 Gene Ontology Overview and Perspective.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Today’s menu: -SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Demonstration Trupti Joshi Computer Science Department 317 Engineering Building North (O)
Ontologies for Informatics. Infrastructure for Systems Biology. Oxford October
1 Gene Ontology and Semantic Similarity Measures.
Protein and Function Databases
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Gene Ontology Project
A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!
Using The Gene Ontology: Gene Product Annotation.
Introduction to the Gene Ontology and GO annotation resources
Gene Ontology (GO) Project
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Only build an ontology if: You have a body of data to annotate.
Ontologies, data standards and controlled vocabularies.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
GO: The Gene Ontology Pascale Gaudet dictyBase curator Northwestern University, Chicago, IL.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Gene Ontology Project
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
Lecture Four: GO: The Gene Ontology ----Infrastructure for Systems Biology.
BIOINFORMATIK I UEBUNG 2 mRNA processing.
Gene Onotology Part 1: what is the GO? Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse Genome Informatics.
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
Emily Dimmer GOA group European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK Gene Ontology (GO)
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Manual GO annotation Evidence: Source AnnotationsProteins IEA:Total Manual: Total
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
Gene Product Annotation using the GO ml Harold J Drabkin Senior Scientific Curator The Jackson Laboratory.
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
Gene Ontology Consortium
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Gene Ontology Project
1 Annotation EPP 245/298 Statistical Analysis of Laboratory Data.
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
Gene Ontology TM (GO) Consortium
Joined up ontologies: incorporating the Gene Ontology into the UMLS.
Canadian Bioinformatics Workshops
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Annotating with GO: an overview
Using the Gene Ontology (GO) for analysis of expression data Jane Lomax EMBL-EBI 25th June 2007 Jane Lomax.
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Annotating Gene Products to the GO
Presentation transcript:

Part II GO-Vocabulary of Genome

S. cerevisiae

D. melanogaster

Cells that normally survive CED-9 ON CED-3 CED-4 OFF CED-9 OFF CED-3 CED-4 ON Cells that normally die C elegans

M. musculus

MCM3 MCM2 CDC46/MCM5 CDC47/MCM7 CDC54/MCM4 MCM6 These proteins form a hexamer in the species that have been examined Comparison of sequences from 4 organisms

A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Gene Ontology FlyBaseDrosophilaCambridge, EBI, Harvard Berkeley & Bloomington. SGDSaccharomycesStanford. MGIMusJackson Labs., Bar Harbor.

Gene Ontology -now Fruitfly - FlyBase Budding yeast - Saccharomyces Genome Database (SGD) Mouse - Mouse Genome Database (MGD & GXD) Rat - Rat Genome Database (RGD) Weed - The Arabidopsis Information Resource (TAIR) Worm - WormBase Dictyostelium discoidem - Dictybase InterPro/UniProt at EBI - InterPro Fission yeast - Pombase Human - UniProt, Ensembl, NCBI, Incyte, Celera, Compugen Parasites - Plasmodium, Trypanosoma, Leishmania - GeneDB - Sanger Microbes - Vibrio, Shewanella, B. anthracus, … - TIGR Grasses - rice & maize - Gramene database zebra fish – Zfin

To provide structured controlled vocabularies for the representation of biological knowledge in biological databases.

Be open source Use open standards Make data & code available without constraint Involve your community

Outline Introduction to the Gene Ontologies (GO) Annotations to GO terms GO Tools Applications of GO

Gene Ontology Objectives GO represents concepts used to classify specific parts of our biological knowledge: –Biological Process –Molecular Function –Cellular Component GO develops a common language applicable to any organism GO terms can be used to annotate gene products from any species, allowing comparison of information across species

GO: Three ontologies Where does it act? What processes is it involved in? What does it do?Molecular Function Cellular Component Biological Process gene product

Function (what) Process (why) Drive nail (into wood) Carpentry Drive stake (into soil) Gardening Smash roach Pest Control Clown’s juggling object Entertainment Example: Gene Product = hammer

Biological Examples Molecular Function Biological Process Cellular Component

Molecular Function = elemental activity/task –the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity Biological Process = biological goal or objective –broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions Cellular Component = location or complex –subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme The 3 Gene Ontologies

Molecular Function A single reaction or activity, not a gene product A gene product may have several functions Sets of functions make up a biological process

Molecular Function

Carbonate dehydratase activity

Biological Process

Gluconeogenesis

Cellular Component where a gene product acts

Mitochondrial membrane

term: gluconeogenesis id: GO: definition: The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol. What’s in a GO term?

What’s in a name?

Molecular Function 7,309 terms Biological Process 10,041 terms Cellular Component 1,629 terms Total 18, 975 terms Definitions: 94.9 % Obsolete terms: 992 Content of GO As of October 2005

What’s in a name? Glucose synthesis Glucose biosynthesis Glucose formation Glucose anabolism Gluconeogenesis All refer to the process of making glucose from simpler components

tree directed acyclic graph

Nucleus Nucleoplasm Nuclear envelope ChromosomePerinuclear spaceNucleolus A child is a subset of a parent’s elements The cell component term Nucleus has 5 children Parent-Child Relationships

Ontology Relationships Directed Acyclic Graph

Evidence Codes for GO Annotations

IEAInferred from Electronic Annotation ISSInferred from Sequence Similarity IEPInferred from Expression Pattern IMPInferred from Mutant Phenotype IGIInferred from Genetic Interaction IPIInferred from Physical Interaction IDAInferred from Direct Assay RCAInferred from Reviewed Computational Analysis TASTraceable Author Statement NASNon-traceable Author Statement ICInferred by Curator NDNo biological Data available

IEA Inferred from Electronic Annotation Sequence Similarity (BLAST) Automatic transfer from mappings (InterPro2GO, EC2GO etc.) -> Not manually reviewed

ISS Inferred from Sequence or Structural Similarity Sequence similarity Recognized domains Structural similarity ->Use of ‘with’ column recommended

IEP Inferred from Expression Pattern Transcript levels (Northerns, microarrays) Protein levels (Western blots) ->Timing or localization of expression ->Biological process annotations

IMP Inferred from Mutant Phenotype Gene mutation/knockout Overexpression/ectopic expression Anti-sense experiments RNAi experiments Specific protein inhibitors

IGI Inferred from Genetic Interaction Suppressors, synthetic lethals… Functional complementation Rescue experiments ->Use of ‘with’ column recommended

IPI Inferred from Physical Interaction 2-hybrid interactions Co-purification Co-immunoprecipitation Ion/complex/protein binding experiments ->Use of ‘with’ column recommended

IDA Inferred from Direct Assay Enzyme assays In vitro reconstitution (e.g. transcription) Immunofluorescence (for cell. comp.) Cell fractionation (for cell. comp.) Physical interaction/binding assay

RCA Inferred from Reviewed Computational Analysis Non-sequence-based computational methods Genome-wide analyses (e.g. 2-hybrid) Combinations of large-scale experiments

TAS Traceable Author Statement Support from review article Textbook ‘common knowledge’ ->Data that can be ‘traced’ back

NAS Non-traceable Author Statement Database entries that don't cite a paper ->Data that cannot be ‘traced’ back

IC Inferred by Curator Not supported by any direct evidence Inferred from other GO annotations -> GO term in ‘with/from’ column required

ND No biological Data available molecular function unknown GO: biological process unknown GO: cellular component unknown GO: Curator found no information supporting any annotation

TAS/IDA IMP/IGI/IPI ISS/IEP NAS IEA Term Hierarchy

Meloidogyne incognita: McCarter et al Annotation summaries

Mitochondrial P450 Annotation of gene products with GO terms

Cellular component: mitochondrial inner membrane GO: Biological process: Electron transport GO: Molecular function: monooxygenase activity GO: substrate + O 2 = CO 2 +H 2 0 product

Other gene products annotated to monooxygenase activity (GO: ) - monooxygenase, DBH-like 1 (mouse) - prostaglandin I2 (prostacyclin) synthase (mouse) - flavin-containing monooxygenase (yeast) - ferulate-5-hydrolase 1 (arabidopsis)

Annotate to finest granularity Annotating to GO: automatically annotates to all of its parents; thus a product is annotated to both protein modification AND cytoskeleton organization

Unknown v.s. Unannotated “Unknown” is used when the curator has determined that there is no existing literature to support an annotation. –Biological process unknown GO: –Molecular function unknown GO: –Cellular component unknown GO: NOT the same as having no annotation at all –No annotation means that no one has looked yet

Annotation of a genome GO annotations are always work in progress Part of normal curation process –More specific information –Better evidence code Replace obsolete terms “Last reviewed” date

How to access the Gene ontology and its annotations 1. Downloads Ontologies Annotations : Gene association files Ontologies and Annotations 2. Web-based access AmiGO ( QuickGO ( among others…

Gene Ontology :

attacked time control Puparial adhesion Molting cycle hemocyanin Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Immune response Toll regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. …analysis of high-throughput data according to GO MicroArray data analysis

Anatomy Physiology Phenotype Pathway Disease Molecular Metabolic Developmental Stage Ontologies