GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.

Slides:



Advertisements
Similar presentations
Microarray statistical validation and functional annotation
Advertisements

A Comparative mapping resource ONTOLOGY DEVELOPMENT AND INTEGRATION IN GRAMENE Pankaj Jaiswal Cornell University.
1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida.
Annotation of Gene Function …and how thats useful to you.
Applications of GO. Goals of Gene Ontology Project.
24th Feb 2006 Jane Lomax Gene Ontology tutorial Talk:Using the Gene Ontology (GO) for Expression Analysis Practical:Onto-Express analysis tool Talk: GO.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Microarray Data Analysis Day 2
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
Gene Ontology John Pinney
Introduction to Functional Analysis J.L. Mosquera and Alex Sanchez.
COG and GO tutorial.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis- part 2.
BI class 2010 Gene Ontology Overview and Perspective.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Today’s menu: -SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Demonstration Trupti Joshi Computer Science Department 317 Engineering Building North (O)
Internet tools for genomic analysis: part 2
1 Gene Ontology and Semantic Similarity Measures.
Protein and Function Databases
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Gene Ontology and Functional Enrichment Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Methods for Creating GO Annotations Emily Dimmer European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK.
A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!
Daniel Rico, PhD. Daniel Rico, PhD. ::: Introduction to Functional Analysis Course on Functional Analysis Bioinformatics Unit.
EECS 730 Introduction to Bioinformatics Function Luke Huan Electrical Engineering and Computer Science
Using The Gene Ontology: Gene Product Annotation.
Gene Ontology (GO) Project
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Biology 224 Instructor: Tom Peavy Feb 21 & 26, Protein Structure & Analysis.
Ontologies, data standards and controlled vocabularies.
Gene Ontology Consortium
The Bioinformatics of Microarrays
GO: The Gene Ontology Pascale Gaudet dictyBase curator Northwestern University, Chicago, IL.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Gene Ontology Project
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
Gene expression analysis
Lecture Four: GO: The Gene Ontology ----Infrastructure for Systems Biology.
BIOINFORMATIK I UEBUNG 2 mRNA processing.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Gene Onotology Part 1: what is the GO? Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse Genome Informatics.
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
Emily Dimmer GOA group European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge UK Gene Ontology (GO)
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
Part II GO-Vocabulary of Genome. S. cerevisiae D. melanogaster.
The Gene Ontology and its insertion into UMLS Jane Lomax.
Tutorial 7 Gene expression analysis 1. Expression data –GEO –UCSC –ArrayExpress General clustering methods –Unsupervised Clustering Hierarchical clustering.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!
Gene Ontology Consortium
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation Bioinformatics, July 2003 P.W.Load,
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
An example of GO annotation from a primary paper GO Annotation Camp, July 2006 PMID:
Gene Ontology TM (GO) Consortium
2/3/2005 Gene Ontology (GO) The Gene Ontology (GO) project is a collaborative effort to address the need for consistent descriptions.
GO : the Gene Ontology & Functional enrichment analysis
Department of Genetics • Stanford University School of Medicine
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA

A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!

Outline of Topics Introduction to the Gene Ontologies (GO) Introduction to the Gene Ontologies (GO) Annotations to GO terms Annotations to GO terms GO Tools GO Tools Applications of GO Applications of GO

Gene Ontology - Gene annotation system - Controlled vocabulary that can be applied to all organisms - Used to describe gene products

What ’ s in a name? What is a cell? What is a cell?

Cell

Cell

Cell

Cell

Cell Image from

Bud initiation?

= bud initiation sensu Metazoa = bud initiation sensu Saccharomyces = bud initiation sensu Viridiplantae

What ’ s in a name? The same name can be used to describe different concepts The same name can be used to describe different concepts

What’s in a name?

Glucose synthesis Glucose synthesis Glucose biosynthesis Glucose biosynthesis Glucose formation Glucose formation Glucose anabolism Glucose anabolism Gluconeogenesis Gluconeogenesis All refer to the process of making glucose from simpler components All refer to the process of making glucose from simpler components

What ’ s in a name? The same name can be used to describe different concepts The same name can be used to describe different concepts A concept can be described using different names A concept can be described using different names  Comparison is difficult – in particular across species or across databases

What is the Gene Ontology? A (part of the) solution: - A controlled vocabulary that can be applied to all organisms - Used to describe gene products - proteins and RNA - in any organism

How does GO work? What does the gene product do? What does the gene product do? Why does it perform these activities? Why does it perform these activities? Where does it act? Where does it act? What information might we want to capture about a gene product?

Molecular Function = elemental activity/task Molecular Function = elemental activity/task the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity Biological Process = biological goal or objective Biological Process = biological goal or objective broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions Cellular Component = location or complex Cellular Component = location or complex subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme The 3 Gene Ontologies

Function (what) Process (why) Drive nail (into wood) Carpentry Drive stake (into soil) Gardening Smash roach Pest Control Clown’s juggling object Entertainment Example: Gene Product = hammer

Ontologies can be represented as graphs, where the nodes are connected by edges Nodes = concepts in the ontology Nodes = concepts in the ontology Edges = relationships between the concepts Edges = relationships between the concepts node edge Ontology Structure

The Gene Ontology is structured as a hierarchical directed acyclic graph (DAG) The Gene Ontology is structured as a hierarchical directed acyclic graph (DAG) Terms can have more than one parent and zero, one or more children Terms can have more than one parent and zero, one or more children Terms are linked by two relationships Terms are linked by two relationships is-a is-a part-of part-of

Directed Acyclic Graphs (DAG) is-a part-of [other protein complexes] [other organelles] protein complex organelle mitochondrion fatty acid beta-oxidation multienzyme complex

Nucleus Nucleoplasm Nuclear envelope ChromosomePerinuclear spaceNucleolus A child is a subset of a parent’s elements The cell component term Nucleus has 5 children Parent-Child Relationships

True Path Rule The path from a child term all the way up to its top-level parent(s) must always be true The path from a child term all the way up to its top-level parent(s) must always be truecell  cytoplasm  chromosome  nuclear chromosome  cytoplasmic chromosome  mitochondrial chromosome  nucleus  nuclear chromosome is-a  part-of 

term: gluconeogenesis id: GO: definition: The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol. What’s in a GO term?

No GO Areas GO covers ‘ normal ’ functions and processes GO covers ‘ normal ’ functions and processes No pathological processes No pathological processes No experimental conditions No experimental conditions NO evolutionary relationships NO evolutionary relationships NO gene products NO gene products NOT a system of nomenclature NOT a system of nomenclature

Mitochondrial P450 Annotation of gene products with GO terms

Cellular component: mitochondrial inner membrane GO: Biological process: Electron transport GO: Molecular function: monooxygenase activity GO: substrate + O 2 = CO 2 +H 2 0 product

Other gene products annotated to monooxygenase activity (GO: ) - monooxygenase, DBH-like 1 (mouse) - prostaglandin I2 (prostacyclin) synthase (mouse) - flavin-containing monooxygenase (yeast) - ferulate-5-hydrolase 1 (arabidopsis)

Two types of GO Annotations:  Electronic Annotation  Manual Annotation All annotations must: be attributed to a source indicate what evidence was found to support the GO term-gene/protein association

IEAInferred from Electronic Annotation ISSInferred from Sequence Similarity IEPInferred from Expression Pattern IMPInferred from Mutant Phenotype IGIInferred from Genetic Interaction IPIInferred from Physical Interaction IDAInferred from Direct Assay RCAInferred from Reviewed Computational Analysis TASTraceable Author Statement NASNon-traceable Author Statement ICInferred by Curator NDNo biological Data available

Terms become obsolete when they are removed or redefined GO IDs are never deleted For each term, a comment is added to explains why the term is now obsolete Ensuring Stability in a Dynamic Ontology Obsolete Cellular Component Obsolete Molecular Function Obsolete Biological Process Biological Process Molecular Function Cellular Component

Why modify the GO GO reflects current knowledge of biology GO reflects current knowledge of biology New organisms being added makes existing terms arrangements incorrect New organisms being added makes existing terms arrangements incorrect Not everything perfect from the outset Not everything perfect from the outset

Access gene product functional information Find how much of a proteome is involved in a process/ function/ component in the cell Map GO terms and incorporate manual annotations into own databases Provide a link between biological knowledge and … gene expression profiles proteomics data What can scientists do with GO?

Whole genome analysis (J. D. Munkvold et al., 2004) Microarray analysis

Beyond GO – Open Biomedical Ontologies Orthogonal to existing ontologies to facilitate combinatorial approaches - Share unique identifier space - Include definitions Anatomies Cell Types Sequence Attributes Temporal Attributes Phenotypes Diseases More….