Download presentation
Presentation is loading. Please wait.
Published byErnest Simpson Modified over 9 years ago
1
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA
2
A Common Language for Annotation of Genes from Yeast, Flies and Mice The Gene Ontologies …and Plants and Worms …and Humans …and anything else!
3
Outline of Topics Introduction to the Gene Ontologies (GO) Introduction to the Gene Ontologies (GO) Annotations to GO terms Annotations to GO terms GO Tools GO Tools Applications of GO Applications of GO
4
Gene Ontology - Gene annotation system - Controlled vocabulary that can be applied to all organisms - Used to describe gene products
5
What ’ s in a name? What is a cell? What is a cell?
6
Cell
7
Cell
8
Cell
9
Cell
10
Cell Image from http://microscopy.fsu.edu
11
Bud initiation?
12
= bud initiation sensu Metazoa = bud initiation sensu Saccharomyces = bud initiation sensu Viridiplantae
13
What ’ s in a name? The same name can be used to describe different concepts The same name can be used to describe different concepts
14
What’s in a name?
15
Glucose synthesis Glucose synthesis Glucose biosynthesis Glucose biosynthesis Glucose formation Glucose formation Glucose anabolism Glucose anabolism Gluconeogenesis Gluconeogenesis All refer to the process of making glucose from simpler components All refer to the process of making glucose from simpler components
16
What ’ s in a name? The same name can be used to describe different concepts The same name can be used to describe different concepts A concept can be described using different names A concept can be described using different names Comparison is difficult – in particular across species or across databases
17
What is the Gene Ontology? A (part of the) solution: - A controlled vocabulary that can be applied to all organisms - Used to describe gene products - proteins and RNA - in any organism
18
How does GO work? What does the gene product do? What does the gene product do? Why does it perform these activities? Why does it perform these activities? Where does it act? Where does it act? What information might we want to capture about a gene product?
19
Molecular Function = elemental activity/task Molecular Function = elemental activity/task the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity Biological Process = biological goal or objective Biological Process = biological goal or objective broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions Cellular Component = location or complex Cellular Component = location or complex subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme The 3 Gene Ontologies
20
Function (what) Process (why) Drive nail (into wood) Carpentry Drive stake (into soil) Gardening Smash roach Pest Control Clown’s juggling object Entertainment Example: Gene Product = hammer
21
Ontologies can be represented as graphs, where the nodes are connected by edges Nodes = concepts in the ontology Nodes = concepts in the ontology Edges = relationships between the concepts Edges = relationships between the concepts node edge Ontology Structure
22
The Gene Ontology is structured as a hierarchical directed acyclic graph (DAG) The Gene Ontology is structured as a hierarchical directed acyclic graph (DAG) Terms can have more than one parent and zero, one or more children Terms can have more than one parent and zero, one or more children Terms are linked by two relationships Terms are linked by two relationships is-a is-a part-of part-of
23
Directed Acyclic Graphs (DAG) is-a part-of [other protein complexes] [other organelles] protein complex organelle mitochondrion fatty acid beta-oxidation multienzyme complex
24
Nucleus Nucleoplasm Nuclear envelope ChromosomePerinuclear spaceNucleolus A child is a subset of a parent’s elements The cell component term Nucleus has 5 children Parent-Child Relationships
25
True Path Rule The path from a child term all the way up to its top-level parent(s) must always be true The path from a child term all the way up to its top-level parent(s) must always be truecell cytoplasm chromosome nuclear chromosome cytoplasmic chromosome mitochondrial chromosome nucleus nuclear chromosome is-a part-of
26
term: gluconeogenesis id: GO:0006094 definition: The formation of glucose from noncarbohydrate precursors, such as pyruvate, amino acids and glycerol. What’s in a GO term?
27
No GO Areas GO covers ‘ normal ’ functions and processes GO covers ‘ normal ’ functions and processes No pathological processes No pathological processes No experimental conditions No experimental conditions NO evolutionary relationships NO evolutionary relationships NO gene products NO gene products NOT a system of nomenclature NOT a system of nomenclature
28
Mitochondrial P450 Annotation of gene products with GO terms
29
Cellular component: mitochondrial inner membrane GO:0005743 Biological process: Electron transport GO:0006118 Molecular function: monooxygenase activity GO:0004497 substrate + O 2 = CO 2 +H 2 0 product
30
Other gene products annotated to monooxygenase activity (GO:0004497) - monooxygenase, DBH-like 1 (mouse) - prostaglandin I2 (prostacyclin) synthase (mouse) - flavin-containing monooxygenase (yeast) - ferulate-5-hydrolase 1 (arabidopsis)
31
Two types of GO Annotations: Electronic Annotation Manual Annotation All annotations must: be attributed to a source indicate what evidence was found to support the GO term-gene/protein association
32
IEAInferred from Electronic Annotation ISSInferred from Sequence Similarity IEPInferred from Expression Pattern IMPInferred from Mutant Phenotype IGIInferred from Genetic Interaction IPIInferred from Physical Interaction IDAInferred from Direct Assay RCAInferred from Reviewed Computational Analysis TASTraceable Author Statement NASNon-traceable Author Statement ICInferred by Curator NDNo biological Data available
33
Terms become obsolete when they are removed or redefined GO IDs are never deleted For each term, a comment is added to explains why the term is now obsolete Ensuring Stability in a Dynamic Ontology Obsolete Cellular Component Obsolete Molecular Function Obsolete Biological Process Biological Process Molecular Function Cellular Component
34
Why modify the GO GO reflects current knowledge of biology GO reflects current knowledge of biology New organisms being added makes existing terms arrangements incorrect New organisms being added makes existing terms arrangements incorrect Not everything perfect from the outset Not everything perfect from the outset
35
Access gene product functional information Find how much of a proteome is involved in a process/ function/ component in the cell Map GO terms and incorporate manual annotations into own databases Provide a link between biological knowledge and … gene expression profiles proteomics data What can scientists do with GO?
36
Whole genome analysis (J. D. Munkvold et al., 2004) Microarray analysis
37
http://www.geneontology.org/GO.tools
38
Beyond GO – Open Biomedical Ontologies Orthogonal to existing ontologies to facilitate combinatorial approaches - Share unique identifier space - Include definitions Anatomies Cell Types Sequence Attributes Temporal Attributes Phenotypes Diseases More…. http://obo.sourceforge.net
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.