2/3/2005 Gene Ontology (GO) The Gene Ontology (GO) project is a collaborative effort to address the need for consistent descriptions of gene products in different databases. The GO collaborators are developing three structured, controlled vocabularies (ontologies) that describe gene products in terms of their associated biological processes, cellular components and molecular functions in a species-independent manner.
2/3/2005 Molecular Function Biochemical activity or action of the gene product. MF describes a capability that the gene product has and there is no reference to where or when this activity or usage actually occurs. Examples: enzyme transporter ligand cytochrome c: electron transporter activity
2/3/2005 Biological process A biological objective to which the gene product contributes. A biological process is accomplished via one or more ordered assemblies of molecular functions. There is generally some temporal aspect to the process and it will often involve the transformation of some physical thing. Examples: cell growth and maintenance cytochrome c oxidative phosphorylation, induction of cell death
2/3/2005 Cellular Component A component of a cell that is part of some larger object or structure. Examples: chromosome nucleus ribosome cytochrome c: mitochondrial matrix, mitochondrial inner membrane
2/3/2005 structure of GO terms GO: : Gene Ontology GO: : biological process GO: : death GO: :cytolysis GO: : programmed cell death GO: : apoptosis GO: : hypersensitive response GO: : cellular component (S.G.Lee, 2004)
2/3/2005 GO parent-child relationship “Child” terms are more specific than “parent” terms “is-a” or “part-of” A mitotic chromosome is a chromosome A telemere is part-of a chromosome
2/3/2005 Induced Graph: Directed Acyclic Graphs (DAG)
2/3/2005 Genes to GO terms: many to many GO itself has no reference to genes GO specifies terminology and relationship between terms GO’s real power comes from annotation of genes at different GO terms Each annotation is supported by certain evidence, e.g. TAS, IEP, ISS Some remarks
2/3/2005 Hypergeometric distribution N balls, m white, N-m black Draw k balls from N, X out of k balls are white P(X=i) = i C m (k-i) C (N-m) / k C N Are there any GO terms over represented in our list? P-value = ( x C m (k-x) C (N-m) / k C N )
2/3/2005 library("GOstats") library(hgu95av2) GOHyperG(myLL, lib="hgu95av2", what="MF“) $pvalues GO: GO: GO: GO: GO: GO: e e e e e e-04 $goCounts GO: GO: GO: GO: GO: GO: GO: $numLL [1] 6309 $numInt [1]
2/3/2005 Gene: zinc finger protein 261 [Homo sapiens] Chromosomal location Xq13.1” PubMed, PMID “ ”“ ”“ ” Gene symbol “ZNF261” GenBank accession # “X95808” LocusLink, LocusID “9203” Affymetrix identifier HGU95A chips “41046_s_at” GO: "GO: " "GO: " "GO: "
2/3/2005 Ensembl project
2/3/2005 GenMAPP: more biology GenMAPP provides users with a tool for visualizing gene expression data along pathways (called MAPPs), creating new pathways and identifying global biological associations within an expression dataset.
2/3/2005