Download presentation
Presentation is loading. Please wait.
Published byKellie Scott Modified over 7 years ago
1
Introduction to the Gene Ontology and GO Annotation Resources
EBI Bioinformatics Roadshow 15 March Düsseldorf, Germany Rebecca Foulger
2
OUTLINE OF TUTORIAL: PART I:. Ontologies and the Gene
OUTLINE OF TUTORIAL: PART I: Ontologies and the Gene Ontology (GO) PART II: GO Annotations How to access GO annotations How scientists use GO annotations GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
3
PART I: Gene Ontology GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
4
What’s in a name...? GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
5
A: It really depends who you ask!
Q: What is a cell? A: It really depends who you ask! GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
6
Different things can be described by the same name
GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
7
The same thing can be described by different names:
Glucose synthesis Glucose biosynthesis Glucose formation Glucose anabolism Gluconeogenesis All refer to the process of making glucose from simpler components GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
8
Inconsistency in naming of biological concepts
Same name for different concepts Different names for the same concept Comparison is difficult – in particular across species or across databases Just one reason why the Gene Ontology (GO) is is needed… English is not a very precise language. This makes it difficult to carry out comparisons across species or databases. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
9
Why do we need GO? Inconsistency in naming of biological concepts
Increasing amounts of biological data available Large datasets need to be interpreted quickly Large datasets need to be interpreted quickly: Nee to organise data, analyse it, share it to benefit other researchers. People’s time is expensive so we need a quick and effective way to do this. GO is computer compliant. Increasing amounts of biological data to come GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
10
Increasing amounts of biological data available
Search on mesoderm development…. you get 9441 results! Expansion of sequence information GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
11
What is an ontology? 1606 1700s Dictionary:
A branch of metaphysics concerned with the nature and relations of being (philosophy) A formal representation of the knowledge by a set of concepts within a domain and the relationships between those concepts (computer science) Barry Smith: The science of what is, of the kinds and structures of objects, properties, events, processes and relations in every area of reality. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
12
is part of What is an ontology? More usefully:
An ontology is the representation of something we know about. “Ontologies" consist of a representation of things, that are detectable or directly observable, and the relationships between those things. is part of GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
13
What is an ontology? An ontology is more than just a list of terms (a controlled vocabulary) A vocabulary of terms Definitions for those terms *** Defined logical relationships between the terms *** Ontologies go a step beyond controlled vocabularies, to model the relationships between them GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
14
What’s in an Ontology? GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
15
What is the Gene Ontology (GO)?
A way to capture biological knowledge in a written and computable form Describes attributes of gene products (RNA and protein) Biologists use up a lot of time and effort in searching for all of the available information about each small area of research. The Gene Ontology (GO) project is a collaborative effort to address the need for consistent descriptions of gene products in different databases. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
16
What information might we want to capture about a gene product?
The scope of GO What information might we want to capture about a gene product? What does the gene product do? Where does it act? How does it act? GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
17
Biological Process what does a gene product do?
A commonly recognised series of events transcription cell division GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
18
Cellular Component where is a gene product located?
plasma membrane mitochondrion mitochondrial membrane mitochondrial matrix mitochondrial lumen ribosome large ribosomal subunit small ribosomal subunit GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
19
Molecular Function how does a gene product act?
insulin binding insulin receptor activity An elemental activity or task or job glucose-6-phosphate isomerase activity GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
20
Three separate ontologies or one large one?
GO was originally three completely independent hierarchies, with no relationships between them As of 2009, GO have started making relationships between biological process and molecular function in the live ontology Functions are mini-processes. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
21
Process Function Function
It makes sense to connect the transports to the process they’re involved in GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
22
covers normal processes GO is NOT: NO pathological/disease processes
species independent covers normal processes GO is NOT: NO pathological/disease processes NO experimental conditions NO evolutionary relationships NO gene products NOT a nomenclature system GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
23
Aims of the GO project Compile the ontologies
Annotate gene products using ontology terms Provide a public resource of data and tools Currently 31,821 and growing daily GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
24
Anatomy of a GO term Unique identifier Term name Definition Synonyms
17800 terms in three ontologies 94% of terms defined Cross-references GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
25
Ontology structure GO is structured as a hierarchical directed acyclic graph (DAG) Terms can have more than one parent and zero, one or more children Terms are linked by relationships, which add to the meaning of the term node edge Nodes = terms in the ontology Edges = relationships between the concepts GO isn’t just a flat list of terms Nodes connected by edges GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
26
GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf
GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
27
Relationships between GO terms
is_a part_of regulates positively regulates negatively regulates has_part GO currently has 4 relationship types GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
28
is_a If A is a B, then A is a subtype of B
mitotic cell cycle is a cell cycle lyase activity is a catalytic activity. Transitive relationship: can infer up the graph GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
29
A B part_of Necessarily part of
Wherever B exists, it is as part of A. But not all B is part of A. Transitive relationship (can infer up the graph) A B E.g. All replication forks are part of a chromosome. Not all chromosomes have replication forks. Is_a & part_of relationships work such that the parent term is broader (more general) than the child term. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
30
A B regulates One process directly affects another process or quality
Necessarily regulates: if both A and B are present, B always regulates A, but A may not always be regulated by B A B cycle checkpoints regulate the cell cycle. The cell cycle is not solely regulated by cell cycle checkpoints There’s no inference made about specificity. For is_a, and part_of relationships, then term nearest the node (the top) is more general than its child (which is more specific). You can’t infer this from a regulates relationship. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
31
has_part Relationships are upside down compared to is_a and part_of
Necessarily has part All nuclei have chromosomes. Not all chromosomes are part of nuclei There’s two versions of GO for download: a full and a public version. Only the full version has has_part relationships at the moment. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
32
is_a complete For all terms in the ontology, you have to be able to reach the root through a complete path of is_a relationships: we call this being is_a complete important for reasoning over the ontology, and ontology development Other rules are applied to GO to make it ontologically correct Everything needs to be is_a something GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
33
True path rule Child terms inherit the meaning of all their parent terms. If this goes wrong, you get true path violations. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
34
How is GO maintained? GO editors and annotators work with experts to remodel specific areas of the ontology Signaling Kidney development Transcription Pathogenesis Cell cycle Deal with requests from the community database curators, researchers, software developers Some simple requests can be dealt with automatically GO Consortium meetings for large changes Mailing lists, conference calls, content workshops GO is a very rapidly developing ontology - the editorial office is here on site at EBI, and there are several of us who edit the ontologies full-time, plus others at the model organism databases who also edit the ontologies. We edit and develop the ontologies in response to requests from the community, which in this case is people from the genome and protein databases who are creating GO annotations, from research scientists who are concerned we represent their fields correctly, and from people developing tools using GO. These requests are now managed via a tracker system on sourceforge.net, which allows us to log them and for others to comment. Changes to the ontologies are discussed over various mailing lists, and major changes to the ontologies, as well as changes to format etc are discussed at three times yearly consortium meetings. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
35
Requesting changes to the ontology
Public Source Forge (SF) tracker for term related issues Web-based tracking system hosted at SourceForge Public Tracker item for each new request or question
36
Why modify the GO? GO reflects current knowledge of biology
Information from new organisms can make existing terms and arrangements incorrect Not everything perfect from the outset Improving definitions Adding in synonyms and extra relationships Most of what editorial office do is in response to annotator needs - firefighting We don’t develop terms prospectively unless needed for annotation GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
37
Ensuring Stability in a Dynamic Ontology
Terms become obsolete when they are removed or redefined GO IDs are never deleted For each term, a comment is added to explain why the term is now obsolete Alternative GO terms are suggested to replace an obsoleted term GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
38
Searching for GO terms … there are more browsers available on the GO Tools page: The latest OBO Gene Ontology file can be downloaded from: GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
39
Browsing the Gene Ontology using QuickGO
Exercise Browsing the Gene Ontology using QuickGO Exercise 1 Try searching for ‘apoptosis’ and repeat the search using at least one other term name, to give you an idea of the type of terms in GO… mitochondrion, heart development, isomerase activity, protein kinase activity. Autocomplete in search box. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
40
PART II: GO Annotation GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
41
http://www.geneontology.org Reactome
E. Coli hub Reactome The GO Consortium has grown to include many databases, including several of the world's major repositories for plant, animal and microbial genomes.
42
A GO annotation is… A statement that a gene product:
1. has a particular molecular function Or is involved in a particular biological process Or is located within a certain cellular component 2. as determined by a particular method 3. as described in a particular reference Minimal information required for a GO annotation. Every annotation requires a GO ID, a reference, and an evidence code (gives you an idea of what methods were done to find out that info) Accession Name GO ID GO term name Reference Evidence Code P00505 GOT2 GO: Aspartate transaminase activity PMID: IDA GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
43
BLASTs, orthology comparison, HMMs
Evidence codes IDA: enzyme assay IPI: e.g. Y2H BLASTs, orthology comparison, HMMs subcategories of ISS review papers
44
GO evidence code decision tree
GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
45
Gene Ontology Annotation (GOA)
The GOA database at the EBI is: The largest open-source contributor of annotations to GO Member of the GO Consortium since 2001 Provides annotation for 321,998 species (February 2011 release) GOA’s priority is to annotate the human proteome GOA is responsible for human, chicken and bovine annotations in the GO consortium GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
46
GOA makes annotations using two methods
Electronic Quick way of producing large numbers of annotations Annotations are less detailed Manual Time-consuming process producing lower numbers of annotations Annotations are very detailed and accurate GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
47
Electronic annotation by GOA
1. Mapping of external concepts to GO terms InterPro2GO (protein domains) SPKW2GO (UniProt/Swiss-Prot keywords) HAMAP2GO (Microbial protein annotation) EC2GO (Enzyme Commission numbers) SPSL2GO (Swiss-Prot subcellular locations) 2. Automatic transfer of annotations to orthologs Annotations are high-quality and have an explanation of the method. Ensembl compara Macaque Chimpanzee Guinea Pig Rat Mouse Cow Dog Chicken GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
48
Mappings of concepts from UniProtKB files
Aspartate transaminase activity ; GO: A manually produced translation table is run over UniProt entries currently 13,608 external identifiers have been mapped to GO producing 51,155,464 annotations lipid transport; GO:006869 GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
49
Automatic transfer of annotations to orthologs
Human Mouse Rat Zebrafish Xenopus Ensembl COMPARA Homologies between different species calculated GO terms projected from MANUAL annotation only (IDA, IEP, IGI, IMP, IPI) One-to-one orthologies used. Currently provides 479,961 GO annotations for 60,515 proteins from 49 species (February 2011 release) Macaque Chimpanzee Xenopus Zebrafish Human Human Guinea Pig Rat Mouse Tetraodon Rat Mouse Cow Dog Chicken Fugu
50
Manual annotation by GOA
High-quality, specific annotations using: Peer-reviewed papers A range of evidence codes to categorize the types of evidence found in a paper GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
51
Finding annotations in a paper
…for B. napus PERK1 protein (Q9ARH1) In this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predicted to encode a novel receptor-like kinase. We have shown that like other plant RLKs, the kinase domain of PERK1 has serine/threonine kinase activity, In addition, the location of a PERK1-GTP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral membrane protein…these kinases have been implicated in early stages of wound response… serine/threonine kinase activity, integral membrane protein wound response PubMed ID: …piece of text…from which can pull out GO terms Generally use the abstract as a summary, and go to the method section of a paper to pull out the associations Function: protein serine/threonine kinase activity GO: Component: integral to plasma membrane GO: Process: response to wounding GO: GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
52
Additional information
Qualifiers Modify the interpretation of an annotation NOT (protein is not associated with the GO term) colocalizes_with (protein associates with complex but is not a bona fide member) contributes_to (describes action of a complex of proteins) 'With' column Can include further information on the method being referenced e.g. the protein accession of an interacting protein GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
53
The NOT qualifier NOT is used to make an explicit note that the gene product is not associated with the GO term Also used to document conflicting claims in the literature NOT can be used with ALL three gene ontologies E.g. if the authors show that a protein is NOT found in the nucleolus. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
54
excluded from the nucleoli
In these cells, SIPP1 was mainly present in the nucleus, where it displayed a non-uniform, speckled distribution and appeared to be excluded from the nucleoli. excluded from the nucleoli GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
55
The colocalizes_with qualifier
Gene products that are transiently or peripherally associated with an organelle or complex ONLY used with GO component ontology GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
56
PSI is physically associated with U1 snRNP
Immunoblot analysis with anti-PSI polyclonal antibodies of U1 snRNP particles affinity purified from Drosophila embryonic nuclear extracts showed that PSI is physically associated with U1 snRNP (Figure 1A, top panel). Association of U1 snRNP with GST-PSI was detected by ethidium bromide staining of the selected snRNAs and was confirmed by blot hybridization with an antisense U1 snRNA riboprobe (Figure 1C, lane 4). PSI is physically associated with U1 snRNP Association of U1 snRNP with GST-PSI GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
57
The reason for the association becomes clear as the authors show that the protein interact with one of the U1snRNP complex members. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
58
The contributes_to qualifier
Where an individual gene product that is part of a complex can be annotated to terms that describe the action (function or process) of the whole complex contributes_to is not needed to annotate a catalytic subunit. Furthermore, contributes_to may be used for any non-catalytic subunit, whether the subunit is essential for the activity of the complex or not Annotations to contributes_to often use the IC evidence code, but others may also be used. ONLY used with GO function ontology GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
59
we next examined whether a complex of four proteins can be formed…
..we next examined whether a complex of four proteins can be formed…. As shown in Figure 4, FLAG-tagged PIG-C was precipitated efficiently with anti- FLAG beads in four combinations with other proteins (Figure 4A, lanes 1–4)….. These results strongly suggest that all four proteins form a complex. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
60
.. To test whether the protein complex consisting of PIG-A, PIG-H, PIG-C and hGPI1 has GlcNAc transferase activity in vitro…. …incubation of the radiolabeled donor of GlcNAc, UDP-[6-3H]GlcNAc, with lysates of JY5 cells transfected with GST-tagged PIG-A resulted in synthesis of GlcNAc-PI and its subsequent deacetylation to glucosa- minyl phosphatidylinositol (GlcN-PI) whether the protein complex has GlcNAc transferase activity resulted in synthesis of GlcNAc-PI and Its subsequent deacetylation to glucosa-minyl phosphatidylinositol (GlcN-PI) GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
61
Unknown v.s. Unannotated
When there is no existing data to support an annotation, gene is annotated to the ROOT (top level) term NOT the same as having no annotation at all No annotation means that no one has looked yet GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
62
WITH column The with column provides supporting evidence for ISS, IPI, IGI and IC evidence codes ISS: the accession of the aligned protein/ortholog IPI: the accession of the interacting protein IGI: the accession of the interacting gene IC: The GO:ID for the inferred_from term WITH column GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
63
How to access GO annotation data
GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
64
Where can you find annotations?
UniProtKB Ensembl Entrez gene
65
Gene Association Files
17 column files containing all information for each annotation GO Consortium website GOA website
66
GO browsers
67
QuickGO browser Search GO terms or proteins Find sets of
GO annotations GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
68
Searching for GO annotations
Exercise Searching for GO annotations in QuickGO Exercise 2 Exercise 3 GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
69
Using QuickGO to create a tailored
Exercise Using QuickGO to create a tailored set of annotations Exercise 4: Filtering Exercise 5: Statistics GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
70
How scientists use the GO, and the tools they use for analysis
GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
71
Using GO annotations If you wanted to find out the role of a gene product manually, you’d have to read an awful lot of papers But by using GO annotations, this work has already been done for you! But by using GO annotations, this work has already been done for you - someone has already sat down and associated a particular gene with a particular process… Saves loads of time… especially given a list of thousands of gene products GO: : apoptosis GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
72
How scientists use the GO
Access gene product functional information Analyse high-throughput genomic or proteomic datasets Validation of experimental techniques Get a broad overview of a proteome Obtain functional information for novel gene products Some examples… GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
73
JAK-STAT regulated genes
attacked time control Puparial adhesion Molting cycle Hemocyanin Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabolism MicroArray data analysis Treat samples Collect mRNA Label Hybridize Scan Normalize Select differentially regulated genes Understand the biological phenomena involved Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
74
Validation of experimental techniques
Rat liver plasma membrane isolation (Cao et al., Journal of Proteome Research 2006) GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
75
Analysis of high-throughput proteomic datasets
Characterisation of proteins interacting with ribosomal protein S19 (Orrù et al., Molecular and Cellular Proteomics 2007) GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
76
Obtain functional information for novel gene products
MPYVSQSQHIDRVRGAIEGRLPAPGNSSRLVSSWQRSYEQYRLDPGSVIGPRVLTS SELR DVQGKEEAFLRASGQCLARLHDMIRMADYCVMLTDAHGVTIDYRIDRDRRGD FKHAGLYI GSCWSEREEGTCGIASVLTDLAPITVHKTDHFRAAFTTLTCSASPIFAPTG ELIGVLDAS AVQSPDNRDSQRLVFQLVRQSAALIEDGYFLNQTAQHWMIFGHASRN FVEAQPEVLIAFD ECGNIAASNRKAQECIAGLNGPRHVDEIFDTSAVHLHDVARTDTI MPLRLRATGAVLYAR IRAPLKRVSRSACAVSPSHSGQGTHDAHNDTNLDAISRFLHS RDSRIARNAEVALRIAGK HLPILILGETGVGKEVFAQALHASGARRAKPFVAVNCGAIP DSLIESELFGYAPGAFTGA RSRGARGKIAQAHGGTLFLDEIGDMPLNLQTRLLRVLA EGEVLPLGGDAPVRVDIDVICA THRDLARMVEEGTFREDLYYRLSGATLHMPPLRER ADILDVVHAVFDEEAQSAGHVLTLD GRLAERLARFSWPGNIRQLRNVLRYACAVCDS TRVELRHVSPDVAALLAPDEAALRPALA LENDERARIVDALTRHHWRPNAAAEALGM InterProScan GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
77
Annotating novel sequences
Can use BLAST queries to find similar sequences with GO annotation which can be transferred to the new sequence Two tools currently available; AmiGO BLAST (from GO Consortium) searches the GO Consortium database BLAST2GO (from Babelomics) searches the NCBI database GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
78
Exportin-T from Pongo abelii (Sumatran orangutan)
AmiGO BLAST Exportin-T from Pongo abelii (Sumatran orangutan) GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
79
Numerous Third Party Tools
Many tools exist that use GO to find common biological functions from a list of genes: Freely available for use GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
80
GO tools: enrichment analysis
Most of these tools work in a similar way: input a gene list and a subset of ‘interesting’ genes tool shows which GO categories have most interesting genes associated with them i.e. which categories are ‘enriched’ for interesting genes tool provides a statistical measure to determine whether enrichment is significant Try exercise 7 at home GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
81
GO slims Many GO analysis tools use GO slims to give a broad overview of the dataset GO slims are cut-down versions of the GO and contain a subset of the terms in the whole GO GO slims usually contain less-specialised GO terms GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
82
Slimming the GO using the ‘true path rule’
Many gene products are associated with a large number of descriptive, leaf GO nodes: GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
83
Slimming the GO using the ‘true path rule’
…however annotations can be mapped up to a smaller set of parent GO terms: GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
84
GO slims Custom slims are available for download;
Or you can make your own using; QuickGO AmiGO's GO slimmer GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
85
Slimming with QuickGO Search GO terms or proteins Find sets of
GO annotations Map-up annotations with GO slims GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
86
Map-up annotation using a GO slim
Exercise Map-up annotation using a GO slim Exercise 6 GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
87
Just some things to be aware of….
The GO is continually changing New terms created Existing terms obsoleted Re-structured New annotations being created ALWAYS use a current version of ontology and annotations If publishing your analyses, please report the versions/dates you use: Differences in representation of GO terms may be due to biological phenomenon. But also may be due to annotation-bias or experimental assays Often better to remove the ‘NOT’ annotations before doing any large-scale analysis, as they can skew the results ontology annotation GO and GO Annotation, EBI Bioinformatics Roadshow. Düsseldorf. March 2011
89
Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.