Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Gene Ontology Project

Similar presentations


Presentation on theme: "The Gene Ontology Project"— Presentation transcript:

1 The Gene Ontology Project
19/09/18 The Gene Ontology Project An Introduction

2 There is a lot of biological research output. 19/09/18

3 Search on mesoderm development… 19/09/18

4 You get 6752 results! How will you ever find what you want? Another
example… 19/09/18

5 How will you spot the patterns?
attacked time control Microarray data shows changed expression of thousands of genes. How will you spot the patterns? Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. 19/09/18

6 Scientists work hard. 19/09/18
19/09/18

7 There are lots of papers to read. 19/09/18
19/09/18

8 more every week. 19/09/18

9 and more… 19/09/18 http://www.teamtechnology.co.uk/f-scientist.jpg
19/09/18

10 more and more and more! 19/09/18
19/09/18

11 Help! Help! more and more and more! 19/09/18
19/09/18

12 Ontology is a way to capture knowledge in a written and computable form.
Computable means that the computer finds patterns so we don’t have to. 19/09/18

13 Demo and practical work
Ebay search (keyword ‘lead’) v. Pubmed search (keyword ‘flower’) Demo and practical work 19/09/18

14 The Gene Ontology 19/09/18

15 This is our browser. 19/09/18

16 Search on mesoderm development. 19/09/18

17 Here is mesoderm development. 19/09/18

18 Definition of mesoderm development. Gene products involved in 19/09/18

19 There are many gene products involved in mesoderm development.
But fewer gene products than papers. You can read papers describing what is known about them. 19/09/18

20 19/09/18

21 time attacked control Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. 19/09/18

22 See which processes are upregulated or downregulated.
time Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Puparial adhesion Molting cycle hemocyanin Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Immune response Toll regulated genes attacked control Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. 19/09/18

23 Did you find your favourite gene product or process?
Practical work: Search AmiGO Did you find your favourite gene product or process? 19/09/18

24 How does the Gene Ontology work?
19/09/18

25 19/09/18

26 term: transcription initiation
The Gene Ontology is like a dictionary term: transcription initiation id: GO: definition: Processes involved in the assembly of the RNA polymerase complex at the promoter region of a DNA template resulting in the subsequent synthesis of RNA from that promoter. 19/09/18

27 The whole system. Clark et al., 2005 is_a part_of
The gene ontology consortium develops ontologies and makes annotation of gene products to those ontologies. The ontologies are databases containing sets of biological processes, molecular functions, and cellular components and the relationships between them. Annotators within the consortium use these ontologies to categorise gene products. During my talk today I’m going to explain about the uses of GO and give a more detailed explanation of the ontolgoies and of the system of annotation. Then Harold Drabkin is going to talk in more detail about annotation and about how you can submit annotations of your own gene products of interest. 19/09/18

28 Mitochondrial P450 (CC24 PR01238; MITP450CC24) An example…
This is a gene product that has already been annotated to all three gene ontologies. It is the Mitochondrial P450 gene product. 19/09/18

29 Where is it? Mitochondrial p450 mitochondrial inner membrane
The mitochondrial p450 gene products are localised on the mitochondrial inner membrane and the GO cellular component term for this is mitochondrial inner membrane ; GO: mitochondrial inner membrane GO cellular component term: GO: 19/09/18

30 substrate + O2 = CO2 +H20 product
What does it do? substrate + O2 = CO2 +H20 product The function of the gene product is described by the GO molecular function term: monooxygenase activity ; GO: monooxygenase activity GO molecular function term: GO: 19/09/18

31 Which process is this? electron transport ; GO:0006118
The process in which the gene product is involved is GO biological process term: electron transport ; GO: In this way you can see that many aspects of a single gene product can be recorded simply by annotating it to the three ontologies. electron transport GO biological process term: GO: mitochondrion/krebpic.html 19/09/18

32 The whole system. Clark et al., 2005 is_a part_of
The gene ontology consortium develops ontologies and makes annotation of gene products to those ontologies. The ontologies are databases containing sets of biological processes, molecular functions, and cellular components and the relationships between them. Annotators within the consortium use these ontologies to categorise gene products. During my talk today I’m going to explain about the uses of GO and give a more detailed explanation of the ontolgoies and of the system of annotation. Then Harold Drabkin is going to talk in more detail about annotation and about how you can submit annotations of your own gene products of interest. 19/09/18

33 The Gene Ontology is for all species some language barriers.
and that means we have to *bridge* some language barriers. The gene ontology consortium develops ontologies and makes annotation of gene products to those ontologies. The ontologies are databases containing sets of biological processes, molecular functions, and cellular components and the relationships between them. Annotators within the consortium use these ontologies to categorise gene products. During my talk today I’m going to explain about the uses of GO and give a more detailed explanation of the ontolgoies and of the system of annotation. Then Harold Drabkin is going to talk in more detail about annotation and about how you can submit annotations of your own gene products of interest. 19/09/18

34 Same name, same thing? Bridge of Sighs, Cambridge. Ponte dei Sospiri,
Bridge of Sighs, Cambridge. Ponte dei Sospiri, Venice. 19/09/18

35 ? Taction Tactition Tactile sense In biology…
In developing the ontologies we are solving a number of problems biologists. Currently in biology there are many ambiguities in language. Groups of researchers may use the same words to mean different things, or they may use several different words to refer to the same thing. This causes problems for scientists trying to access research carried out by groups outside their immediate field. It also makes it very difficult to process biological information using a computer. For example three groups of biologists studying different model organisms may all be studying the perception of touch. Scientists in different groups might talk about this single process as ‘tactition’, ‘tactile sense’ or ‘taction’. This differing use of language means that when they try to find and read each other’s papers they will have more trouble. It will also be harder for them to use a computer to find and interpret biological data on this subject since the computer has no way to know that these words mean the same thing. 19/09/18

36 perception of touch ; GO:0050975
Taction Tactition Tactile sense perception of touch ; GO: The GO provides a solution to this problem since we take biological concepts like the perception of touch and we make them a single GO item in the ontology. We add all the relevant synonyms, and give a unique numerical identifier to the concept. 19/09/18

37 Bud initiation? The GO also provides a solution to the opposite problem in which several groups of scientists use the same words to refer to different things. For example the phrase ‘bud inititation’ could refer to the initiation of a tooth bud, a yeast reproductive bud, or a bud on a tree. However, these three types of bud are initiated in quite different ways, and scientists would like to be able to distinguish between them. 19/09/18

38 = reproductive bud initiation
= tooth bud initiation = reproductive bud initiation To solve this problem the GO differentiates between differing concepts by adding a ‘sensu ending’. So according to this example, we would have ‘bud initiation sensu Metazoa’ to mean the kind of bud initation that gives rise to a tooth in mammals. We would have ‘bud initiation sensu Saccharomyces’ for the initation of a reproductive bud in yeast, and we would have ‘bud initiation sensu Viridiplantae’ for the initation of a tree bud. This means that gene products involved in bud initiation can be categorised along with only those other gene products involved in the same kind of bud initation. = branch bud initiation 19/09/18

39 Demo: Writing an ontology The car ontology 19/09/18

40 Demo: The gene ontology
19/09/18

41 Categorization of gene products using GO is called annotation.
So how does that happen? 19/09/18

42 Choose your favourite gene.
P05147 Choose your favourite gene. 19/09/18

43 P05147 Find a paper about it. PMID: 19/09/18

44 Find the GO term describing its function, process
PMID: Find the GO term describing its function, process or location of action. GO: 19/09/18

45 IDA What evidence do they show? P05147 PMID: 2976880 GO:0047519
19/09/18

46 P05147 GO:0047519 IDA PMID:2976880 Write these down… IDA P05147
19/09/18

47 Send to the GO Consortium
. Send to the GO Consortium 19/09/18

48 Finding annotations in a paper
…for B. napus PERK1 protein (Q9ARH1) In this study, we report the isolation and molecular characterization of the B. napus PERK1 cDNA, that is predicted to encode a novel receptor-like kinase. We have shown that like other plant RLKs, the kinase domain of PERK1 has serine/threonine kinase activity, In addition, the location of a PERK1-GTP fusion protein to the plasma membrane supports the prediction that PERK1 is an integral membrane protein…these kinases have been implicated in early stages of wound response… serine/threonine kinase activity, integral membrane protein wound response PubMed ID: …piece of text…from which can pull out GO terms Function: protein serine/threonine kinase activity GO: Component: integral to plasma membrane GO: Process: response to wounding GO: 19/09/18

49 19/09/18 Annotation details

50 19/09/18

51 19/09/18

52 Where to get annotations?
Non-redundant species database Contains all GO annotations for given species + other information. Multispecies database - GOA Contains all GO annotations. 19/09/18

53 Evidence codes 19/09/18

54 IDA - inferred from direct assay Enzyme assays
In vitro reconstitution (e.g. transcription) Immunofluorescence (for cellular component) Cell fractionation (for cellular component) Physical interaction/binding IEP - inferred from expression pattern Transcript levels (e.g. Northerns, microarray data) Protein levels (e.g. Western blots) IGC - inferred from genomic context Operon structure Syntenic regions Pathway analysis Genome-scale analysis of processes 19/09/18

55 IGI - inferred from genetic interaction
"Traditional" genetic interactions such as suppressors, synthetic lethals, etc. Functional complementation Rescue experiments Inference about one gene drawn from the phenotype of a mutation in a different gene. IMP - inferred from mutant phenotype Any gene mutation/knockout Overexpression/ectopic expression of wild-type or mutant genes Anti-sense experiments RNAi experiments Specific protein inhibitors Polymorphism or allelic variation IPI - inferred from physical interaction 2-hybrid interactions Co-purification Co-immunoprecipitation Ion/protein binding experiments 19/09/18

56 ISS - inferred from sequence or structural similarity
Sequence similarity (homologue of/most closely related to) Recognized domains Structural similarity Southern blotting RCA - inferred from reviewed computational analysis Large-scale protein-protein interaction experiments Microarray experiments Integration of large-scale datasets of several types Text-based computation IEA - Inferred from Electronic Annotation NAS - non-traceable author statement ND - no biological data available TAS - traceable author statement NR - not recorded 19/09/18

57 Should we trust electronic annotations?
PMID: 19/09/18

58 19/09/18

59 19/09/18

60 ec2go mapping !version: $Revision: 1.67 $
!date: $Date: 2008/01/21 11:29:01 $ !Mapping of GO function_ontology "enzymes" to Enzyme Commission Numbers. !original mapping by Michael Ashburner, Cambridge. !This version parsed from function.ontology on 2008/01/15 14:01:16 !by Daniel Barrell, EBI, Hinxton ! EC:1 > GO:oxidoreductase activity ; GO: EC:1.1 > GO:oxidoreductase activity, acting on CH-OH group of donors ; GO: EC:1.1.1 > GO:oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor ; GO: EC: > GO:alcohol dehydrogenase activity ; GO: EC: > GO:L-xylulose reductase activity ; GO: EC: > GO:3-oxoacyl-[acyl-carrier-protein] reductase activity ; GO: EC: > GO:acylglycerone-phosphate reductase activity ; GO: EC: > GO:3-dehydrosphinganine reductase activity ; GO: EC: > GO:L-threonine 3-dehydrogenase activity ; GO: EC: > GO:4-oxoproline reductase activity ; GO: 19/09/18

61 interpro2go mapping InterPro is a database of protein families,
domains and functional sites in which identifiable features found in known proteins can be applied to unknown protein sequences. !date: 2008/01/15 13:01:24 !Mapping of InterPro entries to GO !Nicola Mulder, Hinxton ! InterPro:IPR Retinoid X receptor > GO:DNA binding ; GO: InterPro:IPR Retinoid X receptor > GO:steroid binding ; GO: InterPro:IPR Retinoid X receptor > GO:regulation of transcription, DNA-dependent ; GO: InterPro:IPR Retinoid X receptor > GO:nucleus ; GO: InterPro:IPR Helix-turn-helix, AraC type > GO:transcription factor activity ; GO: InterPro:IPR Helix-turn-helix, AraC type > GO:intracellular ; GO: InterPro:IPR Metallothionein, vertebrate > GO:metal ion binding ; GO: InterPro:IPR Peptidase M7, snapalysin > GO:extracellular region ; GO: InterPro:IPR PAS > GO:signal transducer activity ; GO: InterPro:IPR Fimbrial biogenesis outer membrane usher protein > GO:transporter activity ; GO: InterPro:IPR P2Y4 purinoceptor > GO:purinergic nucleotide receptor activity, G-protein coupled ; GO: InterPro:IPR Anaphylatoxin/fibulin > GO:extracellular region ; GO: InterPro:IPR Hok/gef cell toxic protein > GO:membrane ; GO: InterPro:IPR Carboxyl transferase > GO:ligase activity ; GO: InterPro:IPR Phosphofructokinase > GO:6-phosphofructokinase activity ; GO: InterPro:IPR Melatonin receptor > GO:integral to membrane ; GO: InterPro:IPR Guanine-specific ribonuclease N1 and T1 > GO:endoribonuclease activity ; GO: InterPro:IPR Chloroperoxidase > GO:peroxidase activity ; GO: 19/09/18

62 electronic annotation appears in QuickGO.
Manual annotation appears in AmiGO. Manual and electronic annotation appears in QuickGO. 19/09/18

63 Many species groups annotate. We see the research of one
Clark et al., 2005 Many species groups annotate. We see the research of one function across all species. 19/09/18

64 Search for your favourite gene and see if the annotation
Exercise: Search for your favourite gene and see if the annotation is electronic or manual. 19/09/18

65 Submit new GO terms: 19/09/18

66 19/09/18

67 19/09/18 GO slims

68 Clark et al., 2005 is_a part_of
The gene ontology consortium develops ontologies and makes annotation of gene products to those ontologies. The ontologies are databases containing sets of biological processes, molecular functions, and cellular components and the relationships between them. Annotators within the consortium use these ontologies to categorise gene products. During my talk today I’m going to explain about the uses of GO and give a more detailed explanation of the ontolgoies and of the system of annotation. Then Harold Drabkin is going to talk in more detail about annotation and about how you can submit annotations of your own gene products of interest. 19/09/18

69 Clark et al., 2005 is_a part_of
The gene ontology consortium develops ontologies and makes annotation of gene products to those ontologies. The ontologies are databases containing sets of biological processes, molecular functions, and cellular components and the relationships between them. Annotators within the consortium use these ontologies to categorise gene products. During my talk today I’m going to explain about the uses of GO and give a more detailed explanation of the ontolgoies and of the system of annotation. Then Harold Drabkin is going to talk in more detail about annotation and about how you can submit annotations of your own gene products of interest. 19/09/18

70 Whole genome analysis (J. D. Munkvold et al., 2004)
The GO can be used for further very specific applications in the lab. For example in microarray analysis you can use the GO data to show which processes are modified by the treatment being studied in a given microarray experiment. You can also use the GO to give an overview of the range of gene products in a whole genome as represented by the functions of processes those genes are involved in. 19/09/18

71 …analysis of high-throughput data according to GO
time Puparial adhesion Molting cycle hemocyanin Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Immune response Toll regulated genes attacked control Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. 19/09/18

72 Making Slims: OBO-Edit
19/09/18

73 Reapplying slimmed ontology to annotations: AmiGO http://amigo
19/09/18

74 Converting IDs: PICR http://www.ebi.ac.uk/Tools/picr/
19/09/18

75 GOOSE http://www.berkeleybop.org/goose
19/09/18

76 U.S. Virgin Islands, March 30 - April 3, 2006
2006 Consortium Meeting, St. Croix, U.S. Virgin Islands, March 30 - April 3, 2006 Finally, this is the current list of groups in the consortium. The Editorial office where the ontologies are developed is in Cambridge in the UK, and the rest of these groups contribute annotations. We are keen to include more groups in the annotation process so that more species will be manually annotated to the go and so Harold Drabkin is now going to talk about the process of annotation and about how you can contribute manual annotations of your own gene products. 19/09/18

77 E. Coli hub Reactome 19/09/18


Download ppt "The Gene Ontology Project"

Similar presentations


Ads by Google