Download presentation
Presentation is loading. Please wait.
Published byPolly Stokes Modified over 9 years ago
1
Copyright OpenHelix. No use or reproduction without express written consent1
2
Version 3 Copyright OpenHelix. No use or reproduction without express written consent 2 Gene Ontology ™, MeSH ®, EC and more…. Controlled Vocabularies (and why you should care) Materials prepared by: Mary Mangan, Ph.D. www.openhelix.com Updated: Q1 2011
3
Copyright OpenHelix. No use or reproduction without express written consent3 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises
4
Copyright OpenHelix. No use or reproduction without express written consent4 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises
5
Copyright OpenHelix. No use or reproduction without express written consent5 What are Controlled Vocabularies? Controlled Constrained Restricted Structured Ontology From the Free On-line Dictionary of computing (www.foldoc.org)www.foldoc.org Describes ONTOLOGY as: “The hierarchical structuring of knowledge about things by subcategorising them according to their essential (or at least relevant and/or cognitive) qualities.”
6
Copyright OpenHelix. No use or reproduction without express written consent6 General Structure: a Hierarchy Most specific term Pretty specific term Increasingly specific term Less broad term Broad, general term Mitochondrial matrix Mitochondrion Cytoplasm Intracellular Cell Parent and child terms
7
Copyright OpenHelix. No use or reproduction without express written consent7 Controlled Vocabularies are: Trying to ensure we use the same words to describe the same thing; consistent descriptions Usually developed by groups of people with expertise in a field (e.g. the GO Consortium) Are easier for computers to handle Easy to share between different databases Are becoming the standard way to describe genes, proteins, structures, activities, etc Are appearing in databases near you….
8
Copyright OpenHelix. No use or reproduction without express written consent8 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises
9
Copyright OpenHelix. No use or reproduction without express written consent9 A Sample of Gene Ontology ™ www.geneontology.org search term: mitochondrial matrix search result: as a list search result: as a tree GO ID # Genes
10
Copyright OpenHelix. No use or reproduction without express written consent10 MeSH ®, Medical Subject Headings http://www.nlm.nih.gov/mesh/MBrowser.html MeSH browser search: mitochondrion
11
Copyright OpenHelix. No use or reproduction without express written consent11 MeSH ®, Medical Subject Headings upper part of results page lower part of results page MeSH ID
12
Copyright OpenHelix. No use or reproduction without express written consent12 EC Numbers The Enzyme Commission has long used EC numbers to classify enzymes, based on the reactions they catalyze. (since the 1950s!) Many people have seen EC numbers in papers, or GenBank or Swiss-Prot records. http://www.ebi.ac.uk/swissprot/access.html
13
Copyright OpenHelix. No use or reproduction without express written consent13 EC Numbers (continued) Details can be found at: http://www.chem.qmw.ac.uk/iubmb/enzyme/ broad term specific term specific enzymes
14
Copyright OpenHelix. No use or reproduction without express written consent14 Controlled Vocabularies Standardized terms for describing biological properties Different groups have different types of lists, to characterize the functions that they are interested in Next: how are they assigned, and where are they used
15
Copyright OpenHelix. No use or reproduction without express written consent15 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises
16
Copyright OpenHelix. No use or reproduction without express written consent16 How Terms are Assigned to Genes Some terms are COMPUTATIONALLY assigned; based on a certain protein domain, or a homology to another known gene. Some are assigned by professional CURATORS; actual live scientists who read papers and attach a term to a given gene record. Some may come from authors of a paper, or from volunteer curators or submitters. Assignment of terms to a record is referred to as ANNOTATION.
17
Copyright OpenHelix. No use or reproduction without express written consent17 Computational, Curator, or Volunteer Assignment Entrez Gene entry of a Riken gene. Computationally assigned GO terms. http://www.ncbi.nlm.nih.gov/ entrez/query.fcgi?db=gene GO definitions of the ISS evidence code: Computational assignment
18
Copyright OpenHelix. No use or reproduction without express written consent18 Computational, Curator, or Volunteer Assignment Curator assignment Entrez Gene entry with GO evidence code TAS; evidence from a paper, applied by a human curator
19
Copyright OpenHelix. No use or reproduction without express written consent19 Computational, Curator, or Volunteer Assignment Volunteer assignment With any source of information, the quality can vary. http://www.ncbi.nlm.nih.gov/ BankIt/ Any person submitting sequence to GenBank could add terms to the record which might or might not be accurate usage.
20
Copyright OpenHelix. No use or reproduction without express written consent20 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises
21
Copyright OpenHelix. No use or reproduction without express written consent21 Appearing in Databases Near You www.informatics.jax.org
22
Copyright OpenHelix. No use or reproduction without express written consent22 Appearing in Databases Near You www.flybase.org
23
Copyright OpenHelix. No use or reproduction without express written consent23 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises
24
Copyright OpenHelix. No use or reproduction without express written consent24 Great Question: What Forms Do These Come In? Entrez Gene sample Within database entries, both public and private MeSH XML file Plain text, or “flat” files. Best for computer folks to use in their work. Downloadable. May be XML, ASCII, other forms. Web browsers. Software permits you to browse around the information (AmiGO, The Jackson Laboratory, others)… GO Browser
25
Copyright OpenHelix. No use or reproduction without express written consent25 Great Question: Can the Same Term be at More Than One Place in a Hierarchy? Yes
26
Copyright OpenHelix. No use or reproduction without express written consent26 Great Question: Can There be Species-Specific Terms in a Broader Hierarchy? Yes
27
Copyright OpenHelix. No use or reproduction without express written consent27 Why Should You Care? You can do better searches, and get more accurate results back. You can vary the resolution of what you are looking for (okay, maybe that last search was too specific, let me back up one level…) Cast a wide net in a search. Some search tools REQUIRE you to use a controlled vocabulary term in a search. You may be forced someday to use specific, correct terms in writing papers or for submissions to databases…. As a researcher:
28
Copyright OpenHelix. No use or reproduction without express written consent28 Why Should You Care? You don’t need to duplicate the efforts of other groups; many term lists have already been created. Different databases sharing the same lists can be inter-operating. Tools have already been created for using these controlled vocabularies. Often user-groups are available to help with them. As a software developer:
29
Copyright OpenHelix. No use or reproduction without express written consent29 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises
30
Copyright OpenHelix. No use or reproduction without express written consent30 Web Sites of Some Controlled Vocabularies Gene Ontology™ Consortium http://www.geneontology.org MeSH, from National Library of Medicine http://www.nlm.nih.gov/mesh/meshhome.html http://www.nlm.nih.gov/mesh/meshhome.html Open Biological Ontologies http://obo.sourceforge.net/ Systematized Nomenclature of Medicine, SNOMED ® http://www.snomed.org/ EC Numbers, Enzyme Commission http://www.chem.qmw.ac.uk/iubmb/enzyme/ Other specialist groups are creating their own…not a complete list….
31
Copyright OpenHelix. No use or reproduction without express written consent31 Controlled Vocabularies are: (reprise) Trying to ensure we use the same words to describe the same thing; consistent descriptions Usually developed by groups of people with expertise in a field (GO Consortium) Are easier for computers to handle Easy to share between different databases Are becoming the standard way to describe genes, proteins, structures, activities, etc Are appearing in databases near you….
32
Copyright OpenHelix. No use or reproduction without express written consent32 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises
33
Copyright OpenHelix. No use or reproduction without express written consent33
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.