Copyright OpenHelix. No use or reproduction without express written consent1.

Slides:



Advertisements
Similar presentations
Copyright OpenHelix. No use or reproduction without express written consent1.
Advertisements

1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Gene Ontology John Pinney
Copyright OpenHelix. No use or reproduction without express written consent1.
IST Computational Biology1 Information Retrieval Biological Databases 2 Pedro Fernandes Instituto Gulbenkian de Ciência, Oeiras PT.
An introduction to using the AmiGO Gene Ontology tool.
Getting the most out of FlyBase. Tools –QuickSearch – Controlled Vocabularies, Term Reports and TermLink –QueryBuilder.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
NCBI’s Bioinformatics Resources Michele R. Tennant, Ph.D., M.L.I.S. Health Science Center Libraries U.F. Genetics Institute January 2015.
Lecture Four: Steps 3 and 4 INST 250/4.  Does one look for facts, or opinions, or both when conducting a literature search?  What is the difference.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent 2 Overview of Genome Browsers Materials prepared by Warren C. Lathe, Ph.D.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Copyright OpenHelix. No use or reproduction without express written consent1.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
MeSH The Medical Subject Headings from the National Library of Medicine.
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
GVS: Genome Variation Server Materials prepared by: Warren C. Lathe, PhD Updated: Q Version 2.
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
Table of Contents – Part B HINARI Resources –Clinical Evidence –Cochrane Library –EBM Guidelines –Essential Evidence Plus –HINARI EBM Journals.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Welcome to the Protein Database Tutorial. This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Department of Genetics • Stanford University School of Medicine
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Presentation transcript:

Copyright OpenHelix. No use or reproduction without express written consent1

Version 3 Copyright OpenHelix. No use or reproduction without express written consent 2 Gene Ontology ™, MeSH ®, EC and more…. Controlled Vocabularies (and why you should care) Materials prepared by: Mary Mangan, Ph.D. Updated: Q1 2011

Copyright OpenHelix. No use or reproduction without express written consent3 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises

Copyright OpenHelix. No use or reproduction without express written consent4 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises

Copyright OpenHelix. No use or reproduction without express written consent5 What are Controlled Vocabularies? Controlled Constrained Restricted Structured Ontology From the Free On-line Dictionary of computing ( Describes ONTOLOGY as: “The hierarchical structuring of knowledge about things by subcategorising them according to their essential (or at least relevant and/or cognitive) qualities.”

Copyright OpenHelix. No use or reproduction without express written consent6 General Structure: a Hierarchy Most specific term Pretty specific term Increasingly specific term Less broad term Broad, general term Mitochondrial matrix Mitochondrion Cytoplasm Intracellular Cell Parent and child terms

Copyright OpenHelix. No use or reproduction without express written consent7 Controlled Vocabularies are: Trying to ensure we use the same words to describe the same thing; consistent descriptions Usually developed by groups of people with expertise in a field (e.g. the GO Consortium) Are easier for computers to handle Easy to share between different databases Are becoming the standard way to describe genes, proteins, structures, activities, etc Are appearing in databases near you….

Copyright OpenHelix. No use or reproduction without express written consent8 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises

Copyright OpenHelix. No use or reproduction without express written consent9 A Sample of Gene Ontology ™ search term: mitochondrial matrix search result: as a list search result: as a tree GO ID # Genes

Copyright OpenHelix. No use or reproduction without express written consent10 MeSH ®, Medical Subject Headings MeSH browser search: mitochondrion

Copyright OpenHelix. No use or reproduction without express written consent11 MeSH ®, Medical Subject Headings upper part of results page lower part of results page MeSH ID

Copyright OpenHelix. No use or reproduction without express written consent12 EC Numbers The Enzyme Commission has long used EC numbers to classify enzymes, based on the reactions they catalyze. (since the 1950s!) Many people have seen EC numbers in papers, or GenBank or Swiss-Prot records.

Copyright OpenHelix. No use or reproduction without express written consent13 EC Numbers (continued) Details can be found at: broad term specific term specific enzymes

Copyright OpenHelix. No use or reproduction without express written consent14 Controlled Vocabularies Standardized terms for describing biological properties Different groups have different types of lists, to characterize the functions that they are interested in Next: how are they assigned, and where are they used

Copyright OpenHelix. No use or reproduction without express written consent15 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises

Copyright OpenHelix. No use or reproduction without express written consent16 How Terms are Assigned to Genes Some terms are COMPUTATIONALLY assigned; based on a certain protein domain, or a homology to another known gene. Some are assigned by professional CURATORS; actual live scientists who read papers and attach a term to a given gene record. Some may come from authors of a paper, or from volunteer curators or submitters. Assignment of terms to a record is referred to as ANNOTATION.

Copyright OpenHelix. No use or reproduction without express written consent17 Computational, Curator, or Volunteer Assignment Entrez Gene entry of a Riken gene. Computationally assigned GO terms. entrez/query.fcgi?db=gene GO definitions of the ISS evidence code: Computational assignment

Copyright OpenHelix. No use or reproduction without express written consent18 Computational, Curator, or Volunteer Assignment Curator assignment Entrez Gene entry with GO evidence code TAS; evidence from a paper, applied by a human curator

Copyright OpenHelix. No use or reproduction without express written consent19 Computational, Curator, or Volunteer Assignment Volunteer assignment With any source of information, the quality can vary. BankIt/ Any person submitting sequence to GenBank could add terms to the record which might or might not be accurate usage.

Copyright OpenHelix. No use or reproduction without express written consent20 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises

Copyright OpenHelix. No use or reproduction without express written consent21 Appearing in Databases Near You

Copyright OpenHelix. No use or reproduction without express written consent22 Appearing in Databases Near You

Copyright OpenHelix. No use or reproduction without express written consent23 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises

Copyright OpenHelix. No use or reproduction without express written consent24 Great Question: What Forms Do These Come In? Entrez Gene sample Within database entries, both public and private MeSH XML file Plain text, or “flat” files. Best for computer folks to use in their work. Downloadable. May be XML, ASCII, other forms. Web browsers. Software permits you to browse around the information (AmiGO, The Jackson Laboratory, others)… GO Browser

Copyright OpenHelix. No use or reproduction without express written consent25 Great Question: Can the Same Term be at More Than One Place in a Hierarchy? Yes

Copyright OpenHelix. No use or reproduction without express written consent26 Great Question: Can There be Species-Specific Terms in a Broader Hierarchy? Yes

Copyright OpenHelix. No use or reproduction without express written consent27 Why Should You Care? You can do better searches, and get more accurate results back. You can vary the resolution of what you are looking for (okay, maybe that last search was too specific, let me back up one level…) Cast a wide net in a search. Some search tools REQUIRE you to use a controlled vocabulary term in a search. You may be forced someday to use specific, correct terms in writing papers or for submissions to databases…. As a researcher:

Copyright OpenHelix. No use or reproduction without express written consent28 Why Should You Care? You don’t need to duplicate the efforts of other groups; many term lists have already been created. Different databases sharing the same lists can be inter-operating. Tools have already been created for using these controlled vocabularies. Often user-groups are available to help with them. As a software developer:

Copyright OpenHelix. No use or reproduction without express written consent29 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises

Copyright OpenHelix. No use or reproduction without express written consent30 Web Sites of Some Controlled Vocabularies Gene Ontology™ Consortium MeSH, from National Library of Medicine Open Biological Ontologies Systematized Nomenclature of Medicine, SNOMED ® EC Numbers, Enzyme Commission Other specialist groups are creating their own…not a complete list….

Copyright OpenHelix. No use or reproduction without express written consent31 Controlled Vocabularies are: (reprise) Trying to ensure we use the same words to describe the same thing; consistent descriptions Usually developed by groups of people with expertise in a field (GO Consortium) Are easier for computers to handle Easy to share between different databases Are becoming the standard way to describe genes, proteins, structures, activities, etc Are appearing in databases near you….

Copyright OpenHelix. No use or reproduction without express written consent32 Controlled Vocabularies Agenda Introduction GO, MeSH, and EC numbers Making term assignments Appearing in databases Common questions Conclusion Exercises

Copyright OpenHelix. No use or reproduction without express written consent33