The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.

Slides:



Advertisements
Similar presentations
The Gene Ontology Project: Content for the Semantic Web.
Advertisements

25th June 2007 Jane Lomax Using the Gene Ontology (GO) for analysis of expression data Jane Lomax EMBL-EBI.
Www. GeneOntology.org Gene Ontology Collaboration.
Using Semantic Similarity Measures in the Biomedical Domain for Computing Similarity between Genes based on Gene Ontology By : Elham Khabiri Adviser :
“Biomedical computing is entering an age where creative exploration of huge amounts of data will lay the foundation of hypotheses. Much work must still.
Collaboration with IntAct and InterMine: SGD Rama Balakrishnan Saccharomyces Genome Database Gene Ontology Consortium Stanford University, CA USA.
The Gene Ontology Consortium Jennifer Clark, GO Editorial Office.
Gene Ontology John Pinney
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
Real-life ontology development: lessons from the Gene Ontology.
Ontology Notes are from:
Extending to the GO model OBO open biology ontologies aka - extended go - (ego)
1 Using Gene Ontology. 2 Assigning (or Hypothesizing About) Biological Meaning to Clusters What do you want to be able to to? –Identify over-represented.
COG and GO tutorial.
Proteins and Protein Function Charles Yan Spring 2006.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Internet tools for genomic analysis: part 2
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
Gene Ontology Project
Modifying GO How changes are made to GO, and how you can be involved.
Analysis Environments For Scientific Communities From Bases to Spaces Bruce R. Schatz Institute for Genomic Biology University of Illinois at Urbana-Champaign.
Using The Gene Ontology: Gene Product Annotation.
Gene Ontology (GO) Project
Linking Diseases and Genes through Informatics Knowledge Bases and Ontologies Joyce A. Mitchell, Ph.D. National Library of Medicine University of Missouri.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA Experiences in visualizing and navigating biomedical.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Open Biomedical Ontologies. Open Biomedical Ontologies (OBO) An umbrella project for grouping different ontologies in biological/medical field –a repository.
Only build an ontology if: You have a body of data to annotate.
March 24, Integrating genomic knowledge sources through an anatomy ontology Gennari JH, Silberfein A, and Wiley JC Pac Symp Biocomputing 2005:
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
Gene Ontology Consortium
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Gene Ontology Project
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
Lecture Four: GO: The Gene Ontology ----Infrastructure for Systems Biology.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
DAVID R. SMITH DR. MARY DOLAN DR. JUDITH BLAKE Integrating the Cell Cycle Ontology with the Mouse Genome Database.
Integrating the Cell Cycle Ontology with the Mouse Genome Database David R. Smith Mary Dolan Dr. Judith Blake.
Gene Onotology Part 1: what is the GO? Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse Genome Informatics.
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
DAVID R. SMITH DR. MARY DOLAN DR. JUDITH BLAKE Integrating the Cell Cycle Ontology with the Mouse Genome Database.
Protein Information Resource Protein Information Resource, 3300 Whitehaven St., Georgetown University, Washington, DC Contact
The Gene Ontology and its insertion into UMLS Jane Lomax.
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
To Boldly GO… Amelia Ireland GO Curator EBI, Hinxton, UK.
Gene Ontology Project
What is an Ontology? A representation of knowledge in a domain In theory Thomas Gruber (1993) “An ontology is a formal, explicit specification of a shared.
Gene Ontology Consortium
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Organization Challenges for Building Biomedical Ontologies Talk by Jennifer Clark, Slides by Michael Ashburner and Suzanna Lewis
Computer Science Ph. D. Seminar Gene Ontology (GO) Based Search for Protein Structure Similarity Clustering Metrics Ph.D. Candidate Steve Johnson Committee.
Central hub for biological data UniProtKB/Swiss-Prot is a central hub for biological data: over 120 databases are cross-referenced (EMBL/DDBJ/GenBank,
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
Gene Ontology TM (GO) Consortium
Joined up ontologies: incorporating the Gene Ontology into the UMLS.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Towards a unified MOD resource: An Overview
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Annotating with GO: an overview
GO : the Gene Ontology & Functional enrichment analysis
Saccharomyces Genome Database (SGD)
Department of Genetics • Stanford University School of Medicine
Overview Gene Ontology Introduction Biological network data
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Presentation transcript:

The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI

What is the Gene Ontology? Controlled vocabulary - GO –Terms and relationships –Bottom-up approach Annotation of proteins to terms –Gene association files Software/database development Freely available

The vocabulary GO is divided into three sub-vocabularies: –biological process broad series of events, can either be at the level of the cell or organism e.g. circulation, glycolysis –molecular function direct activities e.g. catalysis, binding –cellular component site of action e.g. nucleus, ribosome

The vocabulary Hierarchical Directed Acyclic Graph –terms have one or more parents is-a and part-of relations

How is GO maintained? Several full-time editors Requests from community –database curators, researchers, software developers –SourceForge tracker GO Consortium meetings for large changes Mailing lists

OBO - Open Biological Ontologies GO is a member vocabulary of OBO A repository for biological structured vocabularies –Freely available without license –Common syntax –Orthogonal to existing ontologies

Future developments File format –Current GO flat file format partly redundant difficult to parse –New format Extensible e.g. new relationship types can be specified minimal redundancy, but human readable easier to parse Moving to a database being the primary form of GO

Formalizing GO Informality is a common criticism of GO –developed by biologists, for biologists Now beginning work ‘decomposing’ GO using ProLog –Terms broken down into constituent parts e.g. regulation of heart development –New terms could be created from orthogonal ontologies e.g. anotomical Work translating GO in DL, reasoning across the ontologies

GO into UMLS GO now released as part of the NLM’s Unified Medical Language System Metathesaurus Links biomedical vocabularies including MeSH and SNOMED. The process of including GO in UMLS highlighted problems in both systems

GO synonyms Text strings associated with GO terms Often do not have identical meaning to term Reduces utility in e.g. semantic matching Developed relationships between terms and synonyms –soon to be fully implemented in GO

FlyBase & Berkeley Drosophila Genome Project Saccharomyces Genome Database PomBase (Sanger Institute) Rat Genome Database Genome Knowledge Base (CSHL) The Institute for Genomic Research Compugen, Inc The Arabidopsis Information Resource WormBase DictyBase Mouse Genome Informatics Swiss-Prot/TrEMBL/InterPro Pathogen Sequencing Unit (Sanger Institute) The Gene Ontology Consortium is supported by an R01 grant from the National Human Genome Research Institute (NHGRI) [grant HG02273]. SGD is supported by a P41, National Resources, grant from the NHGRI [grant HG01315]; MGD by a P41 from the NHGRI [grant HG00330]; GXD by the National Institute of Child Health and Human Development [grant HD33745]; FlyBase by a P41 from the NHGRI [grant HG00739] and by the Medical Research Council, London. TAIR is supported by the National Science Foundation [grant DBI ]. WormBase is supported by a P41, National Resources, grant from the NHGRI [grant HG02223]; RGD is supported by an R01 grant from the NHLBI [grant HL64541]; DictyBase is supported by an R01 grant from the NIGMS [grant GM064426].