Increased Expressivity of Gene Ontology Annotations Huntley RP, Harris MA, Alam-Faruque Y, Carbon SJ, Dietze H, Dimmer E, Foulger R, Hill DP, Khodiyar.

Slides:



Advertisements
Similar presentations
BioPortal Status and Plans September 2011 Ray Fergerson NCBO Project Director Stanford University 1.
Advertisements

DAML Ontology Library Mike Dean OntoLog Forum 28 February
Annotation of Gene Function …and how thats useful to you.
NIH Public Access Compliance Cleveland Health Sciences Library Case Western Reserve University Kathleen C. Blazar.
Relations in GO for Intro We have many relations ready to GO live in the scratch directory – within GO ontologies – across GO ontologies – between.
Bridging GO, Uberon and multiple species specific anatomy ontologies.
A Unified Clinical Genomics Database
More than one way to dissect an animal Melissa Haendel ZFIN Scientific Curator.
Collaboration with IntAct and InterMine: SGD Rama Balakrishnan Saccharomyces Genome Database Gene Ontology Consortium Stanford University, CA USA.
Automated tools to help construction of Trait Ontologies Chris Mungall Monarch Initiative Gene.
Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Extending to the GO model OBO open biology ontologies aka - extended go - (ego)
What is an ontology and Why should you care? Barry Smith with thanks to Jane Lomax, Gene Ontology Consortium 1.
COG and GO tutorial.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Comprehensive Annotation System for Infectious Disease Data Alexander Diehl University at Buffalo/The Jackson Laboratory IDO Workshop /9/2010.
CACAO - Penn State Gene Function and Gene Ontology January 2011
Gene Ontology at WormBase: Making the Most of GO Annotations Kimberly Van Auken.
Moving beyond free text. Authors Scientist does research Scientist publishes research results in journal article Old Paradigm:
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
The Plant Ontology: Linking Phenotypes and Genomics Across Plant Taxa Laurel D. Cooper* 1, Ramona L. Walls 2, Justin Elser 1, Justin Preece 1, Dennis W.
Introduction to the Gene Ontology and GO annotation resources
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Editing the Gene Ontology Midori A. Harris GO Editorial Office EBI, Hinxton, UK.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Applying the Semantic Web at UCHSC - Center for Computational Pharmacology Ian Wilson.
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
GO Galaxy. Enrichment Enrichment analysis is a ‘killer app’ for GO – Should be more central to what we do – Also other tools: e.g. function prediction.
Gene Ontology Consortium
Cell Ontology 2.0 Elimination of multiple is_a inheritance through instantiation of relationships to terms in outside ontologies, such as the GO cellular.
The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
TermGenie – Granting Biocurators’ Wishes for the GeneOntology BioCurator Meeting 2013 Heiko Dietze – Lightning Talk.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Manual GO annotation Evidence: Source AnnotationsProteins IEA:Total Manual: Total
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
Linking Animal Models and Human Diseases Supported by NIH P41 HG002659, U54 HG004028, & R01 HG Cambridge University & the University of Oregon.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Increasing GO Annotation Through Community Involvement Fiona McCarthy*, Nan Wang*, Susan Bridges** and Shane Burgess** GO.
To Boldly GO… Amelia Ireland GO Curator EBI, Hinxton, UK.
Statistical Testing with Genes Saurabh Sinha CS 466.
Master headline RDFizing the EBI Gene Expression Atlas James Malone, Electra Tapanari
You can request PRO terms by using the SourceForge PRO tracker (Fig 3A) or by directly contributing to PRO by providing the information in the RACE-PRO.
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
JSON exchange format. Current GO annotation download options Tab-separated – GAF – GPAD/GPI (not available yet) XML – Pseudo RDF/XML (circa 2001) Relational.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
Phenotype And Trait Ontology (PATO) and plant phenotypes
AgBase Shane Burgess, Fiona McCarthy Mississippi State University.
Converting an Existing Taxonomic Data Resource to Employ an Ontology and LSIDS Jessie Kennedy Rob Gales, Robert Kukla.
Lisa Matthews, 1 Esther Schmidt, 2 Suzanna Lewis, 3 David Croft, 2 Bernard de Bono, 2 Peter D'Eustachio, 1 Marc Gillespie, 1 Gopal Gopinath, 1 Bijay Jassal,
Tools in Bioinformatics Ontologies and pathways. Why are ontologies needed? A free text is the best way to describe what a protein does to a human reader.
Joined up ontologies: incorporating the Gene Ontology into the UMLS.
The Cardiovascular Disease Ontology (CVDO) Mercedes Arguello Casteleiro 1, Julie Klein 2 and Robert Stevens 1 1 School of Computer Science, University.
` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.
Getting GO annotation for your dataset
Annotation extension meeting summary
Introduction to the Gene Ontology
Statistical Testing with Genes
Functional Annotation of the Horse Genome
The Gene Ontology: an evolution
Statistical Testing with Genes
Presentation transcript:

Increased Expressivity of Gene Ontology Annotations Huntley RP, Harris MA, Alam-Faruque Y, Carbon SJ, Dietze H, Dimmer E, Foulger R, Hill DP, Khodiyar V, Lock A, Lomax J, Lovering RC, Mungall CJ, Mutowo- Muellenet P, Sawford T, Van Auken K, Wood V

The Gene Ontology A vocabulary of 37,500 * distinct, connected descriptions that can be applied to gene products Thats a lot… – How big is the space of possible descriptions? *April 2013

Current descriptions miss details Author: – LMTK1 (Aatk) can negatively control axonal outgrowth in cortical neurons by regulating Rab11A activity in a Cdk5- dependent manner – GO: – Aatk: GO: negative regulation of axon extension GO terms will always be a subset of total set of possible descriptions – We shouldnt attempt to make a term for everything

T63 Toxic effect of contact with venomous animals and plants Term from ICD-10, a hierarchical medical billing code system use to annotate patient records

T63 Toxic effect of contact with venomous animals and plants – T Toxic effect of contact with Portugese Man-o-war, accidental (unintentional)

T63 Toxic effect of contact with venomous animals and plants – T Toxic effect of contact with Portugese Man-o-war, accidental (unintentional) – T Toxic effect of contact with Portugese Man-o-war, intentional self-harm

T63 Toxic effect of contact with venomous animals and plants – T Toxic effect of contact with Portugese Man-o-war, accidental (unintentional) – T Toxic effect of contact with Portugese Man-o-war, intentional self-harm – T Toxic effect of contact with Portugese Man-o-war, assault

T63 Toxic effect of contact with venomous animals and plants – T Toxic effect of contact with Portugese Man-o-war, accidental (unintentional) – T Toxic effect of contact with Portugese Man-o-war, intentional self-harm – T Toxic effect of contact with Portugese Man-o-war, assault T63.613A Toxic effect of contact with Portugese Man- o-war, assault, initial encounter T63.613D Toxic effect of contact with Portugese Man- o-war, assault, subsequent encounter T63.613S Toxic effect of contact with Portugese Man- o-war, assault, sequela

Post-composition Curators need to be able to compose their complex descriptions from simpler descriptions (terms) at the time of annotation GO annotation extensions Introduced with Gene Association Format (GAF) v2 – Also supported in GPAD Has underlying OWL description-logic model

Classic annotation model Gene Association Format (GAF) v1 – Simple pairwise model – Each gene product is associated with an (ordered) set of descriptions Where each description == a GO term

GO annotation extensions Gene Association Format (GAF) v1 – Simple pairwise model – Each gene product is associated with an (ordered) set of descriptions Where each description == a GO term Gene Association Format (GAF) v2 (and GPAD) – Each gene product is (still) associated with an (ordered) set of descriptions – Each description is a GO term plus zero or more relationships to other entities Entities from GO, other ontologies, databases Description is an OWL anonymous class expression (aka description)

Classic GO annotations are unconnected sty1 DBObjectTermEvRef.. PomBasesty1 SPAC24B11.06c GO: IMP PMID: PomBasesty1 SPAC24B11.06c GO: IMP PMID: PomBasepap1 SPAC c GO: IMP PMID: protein localization to nucleus[GO: ] cellular response to oxidative stress [GO: ] cellular response to oxidative stress [GO: ] pap1 positive regulation of transcription from pol II promoter in response to oxidative stress[GO: ]

Now with annotation extensions sty1 DBObjectTermEvRefExtension PomBasesty1 SPAC24B11.06c GO: protein localization to nucleus IMP PMID: happens_during(GO: ), has_input(SPAC c).. PomBasepap1 SPAC c GO: IMP PMID: has_reulation_target(…) protein localization to nucleus[GO: ] cellular response to oxidative stress [GO: ] cellular response to oxidative stress [GO: ] happens during pap1 has input positive regulation of transcription from pol II promoter in response to oxidative stress[GO: ] has regulation target <anonymous description> <anonymous description>

PomBase web interface – sty1

pap1

Where do I get them? Download – MGI (22,000) GOA Human (4,200) PomBase (1,588) Search and Browsing – Cross-species AmiGO 2 – - poster#57http://amigo2.berkeleybop.org QuickGO (later this year) - – MOD interfaces PomBase –

Query tool support: AmiGO 2 Annotation extensions make use of other ontologies CHEBI CL – cell types Uberon – metazoan anatomy MA – mouse anatomy EMAP – mouse anatomy …. Annotation extensions make use of other ontologies CHEBI CL – cell types Uberon – metazoan anatomy MA – mouse anatomy EMAP – mouse anatomy …. CL –

CL, Uberon –

CL, Uberon –

Curation tool support Supported in – Protein2GO (GOA, WormBase) [poster#97] – CANTO (PomBase) [poster#110] – MGI curation tool

Analysis tool support Currently: Enrichment tools do not yet support annotation extensions – Annotation extensions can be folded into an analysis ontology - Future: Analysis tools can use extended annotations to their benefit – E.g. account for other modes of regulation in their model – Tool developers: contact us!

Challenge: pre vs post composition Curator question: do I… – Request a pre-composed term via TermGenie[*]? – Post-compose using annotation extensions? See Heikos TermGenie talk tomorrow & poster #33

Challenge: pre vs post composition Curator question: do I… – Request a pre-composed term via TermGenie? – Post-compose using annotation extensions? From a computational perspective: – It doesnt matter, were using OWL – 40% of GO terms have OWL equivalence axioms protein localization [GO: ] Nucleus [GO: ] end_location protein localization to nucleus[GO: ]

Curation Challenges Manual Curation – Fewer terms, but more degrees of freedom – Curator consistency OWL constraints can help Automated annotation – Phylogenetic propagation – Text processing and NLP

Similar approaches and future directions Post-composition has been used extensively for phenotype annotation – ZFIN [poster#95] – Phenoscape [next talk] Future: – A more expressive model that bridges GO with pathway representations

Conclusions Description space is huge – Context is important – Not appropriate to make a term for everything – OWL allows us to mix and match pre and post composition Number of extension annotations is growing Annotation extensions represent untapped opportunity for tool developers

Acknowledgments GO Consortium, model organism and UniProtKB curators GO Directors PomBase developers: – Mark McDowell, Kim Rutherford Funding – GO Consortium NIH 5P41HG – UniProtKB GOA NHGRI U41HG – British Heart Foundation grant SP/07/007/23671 – Kidney Research UK RP26/2008 – PomBase - Wellcome Trust WT090548MA – MGD NHGRI HG000330