JSON exchange format. Current GO annotation download options Tab-separated – GAF – GPAD/GPI (not available yet) XML – Pseudo RDF/XML (circa 2001) Relational.

Slides:



Advertisements
Similar presentations
OMV Ontology Metadata Vocabulary April 10, 2008 Peter Haase.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Status on the Mapping of Metadata Standards
+ OWL for annotators David Osumi-Sutherland. + What is OWL? Web Ontology Language Can express everything in OBO and more. Certified web standard Fast.
Molecular Model (aka lego) Editing Environment Architecture.
Building ontologies using Jenkins. Changing requirements for ontology engineering Original ontology build pipeline – What pipeline? – Life on the bleeding.
So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
CellDesigner Tutorial Laurence Calzone, Andrei Zinovyev UMR U900 INSERM/Institut Curie/Ecole des Mines de Paris Wednesday, April 30th.
University of Leeds Department of Chemistry The New MCM Website Stephen Pascoe, Louise Whitehouse and Andrew Rickard.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
GO Ontology Editing Workshop: Using Protege and OWL Hinxton Jan 2012.
Noctua Development 2015 Noctua: Express yourself. Clearly.
OIL: An Ontology Infrastructure for the Semantic Web D. Fensel, F. van Harmelen, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider Presenter: Cristina.
Editing Description Logic Ontologies with the Protege OWL Plugin.
CIMI / FHIR and Shape Expressions. Local DB … …
Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer …but …
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
January, 23, 2006 Ilkay Altintas
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
Advances in Technology and CRIS Nikos Houssos National Documentation Centre / National Hellenic Research Foundation, Greece euroCRIS Task Group Leader.
Formalizing and Querying Heterogeneous Documents with Tables Krishnaprasad Thirunarayan and Trivikram Immaneni Department of Computer Science and Engineering.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Where Innovation Is Tradition SYST699 – Spec Innovations Innoslate™ System Engineering Management Software Tool Test & Analysis.
® IBM Software Group © 2004 IBM Corporation Using Rational Software Architect to Drive Application Integration Message Definition from Information Models.
Max Planck Institute for Psycholinguistics Tool development report H. Brugman MPI Nijmegen.
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
Scalable Metadata Definition Frameworks Raymond Plante NCSA/NVO Toward an International Virtual Observatory How do we encourage a smooth evolution of metadata.
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
SRI International Bioinformatics 1 Object Groups & Enrichment Analysis Suzanne Paley Pathway Tools Workshop 2010.
Development Process and Testing Tools for Content Standards OASIS Symposium: The Meaning of Interoperability May 9, 2006 Simon Frechette, NIST.
Migration of GO to GitHub GO has migrated all trackers from SourceForge to GitHub – No one should ever need visit SF again GO has also migrated a lot of.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
Aude Dufresne and Mohamed Rouatbi University of Montreal LICEF – CIRTA – MATI CANADA Learning Object Repositories Network (CRSNG) Ontologies, Applications.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
STASIS Technical Innovations - Simplifying e-Business Collaboration by providing a Semantic Mapping Platform - Dr. Sven Abels - TIE -
SemantEco Annotator for Linked Data Generation and Generalized Semantic Mapping Session: Technologies, Reasoning, and Annotation Methods of the Semantics.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Building a Topic Map Repository Xia Lin Drexel University Philadelphia, PA Jian Qin Syracuse University Syracuse, NY * Presented at Knowledge Technologies.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
PHS / Department of General Practice Royal College of Surgeons in Ireland Coláiste Ríoga na Máinleá in Éirinn Knowledge representation in TRANSFoRm AMIA.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Introduction to the Semantic Web and Linked Data
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Working with Ontologies Introduction to DOGMA and related research.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
© 2006 Altova GmbH. All Rights Reserved. Altova ® Product Line Overview.
Architecture for an Ontology and Web Service Modelling Studio Michael Felderer & Holger Lausen DERI Innsbruck Frankfurt,
Inference-based Semantic Mediation and Enrichment for the Semantic Web AAAI SSS-09: Social Semantic Web: Where Web 2.0 Meets Web 3.0 March 25, 2009 Dan.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Ontology domain & modeling extensions. Modeling enhancements: overview Enhancements: – Increased expressivity in ontology – Increased expressivity in.
GO in OWL part 2: JSON-LD serialization. Some subtleties The following 4 assertion sets are semantically equivalent but structurally different 1. Fred.
CTI STIX SC Status Report October 22, 2015.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
WonderWeb. Ontology Infrastructure for the Semantic Web. IST Project Review Meeting, 11 th March, WP2: Tools Raphael Volz Universität.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Mechanisms for Requirements Driven Component Selection and Design Automation 최경석.
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
Networks and Interactions
11. The future of SDMX Introducing the SDMX Roadmap 2020
CCO: concept & current status
Using JDeveloper.
PREMIS Tools and Services
ece 627 intelligent web: ontology and beyond
Data Model.
The Gene Ontology: an evolution
NIEM Tool Strategy Next Steps for Movement
Presentation transcript:

JSON exchange format

Current GO annotation download options Tab-separated – GAF – GPAD/GPI (not available yet) XML – Pseudo RDF/XML (circa 2001) Relational – MySQL dump (circa 2001)

The problem with tabular files Mapping everything to simple TSVs hinders the evolution of the GO Benefits of TSVs: – Ideal for simplified view of some subset of our data/knowledge Problems with TSVs: – Ad-hoc syntax (e.g. multivalued columns) not good for robust software negineering – Bad fit for complex graph-like or nested representations (modular annotation, aka “lego”)

JSON as an alternative to TSVs What is JSON? – Data exchange format for simple data structures – Allows nesting – crucial for modular annotations – De facto standard for web applications – Easy to use programmatically JSON-LD – Magic that makes a JSON file also an RDF file – The JSON is ‘semantic’

Example (LEGO) { type: "GO:nnnn", ## MF part_of: { type: "GO:nnnn" }, ## BP occurs_in: { type: "GO:nnnn" }, ## CC enabled_by: { type: "UniProtKB:nnn"}, ## mandatory describedBy: { reference: "PMID:123456", evidence: { type: "ECO: ", with: "XXXX" }

Basic MF annotations { type: "GO:nnnn", ## MF part_of: { type: "GO:nnnn" }, ## BP occurs_in: { type: "GO:nnnn" }, ## CC enabled_by: { type: "UniProtKB:nnn"}, ## mandatory describedBy: { reference: "PMID:123456", evidence: { type: "ECO: ", with: "XXXX" }

Basic BP annotations { type: "GO: ", ## MF part_of: { type: "GO:nnnn" }, ## BP occurs_in: { type: "GO:nnnn" }, ## CC enabled_by: { type: "UniProtKB:nnn"}, ## mandatory describedBy: { reference: "PMID:123456", evidence: { type: "ECO: ", with: "XXXX" }

Nesting { type: "GO:nnnn", ## MF has_input: “CHEBI:nnnn”, ## (same as c16) part_of: { type: "GO:nnnn" }, ## BP occurs_in: { type: "GO:nnnn”, ## CC part_of: { type: “CL:nnnnn”, part_of: { type: “UBERON:nnnn” }}}, enabled_by: { type: "UniProtKB:nnn"}, ## describedBy: { reference: "PMID:123456", evidence: { type: "ECO: ", with: "XXXX" }

Connecting { id: “GOC:ann12345”, type: "GO:nnnn", ## MF has_input: “CHEBI:nnnn”, ## (same as c16) part_of: { type: "GO:nnnn" }, ## BP occurs_in: { type: "GO:nnnn”, ## CC part_of: { type: “CL:nnnnn”, part_of: { type: “UBERON:nnnn” }}}, enabled_by: { type: "UniProtKB:nnn"}, ## describedBy: { reference: "PMID:123456", evidence: { type: "ECO: ", with: "XXXX" } }, directly_activates: “GOC:ann9876”, }

JSON-LD Schema: mapping to RDF { id: “GOC:ann12345”, type: "GO:nnnn", ## MF has_input: “CHEBI:nnnn”, ## (same as c16) part_of: { type: "GO:nnnn" }, ## BP occurs_in: { type: "GO:nnnn”, ## CC part_of: { type: “CL:nnnnn”, part_of: { type: “UBERON:nnnn” }}}, enabled_by: { type: "UniProtKB:nnn"}, ## describedBy: { reference: "PMID:123456", evidence: { type: "ECO: ", with: "XXXX" } }, directly_activates: “GOC:ann9876”, } { “GO”: “type”: “rdf:type”, “has_input”: “RO: ” “part_of”: “BFO: ”, “occurs_in”: “BFO: ”, }

JSON-LD Schema: mapping to RDF { id: “GOC:ann12345”, type: "GO:nnnn", ## MF has_input: “CHEBI:nnnn”, ## (same as c16) part_of: { type: "GO:nnnn" }, ## BP occurs_in: { type: "GO:nnnn”, ## CC part_of: { type: “CL:nnnnn”, part_of: { type: “UBERON:nnnn” }}}, enabled_by: { type: "UniProtKB:nnn"}, ## describedBy: { reference: "PMID:123456", evidence: { type: "ECO: ", with: "XXXX" } }, directly_activates: “GOC:ann9876”, } { “GO”: “type”: “rdf:type”, “has_input”: “RO: ” “part_of”: “BFO: ”, “occurs_in”: “BFO: ”, } iron ion transmembrane transporter activity enabled by fip1

Next steps Develop spec further – SVN/trunk/experimental/lego/docs/lego-json.md SVN/trunk/experimental/lego/docs/lego-json.md Write converters Document and publish That should be it for the next ten years

Ontology infrastructure update

GO and RHEA

Goals Map to and align with RHEA Leverage mappings – Generate CHEBI logical definitions – Derive mappings to other resources (e.g. EC) – Automate development of parts of MF ontology – Automate F->P links for reactions TermGenie for catalytic activities

RHEA Represents biochemical reactions – Up to 4 IDs L->R L<-R L R L ? R Each side of the reaction is mapped to CHEBI No hierarchy – Doesn’t do ‘generic’ reactions Mapped to EC No names/labels for reactions Available as BioPAX

Generate mappings to RHEA mappedtotal GO: catalytic activity % GO: transporter activity % GO: antioxidant activity132454%

Mappings should be 1:1 Previously many RHEA mappings were ‘rough matches’ – 1 to many – many to 1 – many to many – We need precise mappings to exploit reasoning Becky cleaned these up – 68 remaining – Make exception for NAD(P)H/+

Translation to OWL Translate RHEA BioPAX to our OWL model Challenge: bidirectionality – We can’t use has_input (RO: ) and has_output (RO: ) – We use direction-neutral Challenge: stoichiometry – Difficult to represent explicitly – Represent each side of the reaction compositionality

Proposed pipeline New MF requests go via RHEA Auto-generate GO term from RHEA – To be resolved: generating the name Place term automatically in hierarchy using reasoning – Grouping classes in GO still need logical definitions

The road to modular annotations (aka LEGO): Are we there yet? Bar Harbor 2013

How do we get to here?

Brief overview: What is modular annotation (aka lego)? We create instances of GO molecular function classes to represent the particular activities in a particular process These are connected to other instances (triples) using the same relations used in the GO – Regulation, part_of, occurs_in, has_input,.. Thus the model is fully ontological, and takes advantage of the ontology

Questions/Obstacles How would users view the annotations? How would annotators create the annotations? How would we validate the annotations? Which formats would tools use to exchange the annotations? How can we seed a large enough set of annotations to be useful? How does all this help biologists interpret their experimental data?

How would users view the annotations? New browsing paradigm required – Not gene sets any more Current progress: – bin/amigo2/amigo/search/complex_annotation bin/amigo2/amigo/search/complex_annotation – Tabular and network views

Functions are enabled / executed by a molecular entity, in a location, as part of a cellular process, in some larger context

Users can select an ‘annotation unit’ to see its connectivity to other units in the process bin/amigo2/amigo/complex_annotation/cytoki nin-signaling

Locations can be specified at multiple levels of granularity Query by location

Locations can be specified at multiple levels of granularity Query by location

Search and browsing: next steps Current goals of alpha version in AmiGO 2 – GOC internal use only Loading subset of files in ‘experimental’ folder in SVN Track progress on ‘LEGO’ project Prototype eventual display for users Modular Annotation GO working group Wider release? – Will require Larger number of high quality modular annotations Ability to satisfy some analysis use case

How would annotators create the annotations? Prototype phase ( ): – Heiko’s LEGO Protégé 4 plugin Progress: FEATURE FREEZE – not for general use Next stage (proposed 2014-): – Visual web-based editor Could be a plugin to Protein2GO Or a standalone tool with a RDF triplestore backend – Current progress Evaluating javascript frameworks – E.g. jsPlumb – Web cytoscape is Flash, so we won’t use

Modular annotation editor: Open questions How should this be integrated with phylogeny- based editing? Will a relational database be an appropriate backend or should we jump straight to an RDF triplestore?

How do we validate the annotations? High degree of expressivity  scope for errors and inconsistency Q: How do we spot errors? A: we use OWL reasoners – Lego is already layered on RDF/OWL, no additional translation required – Progress: Ontology editors have enriched the ontology with many constraints E.g. (part_of some nucleus) DisjointWith (part_of some cytoplasm) This has already proven useful in the prototyping phase

Which formats would tools use to exchange the annotations? LEGO is just RDF/OWL Express directly as RDF triples – See SVN/trunk/experimental/lego/owl/ SVN/trunk/experimental/lego/owl/ Different syntaxes available (all use the same model): – RDF/XML – RDF-Turtle What about simpler formats?

How can we seed a large enough set of annotations to get the ball rolling? Strategies: – Translate pathway databases – Use the GO ontology and existing annotations – Text mining?

Generating complex annotations from BioPAX All pathway databases export to BioPAX – BioPAX is RDF/XML, but different schema than Lego We can implement transforms – Every bp:BiochemicalReaction  GO MF instance (aka annotation unit) – Use participants to infer more specific GO MF – Progress: prototype implementation Working on complexes

FAIL!

Generating complex annotations from GAFs Start with a BP in a species – Exclude grouping classes Make use of axioms in ontology to decompose process into its necessary parts – E.g. every ‘iron assimilation by reduction and transport’ has_part some ‘iron ion transmembrane transport’ Take gene products for that BP – Rank most likely MF – Connect up to BP instances

Has_part

Gene products involved in process

Most likely functions to be be part of process

Use annotation extensions - include: translation of PPI database (biogrid) to GAF

How does all this help biologists interpret their experimental data?

Next steps MAGO-WG Content meetings

Common Annotation Tool and LEGO

How do we get there?

Basic idea Format should allow exchange of basic annotations – GAF 1 – GAF 2 / GPAD Extensible to complex annotations – Aka ‘LEGO’