BioGateway: an RDF store for supporting Systems Biology

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Clearing your Cookies Google Chrome A short guide to help you navigate our website faster Brought to you by:
RDF and RDB 1 Some slides adapted from a presentation by Ivan Herman at the Semantic Technology & Business Conference, 2012.
Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
CMSC838 Project Presentation An Ontology-based Approach for Managing Software Components by Vladimir Kolovski.
TAMBIS Transparent Access to Multiple Biological Information Sources.
1 Introduction to OBIEE: Learning to Access, Navigate, and Find Data in the SWIFT Data Warehouse Lesson 8: Printing and Exporting an OBIEE Analysis This.
The Cell Cycle Ontology Erick Antezana Dept. of Plant Systems Biology Flanders Institute for Biotechnology (VIB) / Ghent University Ghent - BELGIUM
Connecting Diverse Web Search Facilities Udi Manber, Peter Bigot Department of Computer Science University of Arizona Aida Gikouria - M471 University of.
MIS2502: Data Analytics MySQL and SQL Workbench David Schuff
Triple Stores.
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
RDF Triple Stores Nipun Bhatia Department of Computer Science. Stanford University.
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
Grant Number: IIS Institution of PI: Arizona State University PIs: Zoé Lacroix Title: Collaborative Research: Semantic Map of Biological Data.
Master Informatique 1 Semantic Technologies Part 11Direct Mapping Werner Nutt.
IPAW'08 – Salt Lake City, Utah, June 2008 Exploiting provenance to make sense of automated decisions in scientific workflows Paolo Missier, Suzanne Embury,
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Copyright OpenHelix. No use or reproduction without express written consent1.
Quality views: capturing and exploiting the user perspective on data quality Paolo Missier, Suzanne Embury, Mark Greenwood School of Computer Science University.
PHS / Department of General Practice Royal College of Surgeons in Ireland Coláiste Ríoga na Máinleá in Éirinn Knowledge representation in TRANSFoRm AMIA.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06.
Image Classification for Automatic Annotation
Natural Language Interfaces to Ontologies Danica Damljanović
A Semantic Web Approach for the Third Provenance Challenge Tetherless World Rensselaer Polytechnic Institute James Michaelis, Li Ding,
Conceptualization Relational Model Incomplete Relations Indirect Concept Reflection Entity-Relationship Model Incomplete Relations Two Ways of Concept.
CMPE58H Project Progress Presentation QAPoint H.Tuğçe Özkaptan Gözde Kaymaz Serkan Kırbaş
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
XP New Perspectives on Microsoft Office Access 2003, Second Edition- Tutorial 8 1 Microsoft Office Access 2003 Tutorial 8 – Integrating Access with the.
Advanced Gene Selection Algorithms Designed for Microarray Datasets Limitation of current feature selection methods: –Ignores gene/gene interaction: single.
1 Copyright © 2014 Tata Consultancy Services Limited Assessment Knowledge Center – Item Creation Training Document.
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
RDF based on Integration of Pathway Database and Gene Ontology SNU OOPSLA LAB DongHyuk Im.
Describing and Annotating Experimental Data: Hands On.
EBI is an Outstation of the European Molecular Biology Laboratory. Semantic Interoperability Framework Sarala M. Wimalaratne (RICORDO project)
Dmitry Mouromtsev, Aleksei Romanov, Dmitry Volchek and Fedor Kozlov Laboratory ITMO University, St. Petersburg, Russia “Metadata Extraction from.
Lesson 4 Basic Text Formatting. Objectives ● I ● In this tutorial we will: ● Introduce Wiki Syntax ● Learn how to Bold and Italicise text, and add Headings.
Semantic Web Application Patterns: Pipelines, Versioning and Validation David Booth, Ph.D. (Consultant) W3C Linked Enterprise Data Patterns Workshop 7-Dec-2011.
Jens Ziegler, Markus Graube, Johannes Pfeffer, Leon Urbas
Components.
Introduction to OBIEE:
Online BIOS QTL atlases
Triple Stores.
RDF and RDB 1 Some slides adapted from a presentation by Ivan Herman at the Semantic Technology & Business Conference, 2012.
Linked Data browsers.
System Structures Identification
Tutorial for using Case It for bioinformatics analyses
Analyzing and Securing Social Networks
Ontology Partition for Browsing
Wikitology Wikipedia as an Ontology
Triple Stores.
OBO Foundry Principles
CCO: concept & current status
Microsoft Office Access 2003
Microsoft Office Access 2003
Analysis models and design models
UmbrellaDB v0.5 Project Report #3
MIS2502: Data Analytics MySQL and SQL Workbench
Explore Evolution: Instrument for Analysis
Causal Models Lecture 12.
Information Networks: State of the Art
CC La Web de Datos Primavera 2018 Lecture 8: SPARQL [1.1]
Triple Stores.
Information - the lifeblood of the business
Executive Reports, Instructions and Documentation
Presentation transcript:

BioGateway: an RDF store for supporting Systems Biology Erick Antezana Dept. of Plant Systems Biology VIB/University of Ghent erick.antezana@psb.ugent.be

Contents Systems Biology Data integration and exploitation BioGateway Concluding remarks Next steps

The four steps of Systems Biology Define all of the components of the system, build model, simulate and predict Systematically perturb and monitor components of the system Reconcile the experimentally observed responses with those predicted by the model Design and perform new perturbation experiments to distinguish between multiple or competing model hypotheses Kitano, Science, 2002

Systems Biology Cycle Mathematical model Data analysis Information extraction New information to model Model Refinement Systems Biology Cycle Dynamical simulations and hypothesis formulation Experimental design Experimentation, Data generation

Semantic Knowledge Base Semantic Systems Biology Cycle Information extraction, Knowledge formalization Consistency checking Querying Automated reasoning Semantic Systems Biology Cycle Experimentation, Data generation Hypothesis formulation Experimental design

BioGateway Uses Virtuoso Open Server Open Source software that can host a triple store Can build this from RDF files Has a DB backend Supports SPARQL* language which allows querying RDF data (graphs) Its syntax is similar to that of SQL. http://www.openlinksw.com/virtuoso/ *http://www.w3.org/TR/rdf-sparql-query/

BioGateway Some motivating questions Cancer: what candidate genes are involved in cell cycle control, S-phase to G2 transition, DNA damage response and skin cancer? Gastrin: what genes correlate with cancer and the use of anti-acids, and are involved in the gastrin response, and are associated with cell cycle control? Inflammation: give me genes that are mentioned in the context of high carbohydrate intake and play a role in (process #1 to be named) and are within x steps from a GO ontology term related to inflammation

BioGateway The homepage of SSB, including BioGateway as a first step towards this idea.

Use the buttons for prefixes and other constructs Type a query here. Click Run!

Select a query in the drop-down box Click on Run to execute the query The query editor Click on Run to execute the query

A library of queries The drop-down box contains (so far) 31 queries: 11 protein-centric biological queries: The role of proteins in diseases Their interactions Their functions Their locations 20 ontological queries: Browsing abilities in RDF like getting the neighborhood, the path to the root, the children,... Meta-information about the ontologies, graphs, relations Queries to show the possibilities of SPARQL on BioGateway, like counting, filtering, combining graphs,...

Parameterizing the queries made easy.

The parameters are indicated in red. All the queries are explained in a tutorial For every query the name, the parameters and the function are indicated at the top. The parameters are indicated in red.

The results appear in a separate window

The neighborhood of the human protein 1443F in the RDF-graph The resulting triples (arrows) are represented as a small grammatical sentence: subject, predicate, object. Outgoing arrows Incoming arrows

Labeled arrows to extra information Limit Execute The SPARQL-endpoint The prefixes The query without the prefixes The URI's in blue. The results: 9 proteins Labeled arrows to extra information

998 RDF-files can be downloaded from the Resources page The graph names can be used to query or combine individual graphs for quicker answers or more specific information

The RDF export specifications The RDF is automatically generated with onto-perl, our own ontology API. Many choices for the RDF specifications were made during the testing of the queries. The resources are available either as part of an integrated graph or as individual graphs. BioMetarel, a relation ontology, provides labels for the URIs of the relations. OWL-RDF was avoided because it is too verbose. We preferred RDF optimized for querying.

Metarel Metarel is a generic ontological hierarchy for relation types, consistent with OBOF and RDF. It includes meta-information like transitivity, reflexivity and composition. BioMetarel includes all the biological relation types that are used in BioGateway. We are still testing the exploitation of composition, like A located in B and B part of C, gives A located in C.

Transitive closure graphs A transitive closure was constructed for the subsumption relation (is a) and the partonomy relation (part of)‏ If A is a B, and B is a C, then A is a C is also added to the graph. Many interesting queries can be done in a performant way with it, like 'What are the proteins that are located in the cell nucleus or any subpart thereof?' The graphs without transitive closure are available for querying as well.

Conclusions / Results BioGateway: RDF store for Biosciences Data integration pipeline: BioGateway Queries and knowledge sources and system design go hand-in-hand (user interaction) Existing integration obstacles due to: diversity of data formats lack of formalization approaches Calls for ‘foundry’ type initiative for RDF

Next steps More data sources (e.g. Nutrigenomics, pathways etc.) RDF rules User interface development Reasoning… …

Acknowledgements http://ww.semantic-systems-biology.org Martin Kuiper (NTNU, NO) Vladimir Mironov (NTNU, NO) Mikel Egaña (U Manchester, UK) Robert Stevens (U Manchester, UK) Ward Blonde (U Ghent, BE) Bernard De Baets (U Ghent, BE) Alan Ruttenberg (Science Commons, US) Alistair Rutherford (www.netthreads.co.uk) Users http://ww.semantic-systems-biology.org