Spreadsheets to OWL with Populous 8/12/2011 Mikel Egaña Aranguren 3205 School of Computer Science Universidad Politécnica de Madrid (UPM) 28660 Boadilla.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

Workshop on Knowledge Technologies within the 6th Framework MayA. Gómez-Pérez Certifying Knowledge and Knowledge Technologies ( based on Sig3 activities.
The use of Ontology in Organising and Managing Protein Family Resources Katy Wolstencroft, University Of Manchester.
© Geodise Project, University of Southampton, Semantic Web based Content Enrichment and Knowledge Reuse in e-Science.
RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,
GOAT: The Gene Ontology Annotation Tool Dr. Mike Bada Department of Computer Science University of Manchester
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
 Goals Unambiguous description of how the investigation was performed Consistent annotation, powerful queries and data integration  Details NOT model.
Chapter 8: Web Ontology Language (OWL) Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.
1 SWE Introduction to Software Engineering Lecture 5.
Annotating Documents for the Semantic Web Using Data-Extraction Ontologies Dissertation Proposal Yihong Ding.
A Structure Editor For PAL Constraints Anton An July 18, 2001.
The Cell Cycle Ontology Erick Antezana Dept. of Plant Systems Biology Flanders Institute for Biotechnology (VIB) / Ghent University Ghent - BELGIUM
Information Extraction from Documents for Automating Softwre Testing by Patricia Lutsky Presented by Ramiro Lopez.
1 Cui Tao PhD Dissertation Defense Ontology Generation, Information Harvesting and Semantic Annotation For Machine-Generated Web Pages.
Advanced Data Mining and Integration Research for Europe ADMIRE – Framework 7 ICT ADMIRE Overview European Commission 7 th.
RightField Rich Annotation of Experimental Biology through Stealth Using Spreadsheets Katy Wolstencroft, Stuart Owen, Matthew Horridge, Olga Krebs, Wolfgang.
GO Ontology Editing Workshop: Using Protege and OWL Hinxton Jan 2012.
Editing Description Logic Ontologies with the Protege OWL Plugin.
(C) 2013 Logrus International Practical Visualization of ITS 2.0 Categories for Real World Localization Process Part of the Multilingual Web-LT Program.
DartGrid Browser-based mapping tool of SQL to RDF Point Template Zhejiang University & OpenLink Software.
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
Ontologies: Making Computers Smarter to Deal with Data Kei Cheung, PhD Yale Center for Medical Informatics CBB752, February 9, 2015, Yale University.
The Health Ontology Mapper (HOM) Method Clinical & Translational Science Ontology Workshop (NCBO/CTSA) April 24, 2012 Rob Wynden - Chief Scientist, Ketty.
Deciding Semantic Matching of Stateless Services Duncan Hull †, Evgeny Zolin †, Andrey Bovykin ‡, Ian Horrocks †, Ulrike Sattler † and Robert Stevens †
Taverna in e-Lico  e-Lico is an EU Project ( ) to create a virtual laboratory for data mining and data-intensive sciences  Main partners: –University.
Taverna and my Grid Basic overview and Introduction Tom Oinn
Using Vocabulary Services in Validation of Water Data May 2010 Simon Cox, JRC Jonathan Yu & David Ratcliffe, CSIRO.
8/11/2011 Web Ontology Language (OWL) Máster Universitario en Inteligencia Artificial Mikel Egaña Aranguren 3205 Facultad de Informática Universidad Politécnica.
RightField: Semantic Enrichment of Systems Biology Data using Spreadsheets Katy Wolstencroft myGrid, SysMO-DB University of Manchester.
Open Biomedical Ontologies. Open Biomedical Ontologies (OBO) An umbrella project for grouping different ontologies in biological/medical field –a repository.
Integrated Development Environment for Policies Anjali B Shah Department of Computer Science and Electrical Engineering University of Maryland Baltimore.
Taverna and my Grid Open Workflow for Life Sciences Tom Oinn
Košice, 10 February Experience Management based on Text Notes The EMBET System Michal Laclavik.
XML and Digital Libraries M. Zubair Department of Computer Science Old Dominion University.
Teranode Tools and Platform for Pathway Analysis Michael Kellen, Solution Manager June 16, 2006.
A School of Information Science, Federal University of Minas Gerais, Brazil b Medical University of Graz, Austria, c University Medical Center Freiburg,
Department of computer science and engineering Two Layer Mapping from Database to RDF Martin Švihla Research Group Webing Department.
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
DAVID R. SMITH DR. MARY DOLAN DR. JUDITH BLAKE Integrating the Cell Cycle Ontology with the Mouse Genome Database.
Quality views: capturing and exploiting the user perspective on data quality Paolo Missier, Suzanne Embury, Mark Greenwood School of Computer Science University.
1 Overview of XSL. 2 Outline We will use Roger Costello’s tutorial The purpose of this presentation is  To give a quick overview of XSL  To describe.
WP2: ONTOLOGY ENRICHMENT METHODOLOGIES Carole Goble (IMG) Robert Stevens (BHIG) Mikel Egaña Aranguren (BHIG) Manchester University Computer Science: IMG:
To Boldly GO… Amelia Ireland GO Curator EBI, Hinxton, UK.
Text Mining & NLP based Algorithm to populate ontology with A-Box individuals and object properties Alexandre Kouznetsov and Christopher J. O. Baker, University.
© University of Manchester Simplifying OWL Learning lessons from Anaesthesia Nick Drummond BioHealth Informatics Group.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Mining the Biomedical Research Literature Ken Baclawski.
Semantic Web BY: Josh Rachner and Julio Pena. What is the Semantic Web? The semantic web is a part of the world wide web that allows data to be better.
E-LICO An e-Laboratory for Interdisciplinary Collaborative research in data mining and data intensive sciences October 12 th, 2010 Delivering data mining.
Proposed Research Problem Solving Environment for T. cruzi Intuitive querying of multiple sets of heterogeneous databases Formulate scientific workflows.
Using DAML+OIL Ontologies for Service Discovery in myGrid Chris Wroe, Robert Stevens, Carole Goble, Angus Roberts, Mark Greenwood
Ontology domain & modeling extensions. Modeling enhancements: overview Enhancements: – Increased expressivity in ontology – Increased expressivity in.
Approach to building ontologies A high-level view Chris Wroe.
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Quality Assurance, Ontology Engineering, and Semantic Interoperability.
Ontology Engineering Ron Rudnicki Lab #1 - August 26, 2013.
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
All Hands Meeting 2006 Fluxion: The ComparaGRID Data Integration Architecture Matthew Pocock, Tony Burdett, Rob Davey, Andrew Gibson, Trevor Paterson.
Describing and Annotating Experimental Data: Hands On.
Rendering XML Documents ©NIITeXtensible Markup Language/Lesson 5/Slide 1 of 46 Objectives In this session, you will learn to: * Define rendering * Identify.
An ontology-supported approach to predict automatically the proteases involved in the generation of peptides Mercedes Arguello Casteleiro 1, Julie Klein.
Mechanisms for Requirements Driven Component Selection and Design Automation 최경석.
Scan, Import, and Automatically file documents to Box Introduction
Product Training Program
Part of the Multilingual Web-LT Program
OBI – Standard Semantic
The Gene Ontology: an evolution
Collaborative RO1 with NCBO
Lab 08 Introduction to Spreadsheets MS Excel
Presentation transcript:

Spreadsheets to OWL with Populous 8/12/2011 Mikel Egaña Aranguren 3205 School of Computer Science Universidad Politécnica de Madrid (UPM) Boadilla del Monte Spain Ontology Engineering Group (OEG) Simon Jupp Functional Genomic Group European Bioinfomatics Institute Wellcome Trust Genome Campus Hinxton UK Robert Stevens Biohealth Informatics Group School of Computer Science University of Manchester Manchester UK

Motivation Engaging life scientists in data annotation and ontology population Protégé and OWL are scary.. Need for simple “form filling” style of knowledge gathering and describing data - so we use spreadsheets. Q1 How do we get people to annotate data in spreadhseets accoridng to ontologies? Q2 How do we transform those spreadsheets into sets of axioms? Populous

Writing ontologies in OWL is hard Especially if one doesn’t know OWL; Hard to do complex patterns of axioms; Hard to be consistent and conform to a style; Hard to re-factor an ontology’s content Doing all this in bulk is tedious and error prone Populous

Separation of concerns Populous All Eukaryotic Cells are either nucleated or anucleate, some cells are multinucleate Knowledge ‘Eukaryotic Cells’ has_nucleation some ‘Nucleation’ ‘Nucleation’ subClassOf {mononucleate, binucleate, polynucleate, anucleate} ‘Eukaryotic Cells’ has_nucleation some ‘Nucleation’ ‘Nucleation’ subClassOf {mononucleate, binucleate, polynucleate, anucleate} Ontologically ‘Eukaryotic Cells’ has_nucleation some ‘Nucleation’ ‘Nucleation’ subClassOf {mononucleate, binucleate, polynucleate, anucleate} ‘Eukaryotic Cells’ has_nucleation some ‘Nucleation’ ‘Nucleation’ subClassOf {mononucleate, binucleate, polynucleate, anucleate} Differentia ‘Eukaryotic Cells’ ‘Nucleation’ Mononuclear phagocytemononucleate Flight Muscle cellmultinucleate Red Blood cellanucleate ‘Eukaryotic Cells’ ‘Nucleation’ Mononuclear phagocytemononucleate Flight Muscle cellmultinucleate Red Blood cellanucleate Real Examples

Ontology patterns Axioms often added in regular ways There are often patterns of axioms for a particular way of representation There are also design patterns – standard well recognised solutions Analogous to software patterns Doing the same thing in the same way… it’s a good thing Populous ‘Protein’ has_molecular_function some ‘Molecular Function’ is_capable_of some ‘Biological Process’ located_in some ‘Cellualr component’ ‘Protein’ has_molecular_function some ‘Molecular Function’ is_capable_of some ‘Biological Process’ located_in some ‘Cellualr component’ Repetative pattern

Some requirements Want consistent axiom generation Want to write axioms according to patterns Separate knowledge gathering from axiom generation Engage domain experts not experts in OWL and/or ontologies Validate content to go into the ontology Do all of this in a familiar environment i.e. spreadsheets Populous

Using spreadsheets Spreadsheets are often used simply to organise data Basic tabulation Saying the same kinds of things repeatedly A very familiar environment Want to capitalise on this… Populous

RightField Populous Semantic Annotation by Stealth

Excel validations Populous

Populous workflow Populous

Populous workflow Populous Load from file or directly from BioPortal

Populous workflow Populous Ontology browser

Populous workflow Populous 1. Select column 2. Select Class in Ontology 3. Select allowed values

Populous workflow Populous Tab completion Syntax Highlighting Multi-value cells Label rendering

Demo 1 Demo of Populous in action. Populous

OWL generation Populous Class: CL: Annotation: rdfs:label ‘Kidney Cell’ EquivalentTo: CL: and OBO_REL:part_of some MAO_ Manchester OWL syntax

Introduction to OPPL Ontology Pre Processor Language (oppl.sf.net) Scripting language to automate the manipulation of OWL ontologies Apply pre-defined very complex OWL modelling automatically Based in Manchester OWL Syntax Populous

OPPL script anatomy OPPL for Populous OPPL script Variable declaration,... SELECT Query,... WHERE Constraint,... BEGIN ADD/REMOVE Axiom,... END;

OPPL script anatomy OPPL for Populous OPPL scriptPopulous OPPL script Variable declaration,... SELECT Query,... WHERE Constraint,... BEGIN ADD/REMOVE Axiom,... END; Variable declaration,... SELECT Query,... WHERE Constraint,... BEGIN ADD/REMOVE Axiom,... END; Variable declaration,... SELECT Query,... WHERE Constraint,... BEGIN ADD/REMOVE Axiom,... END;

OPPL script anatomy OPPL for Populous OPPL scriptPopulous OPPL script Variable declaration,... SELECT Query,... WHERE Constraint,... BEGIN ADD/REMOVE Axiom,... END; Variable declaration,... SELECT Query,... WHERE Constraint,... BEGIN ADD/REMOVE Axiom,... END; Variable declaration,... SELECT Query,... WHERE Constraint,... BEGIN ADD/REMOVE Axiom,... END; ?cell:CLASS, ?parent:CLASS BEGIN ADD ?cell subClassOf ?parent END;

OPPL script anatomy OPPL for Populous ?cell:CLASS, ?parent:CLASS BEGIN ADD ?cell subClassOf ?parent END; Variable Variable type CLASS CONSTANT OBJECTPROPERTY DATAPROPERTY ANNOTATIONPROPERTY INDIVIDUAL OWL expression: Manchester OWL syntax + variables

OPPL for Populous OPPL script anatomy ?cell:CLASS, ?anatomyPart:CLASS, ?partOfRestriction:CLASS = part_of some ?anatomyPart, ?anatomyIntersection:CLASS = createIntersection(?partOfRestriction.VALUES) BEGIN ADD ?cell equivalentTo CL_ and ?anatomyIntersection END; createIntersection createUnion (?var.VALUES) OWL expression =

Creating OPPL script OPPL builder OPPL for Populous

Creating OPPL script OPPL text editor OPPL for Populous

Creating OPPL script OPPL macros OPPL for Populous

Creating OPPL script OPPL patterns OPPL for Populous

More information OPPL publications: OPPL documentation: OPPL patterns: OPPL Manual: OPPL sample scripts:

Populous Demo 2 Demo 2 -converting spreadsheets to OWL using OPPL

Populous Populous and RightField Ontological annotation by stealth Real biological data + high quality meta-data Development of a Kidney and Urinary Pathway Knowledge Base

Populous Demo 3 Demo 3 - Experiment template for data annotation

Populous Acknowledgements RightField Matthew Horridge, Katy Wolstencroft, Stuart Owen, Carole Goble Populous Simon Jupp, Robert Stevens funded by e-LICO EU-FP7 Collaborative Project ( ) Theme ICT- 4.4: Intelligent Content and Semantics and NIH funded NCBO driving biological project program Mikel Egaña Aranguren is funded by the Marie Curie Cofund programme (FP7) OPPL 2 is maintained by Luigi Iannone