A SPARQL extension for generating RDF from heterogeneous formats

Slides:



Advertisements
Similar presentations
Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology.
Advertisements

ESDSWG2011 – Semantic Web session Semantic Web Sub-group Session ESDSWG 2011 Meeting – Semantic Web sub-group session Wednesday, November 2, 2011 Norfolk,
Semantic Web Introduction
 Copyright 2010 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Transforming between RDF.
 Copyright 2004 Digital Enterprise Research Institute. All rights reserved. SPARQL Query Language for RDF presented by Cristina Feier.
SPARQL RDF Query.
Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c.
Chapter 3 Querying RDF stores with SPARQL. TL;DR We will want to query large RDF datasets, e.g. LOD SPARQL is the SQL of RDF SPARQL is a language to query.
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
ModelicaXML A Modelica XML representation with Applications Adrian Pop, Peter Fritzson Programming Environments Laboratory Linköping University.
1 Draft of a Matchmaking Service Chuang liu. 2 Matchmaking Service Matchmaking Service is a service to help service providers to advertising their service.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
Databases: Some Research Opportunities For Latin America Marcelo Arenas Pontificia Universidad Católica de Chile Marcelo Arenas Pontificia Universidad.
Semantic Web Bootcamp Dominic DiFranzo PhD Student/Research Assistant Rensselaer Polytechnic Institute Tetherless World Constellation.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Publishing data on the Web (with.
1 Ontology Query and Reasoning Payam Barnaghi Institute for Communication Systems (ICS) Faculty of Engineering and Physical Sciences University of Surrey.
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
1 SAMT’08 Semantic-driven multimedia retrieval with the MPEG Query Format Ruben Tous and Jaime Delgado Distributed Multimedia Applications Group (DMAG)
Data File Access API : Under the Hood Simon Horwith CTO Etrilogy Ltd.
Logics for Data and Knowledge Representation
Chapter 3 Querying RDF stores with SPARQL. Why an RDF Query Language? Why not use an XML query language? XML at a lower level of abstraction than RDF.
Database Support for Semantic Web Masoud Taghinezhad Omran Sharif University of Technology Computer Engineering Department Fall.
Master Informatique 1 Semantic Technologies Part 11Direct Mapping Werner Nutt.
SPARQL W3C Simple Protocol And RDF Query Language
Košice, 10 February Experience Management based on Text Notes The EMBET System Michal Laclavik.
On the Semantics of R2RML and its Relationship with the Direct Mapping Juan F. Sequeda Research in Bioinformatics and Semantic Web (RiBS) Lab Department.
SPARQL All slides are adapted from the W3C Recommendation SPARQL Query Language for RDF Web link:
SPARQL Query Graph Model (How to improve query evaluation?) Ralf Heese and Olaf Hartig Humboldt-Universität zu Berlin.
Using RDF in Agent-Mediated Knowledge Architectures K. Hui, S. Chalmers, P.M.D. Gray & A.D. Preece University of Aberdeen U.K
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
1 SPARQL A. Emrah Sanön. 2 RDF RDF is quite committed to Semantic Web. Data model Serialization by means of XML Formal semantics Still something is missing!
Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
05/01/2016 SPARQL SPARQL Protocol and RDF Query Language S. Garlatti.
Toward a framework for statistical data integration Ba-Lam Do, Peb Ruswono Aryan, Tuan-Dat Trinh, Peter Wetz, Elmar Kiesling, A Min Tjoa Linked Data Lab,
Martin Kruliš by Martin Kruliš (v1.1)1.
CC L A W EB DE D ATOS P RIMAVERA 2015 Lecture 7: SPARQL (1.0) Aidan Hogan
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
RDB2RDF Working Group Cunxin Jia. Why Mapping RDBs to RDF?
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
CC La Web de Datos Primavera 2017 Lecture 7: SPARQL [i]
Keyword Search over RDF Graphs
RDF and RDB 1 Some slides adapted from a presentation by Ivan Herman at the Semantic Technology & Business Conference, 2012.
SPARQL SPARQL Protocol and RDF Query Language
Middleware independent Information Service
Supporting Arbitrary Custom Datatypes in RDF and SPARQL
Semantic Database Builder
XML in Web Technologies
Embedding Knowledge in HTML
Oracle Analytic Views Enhance BI Applications and Simplify Development
Linguistic Linked Open Data
Logics for Data and Knowledge Representation
Database Vs. Data Warehouse
Enhance BI Applications and Simplify Development
CC La Web de Datos Primavera 2016 Lecture 7: SPARQL (1.0)
RDF Presentation and Correct Content Conveyance for Legacy
CC La Web de Datos Primavera 2016 Lecture 2: RDF Model & Syntax
Interpreter Pattern.
JSON-LD 1.0 Yanan Zhang.
How to publish in a format that enhances literature-based discovery?
Publishing Ordnance Survey Ireland's geospatial data as Linked Data
LOD reference architecture
Embedding Knowledge in HTML
Graph Data on the Web: extend the pivot, don’t reinvent the wheel
JSON for Linked Data: a standard for serializing RDF using JSON
Tutorial ESWC 2018: From heterogeneous data to RDF graphs and back
Semantic-Web, Triple-Strores, and SPARQL
ONTOMERGE Ontology translations by merging ontologies Paper: Ontology Translation on the Semantic Web by Dejing Dou, Drew McDermott and Peishen Qi 2003.
RDF Presentation and Correct Content Conveyance for Legacy
Presentation transcript:

A SPARQL extension for generating RDF from heterogeneous formats Ease the accessibility of Semantic Web principles and formalisms for companies, Web services, and constrained devices Maxime Lefrançois, Antoine Zimmermann, Noorani Bakerally MINES Saint-Etienne, CNRS, Laboratoire Hubert Curien UMR 5516

Mines Saint-Etienne involved in: 6 countries, 34 partners, 16M€, 160 person-yrs, coordinated by ENGIE « Design and develop a global ecosystem of services and smart things collectively capable of ensuring the stability and the energy efficiency in the future energy grid » Mines Saint-Etienne involved in: T2.2 SEAS Knowledge Model 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

« Fostering Uses and Usages of Open Sensor Data in Smart Cities » 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

A datalake of data with heterogeneous formats XML CSV JSON ………………. EXI CBOR ………………. Image: https://headleaks.com/ 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

Key step: generate some RDF RDF Data Model XML CSV JSON ………………. EXI CBOR ………………. Image: https://headleaks.com/ 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

Requirements for RDF generation RDF Data Model Transform multiple sources … … having heterogeneous formats Be extensible to new data formats Be easy to use by Semantic Web experts Integrate in a typical semantic web engineering workflow Be flexible and easily maintainable Fast Image: https://headleaks.com/ 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

Besoins pour la génération de RDF RDF Data Model Transform multiple sources … … having heterogeneous formats Be extensible to new data formats Be easy to use by Semantic Web experts Integrate in a typical semantic web engineering workflow Be flexible and easily maintainable Fast Transform binary formats as well as textual formats Contextualize the transformation with an RDF Dataset Image: https://headleaks.com/ 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

Existing approaches RDFizers (see https://www.w3.org/wiki/ConverterToRdf ) A lot of tools are specific to one or a few formats (44 referenced formats) Some frameworks support several/many formats ad hoc methods, little or no control on the structure of the output => may require an additional transformation 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

Existing approaches Approaches based on mapping/transformation languages 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

GRDDL <html xmlns=http://www.w3.org/1999/xhtml xmlns:grddl='http://www.w3.org/2003/g/data-view#' grddl:transformation="http://example.com/getAuthor.xsl" > <head> <title>Are You Experienced?</title> [...] </html> <album xmlns:grddl='http://www.w3.org/2003/g/data-view#' grddl:transformation="http://example.org/getAlbum.xsl" > <artist mbid="">The Jimi Hendrix Experience</artist> <name>Are You Experienced?</name> ... </album> GRDDL (W3C REC 2007) 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

XSPARQL (W3C member submission 2009) 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

R2RML (W3C REC 2012) <#TriplesMap2> rr:logicalTable <#DeptTableView>; rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}"; rr:class ex:Department; ]; rr:predicateObjectMap [ rr:predicate ex:name; rr:objectMap [ rr:column "DNAME" ]; rr:predicate ex:location; rr:objectMap [ rr:column "LOC" ]; rr:predicate ex:staff; rr:objectMap [ rr:column "STAFF" ]; ]. 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

{ "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}], "url": "tree-ops.csv", "dc:title": "Tree Operations", "dcat:keyword": ["tree", "street", "maintenance"], "dc:publisher": { "schema:name": "Example Municipality", "schema:url": {"@id": "http://example.org"} }, "dc:license": {"@id": "http://opendefinition.org/licenses/cc-by/"}, "dc:modified": {"@value": "2010-12-31", "@type": "xsd:date"}, "tableSchema": { "columns": [{ "name": "GID", "titles": ["GID", "Generic Identifier"], "dc:description": "An identifier for the operation on a tree.", "datatype": "string", "required": true }, { "name": "on_street", "titles": "On Street", "dc:description": "The street that the tree is on.", "datatype": "string" "name": "species", "titles": "Species", "dc:description": "The species of the tree.", "name": "trim_cycle", "titles": "Trim Cycle", "dc:description": "The operation performed on the tree.", "name": "inventory_date", "titles": "Inventory Date", "dc:description": "The date of the operation that was performed.", "datatype": {"base": "date", "format": "M/d/yyyy"} }], "primaryKey": "GID", "aboutUrl": "#gid-{GID}" } CSVW (W3C REC 2015) 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

RML (Dimou et al., 2013) 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

RML: some issues for RDF generation - Does not cover low resource devices data formats - Subject-centric - Not easily extensible - One logical source per mapping - No RDF context, filter, aggregate, etc. 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

Research questions How to design a mapping language that… …can be easily extended to any source format? …is expressive enough to cover all of our use cases? …is still rather simple to use? 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

RDF generation process 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

RDF generation process 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

RDF generation process Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

RDF generation process Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

RDF generation process ex:Director rdf:type ? foaf:name ex:salary ? ex:fee ? Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. Graph pattern definition 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

RDF generation process ex:Director rdf:type ? foaf:name ex:salary ? ex:fee ? + Select ontologies Selection patterns Xpath, JSONpath, CSS selectors, regex, etc. Graph pattern definition 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

https://w3id.org/sparql-generate Open-source implementation on top of Jena + doc & tuto Maven 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

https://w3id.org/sparql-generate Usable as JAR 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

https://w3id.org/sparql-generate Usable as Web API (similar to SPARQL Protocol) 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

https://w3id.org/sparql-generate Web form – syntax checking (extends YASGUI) 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

https://w3id.org/sparql-generate Set of implemented custom functions XML (Xpath) JSON (JSONPath, select the list of an object keys,…) CSV, TSV HTML5 (CSS3 selectors) CBOR Plain text (regular expressions) Dates conversion 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

https://w3id.org/sparql-generate Unit tests based on competitor approaches 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

SPARQL-Generate vs RML comparison of reference implementation performances 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

Conclusions & next steps SPARQL generate …is expressive, flexible, extensible …integrates well in a SemWeb workflow …is formalised, implemented, evaluated Next we want to add …custom functions for more data formats …syntactic sugar: use expressions directly in the GENERATE clause …support for data streams (on it way) 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

A SPARQL extension for generating RDF from heterogeneous formats More information about SPARQL-Generate - https://w3id.org/sparql-generate/ Web form and demonstrator, open source implementation, mailing list, … Maxime Lefrançois, Antoine Zimmermann, Noorani Bakerally MINES Saint-Etienne, CNRS, Laboratoire Hubert Curien UMR 5516

Who writes the transformation ? 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

A SPARQL 1.1 extension PREFIX declarations GENERATE template FROM and FROM NAMED clauses ITERATOR … AS … SOURCE … AS … WHERE { … } Solution modifiers ( group by, order by, limit, offset,... like in SPARQL 1.1) Any number and order Expressive / flexible Extensible Usually already mastered by ontologists Implementable on top of existing engines? 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

Formal syntax and semantics Queries a RDF dataset and a RDF Documentset (named RDF literals) Generates a RDF Graph 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

Implementable on top of SPARQL 1.1 engines Theorem + naive algorithm 18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats

18/11/2019 M. Lefrançois et al. - A SPARQL extension for generating RDF from heterogeneous formats