Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute www.deri.ie Uniasdsd Dynamic querying.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute 1 From OntoSelect to OntoSelect-SWSE.
Copyright 2007 Digital Enterprise Research Institute. All rights reserved. SEMEDIA PARENTAL ADVISORY: Neither formulas nor inference rules.
Copyright 2006 Digital Enterprise Research Institute. All rights reserved. MarcOnt Initiative Tools for collaborative ontology development.
0 SMR2 Panel Axel Polleres 3 cents (in answer to 3 questions)
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
 Copyright 2006 Digital Enterprise Research Institute. All rights reserved. The Future is Now JeromeDL A Digital Library on Social Semantic.
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute 0:45:001 LODPeas: Like Peas.
RDF Tutorial.
 Copyright 2010 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Transforming between RDF.
 Copyright 2004 Digital Enterprise Research Institute. All rights reserved. SPARQL Query Language for RDF presented by Cristina Feier.
Master Informatique 1 Semantic Technologies Part 0Course Organization Semantic Technologies Werner Nutt.
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Anatomy of a Semantic Virus.
Semantic Web Tools Vagan Terziyan Department of Mathematical Information Technology, University of Jyvaskyla ;
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Context Dependent Reasoning.
DATA INTEGRATION SOLUTION FOR PAPER INDUSTRY Industrial Ontologies Group University of Jyväskylä Motivating scenario ! Customer Site (maintenance support)
Databases: Some Research Opportunities For Latin America Marcelo Arenas Pontificia Universidad Católica de Chile Marcelo Arenas Pontificia Universidad.
1 DCS861A-2007 Emerging IT II Rinaldo Di Giorgio Andres Nieto Chris Nwosisi Richard Washington March 17, 2007.
Using Java in Linked Data Applications Fuming Shih Oct 12.
Triple Stores.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Publishing data on the Web (with.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved Digital Enterprise Research Institute Semantic Search for CMS IKS.
Information Integration Intelligence with TopBraid Suite SemTech, San Jose, Holger Knublauch
RDF Triple Stores Nipun Bhatia Department of Computer Science. Stanford University.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
Universität Innsbruck Leopold Franzens  Copyright 2007 DERI Innsbruck EASAIER 18 Month Coordination Meeting, Tel Aviv, Israel WP 2 – Media.
Entity Recognition via Querying DBpedia ElShaimaa Ali.
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Using/Extending RIF and.
Database Support for Semantic Web Masoud Taghinezhad Omran Sharif University of Technology Computer Engineering Department Fall.
 Copyright 2007 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Report on DERI,
Trisolda Jakub Yaghob Charles University in Prague, Czech Rep.
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Linked Broken Data? Dr Axel.
IDB, SNU Dong-Hyuk Im Efficient Computing Deltas between RDF Models using RDFS Entailment Rules (working title)
Semantic Web State of SemWeb Promotes flexibility, software reuse. SOA Styled architecture that exposes business processes and rules regarding IT.
Pavan Reddiavri (Ebiquity Labs) “R ♫ P” RDF Access control Policies.
 Copyright 2007 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Scalable Authoritative OWL.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Semantic Web Services enabled B2B Integration Kotinurmi,Vitvar,
Agenda Intro: Information management in Biology Information management engineering Formats and standards XML MAGE example Perspectives: the Semantic Web.
Coastal Atlas Interoperability - Ontologies (Advanced topics that we did not get to in detail) Luis Bermudez Stephanie Watson Marine Metadata Interoperability.
Export experiments in Corese. October 10th Export experiments in Corese Olivier Corby October 10th, 2005 Interoperability Working Days October 10th-11th,
SPARQL Query Graph Model (How to improve query evaluation?) Ralf Heese and Olaf Hartig Humboldt-Universität zu Berlin.
Rapid Prototyping of Semantic Mash-Ups through Semantic Web Pipes Danh Le-Phuoc, Axel Polleres, Manfred Hauswirth, Giovanni Tummarello 1, Christian Morbidoni.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Reasoning and Querying for.
WP3: Provenance and Access Policies Giorgos Flouris (FORTH) - Irini Fundulaki (CWI & FORTH) -
ToolMatch Discovering What Tools can be used to Access, Manipulate, Transform, and Visualize Data Products Patrick West 1 Nancy Hoebelheinrich.
RDF languages and storages part 1 - expressivness Maciej Janik Conrad Ibanez CSCI 8350, Fall 2004.
 Copyright 2007 LarKC Early Adopters Rule-based Reasoner Prototype Barry Bishop STI Innsbruck.
Triple Stores. What is a triple store? A specialized database for RDF triples Can ingest RDF in a variety of formats Supports a query language – SPARQL.
Conclusions Presenter: Manolis Koubarakis Extended Semantic Web Conference 2012.
RDF and Relational Databases
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
RDF David R Newman 15 May 2009.
Semantic Web for the Working Ontologist Dean Allemang Jim Hendler SNU IDB laboratory Last modified,
© The ATHENA Consortium. Susan Thomas SAP AG, Research Department How do you do semantics? Semantic Web Drawings by Sebastian Cremers Unit 3:
WonderWeb. Ontology Infrastructure for the Semantic Web. IST Project Review Meeting, 11 th March, WP2: Tools Raphael Volz Universität.
RDF storages and indexes Maciej Janik September 1, 2005 Enterprise Integration – Semantic Web.
An Optimization Technique for RDFS Inference using the Application Order of RDFS Entailment Rules Kisung Kim, Taewhi Lee
Introduction to the Semantic Web Tutorial Introduction to the Introduction to the Semantic Web Jim Hendler Rensselaer
Semantic Web for the Working Ontologist Dean Allemang Jim Hendler SNU IDB laboratory.
Sales Demo. Demo Overview RDF and Triples D2RQ Overview and Setup Ontology and Mappings Sales Demo Model Inferencing.
WP3: Data Provenance and Access Control Irini Fundulaki, FORTH December 11-12, 2012, Luxembourg.
Stream Reasoning with Linked Data Open Data Open Day 2013 Sina Samangooei, Nick Gibbins 26 June 2013.
Semantic metadata in the Catalogue Frédéric Houbie.
Linked Open Data and federated search
Triple Stores.
Triple Stores.
BPaaS Evaluation Research Prototype
Presentation transcript:

Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Uniasdsd Dynamic querying of mass-storage RDF data with rule-based entailment regimes Giovambattista Ianni, Thomas Krennwallner, Alessandra Martello, and Axel Polleres Universitá Della Calabria 1

Digital Enterprise Research Institute What this paper is about … we define two small, but useful language extensions of SPARQL for dynamic inferencing and querying … we show its feasibility in an deductive DB-based implementation, the GiaBATA system. … we demonstrate that we can achieve decent query response times by well-known optimization techniques. 2

Digital Enterprise Research Institute Motivation Why dynamic inferencing and querying? Is the same data to be queried always with the same ontologies? If my query engine supports inference, do we always want to use the same entailment regime? Likely answer: No… the Semantic Web should be a solution to – as Ora Lassila formulated it – those problems and situations that we are yet to define Unfortunately, not much support here from current RDF stores and SPARQL engines… 3 Can we fix it?

Digital Enterprise Research Institute Example: Query the same data with different ontologies? Query: Find bobs nicknames? 4 :bob foaf:name "Robert"; foaf:nick "Bob"; foaf:icqChatId "Bob123"; foaf:holdsAccount. sioc:name "Bob the Builder" SELECT ?nick FROM WHERE {:bob foaf:nick ?nick} ?nick Bob

Digital Enterprise Research Institute Query: Find bobs nicknames? Hmm, foaf.rdf says: 5 :bob foaf:name "Robert"; foaf:nick "Bob"; foaf:icqChatId "Bob123"; foaf:holdsAccount sioc:name "Bob the Builder" SELECT ?nick FROM WHERE {:bob foaf:nick ?nick} ?nick Bob foaf:icqChatId rdfs:subPropertyOf foaf:nick Example: Query the same data with different ontologies?

Digital Enterprise Research Institute Query: Find bobs nicknames? Hmm, foaf.rdf says: 6 :bob foaf:name "Robert"; foaf:nick "Bob"; foaf:icqChatId "Bob123"; foaf:holdsAccount sioc:name "Bob the Builder" SELECT ?nick FROM WHERE {:bob foaf:nick ?nick} ?nick Bob foaf:icqChatId rdfs:subPropertyOf foaf:nick Still, only one result in most SPARQL engines: missing: Inference rules Example: Query the same data with different ontologies?

Digital Enterprise Research Institute Query: Find bobs nicknames? Hmm, foaf.rdf says: 7 :bob foaf:name "Robert"; foaf:nick "Bob"; foaf:icqChatId "Bob123"; foaf:holdsAccount sioc:name "Bob the Builder" SELECT ?nick FROM WHERE {:bob foaf:nick ?nick} ?nick Bob Bob123 foaf:icqChatId rdfs:subPropertyOf foaf:nick Fortunately, several engines support RDFS inference:, e.g. ARQ, Sesame: {?s ?q ?o } <= {?s ?p ?o. ?p rdfs:subPropertyOf ?q} Example: Query the same data with different ontologies?

Digital Enterprise Research Institute Query: Find bobs nicknames? Say, we want to add a mapping to SIOC, using OWL2 8 :bob foaf:name "Robert"; foaf:nick "Bob"; foaf:icqChatId "Bob123"; foaf:holdsAccount sioc:name "Bob the Builder" SELECT ?nick FROM WHERE {:bob foaf:nick ?nick} ?nick Bob Bob123 foaf:nick :propertyChainAxiom (foaf:holdsAccount, sioc:name ). Nothing happens, not covered by RDFS rules Example: Query the same data with different entailment rules?

Digital Enterprise Research Institute Query: Find bobs nicknames? Say, we want to add a mapping to SIOC, using OWL2 9 :bob foaf:name "Robert"; foaf:nick "Bob"; foaf:icqChatId "Bob123"; foaf:holdsAccount sioc:name "Bob the Builder" SELECT ?nick FROM WHERE {:bob foaf:nick ?nick} ?nick Bob Bob123 Bob the Builder foaf:nick :propertyChainAxiom (foaf:holdsAccount sioc:name). Different set of inference Rules (OWL2RL) can cover that: {?s ?p ?o } <= {?s ?q [?r ?o]. ?p :propertyChainAxiom (?q ?r).} Example: Query the same data with different entailment rules?

Digital Enterprise Research Institute Current implementations are not tailored for that kind of dynamic querying: Most assume reasoning done once and for all (at loading) Inference rules cant be changed on-the-spot, Most assume fixed dataset On top of that, inferences in SPARQL often limited to the Default Graph Problems 10 … Lets look at the last problem first…

Digital Enterprise Research Institute Dataset in SPARQL (D,N) consists of D … Default Graph (merge of FROM clauses, or given implicitly in the store) N … Set of named graphs, identified by a IRI 11 Example: Querying Named Graphs D = myVersionOfFoaf.rdf Bob.rdf N = {} SELECT ?nick FROM WHERE {:bob foaf:nick ?nick} ?nick Bob Bob123 Bob the Builder Ontology is merged into default graph

Digital Enterprise Research Institute Dataset in SPARQL (D,N) consists of D … Default Graph (merge of FROM clauses, or given implicitly in the store) N … Set of named graphs, identified by a IRI 12 SELECT ?nick ?g FROM NAMED FROM WHERE { GRAPH ?g {:bob foaf:nick ?nick }} ?nick?g BobBob.rdf Example: Querying Named Graphs D = myVersionOfFoaf.rdf N = { Bob.rdf } Looses all inferences: No means to merge ontology into a named graph!

Digital Enterprise Research Institute Can we fix it? Yes we can! Two extensions to SPARQL to handle these issues: Extended Datasets / USING ONTOLOGY USING RULESET 13

Digital Enterprise Research Institute Extended Dataset/USING ONTOLOGY Enables merge arbitrary graphs into named graphs… USING ONTOLOGY is simply a shortcut for merging into any graph in the dataset. SELECT ?nick ?g FROM NAMED ( ) WHERE {GRAPH ?g { :bob foaf:nick ?nick }} SELECT ?nick ?g FROM NAMED ( ) … WHERE {GRAPH ?g { :bob foaf:nick ?nick }} SELECT ?nick ?g USING ONTOLOGY FROM NAMED WHERE {GRAPH ?g { :bob foaf:nick ?nick }} A graph collection G is a set of RDF graphs. An extended RDF dataset D is a pair (G 0,{u 1,G 1,...,u n,G n }) Definition: 14

Digital Enterprise Research Institute USING RULESET Additionally, we allow to explicitly name the ruleset that should be used for inference: USING RULESET 15 SELECT ?nick ?g USING RULESET USING ONTOLOGY FROM NAMED WHERE {GRAPH ?g { :bob foaf:nick ?nick }} ?nick?g BobBob.rdf Bob123Bob.rdf SELECT ?nick ?g USING RULESET USING ONTOLOGY FROM NAMED WHERE {GRAPH ?g { :bob foaf:nick ?nick }} ?nick?g BobBob.rdf Bob123Bob.rdf Bob the Builder Bob.rdf Fixed!

Digital Enterprise Research Institute Different Entailment Regimes USING RULESET Which rules we allow? In principle: safe N3-style rules, i.e. no blank nodes in rule heads, e.g. USING RULESET rdfs Paper defines extended BGP matching w.r.t such rulesets. Internally, our system allows a more expressive language (dlvhex) Other accepted syntaxes being worked in (RIF) 16 ?P rdfs:subPropertyOf ?R. <= ?P rdfs:subPropertyOf ?Q. ?Q rdfs:subPropertyOf ?R. ?S ?Q ?O <= ?P rdfs:subPropertyOf ?Q. ?S ?P ?O. ?C rdfs:subClassOf ?E <= ?C rdfs:subClassOf ?D. ?D rdfs:subClassOf ?E. ?S rdf:type ?D. <= ?C rdfs:subClassOf ?D. ?S rdf:type ?C. ?S rdf:type ?C. <= ?P rdfs:domain ?C. ?S ?P ?O. ?O rdf:type ?C. <= ?P rdfs:range ?C. ?S ?P ?O.

Digital Enterprise Research Institute How to implement this? GiaBATA system: SPARQL dlvhex (logic program) Ruleset dlvhex (logic program) Deductive Database techniques: Datalog engine (dlvhex) Postgres SQL Database underneath (dlv-db) RDF storable in different schemas in RDB Magic sets, storage 17 SQL

Digital Enterprise Research Institute SPARQL dlvhex (logic program) Based on [Polleres,WWW2007] Non-recursive Datalog with negation and built-ins: 18

Digital Enterprise Research Institute Ruleset dlvhex (logic program) Straighforward, just translates rules in a way compatible with the SPARQL translation: 19 {?s ?q ?o } <= {?s ?p ?o. ?p rdfs:subPropertyOf ?q}

Digital Enterprise Research Institute SPARQL+Rules SQL Done by dlv-DB, cf. [Terracina, et al. TPLP 8(2),2008] All non-recursive parts are pushed to the Database All recursive parts handled by semi-naïve evaluation (more efficient than WITH RECURSIVE views in SQL, where necessary, intermediate results temporarily materialized into the DB) Some necessary optimisations to make this reasonably performant: FILTER expression evaluation is pushed to SQL (3-valued semantics of SPARQL Filters is handled natively in SQL) No miracles… but magic: Magic set optimisations for focused fwd- chaining evaluation. Join-reordering, not yet implemented, but we did some manual reordering to optimize the query plan in the experiments. 20

Digital Enterprise Research Institute What do we mean by reasonably performant Experiments Lets compare with ARQ, Allegro, Sesame 21

Digital Enterprise Research Institute LUBM test setup Experiments Settings Test setup Dataset(LUBM1, LUBM5, LUBM10, LUBM30) Intel P4 3GHz machine, 1.5GB RAM under Linux Compared Systems: – AllegroGraph 3.2 (native persistence mechanism) – ARQ 2.6 (connected to PostgreSQL 8.3) – GiaBATA (connected to PostgreSQL 8.3) – Sesame 2.3 (native store persistence support) Experiments are enough for comparing performance trends, so we didnt consider at this stage larger instances of LUBM. 22

Digital Enterprise Research Institute Loading time elapsed time needed for storing the dataset to the system. – Including the time spent in processing of ontology and source files (parsing, storing, indexing, possibly reasoning). Query time elapsed time for querying (opening the dataset, executing the query, getting the results, closing the dataset. Evaluation time (overall loading + query time ) in our setting inferred information depends on the entailment regime, on the dataset at hand, per query – dynamic querying of RDFS moves inference from the pre- materialization-at-loading to the query step –We set a 120min query timeout limit to all test runs. Time Measures 23

Digital Enterprise Research Institute Query Examples Query 1 (without inference) {?X rdf:type GraduateStudent ; takesCourse } large input and high selectivity, no inference Query 4 (with inference) {?X rdf:type Professor; worksFor ; name ?Y1; Address ?Y2; telephone ?X ?Y3 } small input and high selectivity, reasoning over subclasses: – Class Professor has a wide hierarchy, it queries about multiple properties of a single class. 24

Digital Enterprise Research Institute What do we mean by reasonably performant Experiments Lets compare query time with ARQ, Allegro, Sesame - performance 25

Digital Enterprise Research Institute Q4: query time Taking loading time into account different picture: Trend looks promising… 26

Digital Enterprise Research Institute System Comparison (1) RDBMS Support all systems allow persistent storage on RDBMS. – AllegroGraph is itself a database, but it offers the ability to back up its operations to a relational database INFERENCE/QUERYING STRATEGY both reasoning and query evaluation are usually performed in main memory, with exception of ours all, except AllegroGraph and ours, adopt a persistent materialization approach of the whole closure for inferring data. So, AllegroGraph has some Dynamic inferencing support, but no means to change the dataset on the fly after loading. 27

Digital Enterprise Research Institute System Comparison (2) RDFS/OWL SUPPORT all cover RDFS (actually, disregarding axiomatic triples) and partial or non-standard OWL fragments. RULESET FORMAT Custom (none yet support e.g. RIF) External reasoner integration (beyond rules) Sesame via Sesame Sail Interface; ARQ interface with external DL reasoner (Pellet, Racer, FaCT); AllegroGraph support integration with RacerPro; GiaBATA via external atoms of dlvhex 28

Digital Enterprise Research Institute What we did: … defined two extensions of SPARQL for dynamic inferencing and querying … showed feasibility in an deductive DB-based implementation, the GiaBATA system. … demonstrated that we can achieve reasonable query response times by well-known optimization techniques. 29

Digital Enterprise Research Institute Conclusion/Outlook There is no one-size-fits all solution to querying semtantic data dynamic querying/inference is often needed May be viewed similar in spirit to hypothetical datalog: add/delete clauses hypothetically… … we didnt yet consider the delete part More experiments: on multiple/sequential queries in production, ie. compare one-time loading vs reloading/rematerialising per query. We need more/better optimisations, if we really want to do dynamic inference on larger scale, e.g. no caching yet. not taking into account data graph structure/statistics 30

Digital Enterprise Research Institute Conclusion/Outlook 31