 Copyright 2007 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute www.deri.ie Scalable Authoritative OWL.

Slides:



Advertisements
Similar presentations
Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute 1 From OntoSelect to OntoSelect-SWSE.
Advertisements

Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Uniasdsd Dynamic querying.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Linked Broken Data? Dr Axel.
Tutorial at WWW 2011, Distributed reasoning: because size matters Andreas Harth, Aidan Hogan, Spyros Kotoulas,
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Using Datalog for Rule-Based.
Chapter 3 Querying RDF stores with SPARQL. TL;DR We will want to query large RDF datasets, e.g. LOD SPARQL is the SQL of RDF SPARQL is a language to query.
VIVO and Linked Open Data December 13, 2010 Dean B. Krafft Chief Technology Strategist and Director of IT Cornell University Library.
Quick RDF Introduction Scott Streit Terminology – RDF Triple (Also the triple form used in SPARQL) RDF Triple  (Resource, Property,
Analyzing Minerva1 AUTORI: Antonello Ercoli Alessandro Pezzullo CORSO: Seminari di Ingegneria del SW DOCENTE: Prof. Giuseppe De Giacomo.
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Context Dependent Reasoning.
CSCI 572 Project Presentation Mohsen Taheriyan Semantic Search on FOAF profiles.
Tutorial at ISWC 2011, Distributed reasoning: because size matters Andreas Harth, Aidan Hogan, Spyros Kotoulas,
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. 1 The Architecture of a Large-Scale Web Search and Query Engine.
A year on the Semantic W3C (or: what is happening these days?) Semantic Web Meetup, Seattle, Ivan Herman, W3C.
Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1.
Chapter 7: Resource Description Framework (RDF) Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley,
ANHAI DOAN ALON HALEVY ZACHARY IVES Chapter 12: Ontologies and Knowledge Representation PRINCIPLES OF DATA INTEGRATION.
Triple Stores.
CSE 428 Semantic Web Topics Introduction Jeff Heflin Lehigh University.
RDF Semantics by Patrick Hayes W3C Recommendation Presented by Jie Bao RPI Sept 4, 2008 Part 1 of RDF/OWL Semantics Tutorial.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Publishing data on the Web (with.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved Digital Enterprise Research Institute Semantic Search for CMS IKS.
Chapter 6 Understanding Each Other CSE 431 – Intelligent Agents.
-By Mohamed Ershad Junaid UTD ID :
Semantic Publishing Update Second TUC meeting Munich 22/23 April 2013 Barry Bishop, Ontotext.
OWL vs. Linked Data: Experiences and Directions Axel Polleres Siemens AG Österreich 1 27/05/2013.
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Using/Extending RIF and.
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Can Semantics catch up with.
Digital Enterprise Research Institute HADA – An Access Controlled Application for Publishing and Discovering Linked Government Data Owen Sacco.
Data Intensive Query Processing for Large RDF Graphs Using Cloud Computing Tools Mohammad Farhan Husain, Latifur Khan, Murat Kantarcioglu and Bhavani Thuraisingham.
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Linked Broken Data? Dr Axel.
Relational Databases to RDF (a.k.a RDB2RDF) Juan F. Sequeda Dept of Computer Science University of Texas at Austin.
Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce Mohammad Farhan Husain, Pankil Doshi, Latifur Khan, Bhavani Thuraisingham University.
@ Presented by eBiquity group, UMBC CIKM’04, Nov 12, 2004 SwoogleSwoogle SwoogleSwoogle search and metadata for the semantic web Partial research support.
Coastal Atlas Interoperability - Ontologies (Advanced topics that we did not get to in detail) Luis Bermudez Stephanie Watson Marine Metadata Interoperability.
Export experiments in Corese. October 10th Export experiments in Corese Olivier Corby October 10th, 2005 Interoperability Working Days October 10th-11th,
SPARQL Query Graph Model (How to improve query evaluation?) Ralf Heese and Olaf Hartig Humboldt-Universität zu Berlin.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Reasoning and Querying for.
Mapping Guide Mapping Ontologies and Data Sets in RDF/RDFS/OWL2 Michel Böhms.
CWM Closed World Machine. CWM Overview CWM is a popular Semantic Web program that can do the following tasks – Parse and pretty-print several RDF formats:
Chapter 7: Resource Description Framework (RDF) Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley,
1 Artificial Intelligence Applications Institute Centre for Intelligent Systems and their Applications Stuart Aitken Artificial Intelligence Applications.
Scalable Distributed Reasoning Using MapReduce Jacopo Urbani, Spyros Kotoulas, Eyal Oren, and Frank van Harmelen Department of Computer Science, Vrije.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Dr. Lowell Vizenor Ontology and Semantic Technology Practice Lead Alion Science and Technology Semantic Technology: A Basic Introduction.
MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06.
Triple Stores. What is a triple store? A specialized database for RDF triples Can ingest RDF in a variety of formats Supports a query language – SPARQL.
OWL Full Semantics -- RDF-Compatible Model-Theoretic Semantics by Peter F. Patel-Schneider, Patrick Hayes and Ian Horrocks W3C Recommendation, 2004
CC L A W EB DE D ATOS P RIMAVERA 2015 Lecture 10: Conclusion Aidan Hogan
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Linked Broken Data? Dr Axel.
Introduction to the Semantic Web Jeff Heflin Lehigh University.
@ eBiquity Lab, CSEE, UMBC Swoogle Tutorial (Part I: Swoogle R & D) A brief introduction to Swoogle An overview of Swoogle research A summary of Swoogle.
Aidan Hogan, Antoine Zimmermann, Jürgen Umbrich, Axel Polleres, Stefan Decker Presented by Joseph Park SCALABLE AND DISTRIBUTED METHODS FOR ENTITY MATCHING,
Ontology Technology applied to Catalogues Paul Kopp.
Chapter Describing Individuals OWL Individuals ▫Ontological Primitive Layer  Mostly described with RDF ▫Instances of user-defined ontological.
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
Rya Query Inference.
The Semantic Web Part 6. RDF Vocabularies: RDF Schema
Knowledge Representation Part II Description Logic & Introduction to Protégé Jan Pettersen Nytun.
Formal ontologies vs. triple based KR gap or convergence?
Triple Stores.
Service-Oriented Computing: Semantics, Processes, Agents
Triple Stores.
Linking Guide Michel Böhms.
Data Provenance.
Scalable and Efficient Reasoning for Enforcing Role-Based Access Control
Triple Stores.
Presentation transcript:

 Copyright 2007 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Scalable Authoritative OWL Reasoner Aidan Hogan, Andreas Harth, Axel Polleres Digital Enterprise Research Institute National University of Ireland, Galway =“free” (Irish) 1

Digital Enterprise Research Institute SAOR - Reasoning for SWSE We want the challenge data plus OWL inferred data in the search results! Our approach: SAOR – Scalable Authoritative OWL Reasoning 2

Digital Enterprise Research Institute Idea Apply a subset of OWL reasoning to the billion triple challenge dataset Forward-chaining rule based approach, e.g.[ter Horst, 2005] Reduced output statements for the SWSE use case…  Must be scalable, must be reasonable … incomplete w.r.t. OWL BY DESIGN!  SCALABLE: Tailored ruleset – file-scan processing – avoid joins  AUTHORITATIVE: Avoid Non-Authoritative inference (“hijacking”, “non-standard vocabulary use”) 3

Digital Enterprise Research Institute Scalable Reasoning Scan 1: Scan all data (1.1b statements), separate T-Box statements, load T-Box statements (8.5m) into memory, perform authoritative analysis. Scan 2: Scan all data and join all statements with in-memory T-Box.  Only works for inference rules with 0-1 A-Box patterns  No T-Box expansion by inference  Needs “tailored” ruleset 4

Digital Enterprise Research Institute Rules Applied: Tailored version of [ter Horst, 2005] 5

Digital Enterprise Research Institute Good “excuses” to avoid G2 rules The obvious:  G2 rules would need joins, i.e. to trigger restart of file-scan The interesting one:  Take for instance IFP rule:  Maybe not such a good idea on real Web data  More experiments including G2, G3 rules in [Hogan, Harth, Polleres, ASWC2008] 6

Digital Enterprise Research Institute Authoritative Reasoning Document D authoritative for concept C iff:  C not identified by URI – OR  De-referenced URI of C coincides with or redirects to D  FOAF spec authoritative for foaf:Person ✓  MY spec not authoritative for foaf:Person ✘ Only allow extension in authoritative documents  my:Person rdfs:subClassOf foaf:Person. (MY spec) ✓ BUT: Reduce obscure memberships  foaf:Person rdfs:subClassOf my:Person. (MY spec) ✘ Similarly for other T-Box statements. In-memory T-Box stores authoritative values for rule execution Ontology Hijacking 7

Digital Enterprise Research Institute Rules Applied The 17 rules applied including statements considered to be T-Box, elements which must be authoritatively spoken for (including for bnode OWL abstract syntax), and output count 8

Digital Enterprise Research Institute Authoritative Resoning covers rdfs: owl: vocabulary misuse rdfs:subClassOf rdfs:subPropertyOf rdfs:Resource. rdfs:subClassOf rdfs:subPropertyOf rdfs:subPropertyOf. rdf:type rdfs:subPropertyOf rdfs:subClassOf. rdfs:subClassOf rdf:type owl:SymmetricProperty. Naïve rules application would infer O(n 3 ) triples By use of authoritative reasoning SAOR/SWSE doesn’t stumble over these :rdfs :owl Hijacking 9

Digital Enterprise Research Institute Performance Graph showing SAOR’s rate of input/output statements per minute for reasoning on 1.1b statements: reduced input rate correlates with increased output rate and vice-versa 10

Digital Enterprise Research Institute Results SCAN 1: 6.47 hrs  In-mem T-Box creation, authoritative analysis: SCAN 2: 9.82 hrs  Scan reasoning – join A-Box with in-mem authoritative T-Box: 1.925b new statements inferred in hrs On our agenda:  More valuable insights on our experiences from Web data  G2 and G3 rules?  Detailed comparison to OWL RL b + 1.9b inferred = 3 billion triples in SWSE

Digital Enterprise Research Institute Search result example: 12

Digital Enterprise Research Institute Le Fin… Enjoy the data… GUI: SPARQL interface: 13 Contact us for feedback!

Digital Enterprise Research Institute Ontology Hijacking Many popular concepts re-defined in non- authoritative documents Obscure concepts defined as super-concepts of popular ones  => should assert that all instances of such popular concepts (e.g., foaf:Person) are instances of obscure ones also (e.g., my:Person) “Ontology hijacking”  Potentially unsafe – foaf:mbox rdf:type owl:SymmetricProperty.  Explosion of output statements 14

Digital Enterprise Research Institute Not Run: A-box Join Rules No rules run requiring A-Box joins: e.g., rule 16 :  ?P a :InverseFunctionalProperty. ?x ?p ?o. ?y ?p ?o. => ?x :sameAs ?y. A-Box join expensive to compute!! Requires on-disk indexes: current work with initial results 15

Digital Enterprise Research Institute Not Run: Equality Also requires A-Box joins Cannot run naively Many instances of incorrect use of, for example, InverseFunctionalProperty Google 08445a31a78661b5c746feff39a9db6e4e2cc5cf Timbl foaf:homepage W3C foaf:homepage foaf:homepage a InverseFunctionalProperty. => Timbl sameAs W3C 16

Digital Enterprise Research Institute Scalable No standard query processing, no databases, no dynamic index structures. Based on two file scans of (unsorted) data Reduced output statements  Focus on A-Box reasoning  No T-Box inferencing/updating  No quasi-axiomatic statements output – ?s a rdfs:Resource. ?s a owl:Thing. ?s owl:sameAs s.  Authoritative analysis 17