WIMS 2011, Sogndal, Norway1 Comparison of Ontology Reasoning Systems Using Custom Rules Hui Shi, Kurt Maly, Steven Zeil, and Mohammad Zubair Contact:

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

1/ 26 AGROVOC and the OWL Web Ontology Language: the Agriculture Ontology Service - Concept Server OWL model NKOS workshop Alicante,
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
SPARQL Dimitar Kazakov, with references to material by Noureddin Sadawi ARIN, 2014.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
WIMS 2014, June 2-4Thessaloniki, Greece1 Optimized Backward Chaining Reasoning System for a Semantic Web Hui Shi, Kurt Maly, and Steven Zeil Contact:
Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
Analyzing Minerva1 AUTORI: Antonello Ercoli Alessandro Pezzullo CORSO: Seminari di Ingegneria del SW DOCENTE: Prof. Giuseppe De Giacomo.
Triple Stores
The Design and Implementation of Minimal RDFS Backward Reasoning in 4store Manuel Salvadores, Gianluca Correndo, Steve Harris, Nick Gibbins, and Nigel.
Ameet N Chitnis, Abir Qasem and Jeff Heflin 11 November 2007.
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
11/8/20051 Ontology Translation on the Semantic Web D. Dou, D. McDermott, P. Qi Computer Science, Yale University Presented by Z. Chen CIS 607 SII, Week.
Semantic Representation of Temporal Metadata in a Virtual Observatory Han Wang 1 Eric Rozell 1
CHOROS: IMPROVING THE PERFORMANCE OF QUALITATIVE SPATIAL REASONING IN OWL Nikolaos Mainas, Euripides G.M. Petrakis Technical University Of Crete (TUC),
Audumbar Chormale Advisor: Dr. Anupam Joshi M.S. Thesis Defense
Triple Stores.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
ONTOLOGY SUPPORT For the Semantic Web. THE BIG PICTURE  Diagram, page 9  html5  xml can be used as a syntactic model for RDF and DAML/OIL  RDF, RDF.
Berlin SPARQL Benchmark (BSBM) Presented by: Nikhil Rajguru Christian Bizer and Andreas Schultz.
RDF Triple Stores Nipun Bhatia Department of Computer Science. Stanford University.
-By Mohamed Ershad Junaid UTD ID :
1 SAMT’08 Semantic-driven multimedia retrieval with the MPEG Query Format Ruben Tous and Jaime Delgado Distributed Multimedia Applications Group (DMAG)
Database Support for Semantic Web Masoud Taghinezhad Omran Sharif University of Technology Computer Engineering Department Fall.
The VIVO Ontology Project Technology: Jon Corson-Rikert, Brian Caruso, Brian Lowe, Nick Cappadona Project Coordination: Medha Devare, Elaine Guidero, Jaron.
Ontology Updating Driven by Events Dutch-Belgian Database Day 2012 (DBDBD 2012) November 21, 2012 Frederik Hogenboom Jordy Sangers.
Data Intensive Query Processing for Large RDF Graphs Using Cloud Computing Tools Mohammad Farhan Husain, Latifur Khan, Murat Kantarcioglu and Bhavani Thuraisingham.
Publishing and Interacting with Linked Data Roberto Garcia, Josep Maria Brunetti, Antonio López-Muzás, Juan Manuel Gimeno, Rosa Gil WIMS’11 Conference,
IDB, SNU Dong-Hyuk Im Efficient Computing Deltas between RDF Models using RDFS Entailment Rules (working title)
Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce Mohammad Farhan Husain, Pankil Doshi, Latifur Khan, Bhavani Thuraisingham University.
Ontology Query. What is an Ontology Ontologies resemble faceted taxonomies but use richer semantic relationships among terms and attributes, as well as.
RDF and triplestores CMSC 461 Michael Wilson. Reasoning  Relational databases allow us to reason about data that is organized in a specific way  Data.
Export experiments in Corese. October 10th Export experiments in Corese Olivier Corby October 10th, 2005 Interoperability Working Days October 10th-11th,
SPARQL Query Graph Model (How to improve query evaluation?) Ralf Heese and Olaf Hartig Humboldt-Universität zu Berlin.
Scalable and E ffi cient Reasoning for Enforcing Role-Based Access Control Tyrone Cadenhead Murat Kantarcioglu, and Bhavani Thuraisingham 1.
Q2Semantic: A Lightweight Keyword Interface to Semantic Search Haofen Wang 1, Kang Zhang 1, Qiaoling Liu 1, Thanh Tran 2, and Yong Yu 1 1 Apex Lab, Shanghai.
Oracle Database 11g Semantics Overview Xavier Lopez, Ph.D., Dir. Of Product Mgt., Spatial & Semantic Technologies Souripriya Das, Ph.D., Consultant Member.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Scalable Distributed Reasoning Using MapReduce Jacopo Urbani, Spyros Kotoulas, Eyal Oren, and Frank van Harmelen Department of Computer Science, Vrije.
Bigscholar 2014, April 8, Seoul, South Korea1 Trust and Hybrid Reasoning for Ontological Knowledge Bases Hui Shi, Kurt Maly, and Steven Zeil Contact:
Web Information Systems Modeling Luxembourg, June VisAVis: An Approach to an Intermediate Layer between Ontologies and Relational Database Contents.
PHS / Department of General Practice Royal College of Surgeons in Ireland Coláiste Ríoga na Máinleá in Éirinn Knowledge representation in TRANSFoRm AMIA.
Semantic Publishing Benchmark Task Force Fourth TUC Meeting, Amsterdam, 03 April 2014.
MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06.
Semantic Web Final Exam Review. Topics for Final Exam First exam material (~30%) Design Patterns and Map/Reduce (~20%) Inference / Restrictions (~10%)
R Store Angelique Moscicki Oshani Seneviratne Sergio Herrero-Lopez.
Triple Stores. What is a triple store? A specialized database for RDF triples Can ingest RDF in a variety of formats Supports a query language – SPARQL.
Conclusions Presenter: Manolis Koubarakis Extended Semantic Web Conference 2012.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
Semantic Water Quality Portal Jin Guang Zheng and Ping Wang Tetherless World Constellation.
1 Instance Store Database Support for Reasoning over Individuals S Bechhofer, I Horrocks, D Turi. Instance Store - Database Support for Reasoning over.
Sesame A generic architecture for storing and querying RDF and RDFs Written by Jeen Broekstra, Arjohn Kampman Summarized by Gihyun Gong.
Scalable and E ffi cient Reasoning for Enforcing Role-Based Access Control Tyrone Cadenhead Advisors: Murat Kantarcioglu, and.
Service Computation 2013, Valencia, Spain1 Query Optimization in Cooperation with an Ontological Reasoning Service Hui Shi, Kurt Maly, and Steven Zeil.
Ontology Technology applied to Catalogues Paul Kopp.
WP3: Data Provenance and Access Control Irini Fundulaki, FORTH December 11-12, 2012, Luxembourg.
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
Triple Stores.
Analyzing and Securing Social Networks
Ontology.
NKOS workshop Alicante, 2006
Ontologies and Model-Based Systems Engineering
Triple Stores.
Scalable and Efficient Reasoning for Enforcing Role-Based Access Control
Ontology.
Scalable and Efficient Reasoning for Enforcing Role-Based Access Control
Triple Stores.
Scalable and Efficient Reasoning for Enforcing Role-Based Access Control
Triple Stores.
Presentation transcript:

WIMS 2011, Sogndal, Norway1 Comparison of Ontology Reasoning Systems Using Custom Rules Hui Shi, Kurt Maly, Steven Zeil, and Mohammad Zubair Contact:

WIMS 2011, Sogndal, Norway2 Outline Introduction –What are we evaluating –What is the approach we are taking? Background –Existing benchmarks –Ontology systems supporting custom rules Experimental design –Data and custom rules –Metrics and evaluation procedure Results –Setup time –Query processing –Transitive rule –Caching Conclusions

WIMS 2011, Sogndal, Norway3 Introduction - Problem Problem –Scalability issues in the context of a question/answer system (called ScienceWeb) in the domain of a knowledge base of science information that has been harvested from the web –ScienceWeb is being built using ontologies, reasoning systems and custom based rules for the reasoning system Approach –Use existing benchmarks extended with Custom inference rules Generate more realistic data In the ScienceWeb environment

WIMS 2011, Sogndal, Norway4 Background Existing semantic application: question/answer systems –AquaLog, QuestIO, QUICK - natural language input Semantic Web –Resource Description Framework(RDF) –RDF schemas –Web Ontology Language (OWL) for specific knowledge domains –SPARQL query language for RDF –SWRL web rule language Reasoning systems –Jena proprietary Jena rules –Pellet and KANON supporting SWRL –ORACLE 11g –OWLIM

Background Existing performance studies on OWL based reasoning systems only on native rule sets Varying complexity of Abox and Tbox –Tbox (contains the axioms defining the classes and relations in an ontology) –Abox (assertions about the individuals in the domain) Existing benchmarks to generate ontologies –Leigh University Benchmark(LUBM) –University Ontology Benchmark (UOBM) and extension of LUBM WIMS 2011, Sogndal, Norway5

Background – Ontologies with custom rule support Jena: in memory and persistent store, SPARQL, forward chain and backward chain Pellet: open source, descriptive logic, SQRL KAON2: free, SWRL,F-logic, SPARQL Oracle 11g: native inference using database, forward chaining, OWL OWLIM: OWL, rules and axiomatic triples WIMS 2011, Sogndal, Norway6

7 General comparison among ontology reasoning systems

Experimental Design - ontology Baseline: LUBM ScienceWeb: use own data generator for ontology instance data (UnivGenerator) –Classes are more detailed –Data are more realistic (e.g., faculty with advisors in different universities, co-authors at different universities) WIMS 2011, Sogndal, Norway8

Class tree of research community ontology. WIMS 2011, Sogndal, Norway9

10 Size range of datasets (in triples)

Experimental Design – Rule Set WIMS 2011, Sogndal, Norway11

WIMS 2011, Sogndal, Norway12 Rule set 1: Co-author authorOf(?x, ?p) authorOf(?y, ?p) coAuthor(?x, ?y) Rule set 2: validated Co-author authorOf(?x, ?p) authorOf(?y, ?p) notEqual(?x, ?y) coAuthor(?x, ?y) Rule set 3: Research ancestor (transitive) advisorOf(?x, ?y) researchAncestor(?x, ?y) researchAncestor(?x, ?y) researchAncestor(?y, ?z) researchAncestor(?x, ?z)

WIMS 2011, Sogndal, Norway13 Rule set 4: Distinguished advisor (recursive) advisorOf(?x,?y) advisorOf(?x,?z) notEqual(?y,?z) worksFor(?x,?u) distinguishAdvisor(?x, ?u) advisorOf(?x,?y) distinguishAdvisor(?y,?u) worksFor(?x,?d) distinguishAdvisor(?x, ?d) Rule set 5: combination of above 4 rule sets.

WIMS 2011, Sogndal, Norway14 Jena [rule1: (?x uni:authorOf ?p) (?y uni:authorOf ?p) notEqual(?x,?y) ->(?x uni:coAuthor ?y)] [rule2: (?x uni:advisorOf ?y) -> (?x uni:researchAncestor ?y)] [rule3: (?x uni:researchAncestor ?y)(?y uni:researchAncestor ?z) ->(?x uni:researchAncestor ?z)] [rule4: (?x uni:advisorOf ?y) (?x uni:advisorOf ?z notEqual(?y,?z) (?x uni:worksFor ?u) -> (?x uni:distinguishAdvisor ?u)] [rule5: (?x uni:advisorOf ?y) (?y uni:distinguishAdvisor ?u) (?x uni:worksFor ?d) -> (?x uni:distinguishAdvisor ?d)]

WIMS 2011, Sogndal, Norway15 In SWRL these rules are less compact. Rule 1: ……

WIMS 2011, Sogndal, Norway16 Query in SPARQL notation: Query 1: Co-author PREFIX uni: SELECT ?x ?y WHERE {?x uni:coAuthor ?y. ?x uni:hasName \"FullProfessor0_d0_u0\" }

WIMS 2011, Sogndal, Norway17 Query 2: Research ancestor PREFIX uni: SELECT ?x ?y WHERE {?x uni:researchAncestor ?y. ?x uni:hasName \"FullProfessor0_d0_u0\" };

WIMS 2011, Sogndal, Norway18 Query 3: Distinguished advisor PREFIX uni: SELECT ?x ?y WHERE {?x uni:distinguishAdvisor ?y. ?y uni:hasTitle \"department0u0\" };

Experimental Design - Metrics Setup time –This stage includes loading and preprocessing time before any query can be made Query processing time –This stage starts with parsing and executing the query and ends when all the results have been saved in the result set. WIMS 2011, Sogndal, Norway19

Experimental Design - Procedure Scale with the size the instance data Scale with respect to the complexity of reasoning –Transitive chain Caching effect Realism of model (ScienceWeb vs LUBM) WIMS 2011, Sogndal, Norway20

Results – Setup time WIMS 2011, Sogndal, Norway21

WIMS 2011, Sogndal, Norway22 Setup time of rule set 1 for LUBM dataset

WIMS 2011, Sogndal, Norway23 Setup time of rule set 4 for LUBM dataset

WIMS 2011, Sogndal, Norway24 Setup time of rule set 1 for ScienceWeb dataset

WIMS 2011, Sogndal, Norway25 Setup time of rule set 4 for ScienceWeb dataset

Results – Setup time Some systems have no data points because of size of Abox (Jena, Pellet, KAON2 load into memory for inferencing) For small(<2 Million) KAON2 best Oracle and OWLIM scale to 4 million triples with no problem; Oracle scales best Great variation with different rule sets –OWLIM not good on rule set 4 for ScienceWeb(more triples in Abox than LUBM) as compared to Oracle –Oracle not good on rule set 2 as it needs to set up “filter” to implement “notEqual” WIMS 2011, Sogndal, Norway26

Results – Query Processing WIMS 2011, Sogndal, Norway27

WIMS 2011, Sogndal, Norway28 Query processing time of query 1 for ScienceWeb dataset

Results – Query Processing OWLIM best for all but largest (4 Million) triple set Oracle best for largest set Query returned in seconds Setup time can take hours WIMS 2011, Sogndal, Norway29

Results – Caching Effect Caching ratio: –first query processing time /average over next ten identical queries OWLIM little effect In other systems effect becomes weaker as the size of dataset grows WIMS 2011, Sogndal, Norway30

WIMS 2011, Sogndal, Norway31 Caching ratios between processing time of single query and average processing time on ScienceWeb ontology for query 1

Results – Transitive Rule created a group of separate instance files containing different number of individuals that are related via the transitive rule in rule set 3 WIMS 2011, Sogndal, Norway32

WIMS 2011, Sogndal, Norway33 Setup time for transitive rule

WIMS 2011, Sogndal, Norway34 Query processing time after inference over transitive rule

Results – Transitive Rule Pellet only provides the results before time-out when the length of transitive chain is 100 Jena’s performance degrades badly when the length is more than 200 Only KAON2, OWLIM and Oracle 11g could complete inference and querying on long transitive chains WIMS 2011, Sogndal, Norway35

WIMS 2011, Sogndal, Norway36 Conclusions When more realistic models (ScienceWeb) than provided by LUBM are used, serious issues arise when the size approaches million of triplets OWLIM and Oracle offer the best scalability for the kinds of datasets anticipated for ScienceWeb –heavy front-loading of the inferencing costs by pre-computing the entailed relationships at set-up time –negative implications for evolving systems Real-time queries over large triplet spaces will have to be limited in their scope How we can specify what can be asked within a real-time system? We do not know yet.