IDB, SNU Dong-Hyuk Im 2008.07.11 Efficient Computing Deltas between RDF Models using RDFS Entailment Rules (working title)

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

1 ICS-FORTH EU-NSF Semantic Web Workshop 3-5 Oct Christophides Vassilis Database Technology for the Semantic Web Vassilis Christophides Dimitris Plexousakis.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
WIMS 2014, June 2-4Thessaloniki, Greece1 Optimized Backward Chaining Reasoning System for a Semantic Web Hui Shi, Kurt Maly, and Steven Zeil Contact:
Semantic Web Thanks to folks at LAIT lab Sources include :
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
27 January Semantically Coordinated E-Market Semantic Web Term Project Prepared by Melike Şah 27 January 2005.
CS570 Artificial Intelligence Semantic Web & Ontology 2
Ontological Logic Programming by Murat Sensoy, Geeth de Mel, Wamberto Vasconcelos and Timothy J. Norman Computing Science, University of Aberdeen, UK 1.
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
RDF Databases By: Chris Halaschek. Outline Motivation / Requirements Storage Issues Sesame General Introduction Architecture Scalability RQL Introduction.
Probabilistic RDF Octavian Udrea 1 V.S. Subrahmanian 1 Zoran Majkić 2 1 University of Maryland College Park 2 University “La Sapienza”, Rome, Italy.
WIMS 2011, Sogndal, Norway1 Comparison of Ontology Reasoning Systems Using Custom Rules Hui Shi, Kurt Maly, Steven Zeil, and Mohammad Zubair Contact:
1 CSIT600f: Introduction to Semantic Web Conclusion and Outlook Dickson K.W. Chiu PhD, SMIEEE Text: Antoniou & van Harmelen: A Semantic Web PrimerA Semantic.
Human Language Technologies. Issue Corporate data stores contain mostly natural language materials. Knowledge Management systems utilize rich semantic.
Incremental Materialization of RDF Graph Closures for Stream Reasoning Alexandre Mello Ferreira (PhD student) 22/11/2010.
Subscription Subsumption Evaluation for Content-Based Publish/Subscribe Systems Hojjat Jafarpour, Bijit Hore, Sharad Mehrotra, and Nalini Venkatasubramanian.
Accelerating Inferencing. Assertion Efficient inferencing using taxonomies require fast computation of subsumption, disjointness, least common ancestors,
1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.
Paper Title Your Name CMSC 838 Presentation. CMSC 838T – Presentation Motivation u Problem paper is trying to solve  Characteristics of problem  … u.
The Semantic Web – WEEK 5: RDF Schema + Ontologies The “Layer Cake” Model – [From Rector & Horrocks Semantic Web cuurse]
Comparing path-based and vertically-partitioned RDF databases Preetha Lakshmi & Chris Mueller 12/10/2007 CSCI 8715 Shashi Shekhar.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
1 Extracting RDF Data from Unstructured Sources Based on an RDF Target Schema Tim Chartrand Research Supported By NSF.
Comparing path-based and vertically-partitioned RDF databases Preetha Lakshmi & Chris Mueller 12/10/2007 CSCI 8715 Shashi Shekhar.
Department of Computer Science, University of Maryland, College Park 1 Sharath Srinivas - CMSC 818Z, Spring 2007 Semantic Web and Knowledge Representation.
1 DCS861A-2007 Emerging IT II Rinaldo Di Giorgio Andres Nieto Chris Nwosisi Richard Washington March 17, 2007.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
TAPP-09 23/02/2009Giorgos Flouris1 On Explicit Provenance Management in RDF/S Graphs Institute of Computer Science Foundation for Research and Technology.
Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer …but …
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
Panagiotis Antonopoulos Microsoft Corp Ioannis Konstantinou National Technical University of Athens Dimitrios Tsoumakos.
Database Support for Semantic Web Masoud Taghinezhad Omran Sharif University of Technology Computer Engineering Department Fall.
Populating A Knowledge Base From Text Clay Fink, Tim Finin, Christine Piatko and Jim Mayfield.
Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce Mohammad Farhan Husain, Pankil Doshi, Latifur Khan, Bhavani Thuraisingham University.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Logical Agents Logic Propositional Logic Summary
Comparison of BaseVISor, Jena and Jess Rule Engines Jakub Moskal, Northeastern University Chris Matheus, Vistology, Inc.
1 Discovering Robust Knowledge from Databases that Change Chun-Nan HsuCraig A. Knoblock Arizona State UniversityUniversity of Southern California Journal.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Scalable Distributed Reasoning Using MapReduce Jacopo Urbani, Spyros Kotoulas, Eyal Oren, and Frank van Harmelen Department of Computer Science, Vrije.
Bigscholar 2014, April 8, Seoul, South Korea1 Trust and Hybrid Reasoning for Ontological Knowledge Bases Hui Shi, Kurt Maly, and Steven Zeil Contact:
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Shridhar Bhalerao CMSC 601 Finding Implicit Relations in the Semantic Web.
Extending the MDR for Semantic Web November 20, 2008 SC32/WG32 Interim Meeting Vilamoura, Portugal - Procedure for the Specification of Web Ontology -
ISO/IEC JTC 1/SC 32 Plenary and WGs Meetings Jeju, Korea, June 25, 2009 Jeong-Dong Kim, Doo-Kwon Baik, Dongwon Jeong {kjd4u,
R Store Angelique Moscicki Oshani Seneviratne Sergio Herrero-Lopez.
Triple Stores. What is a triple store? A specialized database for RDF triples Can ingest RDF in a variety of formats Supports a query language – SPARQL.
Sesame: An Architecture for Storing and Querying RDF Data and Schema Inf. Yasser Ganji Saffar When they were out of sight Ali Baba.
Text Based Similarity Metrics and Delta for Semantic Web Graphs Krishnamurthy Koduvayur Viswanathan Monday, June 28,
Of 38 lecture 6: rdf – axiomatic semantics and query.
Massive Semantic Web data compression with MapReduce Jacopo Urbani, Jason Maassen, Henri Bal Vrije Universiteit, Amsterdam HPDC ( High Performance Distributed.
Graph Data Management Lab, School of Computer Science Branch Code: A Labeling Scheme for Efficient Query Answering on Tree
32nd International Conference on Very Large Data Bases September , 2006 Seoul, Korea Efficient Detection of Empty Result Queries Gang Luo IBM T.J.
Semantic Web for the Working Ontologist Dean Allemang Jim Hendler SNU IDB laboratory Last modified,
1 Review of report "LSDX: A New Labeling Scheme for Dynamically Updating XML Data"
WonderWeb. Ontology Infrastructure for the Semantic Web. IST Project Review Meeting, 11 th March, WP2: Tools Raphael Volz Universität.
An Optimization Technique for RDFS Inference using the Application Order of RDFS Entailment Rules Kisung Kim, Taewhi Lee
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Sesame A generic architecture for storing and querying RDF and RDFs Written by Jeen Broekstra, Arjohn Kampman Summarized by Gihyun Gong.
WonderWeb. Ontology Infrastructure for the Semantic Web. IST WP4: Ontology Engineering Heiner Stuckenschmidt, Michel Klein Vrije Universiteit.
Semantic Web for the Working Ontologist Dean Allemang Jim Hendler SNU IDB laboratory.
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
Ontology Technology applied to Catalogues Paul Kopp.
WP3: Data Provenance and Access Control Irini Fundulaki, FORTH December 11-12, 2012, Luxembourg.
Chapter 8A Semantic Web Primer 1 Chapter 8 Conclusion and Outlook Grigoris Antoniou Frank van Harmelen.
26/02/ WSMO – UDDI Semantics Review Taxonomies and Value Sets Discussion Paper Max Voskob – February 2004 UDDI Spec TC V4 Requirements.
1 Efficient Processing of Transitive Closure Queries in Ontology Store using Graph Labeling Kim, Jongnam SNU OOPSLA Lab. Dec. 3, 2004.
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
Presentation transcript:

IDB, SNU Dong-Hyuk Im Efficient Computing Deltas between RDF Models using RDFS Entailment Rules (working title)

2 Contents  Introduction  Previous Works  Our Approach  Experimental Results

3 Introduction(1/2)  Ontology Evolution  Ontologies change (real world is dynamic)  Changes in the domain of interest DomainModelOntology Modeling by Described by Describe models

4 Introduction(2/2)  Change Detection in RDF  RDF is used in a variety of area (knowledge domain)  There are many updates in data on the web  Generally, a changed part is relatively small  Goal : “GNU Diff”  Find the differences between two versions and inform the user about changes conceptualization Add knowledge Add relationship Add … Real world (Knowledge domain) What is change?

5 Motivating Example (Ontology Evolution) subClassOf property type Person TA Student Jim Literal Person TA Student Jim Literal Transform K to K’ K K’

6 Change Detection : Δ e Person type class Student type class TA type class Student subClassOf Person TA subClassOf Person Address type property Address domain Student Address range Literal Jim type Student Person type class Student type class TA type class Student subClassOf Person TA subClassOf Student Address type property Address domain Person Address range Literal Jim type Person K K’ Δ e = {Del(TA subClassOf Person), Del(Address domain Student), Del(Jim type Student), Add(TA subClassOf Student), Add(Address domain Person), Add(Jim type Person)} *e : explicit Δ e (K – K’) = { Add(t) | t ∈ K’ - K } ∪ { Del(t) | t ∈ K – K’ }

7 Change Detection : Δ c Person type class Student type class TA type class Student subClassOf Person TA subClassOf Person Address type property Address domain Student Address range Literal Jim type Student Person type class Student type class TA type class Student subClassOf Person TA subClassOf Student Address type property Address domain Person Address range Literal Jim type Person K K’ Δ c (K – K’) = { Add(t) | t ∈ C(K’) – C(K) } ∪ { Del(t) | t ∈ C(K) – C(K’) } TA subClssOf Person Address domain Student Address domain TA Jim type Person Δ c = {Del(Jim type Student), Add(TA subClassOf Student), Add(Address domain Person), Add(Address domain TA)} *c : closure

8 Change Detection : Δ d Person type class Student type class TA type class Student subClassOf Person TA subClassOf Student Address type property Address domain Person Address range Literal Jim type Person K K’ Δ d (K – K’) = { Add(t) | t ∈ K’ – C(K) } ∪ { Del(t) | t ∈ K – C(K’) } TA subClssOf Person Address domain Student Address domain TA Jim type Person Δ d = {Del(Jim type Student), Add(TA subClassOf Student), Add(Address domain Person)} *d : dense Person type class Student type class TA type class Student subClassOf Person TA subClassOf Person Address type property Address domain Student Address range Literal Jim type Student

9 Problem Definition  Semantic Diff :  Materialize the complete entailment  (transitive closure)  Perform a structural diff  Enlighten the differences between two versions  Closure computation: (only class-hierarchy)  perform inference (overhead) Data SizeTripleInferred tripleInference time UniProt Taxonomy (2008/2/28) 182MB2,637,0467,111, (S) Gene Ontology (2008/01) 32MB409,671376,80711(S)

10 Related Works  On the Foundations of Computing Deltas between RDF models, ISWC 2007  Various RDF comparison functions in conjunction with the semantics of the underlying change operations  SemVersion: A Versioning System for RDF and Ontologies, ESWC 2005  Proposes two diff algorithm: structured-base, semantic-aware  Time-Space Trade-offs in Scaling up RDF Schema Reasoning, WISE workshop 2005  RDF reasoning that only computes a small part of the implied statements  Inferencing and Truth Maintenance in RDF Schema, PSSS 2003  Gives a detailed algorithm for truth maintenance for RDF(S)

11 Previous Works vs Our Approach RDF Documents Diff result Structural Diff Parsing and partitioning -Fatch File – Insert : ~~~~ Delete: ~~~~~ Fatch File – Insert : ~~~~ Delete: ~~~~~ inference Diff result Structural Diff -Fatch File – Insert : ~~~~ Delete: ~~~~~ Fatch File – Insert : ~~~~ Delete: ~~~~~ inference Previous works Our Approach

12 Our Approach : Delta_Closure A B C A B Transform K to K’ K K’ D C B subClsssOf A C subClassOf A B subClsssOf C C subClassOf A D subClassOf A

13 Our Approach : Delta_Closure B subClsssOf A C subClsssOf A B subClsssOf C C subClsssOf A D subClsssOf A No inference !! May be inferred triple : apply entailment ruls Previous : if t ∉ K, check t ∈ C(K) Our Approach : if t ∉ K, check t ∈ C(K) which satisfy only our conditions

Algorithm 14 Algorithm (Delta & Closure) 01: Input : S source = Set of triples in source model 02: S target = Set of triples in target model 03: L key = List of keys (keys : all subject resource) 04: Output : Set of change operation Diff using entailment rules 05: DO { 06: For every key in L key 07: Select all triples which satisfy the same subject in S source 08: Select all triples which satisfy the same subject in S target 09: For every possible triple pair (x, y), x ∈ S source, y ∈ S target, 10: x’ = ApplyRule (x) 11: if (x’ == y) 12: else x ∪ Diff as deletion 13: y’ = ApplyRule (y) 14: if (y’ == x) 15: else y ∪ Diff as insertion 16: } While (L key is not empty)

15 Inference Engine  Forward chaining  Frequently used for load-time inference (materiallization)  Increased load time and storage space  Fast query response  Backward chaining  Performs run-time inference  Short load time  Slow response time

16 RDF Inference Rule  RDFS entailment rules (subsumption & type)  RDF Semantics  Rule 7  Rule 9  Rule 5, 11 (A subPropertyOf B),(U A Y) (U B Y) (U subClassOf X),(V type U) (V type X) (U subClassOf V),(V subClassOf X) (U subClassOf X) (U subPropertyV),(V subPropertyOf X) (U subPropertyOf X)

17 Applying Rules (Rule 11) B A C DE E A B C A subClassOf B A subClassOf C B subClassOf D B subClassOf E A subClassOf E A subClassOf B E subClassOf C A subClassOf C Check if triple may be inferred A subClassOf E

18 Applying Rules (Rule 9) A BC a A BC a A subClassOf B A subClassOf C a type A A subClassOf B A subClassOf C a type C a type A a type C (U subClassOf X),(V type U) (V type X) Check if triple may be inferred

19 Applying Rules (Rule 7) A BC A BC A draw B A draw C A create B A draw C A draw B A create B (A subPropertyOf B),(U A Y) (U B Y)

20 Experimental Setup (1/2)  Implemented in JAVA  Based in the main memory representation of RDF graphs  Data Set  Synthetic data set (RDF generator)  Gene Ontology termDB (RDF)  Only is-a relationship  Uniprot taxonomy (RDF)  Only is-a relationship

Experimental Setup (2/2) 21 G1G2G3G4G5G6G7G8 # of triple Inference Date(mm- yy) Nov-07Dec-07Jan-08Feb-08Mar-08Apr-08May-08Jun-08 Size(MB) U1U2U3U4U5 # of triple inference Date(mm- yy) Mar-08Apr-08 Jun-08Jul-08 Size(MB) Gene Ontology Uniprot Taxonomy

22 Experimental Result (1/2) Delta Size : dense, delta&closure are smaller than explcit, closure : inferred triple is very small (is-a relationship) Performance : explicit, delta&closure are faster than dense, closure

23 Experimental Result (2/2) Delta Size : dense, delta&closure are smaller than explcit, closure : inferred triple is very small (is-a relationship) : closure is much bigger than explicit Performance : explicit, delta&closure are faster than dense, closure

Conclusion  Semantic-aware Diff  Using inference rules (RDFS schema)  Δ Explicit, Δ Closure, Δ Dense&closure, Δ Dense   Our approach : Delta_closure  Considering efficiency and correctness  generates smaller than Δ Explicit and faster than Δ Dense 24