Download presentation
Presentation is loading. Please wait.
Published byJean Joye Modified over 9 years ago
1
WIMS 2014, June 2-4Thessaloniki, Greece1 Optimized Backward Chaining Reasoning System for a Semantic Web Hui Shi, Kurt Maly, and Steven Zeil Contact: maly@cs.odu.edu
2
2 Outline Problem –Semantic web subject to changes –How to scale a reasoner to big data? Background –Knowledge base using ontologies –Inference strategies –Benchmarks –Query optimization Integrated optimized backward chaining –Selection function –Switching resolution methods –Avoidance of non-termination – OLDT –Owl:sameAs optimization Evaluation Conclusions WIMS 2014, June 2-4Thessaloniki, Greece
3
3 Problem Efficiency of reasoning in the face of large scale and frequent changes within a question/answer system over a semantic web Issue –Forward chaining scales well for fixed knowledge bases –Backward chaining can handle changes in knowledge base but does not scale WIMS 2014, June 2-4Thessaloniki, Greece
4
Background Existing semantic application: question/answer systems –Libra, Cimple, Arnetminer Semantic Web –Resource Description Framework(RDF) –Web Ontology Language (OWL) for specific knowledge domains –SPARQL query language for RDF –SWRL rule language Reasoning systems –Jena proprietary Jena rules –Pellet and KANON –ORACLE 11g –OWLIM WIMS 2014, June 2-4Thessaloniki, Greece 4
5
5 Background Knowledge base (KB) –Ontologies –Representation formalism: Description Logic (DL) Inference methods for First Order Logic –Materialization and forward chaining pre-computes inferred truths and starts with the known data suitable for frequent computation of answers with data that are relatively static Owlim and Oracle –Query-rewriting and backward chaining expands the queries and starts with goals suitable for efficient computation of answers with data that are dynamic and infrequent queries Virtuoso WIMS 2014, June 2-4Thessaloniki, Greece
6
Background Benchmarks evaluate and compare the performances of different reasoning systems –The Lehigh University Benchmark (LUBM) –The University Ontology Benchmark (UOBM) 6WIMS 2014, June 2-4Thessaloniki, Greece
7
Background Query optimization – issues –Query (conjunction of individual clauses) optimization over databases – well understood –Having reasoner -> uncertainty regarding the size of solution space associated with resolving individual clauses –Query optimization in the presence of such uncertainty Dynamic Optimization with an Interposed Reasoner A greedy ordering of the proofs of the individual clauses according to estimated sizes anticipated for the proof results Deferring joins of results from individual clauses where such joins are likely to result in excessive combinatorial growth of the intermediate solution WIMS 2014, June 2-4Thessaloniki, Greece 7
8
Hybrid reasoner Motivation example Assume fully materialized KB Harvester adds new fact: student0 enrolled course0 Query ‘Who is enrolled in course 0?’ ok Assume fact Porf0 teaches course0 in KB Query “Who is being taught by Prof0?” not ok as simple lookup; needs reasoning with rule such as: enrolledIn(?Student,?Course?), teaches(?Faculty,?Course) :- isTaughtBy(?Student,?faculty) WIMS 2014, June 2-4Thessaloniki, Greece 8
9
Optimized Backward Chaining Problem –Generate a query response for a given query pattern based on a specific rule set (RDFS, Horst, custom) Four Optimizations –Ordered Selection Function –Switching between Binding Propagation and Free Variable Resolution –Avoid Repetition and Non-Termination (OLDT) –owl:sameAs Optimization WIMS 2014, June 2-4Thessaloniki, Greece9
10
Dynamic Selection of Propagation Mode Suppose that: –we have a rule body containing clauses (?x p1 ?y) and (?y p2 ?z) –we have already proven that the first clause can be satisfied using value pairs {(x 1, y 1 ), (x 2,y 2 ),…(x n,y n )}. WIMS 2014, June 2-4Thessaloniki, Greece10
11
Dynamic Selection of Propagation Mode Binding propagation mode –the bindings from the earlier solutions are substituted into the upcoming clause to yield multiple instances of that clause as goals for subsequent proof –(y 1 p2 ?z), (y 2 p2 ?z), …, (y n p2 ?z) Free variable resolution mode –a single proof is attempted of the upcoming clause in its original form, with no restriction upon the free variables in that clause –(?y p2 ?z) WIMS 2014, June 2-4Thessaloniki, Greece11
12
Dynamic Selection of Propagation Mode: Example Suppose we have an earlier body clause 1: “?y type Course” and a subsequent body clause 2: “?x takesCourse ?y”. –1.749 seconds to prove body clause 1 –average of 0.235 seconds to prove body clause 2 for a given value of ?y from the proof of body clause 1. –86,361 students satisfying variable ?x –0.235 *86,361=20,295 seconds with binding propagation –2.612 seconds to resolve the second clause in free variable resolution WIMS 2014, June 2-4Thessaloniki, Greece12
13
Dynamic Selection of Propagation Mode –Dynamically switch between modes based upon the size of the partial solutions obtained Let n denote the number of solutions that satisfy an already proven clause Let t denote the threshold used to dynamically select between modes If n≤t, then the binding propagation mode will be selected If n>t, then the free variable resolution mode will be selected The larger the threshold is, the more likely binding propagation mode will be selected. WIMS 2014, June 2-4Thessaloniki, Greece13
14
Calculation of Threshold t –Let join 1 denote the time spent on the join operations in binding propagation mode –Let join 2 denote the time spent on the join operations in free variable resolution mode –Let proof 1 i denote the time of proving first clause with i free variables and proof 2 j be the average time of proving new specialized form with j free variables. (i ∈ [1,3], j ∈ [0,2]) –Let proof 3 k denote the time of proving second clause with k free variables (k ∈ [1,3]) Compare the time spent on binding propagation mode and free variable resolution mode to determine t. Binding propagation is favored when proof 1 i + proof 2 j * n + join 1 < proof 1 i + proof 3 k + join 2 t = floor(proof 3 k / proof 2 j ) WIMS 2014, June 2-4Thessaloniki, Greece 14
15
Calculation of Threshold t To estimate proof 3 k and proof 2 j –we record the time spent on proving goals with different numbers of free variables –after we have recorded a sufficient number of proof times,we compute the average time spent on goals with k free variables and j free variables respectively Start with historical default value Update the threshold several times when answering a particular query WIMS 2014, June 2-4Thessaloniki, Greece15
16
Evaluation Time (ms), Dynamic selection Time (ms), Binding propagation only Time (ms), Free variable resolution only Query1343 296 Query21,0601,34121,278 Query3152015 Query485896142,572 Query5151622,323 Query61,170592,94419,968 Query71,341551,82220,217 Query81,684513,77340,061 Query91,591524,78720,841 Query10982509,07819,734 Query119310919,141 Query1210915638,313 Query1301021,528 Query14156140 WIMS 2014, June 2-4Thessaloniki, Greece 16
17
Overall Performance LUBM(1)LUBM(40) Time (ms), Opt. Backwd Time (ms), OWLIM -SE Time (ms), Opt. Backwd Time (ms), OWLIM -SE Loading time 2,9009,60095,000350,000 Query1260271,40026 Query24903.49,1005,100 Query3561.0362.5 Query44708.45,90014 Query533591541 Query618024043,0005,300 Query71904.451,00054 Query854046057,0003,000 Query92506387,0004,400 Query101400.1051,0000.60 Query111904.92005.4 Query122201.03,60011 Query13280.203317 Query1424231,2002,500 WIMS 2014, June 2-4Thessaloniki, Greece 17 LUBM(1) = 100,839 LUBM(40) = 5,307,754
18
18 Conclusions We have developed optimizations for a backward chaining algorithm New optimized algorithm outperformed one of the best forward-chaining reasoner in scenarios where the knowledge base is subject to frequent change WIMS 2014, June 2-4Thessaloniki, Greece
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.