Datalog DL : Datalog Rules Parameterized by Description Logics Jing Mei, Harold Boley, Jie Li, Virendrakumar C. Bhavsar, Zuoquan Lin Canadian Semantic Web Working Symposium June 6, 2006 Laval University, Quebec City, Canada
2 Contents Semantic Web Architectures Context of Datalog DL Description Logic (DL) Family Hybrid Knowledge Bases Strategies for Reasoning Services Integration Frameworks Comparison Proposal of Datalog DL Syntax Semantics Reasoning Examples Selected References
3 Semantic Web Architectures Homogeneous approach Hybrid approach
4 Content Semantic Web Architectures Context of Datalog DL Description Logic (DL) Family Hybrid Knowledge Bases Strategies for Reasoning Services Integration Frameworks Comparison Proposal of Datalog DL Syntax Semantics Reasoning Examples Selected References
5 The DL Family Bottom-Up ALC: C and D are classes, R is a property S = ALC R+ : Transitive properties SI: Inverse properties SHI: Property hierarchies SHIF: Functional restrictions SHIN: Cardinality (Number) restrictions SHIQ: Qualified number restrictions Support for datatype predicates (e.g. string, integer) leads to the concrete domain of D Using nominals O allows to construct classes from singleton sets, with the so-called one-of operator OWL Lite = SHIQ(D) OWL DL = SHOIN(D) [10] ALC
6 Hybrid Knowledge Base Hybrid KB: K = ( , ) : A DL KB : A Datalog program with DL-queries to Hybrid Rules h(X) :- b 1 (Y 1 ) … b m (Y m ) & q 1 (Z 1 ) … q n (Z n ) h(X), b i (Y i ) Datalog atoms (1≤i≤m); X, Y i sequences of constants|variables q j (Z j ) DL-queries (1≤j≤n); Z j sequence of constants|variables Safeness Condition Weak safeness condition Variables appearing in the head of a rule must also appear in the body, but not necessarily in the DL body That is, a variable occurring in X must occur in one of the Y i |Z j 's Strong safeness condition Each variable appearing in the DL component also appears in the Datalog component, in addition to weak safeness That is, a variable occurring in X|Z j must occur in one of the Y i 's
7 Strategies for Reasoning Services Beyond classical DL tableaux calculus Based on reduction Reducing a DL KB to (disjunctive, function-free, negation-free) Datalog rules Rule engines support for DL reasoning Based on components SLD-resolution for rules Backward chaining, Top-Down Collecting DL-queries, which are finally evaluated for DL satisfiability Entailment for DL Forward chaining, Bottom-Up Building DL tableaux, whose inferred assertions are fed into rules Fixpoint Iteration for both DL and rules Modular reasoning method with separation of reasoning for components Running DL reasoners and rule engines at the same time Exchanging information until a fixpoint is reached
8 Integration Frameworks Homogeneous approaches DLP [1]: Description Logic Programs SWRL [2]: Semantic Web Rule Language KAON2 [3]: OWL extended with DL-safe rules Hybrid approaches AL-log [4]: ALC DL + Datalog CARIN [5]: ALCNR DL + Datalog where N means cardinality (number) restrictions and R means role conjunctions [10] dl-programs [6]: SHIF(D) | SHOIN(D) DL + Answer Set Programming r-hybrid KBs [7]: A decidable DL + Datalog
9 Comparison Notes: 1.DLP: Expressivity restrictions 2.SWRL: Undecidable 3.KAON2: DL-safe rules Notes: 1.AL-log: Only concept constraints 2.CARIN: Recursive CARIN-ALCNR undecidable 3.dl-programs: Nonmonotonic semantics 4.r-hybrid KBs: Nonmonotonic semantics SLD-resolution Entailment Fixpointiteration – SLD-resolution X X X X X X X X X X AL-log CARIN dl-programs r-hybridKBs Datalog DL Hybrid Approaches Reduction – X X X X X X DLP SWRL KAON2 Homogeneous Approaches Reasoning Strategy Information Flow betweenDatalog& DL: Uni-direct. Bi-direct. Safeness Condition: Strong Weak SLD-resolution Entailment Fixpointiteration – SLD-resolution X X X X X X X X AL-log CARIN dl-programs r-hybridKBs Datalog DL Hybrid Approaches Reduction – X X X X X DLP SWRL KAON2 Homogeneous Approaches Reasoning Strategy Information Flow betweenDatalog& DL: Uni-direct. Bi-direct. Safeness Condition: Strong Weak
10 Content Semantic Web Architectures Context of Datalog DL Description Logic (DL) Family Hybrid Knowledge Bases Strategies for Reasoning Services Integration Frameworks Comparison Proposal of Datalog DL Syntax Semantics Reasoning Examples Selected References
11 A Hybrid Approach: Datalog DL Datalog DL : Combining (sublanguage of) SHIQ DL and Datalog rules The rule component: (Disjunctive, Function-free, Negation- free) Datalog with terms consisting of variables and constants The DL Component: Any specific decidable DL language ranging from ALC to SHIQ Safeness: Weak safeness condition Requirement: Independent properties Reasoning Strategy SLD-resolution for rules: Extending a rule engine (OO jDREW) to incorporate hybrid rules Tableaux algorithm for DL queries: Using an external DL reasoner (Racer) to check ALC to SHIQ satisfiability
12 Syntax An alphabet of predicates A = A T A P with A T A P = A Datalog L KB: K = ( , ) : An L-based DL KB with predicates in A T where L ranges from ALC to SHIQ : A Datalog program with DL-queries to , s.t. each hybrid rule r is [r] h(X) :- b 1 (Y 1 ) … b m (Y m ) & q 1 (Z 1 ) … q n (Z n ) where X, Y 1,..., Y m are n-ary sequences of terms (constants|variables) Z1,..., Zn are unary/binary sequences of terms h(X), b i (Y i ) (1≤i≤m) are Datalog atoms with predicates in A P Each q j (Z j ) (1≤j≤n) is a DL-query with predicate in A T Notes: 1.“DL body” means: “q 1 (Z 1 ) … q n (Z n )” 2.“Datalog body” means: “b 1 (Y 1 ) … b m (Y m )” 3.“Datalog rule” means: hybrid rule after deletion of “& DL body”
13 Decidability Issues It has been pointed out in CARIN Recursive Datalog rules + cyclic TBox with only DL constructor P.C Reducing the halting problem of a Turing machine (known to be undecidable) to the entailment problem of a KB containing DL ABox: integer(1) DL TBox: integer succ.integer rule-primitive: lessThan(x, y) :- & succ(x, y). rule-recursive: lessThan(x, y) :- lessThan(z, y) & succ(x, z). Remark: Strong safeness condition would demand that “x” occur in “lessThan(z, y)” in the above KB example Re-obtaining decidability AL-log: Disallowing DL property queries like “succ(x, y)“ CARIN: A (maximal) decidable sublanguage namely CARIN-MARC DLP: Disallowing the existential DL constructor P.C to occur on the right hand side of “ ” subsumptions Datalog DL : By means of constrained SLD-resolution, provided by hybrid rules with independent properties
14 Features of Datalog DL Pure-DL Variables A pure-DL variable in a rule r is a variable that only occurs in one of the Z j 's Pure-DL variables lead to the violation of the strong safeness condition in cases where the weak safeness condition is obeyed According to the classical SLD-resolution with rules, non-pure-DL variables will be bound to ground values, still leaving pure-DL variables free Folding Classical DL algorithms: Reducing DL queries to KB unsatisfiability, e.g. by transforming the query into a negated assertion, but the negation of properties is not supported by most DLs DL-query of C(x) is reduced to checking whether C is non-empty, where x is a pure-DL variable DL-query of P(u, x) ∧ C(x) becomes folding result P.C(u), where x is a pure-DL variable DL-query of P(x, u) ∧ C(x) becomes folding result P -.C(u), where x is a pure-DL variable and P - is the inverse of P
15 Features of Datalog DL (cont’d) Independent Properties Folding cannot be applied to query parts that contain cycles (e.g. P(x, y) ∧ Q(y, z) ∧ R(z, x)), or where more than one property arc enters a node that corresponds to a variable (e.g. P(u, x) ∧ Q(y, x)) Tree-shaped DL queries: Adding rules to DLs, in a unrestricted manner, causes the loss of any form of tree model property A property P is independent in a rule r, if no P occurrence shares any pure-DL variables with other property occurrences (including other P occurrences) Correspondence: For hybrid rules with independent properties, the folding results are equivalent to the original DL-queries
16 Two Other Transformations Making weakly safe rules strongly safe Referring to DL-safe rules in KAON2 [3] A special predicate symbol O A P For each pure variable w in a rule r, add an atom O(w) to the Datalog body of r For each constant c occurring in K = ( , ), add a fact O(c) to Rolling-up to eliminate DL property queries Referring to a conjunctive query language for DL ABox [8] Similar to folding in our setting Exploiting the DL tree model feature for queries containing cycles Simulating the one-of operator by substituting each individual a with a representative concept P a of the individual a
17 Semantics A first-order interpretation I = ( △, I ) of Datalog L △ : The non-empty domain of I I : The interpretation function of I A model of the Datalog L KB K=( , ) The interpretation I is a model of The interpretation I satisfies each hybrid rule r in , i.e. [r] h(X) :- b 1 (Y 1 ) … b m (Y m ) & q 1 (Z 1 ) … q n (Z n ) s.t. If T r (Y i ) b i I and T r (Z j ) q j I (1≤i≤m, 1≤j≤n) for every atom in the body of r, then T r (X) h I for the head of r, where T r is a term assignment w.r.t I for constants and variables in r. Notes: 1.The interpretation of constants is according to the standard names assumption and to the unique name assumption 2.Without negation-as-failure, this first-order semantics gives a platform for DL-and-Datalog combination, both of which are first-order fragments
18 Reasoning A kind of constrained SLD-resolution Grounding variables in hybrid rules, but pure-DL variables still left free Folding (independent) properties, to eliminate pure-DL variables DL satisfiability DL queries without variables Building a disjunctive DL class for the collection of DL queries from hybrid rules sharing the same head
19 Example of AL-log Referring to AL-log [4], a query to mayDoThesis(paul, mary): The final ground queries after constrained SLD-resolution without folding expert(mary, lp), exam(paul, ai), subject(ai, lp) & St(paul), Tp(lp), AC(ai),
20 Example of CARIN Referring CARIN [5], a query to price(a, usa high): The final ground queries after constrained SLD-resolution plus folding made-by(a, b), monopoly(b, a, usa) &
21 Use Case of RuleML FOAF Referring to RuleML FOAF [9], a query to possiblyKnows(Laura, Ben):RuleML FOAF The final ground queries after constrained SLD-resolution plus folding &
22 [1] Benjamin N. Grosof, Ian Horrocks, Raphael Volz, and Stefan Decker. Description Logic Programs: Combining Logic Programs with Description Logic. In WWW 2003, pages 48–57, [2] Ian Horrocks, Peter F. Patel-Schneider, Harold Boley, Said Tabet, Benjamin Grosof, and Mike Dean. Semantic Web Rule Language (SWRL). W3C Member Submission. May [3] Boris Motik, Ulrike Sattler, and Rudi Studer. Query Answering for OWL-DL with Rules. Journal of Web Semantics, 3(1):41–60, [4] Francesco M. Donini, Maurizio Lenzerini, Daniele Nardi, and Andrea Schaerf. AL-log: Integrating Datalog and Description Logics. Journal of Intelligent Information Systems (JIIS), 10(3):227–252, [5] Alon Y. Levy and Marie-Christine Rousset. CARIN: A Representation Language Combining Horn Rules and Description Logics. In ECAI-96, pages 323–327, [6] Thomas Eiter, Thomas Lukasiewicz, Roman Schindlauer, and Hans Tompits. Combining Answer Set Programming with Description Logics for the Semantic Web. In KR 2004, pages 141–151, [7] Riccardo Rosati. On the decidability and complexity of integrating ontologies and rules. Journal of Web Semantics, 3(1):61–73, [8] Ian Horrocks and Sergio Tessaris. Querying the Semantic Web: a Formal Approach. In Workshop on Principles and Practice of Semantic Web Reasoning, pages 177—191, [9] Jie Li, Harold Boley, Virendrakumar C. Bhavsar, and Jing Mei. Expert Finding for eCollaboration Using FOAF with RuleML Rules. In: The Montreal Conference on eTechnologies, May [10] Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter F. Patel-Schneider. The Description Logic Handbook: Theory, Implementation and Applications. Cambridge University Press, Selected References
23