Proper Refinement of Datalog Clauses using Primary Keys

Slides:



Advertisements
Similar presentations
First Order Logic Logic is a mathematical attempt to formalize the way we think. First-order predicate calculus was created in an attempt to mechanize.
Advertisements

Inference Rules Universal Instantiation Existential Generalization
SLD-resolution Introduction Most general unifiers SLD-resolution
10 October 2006 Foundations of Logic and Constraint Programming 1 Unification ­An overview Need for Unification Ranked alfabeths and terms. Substitutions.
CSE 636 Data Integration Conjunctive Queries Containment Mappings / Canonical Databases Slides by Jeffrey D. Ullman.
Query Folding Xiaolei Qian Presented by Ram Kumar Vangala.
CLASSICAL PLANNING What is planning ?  Planning is an AI approach to control  It is deliberation about actions  Key ideas  We have a model of the.
CPSC 504: Data Management Discussion on Chandra&Merlin 1977 Laks V.S. Lakshmanan Dept. of CS UBC.
Inference in first-order logic Chapter 9. Outline Reducing first-order inference to propositional inference Unification Generalized Modus Ponens Forward.
1 Conjunctions of Queries. 2 Conjunctive Queries A conjunctive query is a single Datalog rule with only non-negated atoms in the body. (Note: No negated.
Artificial Intelligence Inference in first-order logic Fall 2008 professor: Luigi Ceccaroni.
Logic Use mathematical deduction to derive new knowledge.
Answer Set Programming Overview Dr. Rogelio Dávila Pérez Profesor-Investigador División de Posgrado Universidad Autónoma de Guadalajara
Inductive Logic Programming: The Problem Specification Given: –Examples: first-order atoms or definite clauses, each labeled positive or negative. –Background.
Leiden University Efficient Frequent Query Discovery in F ARMER Siegfried Nijssen and Joost N. Kok ECML/PKDD-2003, Cavtat.
1 Applied Computer Science II Resolution in FOL Luc De Raedt.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Inference and Resolution for Problem Solving
1 Efficient Discovery of Conserved Patterns Using a Pattern Graph Inge Jonassen Pattern Discovery Arwa Zabian 13/07/2015.
0 1 Todays Topics Resolution – top down and bottom up j-DREW BU procedure Subsumption – change to procedure Infinite Loops RuleML input – Prolog output.
Programming by Example using Least General Generalizations Mohammad Raza, Sumit Gulwani & Natasa Milic-Frayling Microsoft Research.
INFERENCE IN FIRST-ORDER LOGIC IES 503 ARTIFICIAL INTELLIGENCE İPEK SÜĞÜT.
TEDI: Efficient Shortest Path Query Answering on Graphs Author: Fang Wei SIGMOD 2010 Presentation: Dr. Greg Speegle.
Notes for Chapter 12 Logic Programming The AI War Basic Concepts of Logic Programming Prolog Review questions.
Querying Structured Text in an XML Database By Xuemei Luo.
Declarative vs Procedural Programming  Procedural programming requires that – the programmer tell the computer what to do. That is, how to get the output.
30/09/04 AIPP Lecture 3: Recursion, Structures, and Lists1 Recursion, Structures, and Lists Artificial Intelligence Programming in Prolog Lecturer: Tim.
Logical Agents Logic Propositional Logic Summary
Multi-Relational Data Mining: An Introduction Joe Paulowskey.
Predicate Calculus Syntax Countable set of predicate symbols, each with specified arity  0. For example: clinical data with multiple tables of patient.
CS Introduction to AI Tutorial 8 Resolution Tutorial 8 Resolution.
Outline Introduction – Frequent patterns and the Rare Item Problem – Multiple Minimum Support Framework – Issues with Multiple Minimum Support Framework.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 17 Wednesday, 01 October.
The AI War LISP and Prolog Basic Concepts of Logic Programming
For Monday Finish chapter 19 No homework. Program 4 Any questions?
For Monday Finish chapter 19 Take-home exam due. Program 4 Any questions?
First-Order Logic and Inductive Logic Programming.
1 Knowledge Based Systems (CM0377) Lecture 6 (last modified 20th February 2002)
DEDUCTION PRINCIPLES AND STRATEGIES FOR SEMANTIC WEB Chain resolution and its fuzzyfication Dr. Hashim Habiballa University of Ostrava.
Inference in First Order Logic. Outline Reducing first order inference to propositional inference Unification Generalized Modus Ponens Forward and backward.
1 Propositional Logic Limits The expressive power of propositional logic is limited. The assumption is that everything can be expressed by simple facts.
Chap. 10 Learning Sets of Rules 박성배 서울대학교 컴퓨터공학과.
Logic Programming Lecture 2: Unification and proof search.
Logical Agents. Outline Knowledge-based agents Logic in general - models and entailment Propositional (Boolean) logic Equivalence, validity, satisfiability.
CS589 Principles of DB Systems Fall 2008 Lecture 4c: Query Language Equivalence Lois Delcambre
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
1 Representing and Reasoning on XML Documents: A Description Logic Approach D. Calvanese, G. D. Giacomo, M. Lenzerini Presented by Daisy Yutao Guo University.
CS589 Principles of DB Systems Spring 2014 Unit 2: Recursive Query Processing Lecture 2-1 – Naïve algorithm for recursive queries Lois Delcambre (slides.
Database System Implementation CSE 507
Knowledge Representation and Reasoning
Web News Sentence Searching Using Linguistic Graph Similarity
Synthesis for Verification
Artificial Intelligence Chapter 17 Knowledge-Based Systems
Knowledge-Based Systems Chapter 17.
Artificial Intelligence Chapter 17 Knowledge-Based Systems
Lecture 18: Uniformity Testing Monotonicity Testing
Semantics of Datalog With Negation
Polynomial Construction for Arithmetic Circuits
Concurrent Graph Exploration with Multiple Robots
Lectures on Graph Algorithms: searching, testing and sorting
DLL Algorithm.
CS 416 Artificial Intelligence
Presented by: Jacky Ma Date: 11 Dec 2001
CS490D: Introduction to Data Mining Prof. Chris Clifton
Back to “Serious” Topics…
Artificial Intelligence Chapter 17. Knowledge-Based Systems
The most important idea in logic: Validity of an argument.
This Lecture Substitution model
Artificial Intelligence Chapter 17 Knowledge-Based Systems
Presentation transcript:

Proper Refinement of Datalog Clauses using Primary Keys Siegfried Nijssen and Joost N. Kok BNAIC-2003, Nijmegen

Introduction Inductive Logic Programming algorithm: C: Set of Datalog clauses, initially empty D: Database of facts (Knowledge base) repeat make clauses in C more specific (downward refinement) evaluate C against D In this presentation, I will present some general ideas that can be applied to any ILP algorithm which has this form. I will first discuss how “traditional” algorithms perform the second step, the evaluation, and then how many classical algorithms perform refinement. After that, I will present my ideas for these steps. 1. 2. October 24, 2003, Nijmegen BNAIC-2003

Database of Facts g1 g2 {e(g1,n1,n2,a),e(g1,n2,n1,a),e(g1,n2,n3,a), e(g1,n3,n1,b),e(g1,n3,n4,b),e(g1,n3,n5,c), e(g2,n6,n7,b)} n2 n4 n6 a a b b a n1 n5 n7 b c n3 October 24, 2003, Nijmegen BNAIC-2003

Clause b a k(G)  e(G,N1,N2,a),e(G,N2,N3,a), e(G,N1,N4,a),e(G,N4,N5,b) October 24, 2003, Nijmegen BNAIC-2003

Evaluation of a clause -subsumption: D C iff there is a substitution , (C)  D Database D: {e(g1,n1,n2,a),e(g1,n2,n1,a),e(g1,n2,n3,a), e(g1,n3,n1,b),e(g1,n3,n4,b),e(g1,n3,n5,c), e(g2,n6,n7,b)} Clause C: k(G)  e(G,N1,N2,a),e(G,N2,N3,a), e(G,N1,N4,a),e(G,N4,N5,b) ={G/g1,N1/n2,N2/n1,N3/n2,N4/n3,N5/n1} The first step to determine the frequency of a query, is to determine a subsumption operator. The most well-known subsumption operator is -subsumption. October 24, 2003, Nijmegen BNAIC-2003

Evaluation of a clause a b a a b b b a b a a b b a g1 g2 n2 n4 n6 n1 October 24, 2003, Nijmegen BNAIC-2003

Evaluation of a clause a a b b b a b a a a Equivalent g1 g2 n2 n4 n6 October 24, 2003, Nijmegen BNAIC-2003

Evaluation of a clause k(G)  e(G,N1,N2,a) k(G)  e(G,N1,N2,a), Equivalent N3 October 24, 2003, Nijmegen BNAIC-2003

Evaluation of a clause OI-subsumption: D C iff there is a substitution , (C)  D, while:  is injective  does not map to constants in C N3 a N1 a N2 N1 a N2 October 24, 2003, Nijmegen BNAIC-2003

Clause Refinement - modes User defined Refinement using modes [Progol,Aleph, Warmr,Tilde,Farmer] T={k(G),e(G,N,N,L)} M={e(+,-,-,#),e(+,+,-,#)} k(G) + old variable - new variable # constant e(G,N1,N2,a) a ,e(G,N2,N1,a) b Using M only edge labeled trees can be constructed! ,e(G,N1,N3,a) a October 24, 2003, Nijmegen BNAIC-2003

Clause Refinement - modes k(G) e(G,N1,N2,a) k(G) e(G,N1,N2,a),e(G,N1,N3,a) k(G) e(G,N1,N2,a),e(G,N1,N3,a), e(G,N2,N4,a),e(G,N3,N5,b) Complete & proper refinement is possible with OI-subsumption, not with -subsumption. a b a a October 24, 2003, Nijmegen BNAIC-2003

Refinement using Primary Keys Assume we know: between a pair of nodes there is at most one edge with one label How to incorporate this knowledge in the refinement operator? M={e(+,-,-,#),e(+,+,-,#),e(+,+,+,#)} These modes allow: k(G)  e(G,N1,N2,a), e(G,N1,N2,b) k(G)  e(G,N1,N2,a), e(G,N1,N2,L1) Primary key: {1,2,3} (first 3 arguments of e) October 24, 2003, Nijmegen BNAIC-2003

Expressiveness OI vs -subsumption a ? b k(G) e(G,N1,N2,a),e(G,N2,N3,L),e(G,N3,N4,b) For proper and complete refinement: OI is required Under OI: L  a, L  b Assume that, FOR THE PREVIOUS DATABASE WITH LABELS ON THE EDGES, you would like to express the following pattern. October 24, 2003, Nijmegen BNAIC-2003

In an ideal situation... We have complete & proper refinement We are not required to use OI for all types (weak Object Identity) October 24, 2003, Nijmegen BNAIC-2003

Proper refinement using Primary Keys In many cases, this ideal situation exists for refinement using primary keys! k(G) e(G,N1,N2,a),e(G,N2,N3,L),e(G,N3,N4,b) each atom must differ from every other atom in at least one literal of the primary key if this clause is equivalent with a smaller clause, there must be substitution that maps a variable to another variable/constant in the clause no substitution exists for L: primary key differs October 24, 2003, Nijmegen BNAIC-2003

Proper refinement using Primary Keys T={k(G),e(G,N,N,L),t(L,C)} M={e(+,-,-,-),e(+,+,-,-),t(+,#)} K(p)={1,2,3} K(t)={1,2} OI={G,N} k(G) e(G,N1,N2,L1),t(L1,a), e(G,N1,N3,L2),t(L2,a) Now consider this example, in which we have added a predicate t. There is one mode for this predicate, and all its arguments are part of the same primary key. Consider the following clause, which is part of the search space. Can it be equivalent to a shorter clause? This can only be the case if L1 is unified with another variable or with a constant. However, if we unify L1 with L2, two e atoms have the same last arguments, which is not possible for the given modes, as each mode for e defines that the last argument must be new. This clause can therefore not be equivalent with a smaller clause in the search space. October 24, 2003, Nijmegen BNAIC-2003

Proper refinement using Primary Keys Given predicates, types, modes, primary keys and a partition of types into OI and non-OI We prove refinement is proper and complete if for every mode there is a primary key which does not include any non-OI output. October 24, 2003, Nijmegen BNAIC-2003

Conclusions Higher performance for ILP algorithms Higher flexibility primary keys restrict the search space efficiently refinement is proper Higher flexibility weak OI is more flexible than full OI October 24, 2003, Nijmegen BNAIC-2003

Clause refinement - other representation better? Background knowledge: a(G,N1,N2)  e(G,N1,N2,a) b(G,N1,N2)  e(G,N1,N2,b) e(G,N1,N2)  e(G,N1,N2,L) k(G) a(G,N1,N2),e(G,N2,N3),b(G,N3,N4) k(G) a(G,N1,N2),e(G,N1,N2) October 24, 2003, Nijmegen BNAIC-2003