CS589 Principles of DB Systems Fall 2008 Lecture 4e: Logic (Model-theoretic view of a DB) Lois Delcambre lmd@cs.pdx.edu 503 725-2405.

Slides:



Advertisements
Similar presentations
Artificial Intelligence
Advertisements

1 Logic Logic in general is a subfield of philosophy and its development is credited to ancient Greeks. Symbolic or mathematical logic is used in AI. In.
Inference and Reasoning. Basic Idea Given a set of statements, does a new statement logically follow from this. For example If an animal has wings and.
Outline Recap Knowledge Representation I Textbook: Chapters 6, 7, 9 and 10.
Computability and Complexity 9-1 Computability and Complexity Andrei Bulatov Logic Reminder (Cnt’d)
Syllabus Every Week: 2 Hourly Exams +Final - as noted on Syllabus
From Chapter 4 Formal Specification using Z David Lightfoot
Proof by Deduction. Deductions and Formal Proofs A deduction is a sequence of logic statements, each of which is known or assumed to be true A formal.
DEDUCTIVE DATABASE.
A Brief Summary for Exam 1 Subject Topics Propositional Logic (sections 1.1, 1.2) –Propositions Statement, Truth value, Proposition, Propositional symbol,
Introduction to Proofs
Propositional Logic Dr. Rogelio Dávila Pérez Profesor-Investigador División de Posgrado Universidad Autónoma Guadalajara
1 Introduction to Abstract Mathematics Chapter 2: The Logic of Quantified Statements. Predicate Calculus Instructor: Hayk Melikya 2.3.
CS6133 Software Specification and Verification
For Wednesday Read chapter 9, sections 1-3 Homework: –Chapter 7, exercises 8 and 9.
Chapter 2 Symbolic Logic. Section 2-1 Truth, Equivalence and Implication.
Semantics of Predicate Calculus For the propositional calculus, an interpretation was simply an assignment of truth values to the proposition letters of.
Propositional Logic Rather than jumping right into FOL, we begin with propositional logic A logic involves: §Language (with a syntax) §Semantics §Proof.
Foundations of Discrete Mathematics Chapter 1 By Dr. Dalia M. Gil, Ph.D.
Chapter 2 1. Chapter Summary Sets (This Slide) The Language of Sets - Sec 2.1 – Lecture 8 Set Operations and Set Identities - Sec 2.2 – Lecture 9 Functions.
March 3, 2016Introduction to Artificial Intelligence Lecture 12: Knowledge Representation & Reasoning I 1 Back to “Serious” Topics… Knowledge Representation.
CS.462 Artificial Intelligence SOMCHAI THANGSATHITYANGKUL Lecture 04 : Logic.
DISCRETE COMPUTATIONAL STRUCTURES CSE 2353 Fall 2010 Most slides modified from Discrete Mathematical Structures: Theory and Applications by D.S. Malik.
CS589 Principles of DB Systems Fall 2008 Lecture 4c: Query Language Equivalence Lois Delcambre
CS589 Principles of DB Systems Fall 2008 Lecture 4b: Domain Independence and Safety Lois Delcambre
Chapter 7. Propositional and Predicate Logic
2. The Logic of Compound Statements Summary
CS589 Principles of DB Systems Spring 2014 Unit 2: Recursive Query Processing Lecture 2-1 – Naïve algorithm for recursive queries Lois Delcambre (slides.
Introduction to Logic for Artificial Intelligence Lecture 2
CSE15 Discrete Mathematics 01/23/17
Knowledge Representation and Reasoning
Knowledge and reasoning – second part
Goal for this lecture Demonstrate how we can prove that one query language is more expressive than (i.e., “contained in” as described in the book) another.
The Propositional Calculus
ARTIFICIAL INTELLIGENCE
Logics for Data and Knowledge Representation
Induction and recursion
CS201: Data Structures and Discrete Mathematics I
CSE 311 Foundations of Computing I
Logics for Data and Knowledge Representation
Lesson 5 Relations, mappings, countable and uncountable sets
Induction and recursion
INTRODUCTION TO THE THEORY OF COMPUTATION
Functions Computers take inputs and produce outputs, just like functions in math! Mathematical functions can be expressed in two ways: We can represent.
CSE 311: Foundations of Computing
A Brief Summary for Exam 1
First Order Logic Rosen Lecture 3: Sept 11, 12.
Lesson 5 Relations, mappings, countable and uncountable sets
Knowledge and reasoning – second part
Back to “Serious” Topics…
MA/CSSE 474 More Math Review Theory of Computation
Computer Security: Art and Science, 2nd Edition
Negations of quantifiers
From now on: Combinatorial Circuits:
Logics for Data and Knowledge Representation
Chapter 7. Propositional and Predicate Logic
CSNB234 ARTIFICIAL INTELLIGENCE
Computer Science cpsc322, Lecture 20
Chapter 1: Propositional and First-Order Logic
This Lecture Substitution model
Foundations of Discrete Mathematics
Propositional Logic CMSC 471 Chapter , 7.7 and Chuck Dyer
CS201: Data Structures and Discrete Mathematics I
Logic Logic is a discipline that studies the principles and methods used to construct valid arguments. An argument is a related sequence of statements.
ece 720 intelligent web: ontology and beyond
Logical and Rule-Based Reasoning Part I
Representations & Reasoning Systems (RRS) (2.2)
CS589 Principles of DB Systems Fall 2008 Lecture 4a: Introduction to Datalog Lois Delcambre
CS589 Principles of DB Systems Fall 2008 Lecture 4a: Introduction to Datalog Lois Delcambre
CS589 Principles of DB Systems Fall 2008 Lecture 4b: Domain Independence and Safety Lois Delcambre
Presentation transcript:

CS589 Principles of DB Systems Fall 2008 Lecture 4e: Logic (Model-theoretic view of a DB) Lois Delcambre lmd@cs.pdx.edu 503 725-2405

In-class exercise/discussion For each of the following query languages, how can we be sure that the data that ends up in the query answer actually came from the database (or the constants in the query)? In other words, how do we know that some random values don’t appear in the answer? Relational algebra Tuple relational calculus Domain relational calculus Datalog (any version of Datalog)

Definition of tuple calculus syntax (Ramakrishnan & Gehrke) REPEATED Rel is a relation name, R and S are tuple variables, a is an attribute of R, b is an attribute of S, op is one of {, , , , , } An atomic formula is one of the following: R  Rel R.a op S.b R.a op constant (or constant op R.a) A formula is: any atomic formula F, (F), F1^F2, F1vF2, F1F2 (if F, F1, and F2 are formula) R(F(R)) where F is formula with R a free variable R(F(R)) where F is a formula with R a free variable A tuple calculus query is an expression of the form: { T | F(T) } where T is the only free variable in F Note: all of the variables are tuple variables.

Datalog syntax REPEATED Atomic formulas: R(y1, y2, …, yk) – a predicate formula x = y (Note: this is syntatic sugar for equal(x,y).) R(v1, v2, …, vk) – a ground atomic formula Literal: an atomic formula (positive literal) or the negation of an atomic formula (negative literal) Clause (Datalog rule): L :- L1, L2, …, Ln. where L is a predicate formula and Li, i = 1, …, n, is a literal. (Some versions of Datalog require all literals to be positive literals.) The comma means “and”.

Goals for today Review propositional logic (including truth assignment) Review 1st order predicate calculus (including theory vs. interpretation) Introduce the model-theoretic description of a relational database and consider how tuple calculus, domain calculus, and datalog use that view. Mention the proof-theoretic description of a relational database.

Propositional Logic Two constants: true and false Infinite set of variables (called atomic formulae in our textbook) each variable can take the value true or false Well-formed propositional formulas: An atom is either a variable or a constant An atom is a propositional formula If F is formula, then F is a propositional formula If F and G are propositional formulas, then F  G and F  G (and F → G and F  G) are propositional formulas

Propositional Formulas Which ones are well-formed? (wffs?) false true  false p  p p  q  p  p p  p false  true p   q  p  p   q  p 

Do we know whether formulas are true or false? The constant “true” is true The constant “false” is false Tautologies are (always) true Contradictions are (always) false On the previous slide, which ones are true? Which ones are false? What about everything else? It has variables. We need a truth assignment (to indicate which propositional variables are true and which are false).

Truth assignments A truth assignment A is a function that maps a set of propositional variables to {true, false} The formula “true” has the truth value true under any truth assignment. That is A(true)=true, for any A. Similarly for false; the value is always false. The formula “  ” is true under A if and only if A() is true and A() is true The formula “  ” is true under A if and only if A() is true or A() is true The formula “” is true under A iff A() is false

For each truth assignment, we can find out whether a formula is true or false Consider the well-formed formula: q  p For a truth assignment A where A(q) is true and A(p) is false. Then this formula is false. What if we use a truth assignment A where A1(q) is false and A1(p) is true. Then what is the truth value of this formula?

For any propositional wff, we can tell if it’s a tautology using a truth table  q) r true false Note: a truth table exhaustively considers all truth assignments. 1. Enter all possible combinations of truth values for p and q.

For any propositional wff, we can tell if it’s a tautology using a truth table  q) r true false 2. Compute the truth values for p  q.

For any propositional wff, we can tell if it’s a tautology using a truth table  q) r true false 3. Enter all possible combinations of what we have so far – with all possible combinations of truth values for r. This doubles the size of the table

For any propositional wff, we can tell if it’s a tautology using a truth table  q) r true false compare 4. Compute the truth value for the final . This is obviously not a tautology.

Definitions A wff is satisfiable if there is at least one truth assignment that makes it true. A wff is unsatisfiable if it is not satisfiable. A wff is valid if every truth assignment makes it true. A valid propositional wff is a tautology. The negation of a valid wff is unsatisfiable. The negation of a tautology is unsatisfiable.

Logical Implication A formula  logically implies formula  if for every truth assignment A, if A() is true, then A() is true. We write    If    and    then we say  and  are logically equivalent, written    .

Discussion questions We know that a Datalog program is a set of well-formed formulas. And we know that a tuple or domain calculus query uses a well-formed formula to describe the set membership criteria. Would we be interested in knowing whether a query is satisfiable? Would we be interested in knowing whether a query is a tautology? a contradiction? Would we be interested in knowing whether one query logically implies another one?

First-Order theory An infinite set of variables A (possibly empty) set of constant symbols plus {true, false} A (possibly empty) set of n-ary predicates, for n  0 A (possibly empty) set of n-ary functions, for n  1. Sometimes equality is included, sometimes not. A set of axioms.

Well-formed, first-order formulas Variables, individual constants, and f(t1, …, tn) are terms. (Nothing else is a term.) An atomic formula is true or false or R(t1, …, tn) where ti, i=1,n, are terms. Every atomic formula is a wff. If  and  are wffs, then  is a wff,  is a wff,    is a wff, and if x is a variable, then (x) and (x) are wffs. (Nothing else is a wff.)

Example of a First-order theory No constant symbols No functions One predicate, p Two proper axioms: (x)p(x,x) (x)(y)(z)p(x,y)p(y,z)  p(x,z)

Interpretation for a 1st Order Theory An interpretation I = (U, C, P, F) where U is a nonempty set of elements (the domain) C is a function that maps the constants of the theory into elements of U P maps each n-ary predicate (from the theory) into an n-place relation over U F maps each n-ary function (from the theory) into an function from Un  U Variables range over the set U An interpretation is a model (of the theory) iff all of the axioms are true.

Example interpretations Axioms: (x)p(x,x) (x)(y)p(x,y)p(y,z)  p(x,z) Interpretation 1: U = non-negative integers, p is the predicate not-equal. Interpretation 2: U = non-negative integers, p is less-than-or-equal-to. Interpretation 3: U = integers, p is less-than. Which of these interpretations are models? Define another model for this theory.

Another first-order theory One constant symbol {zero} One function, + One predicate, equality Proper axioms: (x)(y)(z) x+(y+z) = (x+y)+z (x) x+zero = x (x)(y) x+y=zero (x)(y) x=y  y=x (x) x=x (x)(y)(z) x=y  (x=z  y=z) (x)(y)(z) (x=y)  ((x+z=y+z) AND (z+y=z+x))

Which ones are models of the theory? Interpretations for theory (on previous slide) (Exercise to do at home) Interpretation 1: U = integers zero is mapped to 0 + is regular addition = is regular equality Interpretation 2: U = non-negative integers Interpretation 3: U = integers modulo 5 zero is mapped to 0 + is modulo 5 addition = is regular equality Interpretation 4: U = integers zero is mapped to 1 + is regular addition Which ones are models of the theory?

Logic Axioms and Rules of Inference 1 2 and 3 well-formed formulas of a 1st order theory Axioms: 1  (2  1) (1  (2  3))  ((1  2)  (2  3)) (2  1)  ((2  1)  2)) ((x)1(x))  1 (t), if t is a term in the theory and x is free in 1(x). ((x)(1  2))  ((1  (x)2)) if x is not free in 1 Rules of Inference: Modus ponens: 2 follows from 1 and 1  2 Generalization: (x)1 follows from 1

Terminology The axioms shown on the previous slide are called the logical axioms. If you add extra axioms to your theory, they are called proper axioms or non-logical axioms. If you don’t have any proper axioms, then the theory is called 1st order predicate calculus. If you show that’s something is logically valid, then you are using a model-theoretic argument. If you prove a theorem, then you are using a proof-theoretic argument. A proof uses axioms and rules of inference, only.

What about relational databases? An interpretation I = (U, C, P, F) where U is a nonempty set of elements (the domain) C is a function that maps the constants of the theory into elements of U P maps each n-ary predicate (from the theory) into an n-place relation over U F maps each n-ary function (from the theory) into an function from Un  U Variables range over the set U An interpretation is a model (of the theory) iff all of the axioms are true. Union of all domains. No constant symbols needed in the theory. Function-free

What about relational databases? An interpretation I = (U, C, P, F) where U is a nonempty set of elements (the domain) C is a function that maps the constants of the theory into elements of U P maps each n-ary predicate (from the theory) into an n-place relation over U F maps each n-ary function (from the theory) into an function from Un  U Variables range over the set U An interpretation is a model (of the theory) iff all of the axioms are true. In order to have an interpretation, we need to describe (we need to write down) all of the tuples where a given predicate is true. Think about a relational database. What does it store for us? It (exactly) stores the tuples that belong in a particular table. A relational database is (exactly) an interpretation of a very simple 1st order theory with no functions and no constant symbols.

Exercise Look at the first paragraph on page 108 of your textbook. (I have it available for you as a handout.) You should be able to understand every word of this paragraph.

Proof-Theoretic View of a Relational DB (by Raymond Reiter) Set out to describe the entire DB content in the theory. (This then requires no interpretation. Everything would be done using proofs.) He introduced constant symbols for every constant in the domains of the database. He introduced one axiom for every tuple in the relational database. (Each tuple is a “fact.”) Then he figured out the additional axioms needed to make it work out. Unique name assumption (Identical things are equal, everything else is not equal.) Domain closure axioms (Every value must be equal to one of the constant symbols in the theory.) Completion axioms (Every atomic predicate must be equal to one of the axioms that represent the tuples in the DB. This is essentially the closed world assumption. (If it is not in the DB, then assume it is false.))

Proof-theoretic view of a relational DB The entire DB is described in a 1st order theory (and all results are proofs). A query (which is a wff) is a theorem to be proved (rather than a wff to be evaluated in an interpretation). Since the DB is in the theory (with “facts” as axioms) we can add any 1st order statements, as axioms, to our database. For example we could add: Age(John, 21) v Age(John 22) without requiring that the tuple for John hold either of these ages.

Reference “Towards a Logical Reconstruction of Relational Database Theory” by Raymond Reiter, a chapter in On Conceptual Modeling, edited by Michael Brodie, John Mylopoulos, and Joachim Schmidt, Springer-Verlag, 1984, pp. 191-233. Reiter was one of many database researchers exploring the relationship between logic and databases. This work (in the 1980s) led to the definition and exploration of Datalog.

Results – 1st order theory Every instance of a tautology is a theorem that uses only axioms 1, 2, and 3, and MP. Completeness: Every theorem of a 1st order predicate calculus is logically valid. If you can prove a theorem…then it’s logically valid (true in all models). Any logically valid wff is a theorem. In any predicate calculus, theorems = valid wffs.