0 Propositional logic versus first-order (predicate) logic The universe of discourse Constants, variables, terms and valuations Predicates as generalized propositions Boolean operations as operations on predicates Substitution of variables Quantifiers The language of first-order logic Interpretation, truth and validity The undecidability of the validity problem A Gentzen-style deductive system Soundness and Gödel’s completeness Peano arithmetic and Gödel’s incompleteness Giorgi Japaridze Logic First Order Logic First Order Logic Episode 5
Propositional logic versus predicate logic 5.1 The language of propositional logic is very poor and does not allow us to talk about many things that we would like to be able to talk about. That is because propositional logic fails to “look inside propositions” and see any further structure in them. For example, propositional logic would not see any connection between “Bob likes Jane” and “There is someone who likes Jane”, even though one statement logically implies the other. This limitation of expressive power is overcome in predicate logic, which is also called first-order logic. It is based not just on propositions, but on predicates (=relations). Propositions are simple special cases of predicates. Hence, propositional logic is just a simple fragment of the more expressive predicate logic. In a sense, the expressive power of first-order logic is universal: it allows us to talk about virtually anything. Note: In this episode, first-order logic will be presented in a way which may seem quite different from the treatments that you have probably seen elsewhere. Yet, our approach is equivalent to the more traditional ones.
The universe of discourse 5.2 Relations are always considered in the context of some set. For example, when we mention <, we may say that we mean it as a binary relation on the set N of natural numbers. This formally means that < is a subset of N N. Such a context-setting set (in this example N) is said to be the universe of discourse. When applying first-order logic, we always have some universe of discourse in mind. For example, if first-order logic is used for building a formal arithmetic, the universe of discourse would be N. And if logic is used for a biological classification system, the universe of discourse would contain (the names of) all plants and animals. In our treatment, we assume that the universe of discourse is always N. There is no (much) loss of generality in doing so. After all, plants, people, chemical elements, rational numbers --- all objects that have or can have names --- can be encoded as natural numbers.
Constants, variables, terms and valuations 5.3 We identify the elements of our universe of discourse with their decimal representations, and call the elements of {0,1,2,...,17,... } constants. The letters a, b, c, d will be typically used as metavariables for constants. Next, we fix another countably infinite set of expressions and call its elements variables. The letters x,y,z will be typically used as metavariables for variables. A term means either a variable or a constant. The letter t will be typically used as a metavariable for terms. A valuation is any function that assigns a constant to each variable. The letter e will be typically used as a metavariable for valuations. We extend the domain of each valuation e to all terms by stipulating that, for any constant c, e(c)=c.
Predicates revisited 5.4 From now on, by a predicate we will always mean a function p that assigns a value e[p] { ⊤, ⊥ } (“true” or “false”) to each valuation e. Note that we write e[p] instead of p(e). When e[p]= ⊤, we say that predicate p is true at e. And when e[p]= ⊥, we say that p is false at e. For example, the predicate “x is even”, or Even(x), is defined by e[Even(x)] = ⊤ if e(x) is even; ⊥ otherwise. And the predicate “x is greater than y”, or x>y, is defined by e[x>y] = ⊤ if e(x)>e(y); ⊥ otherwise.
Constant predicates; propositions as special cases of predicates 5.5 We say that a predicate p is constant if its value does not depend on valuation. That is, p is constant iff, for any two valuations e and e’, we have e[p]=e’[p]. Examples. Are the following predicates constant? x>y x>x x>0 x 0 2+2=5 no yes no yes The last example above illustrates that propositions are nothing but constant predicates. In general, propositional logic is nothing but first-order logic restricted to constant predicates. We say that a predicate p depends on a variable x iff there are two valuations e and e’ such that: (a) e and e’ agree on all variables except x, and (b) e[p] e’[p]. Constant predicates (propositions) thus do not depend on any variables.
Boolean operations as operations on predicates 5.6 In Episode 4, Boolean operations were defined as operations on propositions, i.e. functions of the type {propositions} n {propositions} (n=0, n=1 or n=2). They easily extend to operations on predicates, i.e. functions of the type {predicates} n {predicates}, by the following definition: For every valuation e and all predicates p and q: e[ p] = (e[p]), i.e., p is true at e iff p is false at e; e[p q] = (e[p]) (e[q]), i.e., p q is true at e iff so are both p and q; e[p q] = (e[p]) (e[q]), i.e., p q is true at e iff so is either p or q or both; e[p q] = (e[p]) (e[q]), i.e., p q is true at e iff either p is false at e, or q is true at e, or both.
Substitution of variables 5.7 We often fix a tuple x 1,...,x n of pairwise distinct variables for a given predicate p, and write p (when first mentioning it) as p(x 1,...,x n ). Note: by doing so, we do not necessarily mean that p depends on all of the variables x 1,...,x n, or that p does not depend on any other variables. When p(x 1,...,x n ) is as above and t 1,...,t n are any terms, p(t 1,...,t n ) is written to mean the predicate such that, for any valuation e, we have e[p(t 1,...,t n )]=e’[p(x 1,...,x n )], where e’ is the valuation satisfying the following two conditions: e’(x 1 )=e(t 1 ),..., e’(x n )=e(t n ); e’ agrees with e on all other variables. Example. Let both p(x,y) and q(x) mean “x is a multiple of y”. Then: p(15,3) = p(x,3) = p(y,y) = p(y,z) = q(7) = q(z) = q(y) = “15 is a multiple of 3” = ⊤ “x is a multiple of 3” “y is a multiple of y” = ⊤ “y is a multiple of z” “7 is a multiple of y” “z is a multiple of y” “y is a multiple of y”
Quantifiers 5.8 Quantifiers in classical logic are functions of the type {predicates} {variables} {predicates}. There are two quantifiers: universal quantifier , with xp read as “for all x, p”; existential quantifier , with xp read as “there is x such that p”. They can be defined as “big conjunction” and “big disjunction”: xp(x) = p(0) p(1) p(2) p(3) ... xp(x) = p(0) p(1) p(2) p(3) ... More formally, for any variable x, predicate p(x) and valuation e, we have: e[ xp(x)] = ⊤ iff, for every constant c, e[p(c)]= ⊤ ; e[ xp(x)] = ⊤ iff there is a constant c such that e[p(c)]= ⊤.
Examples 5.9 Let e be the valuation which assigns 5 to x and assigns 0 to all other variables. Which of the following predicates are true at e and which are false? y<x z<y z(z<x) z(z<y) x(x<x) z(z=y 0<z) x y(x<y) y x(x<y) y x(x y) x y(x y) 2+3=4 2+3=x true false true false true false true false true
The language of classical first-order logic 5.10 In addition to the components that the language of propositional logic has, the language of first-order logic contains constants, variables, quantifiers and predicate letters, for which we use p,q,r,s as metavariables. With each predicate letter is associated a natural number called its arity. When the arity of p is n, we say that p is n-ary. An atom of this language is p(t 1,...,t n ), where p is an n-ary letter and t 1,...,t n are any terms. When the arity of p is 0, we write p instead of p( ). The atoms of propositional logic remain atoms of first-order logic, as we understand them as 0-ary letters. This includes ⊤ and ⊥, which are now treated as 0-ary logical predicate letters and hence logical atoms. Formulas are defined inductively by: Atoms are formulas; If F is a formula, so is (F); If E and F are formulas, so are (E) (F), (E) (F), (E) (F); If F is a formula and x is a variable, x(F) and x(F) are formulas.
Free and bound terms; normal formulas 5.11 An occurrence of a term t in a formula F is said to be bound iff it is in the scope of t or t. Otherwise the occurrence is free. For example, in formula y(p(x,y) xp(x,y)), the first occurrence of x is free while the other occurrences of x, as well as all occurrences of y, are bound. A formula is said to be normal iff no variable has both free and bound occurrences in it. From now on, we will implicitly assume that all formulas that we deal with are normal. That is, from now on, we agree that the word “formula” means “normal formula”.
Interpretations 5.12 An interpretation for first-order logic is a function * that assigns some predicate p*(x 1,...,x n ) (with the fixed attached tuple x 1,...,x n of pairwise distinct variables) to each n-ary nonlogical predicate letter p. Such an interpretation * is said to be admissible for a formula F (or F-admissible) if, for any n-ary predicate letter p of F, the predicate p*(x 1,...,x n ) assigned to p does not depend on any variables that are not among x 1,...,x n but occur in F. In the sequel, we always implicitly assume that the interpretations we consider are admissible for the formulas that we are talking about. Note: In the literature, interpretations are more commonly called models or structures. An interpretation * extends to a function *: {formulas} {predicates} by stipulating that: ( p(t 1,...,t n ) ) *=p*(t 1,...,t n ); ⊤ *= ⊤ ; ⊥ *= ⊥ ; ( F)*= (F*); (E F)*= E* F*; (E F)*= E* F*; (E F)*= E* F*; ( xF)*= x(F*); ( xF)*= x(F*). Usually we prefer to write F*(t 1,...,t n ) instead of ( F(t 1,...,t n ) )*.
Examples 5.13 Let p be a 3-ary predicate letter, and * be an interpretation that assigns to it the predicate p*(x,y,z) which is true at a given valuation e iff e(x)=e(y)+e(z). What are the meanings of the following formulas (into what predicates do they turn) under this interpretation? p(x,y,z) --- p(z,4,y) --- p(x,3,5) --- p(x,x,x) --- x y z ( p(x,y,z) p(x,z,y) ) --- zp(x,y,z) --- zp(z,x,z) --- z 1 z 2 z 3 ( p(z 1,y,y) p(z 2,z 1,z 1 ) p(z 3,z 2,z 2 ) p(x,z 3,z 3 ) ) --- x=y+z z=4+y x=3+5i.e., x=8 x=x+xi.e., x=0 xyxy x=0 ⊤ x=16y
Validity 5.14 A formula F of first-order logic is said to be valid iff, for every interpretation * and every valuation e, we have e[F*]= ⊤. p(x) p(x) p(x) Are the following formulas valid? x ( p(x) p(x) ) xp(x) x p(x) x yq(x,y) y xq(x,y) x y ( p(x) p(y) ) x y ( p(x) p(y) ) x yq(x,y) y xq(x,y) No Yes No Yes No Yes Theorem 5.1. The problem of telling whether a given formula of first-order logic is valid is recursively enumerable but not decidable.
A Gentzen-style deductive system 5.15 As in system G2 from Episode 4, we understand sequents as finite sets of (now first order) formulas. Furthermore, as in Episode 4, we only consider formulas without ⊤, ⊥, and without applied to nonatomic formulas. xF should be understood as x F, and xF as x F. Below are the rules of system G3. In those rules, G is any set of formulas, E and F are any formulas, x is any variable, H(x) is any formula, t is any term with no bound occurrence in H(x) or G, H(t) is the result of replacing in H(x) all free occurrences of x by t, y is any variable which does not occur in H(x) and G, and H(y) is the result of replacing in H(x) all free occurrences of x by y. Remember also that we require all formulas to be normal (Slide 5.11). For safety, here we also require that sequents, seen as formulas (i.e. disjunctions of their elements) be normal. Axiom -Introduction -Introduction G, E F G, E, F G, E F G, E G, F G, E,E A [no premises] G, xH(x) G, xH(x), H(t) G, xH(x) G, H(y) -Introduction -Introduction
Examples 5.16 A G3-proof of x yq(x,y) y xq(x,y). x y q(x,y) y xq(x,y) x y q(x,y), y xq(x,y) y q(z 1,y), y xq(x,y) y q(z 1,y), xq(x,z 2 ) y q(z 1,y), xq(x,z 2 ), q(z 1,z 2 ) y q(z 1,y), xq(x,z 2 ), q(z 1,z 2 ), q(z 1,z 2 ) A A G3-proof of x y ( p(x) p(y) ). x y ( p(x) p(y) ) x y ( p(x) p(y) ), y ( p(0) p(y) ) x y ( p(x) p(y) ), p(0) p(z) x y ( p(x) p(y) ), p(0) p(z), y ( p(z) p(y) ) x y ( p(x) p(y) ), p(0) p(z), p(z) p(u) x y ( p(x) p(y) ), p(0), p(z), p(z), p(u) A
The soundness and completeness of G Theorem 5.2. For any formula F of first-order logic, we have: Soundness: If F is provable in G3, then F is valid. Completeness: If F is valid, then F is provable in G3. The soundness part of this theorem is relatively easy to prove: just as for G2, it can be done by verifying that all rules preserve validity. The completeness part is harder. It was first proven in 1930 by Kurt Gödel. For that reason, and for the reason of completeness being the more important part, Theorem 5.2 (or the same theorem for any other equivalent deductive system) is called Gödel’s completeness theorem.
Peano arithmetic 5.18 Language: =, +, , ’, 0 (a’ means a+1); Underlying logic: an extension of G3 which understands = as the equality predicate. Axioms: 1. x y (x’=y’ x=y) 2. x (x’ 0) 3. x (x+0 = x) 4. x y [x+y’ = (x+y)’] 5. x (x 0 = 0) 6. x y [x y’ = (x y)+x] 7. [Q(0) x (Q(x) Q(x’))] xQ(x) Axiom 7 is a scheme, for every formula Q; If Q contains additional variables z 1,...,z n, then the whole thing should be prefixed with z 1... z n This axiom is called the induction scheme Gödel’s Incompleteness Theorem: These axioms are not sufficient to prove every true arithmetical sentence. Neither would be sufficient any bigger set of axioms.