An Algebra for Program Designs Tony Hoare MoscowJuly 2011
With ideas from Ian Wehrman John Wickerson Stephan van Staden Peter O’Hearn Bernhard Moeller Georg Struth Rasmus Petersen …and others
Summary operational rules denotational models algebraic laws deduction rules
Part 1 Algebra and Hoare logic Some familiar algebraic laws their application to program designs derivation of Hoare logic from them
Part 1 Algebra and Hoare logic algebraic laws deduction rules
Subject matter: designs variables (p, q, r) stand for programs, designs, specifications,… they all describe what happens inside/around a computer that is executing a program. The program itself is the most precise. The specification is the most abstract. Designs come in between.
Binary relation: p ⊑ q Everything described by p is also described by q, e.g., – spec p implies spec q – prog p satisfies spec q – prog p more determinate than prog q stepwise development is – spec ⊒ design ⊒ program stepwise analysis is the reverse – program ⊑ design ⊑ spec
p ⊑ q below lesser stronger lower bound more precise …deterministic included in antecedent => above greater weaker upper bound more abstract...non-deterministic containing (sets) consequent (pred)
⊑ is a partial order ⊑ transitive p ⊑ r if p ⊑ q and q ⊑ r needed for stepwise development/analysis ⊑ antisymmetric and reflexive p = r iff p ⊑ r and r ⊑ p needed for abstraction
Binary operator: p ; q sequential composition of p and q an execution of p;q consists of – all events x from an execution of p – and all events y from an execution of q subject to an ordering constraint….
Three ordering constraints strong sequence: x must precede y weak sequence: y must not precede x no constraint all our algebraic laws will apply to all three alternatives
Hoare triple: {p} q {r} defined as p;q ⊑ r – starting in the final state of an execution of p, q ends in the final state of some execution of r – p and r may be arbitrary designs. example: {..x+1 ≤ n} x:= x + 1 {..x ≤ n} where..b (finally b) describes all executions that end in a state satisfying a single-state predicate b.
monotonicity Law: ( ; is monotonic wrto ⊑) : – p;q ⊑ p’;q’ if p ⊑ p’ and q ⊑ q’ – like addition of numbers monotony justifies modular evolution – p’ and q’ are developed independently Theorem (rule of consequence): – p’ ⊑ p & {p} q {r} & r ⊑ r’ implies {p’} q {r’} Law is also provable from the theorem
associativity Law (; is associative) : – (p;q);q’ = p;(q;q’) Theorem (sequential composition): – {p} q {s} & {s} q’ {r} implies {p} q;q’ {r} half the law provable from theorem
Conditional correctness disregards unending executions..b is re-interpreted as including them all: – ‘if the execution terminates, it will end in a state satisfying b‘. definition of triple stays the same all laws apply also to conditional correctness logic as well as total correctness logic.
Unit(skip): a program that does nothing Law ( is the unit of ;): – p; = p = ;p Theorem (nullity) – {p} {p} a quarter of the law is provable from theorem
concurrent composition: p | q execution of (p|q) consists of – all events x of an execution of p, – and all events y of an execution of q same laws apply to both: – interleaving: x precedes or follows y – true concurrency: x neither precedes nor follows y. Laws: | is associative, commutative and monotonic
Separation Logic Law (locality): – (s|p) ; q ⊑ s |(p;q)(left locality) – p ; (q|s) ⊑ (p;q) | s(right locality) – a weak version of associativity – a weak version of distribution Theorem (frame rule) : – {p} q {r} implies {p|s} q {r|s} – in Hoare logic, & replaces |, with side- condition that q does not make s false Left locality provable from the theorem!
Concurrency law Law (; exchanges with *) – (p|q) ; (p’|q’) ⊑ (p;p’) | (q;q’) – a weak kind of mutual distribution Theorem (| compositional) – {p} q {r} & {p’} q’ {r’} implies {p|p’} q|q’ {r|r’} the law is provable from the theorem
p|q ; p’|q’ p p’ q’ q
p|q ; p’|q’ ⊑ p p’ q’ q p;p’ | q;q’
Regular language model p, q, r,… are sets of strings (languages). p ⊑ q is inclusion of languages p;q is (lifted) concatenation of strings p|q is (lifted) interleaving of strings
Left locality Theorem: (s|p) ; q ⊑ s |(p;q) in lhs: s interleaves with just p, and all of q comes at the end. in rhs: s interleaves with all of p;q so lhs is a special case of rhs right locality is similar
Exchange Theorem: (p|q) ; (p’|q’) ⊑ (p;p’) | (q;q’) – in lhs: all of p and q comes before all of p’ and q’. – in rhs: p may interleave with q’ and p’ with q – the lhs is a special case of the rhs.
Conclusion regular expressions satisfy all our laws for ⊑, ;, and | and other operators introduced later
Part 2. more operators and laws Complete lattices Iteration, recursion, fixed points Subroutines, abstraction Basic commands
Subject matter variables (p, q, r) stand for programs, designs, specifications,… they are all descriptions of what happens inside and around a computer that is executing a program. the differences between programs and specs are often defined from their syntax.
Specification syntax includes disjunction (or) to express abstraction, or to keep options open – ‘it may be painted green or blue’ conjunction (and) to combine requirements – it must be cheaper than x and faster than y negation (not) for safety and security – it must not explode implication to define contracts – if the user observes the protocol, so will the system
Program syntax excludes disjunction – non-deterministic programs difficult to test conjunction – inefficient to find a computation satisfying both negation – Incomputable implication – there is no point in executing it
programs include sequential composition (;) concurrent composition (|) iteration recursion interfaces transactions assignments, inputs, outputs, jumps,… So let’s include these in our specification/designs
Bottom A specification that has no implementation like the false predicate A program that has no execution e.g., because of some syntactic error Define as the least solution of _ ⊑ q – r ⊑ q implies ⊑ r Law ( is the zero of ;) : – ; p = = p ; Theorem : – {p} {q}
Top ⊤ a program with a run-time error – for which the programmer is responsible – e.g., subscript error, division by zero, divergence,… defined as the least solution of q ⊑ _ Law: it is a zero of ; ⊤; p = ⊤ = p ;⊤ if p ≠ Theorem: none
Non-determinism (or): p ⊔ q describes all executions that either satisfy p or satisfy q. The choice is not (yet) determined. It may be determined later – in development of the design – or in writing the program – or by the compiler – or even at run time
lub (join): ⊔ Define p⊔q as least solution of p ⊑ _ & q ⊑ _ Theorem – p ⊑ r & q ⊑ r iff p⊔q ⊑ r Theorem – ⊔ is associative, commutative, monotonic, idempotent and increasing – it has unit ⊥ and zero ⊤
glb (meet): ⊓ Define p⊓q as greatest solution of _ ⊑ p & _ ⊑ q
Distribution Law ( ; distributive through ⊔ ) – p ; (q⊔q’) = p;q ⊔ p;q’ – (q⊔q’) ; p = q;p ⊔ q’;p Theorem (non-determinism) – {p} q {r} & {p} q’ {r} implies {p} q⊔q’ {r} – i.e., to prove something of q⊔q’ prove the same thing of both q and q’ quarter of law provable from theorem
Conditional: p if b else p’ Define p ⊰b⊱ p’ as b.. ⊓ p ⊔ not(b).. ⊓ p’ – where b.. describes all executions that begin in a state satisfying b. Theorem. p ⊰b⊱ p’ is associative, idempotent, distributive, and – p ⊰b⊱ q = q ⊰not(b)⊱ p (symm) – (p ⊰b⊱ p’ ) ⊰c⊱ (q ⊰b⊱ q’) = (p ⊰c⊱ q) ⊰b⊱ (p’ ⊰c⊱ q’) (exchange)
Transaction Defined as (p ⊓..b) ⊔ (q ⊓..c) – where..b describes all executions that end satisfying single-state predicate b. Implementation: – execute p first – test the condition b afterwards – terminate if b is true – backtrack on failure of b – and try an alternative q with condition c.
Transaction (realistic) Let r describe the non-failing executions of a transaction t. – r is known when execution of t is complete. – any successful execution of t is committed – a single failed execution of t is undone, – and q is done instead. Define: (t if r else q) = t if t ⊑ r = (t ⊓ r) ⊔ q otherwise
Least upper bound Let S be an arbitrary set of designs Define ⊔ S as least solution of ∀s∊ S. s ⊑ _ – ∀s∊ S. s ⊑ r ⇒ r ⊑ ⊔ S (all r) everything is an upper bound of { }, so ⊔ { }= – a case where ⊔S ∉ S
similarly ⊓ S is greatest lower bound of S ⊓ { } = ⊤
Iteration (Kleene *) q* is least solution of – (ɛ ⊔ (q; _) ) ⊑ _ q* = def ⊔ {s| ɛ ⊔ q; s ⊑ s} – ɛ ⊔ q; q* ⊑ q* – ɛ ⊔ q; q’ ⊑ q’ implies q* ⊑ q’ – q* = ⊔ {qⁿ | n ∊ Nat}(continuity) Theorem (invariance): – {p}q*{p} if{p}q{p}
Infinite replication !p is the greatest solution of _ ⊑ p|_ – as in the pi calculus all executions of !p are infinite – or possibly empty
Recursion Let F(_) be a monotonic function between programs. Theorem: all functions defined by monotonic operators are monotonic. μF is strongest solution of F(_) ⊑ _ νF is weakest solution of _ ⊑ F(_) Theorem (Knaster-Tarski): These solutions exist.
Interfaces Let q be the body of a subroutine Let s be its specification Let (q.. s) assert that q meets s Programmer error (⊤) if incorrect Caller of subroutine may assume that s describes all itsexecutions Implementeation may execute q
Subroutine with interface: q.. s Define (q..s) as glb of the set q ⊑ _ & _ ⊑ s Theorem: (q.. s) = q if q ⊑ s = ⊤ otherwise
Basic statements/assertions skip bottom top⊤ assignment:x := e(x) assertion:assert b assumption:assume b finally..b initiallyb..
more assign thru pointer:[a] := e output:c!e input:c?x points to:a|-> e – a |-> _= def exists v. a|-> v throw catch
Laws(examples) assume b= def b..⊓ assert b= def b..⊓ ⊔ not(b).. x:=e(x) ; x:=f(x)=x := f(e(x)) – in languages without interleaving
more p|-> _ ; [p] := e⊑p|-> e – in separation logic c!e | c?x=x := e – in CSP but not in CCS or Pi throw x ; (catch x; p)=p
Part 3 Unifying Semantic Theories Six familiar semantic definition styles. Their derivation from the algebra and vice versa.
operational rules algebraic laws deduction rules
Hoare Triple a method for program verification {p} q {r} ≝ p;q ⊑ r – one way of achieving r is by first doing p and then doing q Theorem: – {p} q {s} & {s} q’ {r} implies {p} q;q’ {r} – proved by associativity
Plotkin reduction a method for program execution -> r = def p ; q ⊒ r – if p describes state before execution of q then r describes a possible final state, eg. – ->..(x = 37) Theorem: -> s & -> r implies r
Milner transition method of execution of concurrent processes p – q -> r≝p ⊒ q;r – one of the ways of executing p is by first executing q and then executing r. – e.g., (x := x+3) –(x:=x+1)-> (x:=x+2) Theorem: – p –q-> s & s –q’-> r => p –(q;q’)-> r (big-step rule for ; )
test generation method of test case generation p[q]r= def p ⊑ q;r – if r describes erroneous states resulting from execution of q, then p describes some initial states in which a test-run of q will certainly reveal the error. Theorem: p [q] s & s [q’] r implies p [q;q’] r
Summary {p} q {r}= def p;q ⊑ r – Hoare triple ->r= def p;q ⊒ r – Plotkin reduction p –q->r = def p ⊒ q;r – Milner transition p [q] r = de fp ⊑ q;r – test generation
Sequential composition Law: ; is associative Theorem: sequence rule is valid for all four triples. the Law is provable from the conjunction of all of them
Skip Law: p ; = p = ; p Theorems: {p} {p} p [ ] p p − → p –>p Law follows from conjunction of all four theorems
Left distribution ; through ⊔ Law: p;(q ⊔ q’)=p;q ⊔ p;q’ Theorems: – {p} (q⊔q’) {r} if {p}q{r} and {p}q’{r} – -> r if -> r or -> r – p [q⊔q’] r if p [q] r or p [q’] r – p -(q⊔q’)-> r if p –q->r and p -q’->r (not used in CCS) law provable from either and rule together with either or rule.
locality and frame left locality(s|p) ; q ⊑s | (p;q) Hoare frame: {p} q {r} ⇒ {s|p} q {s|r} right locality p ; (q|s) ⊑ (p;q) | s Milner frame: p -q-> r⇒(p|s) - q-> (r|s) Full locality requires both frame rules
Separation logic Exchange law: – (p | p’) ; (q| q’) (p ; q) | (p’;q’) Theorems – {p} q {r} & {p’} q’ {r’} ⇒ {p|p’} q|q’ {r|r’} – p -q -> r & p’–q’-> r’ => p|p’ –q|q’-> r|r’ the law is provable from either theorem For the other two triples, the rules are equivalent to the converse exchange law.
usual restrictions on triples in {p} q {r}, p and r are of form..b,..c in p [q] r,p and r are of form b.., c.. in ->r,p and r are of form..b,..c in p –q->r, p and r are programs in p –q->r (small step), q is atomic (in all cases, q is a program) all laws are valid without these restrictions
Weakest precondition (-;) Specification statement (;-) (q -; r) = def the weakest solution of ( _ ;q ⊆ r) – the same as Dijkstra’s wp(q, r) – for backward development of programs (p ;- r) = def the weakest solution of ( p ; _ ⊆ r) – Back/Morgan’s specification statement – same as p⇝r in RGSep – for stepwise refinement of designs
Weakest precondition (-;) Law (-; adjoint to ;) – p ⊑ q -; riffp;q ⊑ r(galois) Theorem – (q -; r) ; q⊑r – p⊑q -; (p ; q) Law provable from the theorems – cf. (r div q) q ≤ r – r≤(r q) div q
Theorems q’ ⊑ q & r ⊑ r’=> q-;r ⊑ q’-;r’ (q;q’)-;r ⊑q-;(q’-;r) q-;r ⊑(q;s) -; (r;s)
Law of consequence
Frame laws
Part 4 Denotational Models A model is a mathematical structure that satisfies the axioms of an algebra, and realistically describes a useful application, for example, program execution.
Models denotational models algebraic laws
Some Standard Models: Boolean algebra ( {0,1}, ≤, , , not(_) ) predicate algebra (Frege, Heyting) – (ℙS,├, , , not(_), =>, ∃, ∀) regular expressions (Kleene): – (ℙA*, ⊆, ∪, ;, ɛ, { }, | ) binary relations (Tarski): – (ℙ(S S), ⊆, ∪, ∩, ;, Id, not(_), converse(_)) algebra of designs is a superset of these
Model: (EV, EX, PR) EV is an underlying set of events (x, y,..) that can occur in any execution of any program EX are executions (e, f,…), modelled as sets of events PR are designs (p, q, r,…), modelled as sets of executions.
Set concepts ⊑ is (set inclusion) ⊔ is (set union) ⊓ is (intersection of sets) is { } (the empty set) ⊤is EV (the universal set)
With (|) p | q = {e ∪ f | e ε p & f ε q & e∩f = { } } – each execution of p|q is the disjoint union of an execution of p and an execution of q – p|q contains all such disjoint unions | generalises many binary operators
Introducing time TIM is a set of times for events – partially ordered by ≤ Let when : EV -> TIM – map each event to its time of occurrence.
Definition of < x < y = def not(when(y) ≤ when(x)) – x < y & y < x means that x and y occur ‘in true concurrency’. e x < y – no event of f occurs before an event of e – hence e<f implies e f = { } If ≤ is a total order, – there is no concurrency, – executions are time-ordered strings
Sequential composition (then) p ; q = {e f | e∊p & f∊q & e<f} special case: if ≤ is a total order, – e < f means that e f is concatenation (e⋅f) of strings – ; is the composition of regular expressions
Theorems These definitions of ; and | satisfy the locality and exchange laws. (s|p) ; q ⊑ s |(p;q) (p|q) ; (p’|q’) ⊑ (p;p’) | (q;q’) – Proof: the lhs describes fewer interleavings than the rhs. regular expressions satisfy all our laws for ⊑, ⊔, ;, and |
Disjoint concurrency (||) p||q = def (p ; q) (q ; p) – all events of p concurrent with all of q. – no interaction is possible between them. Theorems: (p||q) ; r p || (q ; r) (p||q) ; (p’||q’) (p;p’) || (q;q’) – Proof: the rhs has more disjointness constraints than the lhs. – the wrong way round! So make the programmer responsible for disjointness, using interfaces!
Interfaces Let q be the body of a subroutine Let s be its specification Let (q.. s) assert that q is correct Caller may assume s Implementer may execute q
Solution p*q = def (p|q => p||q) = p|qif p|q ⊑ p||q ⊤ otherwise – programmer is responsible for absence of interaction between p and q. Theorem: ; and * satisfy locality and exchange. – Proof: in cases where lhs ≠ rhs, rhs = ⊤
Problem ; is almost useless in the presence of arbitrary interleaving (interference). It is hard to prove disjointness of p||q We need a more complex model – which constrains the places at which a program may make changes.
Separation PL is the set of places at which an event can occur each place is ‘owned’ by one thread, – no other thread can act there. Let where:EV -> PL map each event to its place of occurrence. where(e) = def {where(x) | x ∊ e }
Separation principle events at different places are concurrent events at the same place are totally ordered in time ∀x,y ∊ EV. where(x) = where(y) iff x≤y or y≤x
Picture time space
Theorem p || q = {e f | e ∊ p & f ∊ q & where(e) where(f) = { } } proved from separation principle
Convexity Principle Each execution contains every event that occurs between any of its events. ∀e ∊ EX, y ∊ EV. ∀x, z ∊ e. when(x) ≤ when(y) ≤ when(z) => y ∊ e – no event from elsewhere can interfere between any two events of an execution
A convex execution of p;q time space pq
A non-convex ‘execution’ of p;q time space pq
Conclusion: in Praise of Algebra Reusable Modular Incremental Unifying Discriminative Computational Comprehensible Abstract Beautiful!
Algebra likes pairs Algebra chooses as primitives – operators with two operands +, – predicates with two places =, – laws with two operators & v, + – algebras with two componentsrings
Tuples Tuples are defined in terms of pairs. – Hoare triples – Plotkin triples – Jones quintuples – seventeentuples …
Semantic Links deductions transitions denotations algebra
Increments algebra
Filling the gaps algebra