50.530: Software Engineering Sun Jun SUTD
Week 13: Rely-Guarantee Reasoning
Background: Race Conditions Due to the race, it is in general not safe to have multiple thread modifying a memory location at the same time. 0 1 Thread1 0 1 Thread2 count Thread3 count Thread4 count++ ……
Background: Locking 0 1 Thread1 0 1 count …… 00 Thread2 0 Thread3 0 Thread4 lock(e) 1111 …… unlock(e) The scheduler has to maintain many locks and becomes the performance bottleneck.
Disadvantage of Locking The ratio of scheduling overhead to useful work can be quite high when the lock is frequently contended – due to context switch and scheduling delays. A thread with the lock may be delayed (due to a page fault, scheduling delay, etc.). Locking is simply a heavyweight mechanism for simple operations like count++
Can we get rid of locks?
Hardware Support for Concurrency Processors designed for multiprocessor operation provide special instructions for managing concurrent access to shared variables, for example: – compare-and-swap – load-linked/store-conditional OSs and JVMs use these instructions to implement locks and concurrent data structures
Compare and Swap CAS has three operands – a memory location V, – the expected old value A, – and the new value B. CAS updates V to the new value B, but only if the value in V matches the expected old value A; otherwise, it does nothing. In either case, it returns the value currently in V.
public class SimulatedCAS { private int value; synchronized int get() { return value; } synchronized int compareAndSwap(int expectedValue, int newValue) { int oldValue = value; if (oldValue == expectedValue) value = newValue; return oldValue; }
A Non-blocking Counter public class CasCounter { private SimulatedCAS value; public int getValue() { return value.get(); } public int increment() { int v; do { v = value.get(); } while (v != value.compareAndSwap(v, v + 1)); return v + 1; } Is it more efficient than a lock-based counter? Any potential problem? CAS is used to implement AtomicXxx in Java
CAS in Java CAS is supported in atomic variable classes (12 in java.util.concurrent.atomic), which are used, to implement most of the classes in java.util.concurrent – AtomicInteger, AtomicBoolean, AtomicReference, etc.
Non-blocking Algorithms An algorithm is called non-blocking if failure or suspension of any thread cannot cause failure or suspension of another thread; Non-blocking algorithms are immune to deadlock (though, in unlikely scenarios, may exhibit livelock or starvation) Non-blocking algorithms are known for – Stacks (Treiber’s), queues, hash tables, etc.
Non-blocking Stack Considerably more complicated than their lock-based equivalent The key is figuring out how to limit the scope of atomic changes to a single variable Click here for a sample program: ConcurrentStack.java
Can we verify concurrent programs with shared variables?
VERIFYING PROPERTIES OF PARALLEL PROGRAMS: AN AXIOMATIC APPROACH Owichi et al. Communication of the ACM, 1976
Is the following sound? Yes, if P1 and P2 have no shared variable. An Unsound Hoare Rule { pre1 } P1 { post1 }, { pre2 } P2 { post2} { pre1 && pre2 } P1 || P2 { post1 && post2 } What if there are shared variables?
An Example The following Hoare triples hold: { y = 1 } x := 0 { y = 1 } { true } y := 2 { true } Is the following valid (assuming assignment is atomic)? { y = 1 ⋀ true } x := 0 || y := 2 { y = 1 ⋀ true } P2 y:=2 is interfering the post-condition of P1!
Another Example The following Hoare triples hold: { z = 0 } x := z; y := x { y = 0 } { true } x := 2 { true } The following does not. { z = 0} (x := z; y := x) || x := 2 { y = 0} P2 x:=2 is interfering the prove on P1!
Interference Interference is somehow the problem? What does interfere really means? – Does bal := bal + 1 interfere with { bal > 1000 } ? Definition 1: If a program never modifies my variable, it is not interfering! Too restrictive. Definition 2: { P } C { P } always holds, i.e., the assertion P is not invalidated by the execution of C. Alternative intuition: if a thread went to a state where P holds, it is not a problem if another thread executes C. – Example: { bal > 1000 } bal := bal + 1 { bal > 1000 }.
Proof Outline { T } if (bal > 1000) { { bal > 1000 } credit := 1; { credit = 1 => bal > 1000 } } else { { true } credit := 0; { credit = 1 => bal > 1000 } } { credit = 1 => bal > 1000 } { T } if (bal > 1000) { credit := 1; else { credit := 0; } { credit=1 => bal>1000 } A proof outline puts assertions before and after every atomic statement in the program; and for every statement, the assertion before and after and itself form a Hoare triple.
Interference Freedom Given the proof outline O1 of {pre1} P1 {post1} and O2 of {pre2} P2 {post2}, O2 is not interfering with O1 if for every Hoare triple in O2, say {p}c{q}, and for every assertion A in O1, the following is true: { p ⋀ A } c { A } i.e., A remains true for any statement in P2. O1 and O2 are interference-free if they do not interfere with each other.
Owicki-Gries Proof Rule Step 1: Annotate an assertion to every control point, and show that every Hoare triple holds. Step 2: Prove interference freedom: every assertion used in the local proof is shown not invalidated by the execution of the other thread. { pre1 } P1 { post1 }, { pre2 } P2 { post2} { pre1 && pre2 } P1 || P2 { post1 && post2 } if the proofs of { pre1 } P1 { post1 } and { pre2 } P2 { post2 } are interference-free.
Owicki-Gries Proof Method Prove { x = 0 } x = x+1; || x = x+2; { x = 3 }. Proof: Let P1 = (x = 0 ⋁ x = 2), Q1 = (x = 1 ⋁ x = 3), P2 = (x = 0 ⋁ x = 1), Q2 = (x = 2 ⋁ x = 3). Step 0: we show (x=0 => P1 ⋀ P2) and (Q1 ⋀ Q2=> x=3). Step 1: we show that the following are true. { P1 } x := x + 1 { Q1 } { P2 } x := x + 2 { Q2 } Assume that each statement is atomic for now.
Owicki-Gries Proof Method Prove { x = 0 } x = x+1; || x = x+2; { x = 3 }. Proof: Let P1 = (x = 0 ⋁ x = 2), Q1 = (x = 1 ⋁ x = 3), P2 = (x = 0 ⋁ x = 1), Q2 = (x = 2 ⋁ x = 3). Step 2: we prove interference freedom. { P1 ⋀ P2 } x := x + 1 { P2 } { P1 ⋀ Q2 } x := x + 1 { Q2 } { P1 ⋀ P2 } x := x + 2 { P1 } { Q1 ⋀ P2 } x := x + 2 { Q1 } End of Proof
Owicki-Gries Proof Method Let P1 be bal := bal + dep. Let P2 be: if (bal > 1000) { credit := 1; } else { credit := 0; } Prove the following Hoare triple: { bal = B ⋀ dep > 0 } P1 || P2 { bal = B + dep ⋀ dep > 0 ⋀ (credit = 1 => bal > 1000)}
Owicki-Gries Proof Method P1: bal := bal + dep; Proof outline of P1: { bal = B ⋀ dep > 0 } bal := bal + dep { bal = B + dep ⋀ dep > 0 } P2: if (bal > 1000) { credit := 1; } else { credit := 0; } Proof outline of P2: { T } if (bal > 1000) { { bal > 1000 } credit := 1; { credit = 1 => bal > 1000 } } else { { true } credit := 0; { credit = 1 => bal > 1000 } } { credit = 1 => bal > 1000 }
Owicki-Gries Proof Method Prove the proof of P1 is not interfering the proof of P2: { bal = B ⋀ dep > 0 ⋀ bal > 1000} bal := bal + dep { bal > 1000 } { bal = B ⋀ dep > 0 ⋀ true } bal := bal + dep { true } { bal = B ⋀ dep > 0 ⋀ credit = 1 => bal > 1000 } bal := bal + dep {credit = 1 => bal > 1000 } Prove the proof of P2 is not interfering the proof of P1: { bal > 1000 ⋀ bal = B ⋀ dep > 0 } credit := 1; { bal = B ⋀ dep > 0 } { bal > 1000 ⋀ bal = B + dep ⋀ dep > 0 } credit := 1; {bal = B + dep ⋀ dep > 0 } { bal = B ⋀ dep > 0 } credit := 0; { bal = B ⋀ dep > 0 } { bal = B + dep ⋀ dep > 0 } credit := 0; {bal = B + dep ⋀ dep > 0 }
Exercise 1 Prove: {x=0} x=x+1 || x=x+1 {x=2} Assume for now that each statement is atomic for now.
Auxiliary Variables Let P be a program and A be a set of variables in P. A is a set of auxiliary variables of P if variables in A occur only in assignments, not in assignment guards or tests in loops or conditionals If x ∈ A occurs in an assignment (x1,...,xn) := (E1,...,En) then x occurs in Ei only when xi ∈ A, i.e., variables in A cannot influence variables outside A
Auxiliary Variable Rule if there is a set A of auxiliary variables of P such that P’ is obtained by removing variables in A (and the relevant assignments) from P post does not mention any variable in A. { pre } P { post } { pre } P’ { post }
Exercise 2 Prove: {x=0} x=x+1 || x=x+1 {x=2} To prove the above, we introduce two auxiliary variables, done1 and done2. We prove the following instead. {x=0 ⋀ done1=0 ⋀ done2=0} (x, done1 := x+1, 1) || (x, done2 := x+1, 1) {x=2} Assume that each statement is atomic for now.
Owicki-Gries Proof Method Prove: {x=0} (x, done1 := x+1, 1) || (x, done2 := x+1, 1) {x=2} Proof outline Δ1: { done1 = 0 ⋀ (done2 = 0 => x = 0) ⋀ (done2 = 1 => x = 1) } (x,done1) := (x+1,1) { done1 = 1 ⋀ (done2 = 0 => x = 1) ⋀ (done2 = 1 => x = 2) } Proof outline Δ2: { done2 = 0 ⋀ (done1 = 0 => x = 0) ⋀ (done1 = 1 => x = 1) } (x,done2) := (x+1,1) { done2 = 1 ⋀ (done1 = 0 => x = 1) (done1 = 1 => x = 2) }
Exercise 3 Prove: {x=0} (x, done1:=x+1, 1) || (x, done2:=x+1, 1) {x=2} … Prove outline Δ1 and Δ2 are interference-free.
Recap: Owicki-Gries Method It requires to reason on the whole program. – Firstly, find “local” proof outlines – Second, check interference-freeness, which is not local! Can we come up with a way of proving concurrent programs locally?
Rely-Guarantee Reasoning
If you program that part … and provide me this interface … I will do this part … and provide an interface so that you can … as long as you don’t do that, then my part would work like this...
Another Look Sequential proof: as long as the environment is guaranteed not to modify my variables, the proof holds. Owicki-Gries proof: as long as the environment is guaranteed not to invalidate my local proof outline, my local proof holds. Rely-Guarantee proof: for my local proof, I would specify exactly what assumptions that I make on the environment and what guarantee that I provide so that as long as the environment satisfies my assumption, my local proof stands.
Another Look Consider again: { x = 0 } x := x + 1 || x := x + 2 { x = 3 } In the example, the transition of P2, x := x + 2, (which is the environment of P1), is constrained by the predicate (x = 0 ⋀ x’ = 2) ⋁ (x = 1 ⋀ x’ = 3) where x and x’ refer to program states before and after a transition. This fact suffices to prove that the assertion used in P1's local proof are not invalidated by interference from P2.
Rely-Guarantee Specification A rely/guarantee specification is a quadruple ( pre, rely, guar, post ) pre is the precondition, a single state predicate that describe what is assumed about the initial state; post is the post-condition, a two-state predicate relating the initial state to the final state immediately after the program terminates; rely is the rely-condition, which models all the atomic actions of the environment, describing the interference the program can tolerate from its environment. guar is the guarantee condition, which models the atomic actions of the program, and hence it describes the interference that it imposes on the other threads of the system.
Stability A rely/guarantee specification is a quadruple (pre, rely, guar, post) In a specification (pre, rely, guar, post) we require that pre is stable under the rely condition, that is, they are resistant to interference from the environment. A predicate P is stable under R iff { P } R { P }. For example: the predicate P = bal > 1000 ⋀ dep > 0 is stable under the action bal = bal + dep.
Rely-Guarantee We write P |= ( pre, rely, guar, post ) to denote program P satisfies post while guaranteeing guar, given pre and rely are satisfied. Compare {pre} P {post} and P |= (pre, rely, guar, post).
Rely-Guarantee Proof Key ideas: the action of the interleaved transitions of the other threads (e.g. the states σi+1 and σj+1) is constrained by the rely condition; the post-condition relates the initial and final state, under the assumption that all other threads respect the rely constraints.
Rely-Guarantee Proof Rules where q is: (post1; (rely1 ⋀ rely2)*; post2) ⋁ (post2; (rely1 ⋀ rely2)*; post1) P1 |= ( pre1, rely1, guar1, post1 ) P2 |= ( pre2, rely2, guar2, post2 ) guar1 => rely2 guar2 => rely1 P1 || P2 |= (pre1 ⋀ pre2, rely1 ⋀ rely2, guar1 ⋁ guar2, q) Are guar1 and guar2 relevant to the conclusion?
Rely-Guarantee Proof Rules Proving a parallel program reduces to: a sequential proof of the post-condition and guarantee condition of each individual thread, assuming that its rely condition is true; a pairwise proof that every other thread's guarantee condition implies this thread's rely condition. P1 |= ( pre1, rely1, guar1, post1 ) P2 |= ( pre2, rely2, guar2, post2 ) guar1 => rely2 guar2 => rely1 P1 || P2 |= (pre1 ⋀ pre2, rely1 ⋀ rely2, guar1 ⋁ guar2, q)
Rely-Guarantee Proof Rules where P is an atomic step; preserve(p) means that for any action C from the environment, {p}C{p}; ID is the identify relation. { pre } P { post } P |= ( pre, preserve(pre), post ⋁ ID, post )
Rely-Guarantee Proof Rules For sequential composition: The precondition of the second operand must follow from the post-condition of the first. The total action is given by the composition of the actions of its components accounting for environment interference in between. P1 |= ( pre1, rely, guar, post1 ) P2 |= ( pre2, rely, guar, post2 ) post1 => pre2 P1; P2 |= (pre1, rely, guar, (post1; rely*; post2))
Rely-Guarantee Proof Rules It is always safe to strengthen the precondition or rely-condition or weaken the post-condition or guarantee-condition P |= ( pre, rely, guar, post ) pre’ => pre rely’ => rely guar => guar’ post => post’ P |= (pre’, rely’, guar’, post’)
Example Prove: { x = 0 } x := x + 1 || x := x + 2 { x = 3 } Proof We prove the following about the first thread. x := x + 1 ⊨ (x=0 ⋁ x=2, (x=0 ⋀ x’=2) ⋁ (x=1 ⋀ x’=3) ⋁ (x’=x), (x=0 ⋀ x’=1) ⋁ (x=2 ⋀ x’=3) ⋁ (x’=x), (x=0 ⋀ x’=1) ⋁ (x=2 ⋀ x’=3)) We show x=0 ⋁ x=2 is stable under (x=0 ⋀ x’=2) ⋁ (x=1 ⋀ x’=3) ⋁ (x’=x). (to be continued)
Example Prove: { x = 0 } x := x + 1 || x := x + 2 { x = 3 } Proof We then prove the following on the second thread. x := x + 2 ⊨ (x=0 ⋁ x=1, (x=0 ⋀ x’=1) ⋁ (x=2 ⋀ x’=3) ⋁ x’=x, (x=0 ⋀ x’=2) ⋁ (x=1 ⋀ x’=3) ⋁ (x’=x), (x=0 ⋀ x’=2) ⋁ (x=1 ⋀ x’=3)) We show x=0 ⋁ x=1 is stable under (x=0 ⋀ x’=1) ⋁ (x=2 ⋀ x’=3) ⋁ (x’=x). (to be continued)
Example Prove: { x = 0 } x := x + 1 || x := x + 2 { x = 3 } Proof We then prove the following (guar1 => rely2): (x=0 ⋀ x’=1) ⋁ (x=2 ⋀ x’=3) ⋁ (x’=x) => (x=0 ⋀ x’=1) ⋁ (x=2 ⋀ x’=3) ⋁ (x’=x) We then prove the following (guar2 => rely1): (x=0 ⋀ x’=2) ⋁ (x=1 ⋀ x’=3) ⋁ (x’=x) => (x=0 ⋀ x’=2) ⋁ (x=1 ⋀ x’=3) ⋁ (x’=x) (to be continued)
Example Prove: { x = 0 } x := x + 1 || x := x + 2 { x = 3 } Proof Apply the rely-guarantee proof rule. x := x + 1 || x := x + 2 |= ( (x=0 ⋁ x=2) ⋀ (x=0 ⋁ x=1), ((x=0 ⋀ x’=2) ⋁ (x=1 ⋀ x’=3) ⋁ x’=x) ⋀ ((x=0 ⋀ x’=1) ⋁ (x=2 ⋀ x’=3) ⋁ x’=x), (x=0 ⋀ x’=1) ⋁ (x=2 ⋀ x’=3) ⋁ (x’=x) ⋁ (x=0 ⋀ x’=2) ⋁ (x=1 ⋀ x’=3) ⋁ (x’=x), q) (to be continued)
Example Prove: { x = 0 } x := x + 1 || x := x + 2 { x = 3 } Proof Apply the rely-guarantee proof rule. Notice that rely1 ⋀ rely2 is equivalent to x’=x; x := x + 1 || x := x + 2 |= ( x=0, x’=x, guar, ((x=0 ⋀ x’=1) ⋁ (x=2 ⋀ x’=3); (x’=x)*; (x=0 ⋀ x’=2) ⋁ (x=1 ⋀ x’=3)) ⋁ ((x=0 ⋀ x’=2) ⋁ (x=1 ⋀ x’=3); (x’=x)*; (x=0 ⋀ x’=1) ⋁ (x=2 ⋀ x’=3)) ) (to be continued)
Example Prove: { x = 0 } x := x + 1 || x := x + 2 { x = 3 } Proof Apply the rely-guarantee proof rule. Notice that rely1 ⋀ rely2 = false, which is equivalent to x’=x; x := x + 1 || x := x + 2 |= ( x=0, x’=x, true, x = 0 ⋀ x’ = 3 ) End of Proof. x = 3
Exercise 4 Let P1 be the following program bal := bal + dep Let P2 be the following program if bal > 1000 then credit := 1 else credit := 0 Prove the following: P1 || P2 ⊨ { dep > 0, dep’ = dep ⋀ credit’ = credit ⋀ bal’ = bal, true, bal’ = bal + dep ⋀ dep > 0 ⋀ (credit = 1 => bal > 1000) }
Example Prove that given an array v[0..n-1] and a function P, write a function findp() to find the smallest r such that P(v[r]) = true holds. public int findp(int[] v) { int N = v.length; for (int i = 0; i < N; i++) { if (P(v[i])) { return i; } public Boolean P(int n) { … }
Example We prove the following using methods we discussed previously: { ∀ i. P(v[i])) is defined } findp(v) { (r = n + 1 ⋀ ∀ i. ¬P(v[i])) ⋁ (1 ≤ r ≤ n ⋀ P(v[r]) ⋀ ∀ i < r. ¬P(v[i]))) } public int findp(int[] v) { int N = v.length; for (int i = 0; i < N; i++) { if (P(v[i])) { return i; } public Boolean P(int n) { … }
Example Idea: partition the array, multiple processes search concurrently, one process per partition. Simple way: even and odd processes. Naive concurrency: each process searches a partition, calculates the final result as the minimum of the result of the even and odd processes. Problem: can perform worse than sequential (why?)
Example What is the RG specification for each thread in this program? pre: ∀ i. P(v[i])) is defined rely: v’ = v ⋀ top ≤ top guar: top’ = top ⋁ top’ < top ⋀ P(v[top]) post: ∀ i ∈ partition, i ≤ top => ¬P(v[i]) int top = v.length; Thread 1: for (int i = 0; i < top; i=i+2) { if (P(v[i])) { top = min(i,top); return; } Thread 2: for (int i = 1; i < top; i=i+2) { if (P(v[i])) { top = min(i,top); return; }
Summary Proving the correctness of multi-threaded programs are extremely difficult. Researchers are proposing proof rules and building proof systems based on those rules. How do we learn what a programmer has in mind so as to generate guarantee-conditions and rely-conditions so that we can prove those program automatically?