Mutual Exclusion Companion slides for © 2003 Herlihy and Shavit Mutual Exclusion Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Mutual Exclusion We will clarify our understanding of mutual exclusion We will also show you how to reason about various properties in an asynchronous concurrent setting This lecture covers a number of classical mutual exclusion algorithms that work by reading and writing shared memory. Although these algorithms are not used in practice, we study them because they provide an ideal introduction to the kinds of correctness issues that arise in every area of synchronization. These algorithms, simple as they are, display subtle properties that students should understand before they are ready to approach the design of truly practical techniques. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Mutual Exclusion In his 1965 paper E. W. Dijkstra wrote: "Given in this paper is a solution to a problem which, to the knowledge of the author, has been an open question since at least 1962, irrespective of the solvability. [...] Although the setting of the problem might seem somewhat academic at first, the author trusts that anyone familiar with the logical problems that arise in computer coupling will appreciate the significance of the fact that this problem indeed can be solved." Dijkstra was the first to provide a solution to the mutual exclusion problem. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Mutual Exclusion Formal problem definitions Solutions for 2 threads Solutions for n threads Fair solutions Inherent costs First, we are going to talk about what mutual exclusion means. This might seem simple, almost trivial, but there are tricky aspects that will cause problems if we don’t understand them. Then we discuss 2 process solutions. We like these because they are simpler than general solutions, so we can focus on important issues more clearly. Also the code fits on a slide. Then we will be ready to move on to n-process solutions. In the end, we will talk about the inherent costs of doing mutual exclusion in this way, and the implications of those costs. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Warning You will never use these protocols Get over it You are advised to understand them The same issues show up everywhere Except hidden and more complex These particular protocols will never be used in practice. We don’t care because we are interested in understanding them, not using them. Once we understand how and why they work, and their inherent limitations, only then will we be ready to move on to more practical protocols. Art of Multiprocessor Programming
Why is Concurrent Programming so Hard? © 2003 Herlihy and Shavit Why is Concurrent Programming so Hard? Try preparing a seven-course banquet By yourself With one friend With twenty-seven friends … Before we can talk about programs Need a language Describing time and concurrency Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Time “Absolute, true and mathematical time, of itself and from its own nature, flows equably without relation to anything external.” (I. Newton, 1689) “Time is, like, Nature’s way of making sure that everything doesn’t happen all at once.” (Anonymous, circa 1968) For the next set of slides, PLEASE READ THE DISCUSSION ABOUT TIME IN THE TEXTBOOK. Do not read this slide out, let the students read it while you talk. Instead, you should explain to them that what we are going to try and do is actually talk about time by understanding that in our systems there will not be time in the sense of a global clock in the sky that all threads can read and use in order to relate to each other, rather, we will think of the world in terms of the ordering among events, and will use time just as a tool for explaining this ordering. In other words, there is a notion of time but it is local and not global. In fact, we already know this from general relativity time Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Events An event a0 of thread A is Instantaneous No simultaneous events (break ties) a0 time Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Threads A thread A is (formally) a sequence a0, a1, ... of events “Trace” model Notation: a0 a1 indicates order a0 a1 a2 … time Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Example Thread Events Assign to shared variable Assign to local variable Invoke method Return from method Lots of other things … Art of Multiprocessor Programming
Threads are State Machines © 2003 Herlihy and Shavit Threads are State Machines a0 a3 Events are transitions a2 a1 Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit States Thread State Program counter Local variables System state Object fields (shared variables) Union of thread states Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Concurrency Thread A time Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Concurrency Thread A Thread B time time Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Interleavings Events of two or more threads Interleaved Not necessarily independent (why?) time Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Intervals An interval A0 =(a0,a1) is Time between events a0 and a1 a0 a1 A0 time Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Intervals may Overlap b0 b1 B0 a0 a1 A0 time Art of Multiprocessor Programming
Intervals may be Disjoint © 2003 Herlihy and Shavit Intervals may be Disjoint b0 b1 B0 a0 a1 A0 time Art of Multiprocessor Programming
Precedence Interval A0 precedes interval B0 b0 b1 B0 a0 a1 A0 time © 2003 Herlihy and Shavit Precedence Interval A0 precedes interval B0 b0 b1 B0 a0 a1 A0 time Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Precedence Notation: A0 B0 Formally, End event of A0 before start event of B0 Also called “happens before” or “precedes” The “arrow” (HAPPENS BEFORE) notation was introduced by Leslie Lamport, a distributed computing pioneer. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Precedence Ordering Remark: A0 B0 is just like saying 1066 AD 1492 AD, Middle Ages Renaissance, Oh wait, what about this week vs this month? Week is concurrent with month, leads into next slide Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Precedence Ordering Never true that A A If A B then not true that B A If A B & B C then A C Funny thing: A B & B A might both be false! In this slide we deal with concurrent events. Art of Multiprocessor Programming
Partial Orders (review) © 2003 Herlihy and Shavit Partial Orders (review) Irreflexive: Never true that A A Antisymmetric: If A B then not true that B A Transitive: If A B & B C then A C Notice that we use a definition of Antisymmetric that is based on our earlier definition of irreflexive. Also, notice that it could be that both AB and BA don’t hold. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Total Orders (review) Also Irreflexive Antisymmetric Transitive Except that for every distinct A, B, Either A B or B A We strengthen antisymmetry which allowed that both AB and BA don’t hold by requiring that one of the two always hold. Art of Multiprocessor Programming
Repeated Events a0k A0k k-th occurrence of event a0 © 2003 Herlihy and Shavit Repeated Events while (mumble) { a0; a1; } k-th occurrence of event a0 a0k k-th occurrence of interval A0 =(a0,a1) A0k Art of Multiprocessor Programming
Implementing a Counter © 2003 Herlihy and Shavit Implementing a Counter public class Counter { private long value; public long getAndIncrement() { temp = value; value = temp + 1; return temp; } Make these steps indivisible using locks Art of Multiprocessor Programming
Locks (Mutual Exclusion) © 2003 Herlihy and Shavit Locks (Mutual Exclusion) public interface Lock { public void lock(); public void unlock(); } This is the formal definition of a mutual exclusion lock object in java. Art of Multiprocessor Programming
Locks (Mutual Exclusion) © 2003 Herlihy and Shavit Locks (Mutual Exclusion) public interface Lock { public void lock(); public void unlock(); } acquire lock Art of Multiprocessor Programming
Locks (Mutual Exclusion) © 2003 Herlihy and Shavit Locks (Mutual Exclusion) public interface Lock { public void lock(); public void unlock(); } acquire lock release lock Threads using the Lock() and Unlock() methods must follow a specific format, first Lock() then Unlock(). Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Using Locks public class Counter { private long value; private Lock lock; public long getAndIncrement() { lock.lock(); try { int temp = value; value = value + 1; } finally { lock.unlock(); } return temp; }} In Java, these methods should be used in the following structured way. mutex.lock(); try { ... // body } finally { mutex.unlock(); } Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Using Locks public class Counter { private long value; private Lock lock; public long getAndIncrement() { lock.lock(); try { int temp = value; value = value + 1; } finally { lock.unlock(); } return temp; }} acquire Lock Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Using Locks public class Counter { private long value; private Lock lock; public long getAndIncrement() { lock.lock(); try { int temp = value; value = value + 1; } finally { lock.unlock(); } return temp; }} Release lock (no matter what) Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Using Locks public class Counter { private long value; private Lock lock; public long getAndIncrement() { lock.lock(); try { int temp = value; value = value + 1; } finally { lock.unlock(); } return temp; }} critical section Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Mutual Exclusion Let CSik be thread i’s k-th critical section execution The interval CS^k_i is denoted by the . Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Mutual Exclusion Let CSik be thread i’s k-th critical section execution And CSjm be thread j’s m-th critical section execution Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Mutual Exclusion Let CSik be thread i’s k-th critical section execution And CSjm be j’s m-th execution Then either or Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Mutual Exclusion Let CSik be thread i’s k-th critical section execution And CSjm be j’s m-th execution Then either or CSik CSjm Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Mutual Exclusion Let CSik be thread i’s k-th critical section execution And CSjm be j’s m-th execution Then either or CSik CSjm CSjm CSik Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Deadlock-Free If some thread calls lock() And never returns Then other threads must complete lock() and unlock() calls infinitely often System as a whole makes progress Even if individuals starve At least one other thread is completing infinitely often, since there are only a finite number of threads. This is a “Stalinistic” approach. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Starvation-Free If some thread calls lock() It will eventually return Individual threads make progress This is a more democratic approach, we want every individual to be happy, not just the system as a whole. One can make a note that we must assume that no thread is in the critical section forever. Art of Multiprocessor Programming
Two-Thread vs n-Thread Solutions © 2003 Herlihy and Shavit Two-Thread vs n-Thread Solutions 2-thread solutions first Illustrate most basic ideas Fits on one slide Then n-thread solutions Art of Multiprocessor Programming
Two-Thread Conventions © 2003 Herlihy and Shavit Two-Thread Conventions class … implements Lock { … // thread-local index, 0 or 1 public void lock() { int i = ThreadID.get(); int j = 1 - i; } … will be the lock method we use Art of Multiprocessor Programming
Two-Thread Conventions © 2003 Herlihy and Shavit Two-Thread Conventions class … implements Lock { … // thread-local index, 0 or 1 public void lock() { int i = ThreadID.get(); int j = 1 - i; } LockX will be the lock method we use. We will not mention I and j defs again Henceforth: i is current thread, j is other thread Art of Multiprocessor Programming
LockOne class LockOne implements Lock { private boolean[] flag = new boolean[2]; public void lock() { flag[i] = true; while (flag[j]) {} }
LockOne Each thread has flag class LockOne implements Lock { private boolean[] flag = new boolean[2]; public void lock() { flag[i] = true; while (flag[j]) {} } Each thread has flag
LockOne Set my flag class LockOne implements Lock { private boolean[] flag = new boolean[2]; public void lock() { flag[i] = true; while (flag[j]) {} } Set my flag
Wait for other flag to become false LockOne class LockOne implements Lock { private boolean[] flag = new boolean[2]; public void lock() { flag[i] = true; while (flag[j]) {} } Wait for other flag to become false
LockOne Satisfies Mutual Exclusion © 2003 Herlihy and Shavit LockOne Satisfies Mutual Exclusion Assume CSAj overlaps CSBk Consider each thread's last (j-th and k-th) read and write in the lock() method before entering Derive a contradiction By way of contradiction we make an assumption: both are in the critical sections in overlapping intervals Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit From the Code writeA(flag[A]=true) readA(flag[B]==false) CSA writeB(flag[B]=true) readB(flag[A]==false) CSB class LockOne implements Lock { … public void lock() { flag[i] = true; while (flag[j]) {} } Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit From the Assumption readA(flag[B]==false) writeB(flag[B]=true) readB(flag[A]==false) writeA(flag[A]=true) Since A is in the CS it did not see B’s flag and vice versa. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Combining Assumptions: readA(flag[B]==false) writeB(flag[B]=true) readB(flag[A]==false) writeA(flag[A]=true) From the code writeA(flag[A]=true) readA(flag[B]==false) writeB(flag[B]=true) readB(flag[A]==false) Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Combining Assumptions: readA(flag[B]==false) writeB(flag[B]=true) readB(flag[A]==false) writeA(flag[A]=true) From the code writeA(flag[A]=true) readA(flag[B]==false) writeB(flag[B]=true) readB(flag[A]==false) Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Combining Assumptions: readA(flag[B]==false) writeB(flag[B]=true) readB(flag[A]==false) writeA(flag[A]=true) From the code writeA(flag[A]=true) readA(flag[B]==false) writeB(flag[B]=true) readB(flag[A]==false) Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Combining Assumptions: readA(flag[B]==false) writeB(flag[B]=true) readB(flag[A]==false) writeA(flag[A]=true) From the code writeA(flag[A]=true) readA(flag[B]==false) writeB(flag[B]=true) readB(flag[A]==false) Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Combining Assumptions: readA(flag[B]==false) writeB(flag[B]=true) readB(flag[A]==false) writeA(flag[A]=true) From the code writeA(flag[A]=true) readA(flag[B]==false) writeB(flag[B]=true) readB(flag[A]==false) Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Combining Assumptions: readA(flag[B]==false) writeB(flag[B]=true) readB(flag[A]==false) writeA(flag[A]=true) From the code writeA(flag[A]=true) readA(flag[B]==false) writeB(flag[B]=true) readB(flag[A]==false) Art of Multiprocessor Programming
Impossible in a partial order © 2003 Herlihy and Shavit Cycle! Impossible in a partial order This is the same proof we did for Alice and Bob in the earlier lecture, but this time done formally…notice that we derice the contradiction from the cycle here, slightly different from the way this is done in the book. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Deadlock Freedom LockOne Fails deadlock-freedom Concurrent execution can deadlock Sequential executions OK flag[i] = true; flag[j] = true; while (flag[j]){} while (flag[i]){} The LockOne algorithm is subject to deadlock: if each thread sets its flag to true and waits for the other, they will wait forever, which is the classic definition of a deadlock. Recall that when we looked at the story of Alice and Bob and their pets, this was exactly the first, broken protocol we considered. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit LockTwo public class LockTwo implements Lock { private int victim; public void lock() { victim = i; while (victim == i) {}; } public void unlock() {} Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit LockTwo public class LockTwo implements Lock { private int victim; public void lock() { victim = i; while (victim == i) {}; } public void unlock() {} Let other go first Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit LockTwo public class LockTwo implements Lock { private int victim; public void lock() { victim = i; while (victim == i) {}; } public void unlock() {} Wait for permission Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit LockTwo public class Lock2 implements Lock { private int victim; public void lock() { victim = i; while (victim == i) {}; } public void unlock() {} Nothing to do Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit LockTwo Claims Satisfies mutual exclusion If thread i in CS Then victim == j Cannot be both 0 and 1 Not deadlock free Sequential execution deadlocks Concurrent execution does not public void LockTwo() { victim = i; while (victim == i) {}; } Sequential execution deadlocks because thread arriving alone will not get in. Concurrent one is OK since one always allows the other to have priority Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Peterson’s Algorithm public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; We now combine the two algorithms, one that does not deadlock when they are concurrent, and the other that does not deadlock when they are sequential, to derive an algorithm that never deadlocks. Art of Multiprocessor Programming
Announce I’m interested © 2003 Herlihy and Shavit Peterson’s Algorithm Announce I’m interested public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; Art of Multiprocessor Programming
Announce I’m interested © 2003 Herlihy and Shavit Peterson’s Algorithm Announce I’m interested public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; Defer to other Art of Multiprocessor Programming
Announce I’m interested Wait while other interested & I’m the victim © 2003 Herlihy and Shavit Peterson’s Algorithm Announce I’m interested public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; Defer to other Wait while other interested & I’m the victim Art of Multiprocessor Programming
Announce I’m interested Wait while other interested & I’m the victim © 2003 Herlihy and Shavit Peterson’s Algorithm Announce I’m interested public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; Defer to other Wait while other interested & I’m the victim No longer interested Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Mutual Exclusion (1) writeB(Flag[B]=true)writeB(victim=B) public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } From the Code Art of Multiprocessor Programming 69 69
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Also from the Code (2) writeA(victim=A)readA(flag[B]) readA(victim) public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } Art of Multiprocessor Programming 70 70
Assumption (3) writeB(victim=B)writeA(victim=A) © 2003 Herlihy and Shavit Assumption (3) writeB(victim=B)writeA(victim=A) W.L.O.G. assume A is the last thread to write victim Art of Multiprocessor Programming 71 71
Combining Observations © 2003 Herlihy and Shavit Combining Observations (1) writeB(flag[B]=true)writeB(victim=B) (3) writeB(victim=B)writeA(victim=A) (2) writeA(victim=A)readA(flag[B]) readA(victim) writeB(victim=B) and writeA(victim=A) appear twice so remove them Art of Multiprocessor Programming 72 72
Combining Observations © 2003 Herlihy and Shavit Combining Observations (1) writeB(flag[B]=true)writeB(victim=B) (3) writeB(victim=B)writeA(victim=A) (2) writeA(victim=A)readA(flag[B]) readA(victim) writeB(victim=B) and writeA(victim=A) appear twice so remove them Art of Multiprocessor Programming 73 73
Combining Observations © 2003 Herlihy and Shavit Combining Observations (1) writeB(flag[B]=true)writeB(victim=B) (3) writeB(victim=B)writeA(victim=A) (2) writeA(victim=A)readA(flag[B]) readA(victim) writeB(victim=B) and writeA(victim=A) appear twice so remove them A read flag[B] == true and victim == A, so it could not have entered the CS (QED) Art of Multiprocessor Programming 74 74
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Deadlock Free public void lock() { … while (flag[j] && victim == i) {}; Thread blocked only at while loop only if other’s flag is true only if it is the victim Solo: other’s flag is false Both: one or the other not the victim See proof details in book Art of Multiprocessor Programming 75 75
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Starvation Free Thread i blocked only if j repeatedly re-enters so that flag[j] == true and victim == i When j re-enters it sets victim to j. So i gets in public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; See proof details in book Art of Multiprocessor Programming 76 76
The Filter Algorithm for n Threads © 2003 Herlihy and Shavit The Filter Algorithm for n Threads There are n-1 “waiting rooms” called levels At each level At least one enters level At least one blocked if many try Only one thread makes it through ncs cs We now consider two mutual exclusion protocols that work for n threads, where n is greater than 2. The first solution, the \cFilter{} lock, is a direct generalization of the Peterson lock to multiple threads. The second solution, the Bakery lock, is perhaps the simplest and best known $n$-thread solution. The Filter lock creates n-1 ``waiting rooms'', called levels, that a thread must traverse before acquiring the lock. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Filter class Filter implements Lock { int[] level; // level[i] for thread i int[] victim; // victim[L] for level L public Filter(int n) { level = new int[n]; victim = new int[n]; for (int i = 1; i < n; i++) { level[i] = 0; }} … } n-1 1 4 2 Thread 2 at level 4 level There are n-1 levels threads pass through, the last of which is the critical section. There are at most n threads that pass concurrently into level 0, n-1 into level 1 (a thread in level 1 is already in level 0), n-2 into level 2 and so on, so that Art of Multiprocessor Programming victim
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Filter class Filter implements Lock { … public void lock(){ for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i level[k] >= L) && victim[L] == i ) {}; }} public void unlock() { level[i] = 0; There are n-1 levels threads pass through, the last of which is the critical section. There are at most n threads that pass concurrently into level 0, n-1 into level 1 (a thread in level 1 is already in level 0), n-2 into level 2 and so on, so that Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Filter class Filter implements Lock { … public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i) {}; }} public void release(int i) { level[i] = 0; One level at a time Art of Multiprocessor Programming
Announce intention to enter level L © 2003 Herlihy and Shavit Filter class Filter implements Lock { … public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i) {}; // busy wait }} public void release(int i) { level[i] = 0; Announce intention to enter level L Art of Multiprocessor Programming
Give priority to anyone but me © 2003 Herlihy and Shavit Filter class Filter implements Lock { int level[n]; int victim[n]; public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i) {}; }} public void release(int i) { level[i] = 0; Give priority to anyone but me Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Filter Wait as long as someone else is at same or higher level, and I’m designated victim class Filter implements Lock { int level[n]; int victim[n]; public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i) {}; }} public void release(int i) { level[i] = 0; Art of Multiprocessor Programming
Thread enters level L when it completes the loop © 2003 Herlihy and Shavit Filter class Filter implements Lock { int level[n]; int victim[n]; public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i) {}; }} public void release(int i) { level[i] = 0; Thread enters level L when it completes the loop Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Claim Start at level L=0 At most n-L threads enter level L Mutual exclusion at level L=n-1 ncs L=0 L=1 L=n-2 cs L=n-1 Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Induction Hypothesis No more than n-(L-1) at level L-1 Induction step: by contradiction Assume all at level L-1 enter level L A last to write victim[L] B is any other thread at level L ncs assume L-1 has n-(L-1) L has n-L Read the details of the proof in the next set of slides in the book. We are going to show that assuming No more than n-L+1 at level L-1 we can prevent one more thread from getting in…SHOW DETAIL IN NEXT SLIDE cs prove Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Proof Structure ncs Assumed to enter L-1 A B n-L+1 = 4 n-L+1 = 4 Last to write victim[L] cs By way of contradiction all enter L Read the details of the proof in the next set of slides in the book. We are going to show that assuming No more than n-L+1 at level L-1 we can prevent one more thread from getting in… Show that A must have seen B in level[L] and since victim[L] == A could not have entered Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Just Like Peterson (1) writeB(level[B]=L)writeB(victim[L]=B) public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i) {}; }} From the Code Art of Multiprocessor Programming 88 88
Art of Multiprocessor Programming © 2003 Herlihy and Shavit From the Code (2) writeA(victim[L]=A)readA(level[B]) readA(victim[L]) public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (($ k != i) level[k] >= L) && victim[L] == i) {}; }} Art of Multiprocessor Programming 89 89
By Assumption (3) writeB(victim[L]=B)writeA(victim[L]=A) © 2003 Herlihy and Shavit By Assumption (3) writeB(victim[L]=B)writeA(victim[L]=A) By assumption, A is the last thread to write victim[L] Art of Multiprocessor Programming 90 90
Combining Observations © 2003 Herlihy and Shavit Combining Observations (1) writeB(level[B]=L)writeB(victim[L]=B) (3) writeB(victim[L]=B)writeA(victim[L]=A) (2) writeA(victim[L]=A)readA(level[B]) readA(victim[L]) Art of Multiprocessor Programming 91 91
Combining Observations © 2003 Herlihy and Shavit Combining Observations (1) writeB(level[B]=L)writeB(victim[L]=B) (3) writeB(victim[L]=B)writeA(victim[L]=A) (2) writeA(victim[L]=A)readA(level[B]) readA(victim[L]) Art of Multiprocessor Programming 92 92
Combining Observations © 2003 Herlihy and Shavit Combining Observations (1) writeB(level[B]=L)writeB(victim[L]=B) (3) writeB(victim[L]=B)writeA(victim[L]=A) (2) writeA(victim[L]=A)readA(level[B]) readA(victim[L]) A read level[B] ≥ L, and victim[L] = A, so it could not have entered level L! Art of Multiprocessor Programming 93 93
Art of Multiprocessor Programming © 2003 Herlihy and Shavit No Starvation Filter Lock satisfies properties: Just like Peterson Alg at any level So no one starves But what about fairness? Threads can be overtaken by others Art of Multiprocessor Programming 94 94
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Bounded Waiting Want stronger fairness guarantees Thread not “overtaken” too much If A starts before B, then A enters before B? But what does “start” mean? Need to adjust definitions …. If start means first operation and it’s a write, then two threads writing cannot tell who wrote first. If first is read, cannot tell who read first… Art of Multiprocessor Programming 95 95
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Bounded Waiting Divide lock() method into 2 parts: Doorway interval: Written DA always finishes in finite steps Waiting interval: Written WA may take unbounded steps STAND IN THE CLASS DOORWAY WHEN EXPLAINING THIS PART! (Lots of giggles and they will remember what you explain) It would be great if we could order threads by the order in which they performed the first step of the lock() method. Art of Multiprocessor Programming
First-Come-First-Served © 2003 Herlihy and Shavit First-Come-First-Served For threads A and B: If DAk DB j A’s k-th doorway precedes B’s j-th doorway Then CSAk CSBj A’s k-th critical section precedes B’s j-th critical section B cannot overtake A Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Fairness Again Filter Lock satisfies properties: No one starves But very weak fairness Can be overtaken arbitrary # of times That’s pretty lame… How could it be that there is no lockout and yet there is no r such that the protocol is r-bounded? The answer is that for any given execution i there is some bound r_i, but not an r for all executions. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Bakery Algorithm Provides First-Come-First-Served How? Take a “number” Wait until lower numbers have been served Lexicographic order (a,i) > (b,j) If a > b, or a = b and i > j Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Bakery Algorithm class Bakery implements Lock { boolean[] flag; Label[] label; public Bakery (int n) { flag = new boolean[n]; label = new Label[n]; for (int i = 0; i < n; i++) { flag[i] = false; label[i] = 0; } … Two arrays, one of flags just as in all other algorithms, and one of labels. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Bakery Algorithm class Bakery implements Lock { boolean[] flag; Label[] label; public Bakery (int n) { flag = new boolean[n]; label = new Label[n]; for (int i = 0; i < n; i++) { flag[i] = false; label[i] = 0; } … n-1 f t 2 5 4 6 CS Two arrays, one of flags just as in all other algorithms, and one of labels. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Bakery Algorithm class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ($k flag[k] && (label[i],i) > (label[k],k)); } Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Bakery Algorithm class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ($k flag[k] && (label[i],i) > (label[k],k)); } Doorway Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Bakery Algorithm class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ($k flag[k] && (label[i],i) > (label[k],k)); } I’m interested Art of Multiprocessor Programming
Take increasing label (read labels in some arbitrary order) © 2003 Herlihy and Shavit Bakery Algorithm Take increasing label (read labels in some arbitrary order) class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ($k flag[k] && (label[i],i) > (label[k],k)); } Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Bakery Algorithm Someone is interested class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ($k flag[k] && (label[i],i) > (label[k],k)); } Art of Multiprocessor Programming
Bakery Algorithm Someone is interested … © 2003 Herlihy and Shavit Bakery Algorithm class Bakery implements Lock { boolean flag[n]; int label[n]; public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ($k flag[k] && (label[i],i) > (label[k],k)); } Someone is interested … … whose (label,i) in lexicographic order is lower Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Bakery Algorithm class Bakery implements Lock { … public void unlock() { flag[i] = false; } Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Bakery Algorithm class Bakery implements Lock { … public void unlock() { flag[i] = false; } No longer interested labels are always increasing Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit No Deadlock There is always one thread with earliest label Ties are impossible (why?) Art of Multiprocessor Programming
First-Come-First-Served © 2003 Herlihy and Shavit First-Come-First-Served class Bakery implements Lock { public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ($k flag[k] && (label[i],i) > (label[k],k)); } If DA DB then A’s label is smaller And: writeA(label[A]) readB(label[A]) writeB(label[B]) readB(flag[A]) So B sees smaller label for A locked out while flag[A] is true If DA DBthen A’s label is earlier. By the code B reads A’s label, then writes its own, then checks A’s label and compares to its own. Moreover, A writes its label before B can read it, so if it has an earlier (lower) label B will see it and not go in. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Mutual Exclusion Suppose A and B in CS together Suppose A has earlier label When B entered, it must have seen flag[A] is false, or label[A] > label[B] class Bakery implements Lock { public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ($k flag[k] && (label[i],i) > (label[k],k)); } Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Mutual Exclusion Labels are strictly increasing so B must have seen flag[A] == false Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Mutual Exclusion Labels are strictly increasing so B must have seen flag[A] == false LabelingB readB(flag[A]) writeA(flag[A]) LabelingA Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Mutual Exclusion Labels are strictly increasing so B must have seen flag[A] == false LabelingB readB(flag[A]) writeA(flag[A]) LabelingA Which contradicts the assumption that A has an earlier label Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Bakery Y232K Bug class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ($k flag[k] && (label[i],i) > (label[k],k)); } Art of Multiprocessor Programming
Mutex breaks if label[i] overflows © 2003 Herlihy and Shavit Bakery Y232K Bug Mutex breaks if label[i] overflows class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ($k flag[k] && (label[i],i) > (label[k],k)); } Art of Multiprocessor Programming
Does Overflow Actually Matter? © 2003 Herlihy and Shavit Does Overflow Actually Matter? Yes Y2K 18 January 2038 (Unix time_t rollover) 16-bit counters No 64-bit counters Maybe 32-bit counters Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Timestamps Label variable is really a timestamp Need ability to Read others’ timestamps Compare them Generate a later timestamp Can we do this without overflow? Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit The Good News One can construct a Wait-free (no mutual exclusion) Concurrent Timestamping system That never overflows Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit The Good News Bad One can construct a Wait-free (no mutual exclusion) Concurrent Timestamping system That never overflows This part is hard Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Instead … We construct a Sequential timestamping system Same basic idea But simpler As if we use mutex to read & write atomically No good for building locks But useful anyway Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Precedence Graphs 1 2 3 Timestamps form directed graph Edge x to y Means x is later timestamp We say x dominates y Art of Multiprocessor Programming
Unbounded Counter Precedence Graph © 2003 Herlihy and Shavit Unbounded Counter Precedence Graph 1 2 3 Timestamping = move tokens on graph Atomically read others’ tokens move mine Ignore tie-breaking for now Art of Multiprocessor Programming
Unbounded Counter Precedence Graph © 2003 Herlihy and Shavit Unbounded Counter Precedence Graph 1 2 3 Art of Multiprocessor Programming
Unbounded Counter Precedence Graph © 2003 Herlihy and Shavit Unbounded Counter Precedence Graph 1 2 3 takes 0 takes 1 takes 2 Art of Multiprocessor Programming
Two-Thread Bounded Precedence Graph © 2003 Herlihy and Shavit Two-Thread Bounded Precedence Graph 1 2 Art of Multiprocessor Programming
Two-Thread Bounded Precedence Graph © 2003 Herlihy and Shavit Two-Thread Bounded Precedence Graph 1 2 Art of Multiprocessor Programming
Two-Thread Bounded Precedence Graph © 2003 Herlihy and Shavit Two-Thread Bounded Precedence Graph 1 2 Art of Multiprocessor Programming
Two-Thread Bounded Precedence Graph © 2003 Herlihy and Shavit Two-Thread Bounded Precedence Graph 1 2 Art of Multiprocessor Programming
Two-Thread Bounded Precedence Graph T2 © 2003 Herlihy and Shavit Two-Thread Bounded Precedence Graph T2 and so on … 1 2 Art of Multiprocessor Programming
Three-Thread Bounded Precedence Graph? © 2003 Herlihy and Shavit Three-Thread Bounded Precedence Graph? 3 1 2 Art of Multiprocessor Programming
Three-Thread Bounded Precedence Graph? © 2003 Herlihy and Shavit Three-Thread Bounded Precedence Graph? 3 What to do if one thread gets stuck? 1 2 Art of Multiprocessor Programming
Replace each vertex with a copy of the graph © 2003 Herlihy and Shavit Graph Composition 1 2 1 2 T3=T2*T2 Replace each vertex with a copy of the graph Art of Multiprocessor Programming
Three-Thread Bounded Precedence Graph T3 © 2003 Herlihy and Shavit Three-Thread Bounded Precedence Graph T3 2 1 Art of Multiprocessor Programming
Three-Thread Bounded Precedence Graph T3 © 2003 Herlihy and Shavit Three-Thread Bounded Precedence Graph T3 2 1 20 < 02 < 12 Art of Multiprocessor Programming
Three-Thread Bounded Precedence Graph T3 © 2003 Herlihy and Shavit Three-Thread Bounded Precedence Graph T3 1 2 2 1 and so on… Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit In General label size = Log2(3k) = 2k Tk = T2 * Tk-1 K threads need 3k nodes Art of Multiprocessor Programming
Deep Philosophical Question © 2003 Herlihy and Shavit Deep Philosophical Question The Bakery Algorithm is Succinct, Elegant, and Fair. Q: So why isn’t it practical? A: Well, you have to read N distinct variables Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Shared Memory Shared read/write memory locations called Registers (historical reasons) Come in different flavors Multi-Reader-Single-Writer (Flag[]) Multi-Reader-Multi-Writer (Victim[]) Not that interesting: SRMW and SRSW Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Theorem At least N MRSW (multi-reader/single-writer) registers are needed to solve deadlock-free mutual exclusion. N registers like Flag[]… This proof is not easy for undergraduates but you need to do it so that they see the weakness of read-write registers due to the fact that when they write they obliterate traces of what other threads wrote. Art of Multiprocessor Programming
Proving Algorithmic Impossibility © 2003 Herlihy and Shavit Proving Algorithmic Impossibility C To show no algorithm exists: assume by way of contradiction one does, show a bad execution that violates properties: in our case assume an alg for deadlock free mutual exclusion using < N registers write CS Explain the notion of a proof that NO ALG EXISTS by showing a bad execution. Art of Multiprocessor Programming
Proof: Need N-MRSW Registers © 2003 Herlihy and Shavit Proof: Need N-MRSW Registers Each thread must write to some register A B C write write CS CS CS Bad execution has one thread go in, then at least one of the others must be able to get in since the algorithm must provide deadlock free mutual exclusion …can’t tell whether A is in critical section Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Upper Bound Bakery algorithm Uses 2N MRSW registers So the bound is (pretty) tight But what if we use MRMW registers? Like victim[] ? Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Bad News Theorem At least N MRMW multi-reader/multi-writer registers are needed to solve deadlock-free mutual exclusion. (So multiple writers don’t help) Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Theorem (For 2 Threads) Theorem: Deadlock-free mutual exclusion for 2 threads requires at least 2 multi-reader multi-writer registers Proof: assume one register suffices and derive a contradiction Show proof for two threads first so students get the general structure of the proof. Without this they will have a hard time to get the Details of the three thread case. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Two Thread Execution A B R Write(R) CS CS Threads run, reading and writing R Deadlock free so at least one gets in Art of Multiprocessor Programming
Covering State for One Register Always Exists © 2003 Herlihy and Shavit Covering State for One Register Always Exists B Write(R) In any protocol B has to write to the register before entering CS, so stop it just before In any protocol, we run B never letting it write the register, we stop B when B’s next step is a write. Art of Multiprocessor Programming
Proof: Assume Cover of 1 A B © 2003 Herlihy and Shavit Proof: Assume Cover of 1 A B Write(R) A runs, possibly writes to the register, enters CS CS Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Proof: Assume Cover of 1 A B B Runs, first obliterating any trace of A, then also enters the critical section Write(R) CS CS Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Theorem Deadlock-free mutual exclusion for 3 threads requires at least 3 multi-reader multi-writer registers Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Proof: Assume Cover of 2 A B C Write(RB) Write(RC) Explain to the students that we will show how to derive a cover of the two registers later… Only 2 registers Art of Multiprocessor Programming
Run A Solo A B C Writes to one or both registers, enters CS CS © 2003 Herlihy and Shavit Run A Solo A B C Write(RB) Write(RC) Writes to one or both registers, enters CS CS Art of Multiprocessor Programming
Obliterate Traces of A A B C © 2003 Herlihy and Shavit Obliterate Traces of A A B C Write(RB) Write(RC) Other threads obliterate evidence that A entered CS CS Art of Multiprocessor Programming
Mutual Exclusion Fails © 2003 Herlihy and Shavit Mutual Exclusion Fails A B C Write(RB) Write(RC) CS looks empty, so another thread gets in CS CS Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Proof Strategy Proved: a contradiction starting from a covering state for 2 registers Claim: a covering state for 2 registers is reachable from any state where CS is empty Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Covering State for Two A B Write(RA) Write(RB) This is based on pigeon-hole principle, and its not a specific register, just some register. Once the register is determined, we can call that register R_B. If we run B through CS 3 times, B must return twice to cover some register, say RB Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Covering State for Two A B Write(RA) Write(RB) Start with B covering register RB for the 1st time Run A until it is about to write to uncovered RA Are we done? Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Covering State for Two A B Write(RA) Write(RB) NO! A could have written to RB So CS no longer looks empty to thread C Notice that the problem is the third thread C that can see what is written in these registers and not get in, this is why we need to have both registers be in a state consistent with no-one being in the critical section. Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Covering State for Two A B Write(RA) Write(RB) Run B obliterating traces of A in RB Run B again until it is about to write to RB Now we are done Art of Multiprocessor Programming
Inductively We Can Show © 2003 Herlihy and Shavit Inductively We Can Show A B C Write(RA) Write(RB) Write(RC) There is a covering state Where k threads not in CS cover k distinct registers Proof follows when k = N-1 Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Summary of Lecture In the 1960’s several incorrect solutions to starvation-free mutual exclusion using RW-registers were published… Today we know how to solve FIFO N thread mutual exclusion using 2N RW-Registers Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit Summary of Lecture N RW-Registers inefficient Because writes “cover” older writes Need stronger hardware operations that do not have the “covering problem” In next lectures - understand what these operations are… Say that the issue that writes “cover” older writes is a fundamental problem with RW-registers that shows up everywhere… Art of Multiprocessor Programming
Art of Multiprocessor Programming © 2003 Herlihy and Shavit This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License. You are free: to Share — to copy, distribute and transmit the work to Remix — to adapt the work Under the following conditions: Attribution. You must attribute the work to “The Art of Multiprocessor Programming” (but not in any way that suggests that the authors endorse you or your use of the work). Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license. For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to http://creativecommons.org/licenses/by-sa/3.0/. Any of the above conditions can be waived if you get permission from the copyright holder. Nothing in this license impairs or restricts the author's moral rights. Art of Multiprocessor Programming