Multiprocess Synchronization Algorithms (20225241) The Mutual Exclusion problem Lecturer: Danny Hendler
The mutual exclusion problem (Dijkstra, 1965) We need to devise a protocol that guarantees mutually exclusive access by processes to a shared resource (such as a file, printer, etc.)
The problem model (reads/writes) Shared-memory multiprocessor: multiple processes Processes can apply Atomic reads and writes to shared registers Completely asynchronous
Mutex: formal definition loop forever Remainder code Entry code Critical section (CS) Exit code end loop Remainder code Entry code CS Exit code
Mutex Requirements Mutual exclusion: No two processes are at their CS at the same time. Deadlock-freedom: If a process is trying to enter its critical section, then some process eventually enters its critical section. Starvation-freedom (optional): If a process is trying to enter its critical section, then this process must eventually enter its critical section.
An additional assumption Processes do not stop while performing the entry, CS, or exit code.
Incorrect algorithm 1. Yes No Does algorithm1 satisfy mutex? initially: turn=0 Program for process 0 await turn=0 CS of process 0 turn:=1 Program for process 1 await turn=1 CS of process 1 turn:=0 Does algorithm1 satisfy mutex? Does it satisfy deadlock-freedom? Yes No
Incorrect algorithm 2. No Yes Does algorithm2 satisfy mutex? initially: lock=0 Program for both processes await lock=0 lock:=1 CS lock:=0 Does algorithm2 satisfy mutex? Does it satisfy deadlock-freedom? No Yes
Incorrect algorithm 3. Yes No Does algorithm3 satisfy mutex? initially: flag[0]=false, flag[1]=false Program for process 0 flag[0]:=true await flag[1]=false CS of process 0 flag[0]:=false Program for process 1 flag[1]:=true await flag[0]=false CS of process 1 flag[1]:=false Does algorithm3 satisfy mutex? Does it satisfy deadlock-freedom? Yes No
Peterson’s 2-process algorithm (Peterson, 1981) initially: b[0]=false, b[1]=false, turn=0 or 1 Program for process 0 b[0]:=true turn:=0 await (b[1]=false or turn=1) CS b[0]:=false Program for process 1 b[1]:=true turn:=1 await (b[0]=false or turn=0) CS b[1]:=false
Schematic for Peterson’s 2-process algorithm Indicate participation b[i]:=true Barrier turn:=i no, maybe Is there contention? b[1-i]=true? yes First to cross barrier? turn=1-i? no yes Critical Section Exit code b[i]:=false Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
Let’s prove that Peterson’s 2-process algorithm satisfies both mutual-exclusion and starvation-freedom.
Kessel’s single-writer algorithm (Kessels, 1982) A single-writer register is a register that can be written by a single process only. initially: b[0]=false, b[1]=false, turn[0], turn[1]=0 or 1 Program for process 0 b[0]:=true local[0]:=turn[1] turn[0]:=local[0] Await (b[1]=false or local[0]<>turn[1] CS b[0]:=false Program for process 0 b[1]:=true local[1]:=1-turn[0] turn[1]:=local[1] Await (b[0]=false or local[1]=turn[0] CS b[1]:=false Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
Mutual exclusion for n processes: Tournament trees Level 2 1 2 3 4 5 6 7 Level 1 Level 0 Processes A tree-node is identified by: [level, node#] Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
Tournament tree based on Peterson’s 2-process alg. Variables Per node: b[level, 2node], b[level, 2node+1], turn[level,node] Per process (local): level, node, id. Program for process i node:=i For level = o to log n-1 do id:=node mod 2 node:= node/2 b[level,2node+id]:=true turn[level,node]:=id await (b[level,2node+1-id]=false or turn[level,node]=1-id) od CS for level=log n –1 downto 0 do node:= i/2level b[level,node]:=false
The tournament tree using Peterson’s 2-process algorithm satisfies both mutual-exclusion and starvation-freedom.
Contention-free step complexity The worst-case number of steps for a process to enter the CS when it runs by itself. What’s the contention-free step complexity of Peterson’s tournament tree? log n Can we do better?
Lamport’s fast mutual exclusion algorithm Variables Fast-lock, slow-lock initially 0 want[i] initially false Program for process i want[i]:=true fast-lock:=i if slow-lock<>0 then want[i]:=false await slow-lock:=0 goto 1 slow-lock:=i if fast-lock <> i then for j:=1 to n do await want[j] = false od if slow-lock <> i then await slow-lock = 0 CS slow-lock:=0
Schematic for Lamport’s fast mutual exclusion Indicate contention want[i]:=true, fast-lock:=i Is there contention? slow-lock< > 0? yes Wait until CS is released want[I]:=false, await slow-lock:=0 no Barrier slow-lock:=i Is there contention? fast-lock < > i? yes Wait until no other process can cross the Barrier no no CS Not last to cross Barrier? slow-lock < > i? yes EXIT Wait until CS is released Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
Lamport’s fast mutual exclusion algorithm satisfies both mutual-exclusion and deadlock-freedom.
First in First Out (FIFO) Mutual Exclusion Deadlock-freedom Starvation-freedom remainder doorway waiting entry code critical section exit code FIFO: if process p is waiting and process q has not yet started the doorway, then q will not enter the CS before p. Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
Lamport’s bakery algorithm Variables number[i] initially 0 Choosing[i] initialy false Program for process i choosing[i]:=true number[i]:=max(number[0], …, number[n-1])+1 choosing[i]:=false for j:=1 to n(<> i) await choosing[j] = false await number[j]= 0 or (number[j],j) > (number[i],i) CS number[i]:=0
Lamport’s Bakery Algorithm 1 2 3 4 5 n remainder 1 3 2 4 2 doorway entry waiting 1 3 2 4 2 1 2 2 CS time 1 2 2 exit Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
Implementation 1 code of process i , i {1 ,..., n} number[i] := 1 + max {number[j] | (1 j n)} for j := 1 to n (<> i) { await (number[j] = 0) (number[j] > number[i]) } critical section number[i] := 0 1 2 3 4 n number integer Does this implementation work? Answer: No, it can deadlock! Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
Implementation 1: deadlock 2 3 4 5 n remainder 1 2 2 doorway entry waiting 1 2 2 1 deadlock CS time 1 exit Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
Implementation 2 code of process i , i {1 ,..., n} number[i] := 1 + max {number[j] | (1 j n)} for j := 1 to n (<> i) { await (number[j] = 0) (number[j],j) number[i],i) // lexicographical order } critical section number[i] := 0 1 2 3 4 n number integer Does this implementation work? Answer: It does not satisfy mutual exclusion! Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
Implementation 2: no mutual exclusion 1 2 3 4 5 n remainder 1 2 2 doorway entry waiting 1 2 2 1 2 2 CS time 1 exit Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
The Bakery Algorithm code of process i , i {1 ,..., n} 1: choosing[i] := true 2: number[i] := 1 + max {number[j] | (1 j n)} 3: choosing[i] := false 4: for j := 1 to n do 5: await choosing[j] = false 6: await (number[j] = 0) (number[j],j) (number[i],i) 7: od 8: critical section 9: number[i] := 0 Doorway Waiting Bakery 1 2 3 4 n choosing false false false false false false bits number integer
Computing the maximum code of process i , i {0 ,..., n-1} The correctness of the Bakery algorithm depends on an implicit assumption on the implementation of computing the maximum (statement 2). Below we give a correct implementation. For each process, three additional local registers are used. They are named local1, local2, local3 and their initial values are immaterial. choosing number false 1 false 2 false 3 false local1 := 0 for local2 := 1 to n { local3 := number[local2] if local1 < local3 then {local1 := local3} } number[i] := 1+local1 false n-1 false Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
Question: Computing the maximum code of process i , i {0 ,..., n-1} Is the following implementation also correct? That is, does the Bakery algorithm solves the mutual exclusion problem when the following implementation is used? Justify your answer. For each process, two additional local registers are used. They are named local1, local2, and their initial values are immaterial. choosing number false 1 false 2 false 3 false local1 := i for local2 := 1 to n { if number[local1] < number[local2] then {local1 := local2} } number[i] := 1+ number[local1] false n-1 false Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
The 2nd maximum alg. doesn’t work ? local1 2 1 2 3 4 5 n remainder ? 1 1 1 doorway entry waiting 1 1 1 1 1 1 CS time 1 exit Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
Properties of the Bakery algorithm Satisfies Mutual exclusion and first-come-first-served. The size of number[i] is unbounded. In practice this is not a problem, 16 bits registers will give us ticket numbers which can grow up to 2^16, a number that in practice will never be reached. There is no need to assume that operations on the same memory location occur in some definite order; it works correctly even when it is allowed for reads which are concurrent with writes to return an arbitrary value.
The Black-White Bakery Algorithm Bounding the space of the Bakery Algorithm Bakery (FIFO, unbounded) The Black-White Bakery Algorithm FIFO Bounded space + one bit [Taubenfeld 2004] Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
The Black-White Bakery Algorithm color bit 1 2 3 4 5 n remainder 1 2 1 2 2 doorway entry waiting 1 1 2 2 2 1 1 2 2 CS time 1 1 2 2 exit Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
The Black-White Bakery Algorithm Data Structures 1 2 3 4 n choosing bits mycolor bits number {0,1,...,n} color bit {black,white} Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
The Black-White Bakery Algorithm code of process i , i {1 ,..., n} choosing[i] := true mycolor[i] := color number[i] := 1 + max{number[j] | (1 j n) (mycolor[j] = mycolor[i])} choosing[i] := false for j := 0 to n do await choosing[j] = false if mycolor[j] = mycolor[i] then await (number[j] = 0) (number[j],j) (number[i],i) (mycolor[j] mycolor[i]) else await (number[j] = 0) (mycolor[i] color) (mycolor[j] = mycolor[i]) fi od critical section if mycolor[i] = black then color := white else color := black fi number[i] := 0 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
A space lower bound for deadlock-free mutex How many registers must an n-process deadlock-free mutual exclusion algorithm use if it can only use single-writer registers ? We now prove that the same result holds for multi-reader-multi-writer registers, regardless of their size.
Some definitions required for the proof Configuration A quiescent configuration Indistinguishable configurations A P-quiescent configuration A covered register An execution
Example of indistinguishability Execution x is indistinuishable from execution y to process p execution x p reads 5 from r1 q writes 6 to r1 p writes 7 to r1 q writes 8 to r1 p reads 8 from r1 execution y p reads 5 from r1 p writes 7 to r1 q writes 6 to r1 q reads 6 from r1 q writes 8 to r1 p reads 8 from r1 q write 6 to r1 r1 6 8 r1 8 The values of the shared registers must also be the same Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006
Illustration for Lemma 1 C pi-quiescent, W covered by P (By P) Q Quiescent (By pj) R Pj in CS Quiescent D C ~ D pi (by pi) C1 pi in CS Q ~ Q1 pj (By pj) Z Both pi, pj in CS! (By P) Q1 pi in CS (by pi) D1 Based on the proof in “Distributed Computing”, by Hagit Attiya & Jennifer Welch
Illustration for the simple part of Lemma 2 C1 {pk,…,pn-1}-quiescent p0…pk-1 cover W pk runs until it covers x ' (pk only) {pk+1,…,pn-1}-quiescent W U {x} covered C'2 D'1 p0… pk write to W and exit x is covered P-{pk} in remainder D1 Quiescent D‘1 ~ D1 {p0…pk-1} (by p0… pk-1) C2 {pk,…,pn-1}-quiescent p0…pk-1 cover W Based on the proof in “Distributed Computing”, by Hagit Attiya & Jennifer Welch
Illustration for the general part of Lemma 2 Ci 2… i {pk,…,pn-1}-quiescent p0…pk-1 cover Wi {pk+1,…,pn-1}-quiescent W U {x} covered C’j i+1i+1… j quiescent D1 1 quiescent D'i 'i D0 C1 1 {pk,…,pn-1}-quiescent p0…pk-1 cover W1 C2 2 {pk,…,pn-1}-quiescent p0…pk-1 cover W2 D‘i ~ Di {pk+1,…,pn-1}- i Di quiescent Cj i+1i+1… j {pk+1,…,pn-1}-quiescent p0…pk-1 cover Wi Based on the proof in “Distributed Computing”, by Hagit Attiya & Jennifer Welch
A matching upper bound: the one-bit algorithm initially: b[i]:=false Program for process i repeat b[i]:=true; j:=1 while (b[i] = true) and (j < i) do if (b[j]=true then b[i]:=false await b[j]=false j:=j+1 until b[i]=true for (j:=i+1 to n) do Critical Section b[i]=false
Read-Modify-Write (RMW) operations Read-modify-write (w, f) do atomically prev:=w w:=f(prev) return prev Fetch-and-add(w, Δ) do atomically prev:=w w:= prev+Δ return prev Test-and-set(w) do atomically prev:=w w:=1 return prev
Mutual exclusion using test-and-set initially: v:=0 Program for process I await test&set(v) = 0 Critical Section v:=0 Mutual exclusion? Yes Deadlock-freedom? Yes Starvation-freedom? No
Mutual exclusion using general RMW initially: v:=<0,0> Program for process I position:=RMW(v, <v.first, v.last+1> ) repeat queue:=v until queue.first = position.last Critical Section RMW(v, <v.first+1, v.last> ) How many bits does this algorithm require? Unbounded number, but can be improved to 2 log2 n
Lower bound on the number of bits required for mutual exclusion