CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Set 18: Wait-Free Simulations Beyond Registers CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS CSCE 668 Spring 2014 Prof. Jennifer Welch
Data Types Beyond Registers Registers support the operations read and write We've seen wait-free simulations of one kind of register out of another kind different numbers of values, readers, writers What about (wait-free) simulating a significantly different kind of data type out of registers? More generally, what about (wait-free) simulating an object of type X out of objects of type Y ? Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Key Insight Ability of objects of type Y to be used to simulate an object of type X is related to the ability of those data types to solve consensus! We are focusing on systems that are asynchronous shared memory wait-free Set 18: Wait-Free Simulations Beyond Registers CSCE 668
FIFO Queue Example Sequential specification of a FIFO queue: operation with invocation enq(x) and response ack operation with invocation deq and response return(x) a sequence of operations is allowable iff each deq returns the oldest enqueued value that has not yet been dequeued (returns if queue is empty) Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Consensus Algorithm for n = 2 Using FIFO Queue one shared FIFO queue two shared registers Initially Q = [0] and Prefer[i] = Prefer[i] := pi's input val := deq(Q) if val = 0 then decide on pi's input else temp := Prefer[1 - i] decide temp write my input into my register use shared queue to arbitrate between the 2 procs: first one to dequeue the initial 0 wins, decision value is its input loser obtains decision value from other proc's register Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Implications of Consensus Algorithm Using FIFO Queue Suppose we want to wait-free simulate a FIFO queue using read/write registers. Is this possible? No! If it were possible, we could solve consensus: simulate a FIFO queue using registers use simulated queue and previous algorithm to solve consensus Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Extend Algorithm to More Procs? Can we use FIFO queues to solve consensus with more than 2 procs? The ability to atomically dequeue a value was key to the 2-proc alg: one proc. learns it is the winner the other learns it is the loser, therefore the id of the winner is obvious Not clear how to handle 3 procs. Suppose we have a different data type: Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Compare & Swap Specification compare&swap(X : shared memory address, old: value, new: value) previous := X // previous is a local var. if previous = old then X := new return previous occurs atomically X old new Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Consensus Algorithm Using Compare-and-Swap Initially First = val := compare&swap(First, , my input) if val = then decide on my input else decide val one shared C&S object if First = then replace with my input simultaneously indicate whether you are the winner and the value to be decided by all the losers Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Impossibility of 3-Proc Consensus with FIFO Queue Theorem (15.3): Wait-free consensus is impossible using FIFO queues and registers if n > 2. Proof: Same structure as for registers. Key difference is when considering situation when C is bivalent p0(C) is 0-valent and p1(C) is 1-valent. Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Impossibility of 3-Proc Consensus with FIFO Queues p0 and p1 must be accessing the same FIFO queue. Case 1: Both steps are deq's. 0/1 C p0 deq's p1 deq's 1 p0 deq's p1 deq's 1 look same to p2 contradiction! Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Impossibility Proof contradiction! Case 2: p0 deq's and p1 enq's. Case 2.1: The queue is not empty in C 0/1 C p0 deq's p1 enq's 1 p1 enq's p0 deq's ? contradiction! Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Impossibility Proof contradiction! Case 2: p0 deq's and p1 enq's. Case 2.2: The queue is empty in C 0/1 C queue is empty p0 deq's p1 enq's queue is still empty 1 p0 deq's look the same to p2 queue is empty again 1 contradiction! Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Impossibility Proof contradiction! Case 3: Both p0 and p1 enq (on same queue). 0/1 C p0 enq's A p1 enq's B p1 enq's B p0 enq's A 1 why do and exist? : p0 takes steps until deq'ing A : p0 takes steps until deq'ing B : p1 takes steps until deq'ing B : p1 takes steps until deq'ing A look the same to p2 1 contradiction! Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Impossibility Proof contradiction! Case 3 cont'd: Suppose does not exist: 0/1 C p0 enq's A p1 enq's B p1 enq's B p0 enq's A 1 p0 takes steps until deciding but never deq's A; decides 0 p0 takes same number of steps as on the left; never deq's B; also decides 0 contradiction! 1 Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Impossibility Proof Case 3 cont'd: Prove existence of similarly. Thus there is no wait-free algorithm for consensus with 3 procs using FIFO queues and registers. Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Implications Suppose we want to wait-free simulate a compare&swap object using FIFO queues (and registers). Is this possible? Not if n > 2! If it were possible, we could solve consensus using FIFO queues (and registers): simulate a compare&swap object using FIFO queues (and registers) use simulated compare&swap object and c&s algorithm to solve consensus Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Generalize these Arguments Previous results concerning FIFO queues and compare&swap suggest a criterion for determining if wait-free simulations exist: based on ability of the data types to solve consensus for a certain number of procs. Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Consensus Number data type consensus number read/write register 1 Data type X has consensus number n if n is the largest number of procs. for which consensus can be solved using only objects of type X and read/write registers. data type consensus number read/write register 1 FIFO queue 2 compare&swap ∞ Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Using Consensus Numbers Theorem (15.5): If data type X has consensus number m and data type Y has consensus number n with n > m, then there is no wait-free simulation of an object of type Y using objects of type X and read/write registers in a system with more than m procs. X … reg Y Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Using Consensus Numbers Proof: Suppose in contradiction there is a wait-free simulation S of Y using X and registers in a system with k procs, where m < k ≤ n. Construct consensus algorithm for k > m procs using objects of type X (and registers): Use S to simulate some objects of type Y using objects of type X (and registers) Use the (simulated) type Y objects (and registers) in the k- proc consensus algorithm that exists since CN(Y) = n. contradicts CN(X) < k Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Corollaries There is no wait-free simulation of any object with consensus number > 1 using just read/write registers. There is no wait-free simulation of any object with consensus number > 2 using just FIFO queues and read/write registers. Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Universality Let's now consider positive results relating to consensus number. A data type is universal if objects of that type (together with read/write registers) can wait-free simulate any data type. Theorem: If data type X has consensus number n, then it is universal in a system with at most n procs. Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Proving Universality Result Describe an algorithm that simulates any data type uses compare&swap (instead of any object with consensus number n) simulation is only non-blocking, weaker than wait-free Modify to use any object with consensus number n Modify to be wait-free Modify to bound shared memory used Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Non-Blocking Non-blocking vs. wait-free is analogous to no- deadlock vs. no-lockout for mutual exclusion. Non-blocking simulation: at any point in an execution, if at least one operation is pending (response is not yet ready to be done), then there is a finite sequence of steps by a single proc that completes one of the pending operations. Does not ensure that every pending operation is eventually completed. Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Universal Construction Keep history of operations that have been applied to the simulated object as a shared linked list. To apply an operation on the simulated object, the invoking proc. must insert an appropriate "node" into the linked list: it is convenient to put the newest node at the head of the list A compare&swap object is used to keep track of the head of the list Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Details on Linked List Each linked list node has operation invocation new state of the simulated object operation response pointer to previous node (previous op) anchor invocation state response before invocation state response before initial state Head Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Simulation Initially Head points to anchor node represents initial state of simulated object When inv is invoked: allocate a new linked list node in shared memory, pointed to by local var point point.inv := inv repeat h := Head // h is a local var point.state, point.response := apply(inv,h.state) point.before := h until compare&swap(Head,h,point) = h do the output indicated by point.response depends on simulated data type if Head still points to same node h points to, then make Head point to new node. Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Simulation Figure … pi invocation point state response h before if compare&swap indicates that Head has moved on, then try again to insert the new node, at the new location invocation state response before Head Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Strengthenings of Algorithm To replace compare&swap object with any object with consensus number n (the number of procs): define a consensus object (data type version of consensus problem) get around the difficulty that a consensus object can only be used once by adding a consensus object to each linked list node that points to next node in the list Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Strengthenings of Algorithm To get a wait-free implementation, use idea of helping: procs help each other to finish pending operations (not just their own) To reduce the size of the linked list (so it doesn't grow without bound), need to keep track of which list nodes can be recycled. Set 18: Wait-Free Simulations Beyond Registers CSCE 668
Effect of Randomization Suppose we relax the liveness condition for linearizable shared memory: operations must terminate with high probability Now a randomized consensus algorithm can be used to simulate any data type out of any other data type, including read/write registers I.e., hierarchy collapses. Set 18: Wait-Free Simulations Beyond Registers CSCE 668