Download presentation
Presentation is loading. Please wait.
Published byIrene Jefferson Modified over 9 years ago
1
Distributed Algorithms (22903) Lecturer: Danny Hendler The wait-free hierarchy and the universality of consensus This presentation is based on the book “Distributed Computing” by Hagit attiya & Jennifer Welch
2
2
3
3
4
4
5
5 Formally: the ConsensusObject -Supports a single operation: decide -Each process p i calls decide with some input v i from some domain. decide returns a value from the same domain. -The following requirements must be met: - Agreement: In any execution E, all decide operations must return the same value. - Validity: The values returned by the operations must equal one of the inputs.
6
6 Wait-free consensus can be solved easily by compare&swap Comare&swap(b,old,new) atomically v read from b if (v = old) { b new return success } else return failure; MIPS PowerPC DECAlpha Motorola 680x0 IBM 370 Sun SPARC 80X86 How?
7
7 Would this consensus algorithm from reads/writes work? Decide(v) ; code for p i, i=0,1 1.if (decision = null) 2. decision=v 3. return v 4.else 5. return decision Initially decision=null
8
8 A proof that wait-free consensus for 2 or more processes cannot be solved by registers.
9
9 A FIFO queue Supports 2 operations: q.enqueue(x) – returns ack q.dequeue – returns the first item in the queue or empty if the queue is empty.
10
10 FIFO queue + registers can implement 2-process consensus Decide(v) ; code for p i, i=0,1 1.Prefer[i]:=v 2.qval=Q.deq() 3.if (qval = 0) then return v 4.else return Prefer[1-i] Initially Q= and Prefer[i]=null, i=0,1 There is no wait-free implementation of a FIFO queue shared by 2 or more processes from registers
11
11 A proof that wait-free consensus for 3 or more processes cannot be solved by FIFO queue (+ registers)
12
12 The wait-free hierarchy We say that object type X solves wait-free n-process consensus if there exists a wait-free consensus algorithm for n processes using only shared objects of type X and registers. The consensus number of object type X is n, denoted CN(X)=n, if n is the largest integer for which X solves wait-free n-process consensus. It is defined to be infinity if X solves consensus for every n. Lemma: If CN(X)=m and CN(Y)=n>m, then there is no wait-free implementation of Y from instances of X and registers in a system with more than m processes.
13
13 The wait-free hierarchy (cont’d) Compare-and-swap … 2FIFO queue, stack, test-and-set 1registers
14
14 The universality of conensus An object is universal if, together with registers, it can implement any other object in a wait-free manner. We will show that any object X with consensus number n is universal in a system with n or less processes The lock-freedom progress property is weaker than wait-freedom. An algorithm is lock-free if it guarantees that some operation terminates after some finite total number of steps performed by processes.
15
15 Universal constructions Given the sequential specification of any object, implement a linearizable wait-free concurrent version of it: A lock-free construction using CAS A lock-free construction using consensus A wait-free construction using consensus A bounded-memory wait-free construction using consensus
16
16 A lock-free universal algorithm using CAS inv new-state response … Head Each operation is represented by a shared record of type opr. typedef opr structure { inv ;the operation invocation, including its parameters new-state ;the new state of the object, after applying the operation response ;The response of the operation }
17
inv new-state response 17 A lock-free universal algorithm using CAS (cont’d) Initially Head points to the anchor record. Head.newstate is initialized with the implemented object’s initial state. 1.When inv occurs 2. point:=new opr, point.inv:=inv 3. repeat 4. h:=Head 5. point.new-state, point.response=apply(inv, h.new-state) 6. until compare&swap(Head, h, point)=h 7. return point.response inv new-state response … inv new-state=init response Head anchor
18
18 A lock-free universal algorithm using consensus Each operation is represented by a shared record of type opr. typedef opr structure { seq ;the operation’s sequential number (register) inv ;the operation invocation, including its parameters (register) new-state ;the new state of the object, after applying the operation (register) response ;The response of the operation, including its return value (register) after ;A pointer to the next record (consensus object) … Head inv new-state response after seq inv new-state response after seq inv=null new-state=init response=null after seq=1 anchor
19
19 A lock-free universal algorithm using consensus (cont’d) … Head inv new-state response after seq inv new-state response after seq inv=null new-state=init response=null after seq=1 anchor Initially all Head entries points to the anchor record. 1.When inv occurs 2. point:=new opr, point.inv:=inv 3. for j=0 to n-1 ; find a record with the maximum sequenece number 4. if Head[j].seq > Head[i].seq then Head[i]=Head[j] 5. repeat 6. win:=decide(Head[i].after,point) ; try to thread your operation 7. win.seq:=Head[i].seq+1 8. :=apply(win.inv, Head[i].new-state) 9. Head[i]=win ; point to the following record 10. until win=point 11. return point.response
20
20 A wait-free universal algorithm using consensus Each operation is represented by a shared record of type opr. typedef opr structure { seq ;the operation’s sequential number (register) inv ;the operation invocation, including its parameters (register) new-state ;the new state of the object, after applying the operation (register) response ;The response of the operation, including its return value (register) after ;A pointer to the next record (consensus object) We add a helping mechanism Announce inv new-state response after seq When performing operation with sequence number j, try to help process (j mod n)
21
21 A wait-free universal algorithm using consensus (cont’d) Initially all Head and Announce entries point to the anchor record. 1.When inv occurs 2. Announce[i]:=new opr, Announce[i].inv:=inv,Announce[i].seq:=0 3. for j=0 to n-1 ; find a record with the maximum sequenece number 4. if Head[j].seq > Head[i].seq then Head[i]=Head[j] 5. while Announce[i].seq=0 do 6. priority:=Head[i].seq+1 mod n ; ID of process with priority 7. if Announce[priority].seq=0 ; If help is needed 8. then point:=Announce[priority] ; help the other process 9. else point:=Announce[i] ; perform own operation 10. win:=decide(Head[i].after, point) 11. :=apply(win.inv,Head[i].new-state) 12. win.seq:=Head[i].seq+1 13. Head[i]=win 14.return Announce[i].reponse
22
22 A proof that the universal algorithm using consensus is wait-free
23
23 A bounded-memory wait-free universal algorithm using consensus What is the number of records needed by the algorithm? Unbounded! The following algorithm uses a bounded # of records Each process allocates records from its private pool A record is recycled once we’re sure it will not be referenced anymore We don’t need this mechanism if we use a language with a GC (such as Java)
24
24 A bounded-memory wait-free universal algorithm using consensus (cont’d) When can we recycle record #k? No process trying to thread record (k+n+1) or higher will write record k. After all the processes that thread records k…k+n terminate, record k can be freed. When process p finishes threading record m it releases records m-1…m-n. After record k is released by the operations threading records k+1…k+n – it can be recycled.
25
25 Each operation is represented by a shared record of type opr. released typedef opr structure { seq ;the operation’s sequential number (register) inv ;the operation invocation, including its parameters (register) new-state ;the new state of the object, after applying the operation (register) response ;The response of the operation, including its return value (register) after ;A pointer to the next record (consensus object) before ;A pointer to the previous record released[1..n] initially true A bounded-memory wait-free universal algorithm using consensus: data structures … Head anchor inv new-state response before after seq inv new-state response before after seq inv new-state response before after seq
26
26 Initially all Head and Announce entries point to the anchor record. 1.When inv occurs 2.point:=a free record from private pool, point.inv:=inv,point.seq:=0 for r:=1 to n do point.released[r]:=false, Announce[i]:=point 3. for j=0 to n-1 ; find a record with the maximum sequenece number 4. if Head[j].seq > Head[i].seq then Head[i]=Head[j] 5. while Announce[i].seq=0 do 6. priority:=Head[i].seq+1 mod n ; ID of process with priority 7. if Announce[priority].seq=0 ; If help is needed 8. then point:=Announce[priority] ; help the other process 9. else point:=Announce[i] ; perform own operation 10. win:=decide(Head[i].after, point) 11. :=apply(win.inv,Head[i].new-state) 12. win.before:=Head[i] 13. win.seq:=Head[i].seq+1 14. Head[i]=win 15.temp:=Announce[i].before 16.for r:=1 to n do 17. if temp<> anchor then 18. before-temp:=temp.before, temp.released[r]:=true, temp:= before-temp 19.return Announce[i].response A bounded-memory wait-free universal algorithm using consensus (cont’d)
27
27 How many records are required by the algorithm? Each incomplete operation may waste n distinct records There may be up to n incomplete operations At any point in time, up to n 2 non-recycable records All non-recycable records may belong to same process! Each pool should have O(n 2 ) records, O(n 3 ) total records needed
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.