1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.

1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer

2 Reminder: ◊P and ◊S Failure Detectors ◊P - Eventually Perfect: –Strong Completeness: From some point on, every faulty process is suspected by every correct process –Eventual Strong Accuracy: From some point on, no correct process is suspected ◊S - Eventually Strong: –Strong Completeness –Eventual Weak Accuracy: There exists some correct process that is not suspected by any correct process from some point on Processes do not know who this process is

3 Our Model n processes 1,…,n Reliable links between correct processes Asynchronous –Messages can be delayed arbitrarily Non-assumption –Processes take steps at asynchronous times No clocks ◊S failure detector t<n/2 crash failures –Optimal for ◊S (Chandra, Toueg JACM 96)Chandra, Toueg JACM 96 –A process that crashes at any point in a run is faulty in that run

4 ◊S-based Consensus: MR Algorithm [ Mostefaoui, Raynal 99 ] Asynchronous rounds: –Each process locally progresses through rounds r = 1, 2, 3, … –Different processes can progress at different times Rotating coordinator –Process i mod n is the coordinator of round i Each round consists of two phases

5 val  input; est   || for r =1, 2, … do coord  (r-1 mod n)+1 if I am coord, then send (r,val) to all wait for ( (r, v) from coord OR suspect coord (by ◊S)) if receive v from coord then est  v else est   send (r, est) to all wait for (r,e) from n-t processes if any non-  value e received then val  e if all received e’s have same non-  value v then send (“decide”, v) to all return(v) || Upon receive (“decide”, v), forward to all; return(v) 1 2 <>S-based Consensus [ Mostefaoui, Raynal 99 ] [ Mostefaoui, Raynal 99 ] Note: the only values sent are the coord’s val and  Note: return is like decide and halt

6 MR Principles: Phase 1 The purpose of the 1 st phase: –Ensure that for every p i, est i  {val coord,  } Progress: why does the 1 st phase terminate? –By Strong Completeness property of <>S, if the coordinator crashes, then every correct process will eventually either receive a message from the coordinator, or suspect the crashed coordinator Note: Because of asynchrony, and since the failure detector is unreliable, some of the processes may have est = null while others have est = val coord

7 MR Principles: Phase 2 A process p i finishes the 2 nd phase when it has received (r, est) from a majority of processes Why is the majority important? –Every two majority sets intersect –If one process got n-t values of v: if all received e’s have same non-  value v then send (“decide”, v) to all return(v) –then some other process got at least one value of v: if any non-  value e received then val  e –Thus, If process p i decides v during r, and if process p j progresses to r+1, then p j does it with est = v The purpose of the 2 nd phase: –Ensure that the Agreement property is never violated Progress: why does the 2 nd phase terminate? –Since there are at least n-t correct processes

8 Second Phase (cont’d) Notation: –v = val coord –rec i – the set of received est values at the end of phase II. –rec i = {  } or {v} or {v,  } Consider three cases: 1.rec i = {v}  (rec j = {v}) or (rec j = { , v})  decide v –pi knows that all correct processes also know v, and would either decide v or their est = v 2.rec i = {  }  (rec j = {  }) or (rec j = { , v})  skip to the next round –pi knows that all other processes include NULL in their rec j. No other process can decide 3.rec i = {v,  }  (rec j = {v}) or (rec j = {  }) or (rec j = { , v})  update est to v (why?) –Some process might have decided v, so EVERY pi proceeds to the next round with est=v

9 val  input; est   || for r =1, 2, … do coord  (r-1 mod n)+1 if I am coord, then send (r,val) to all wait for ( (r, v) from coord OR suspect coord (by ◊S)) if receive v from coord then est  v else est   send (r, est) to all wait for (r,e) from n-t processes if any non-  value e received then val  e if all received e’s have same non-  value v then send (“decide”, v) to all return(v) || Upon receive (“decide”, v), forward to all; return(v) Do all processes decide at the same time? 1 2 Why do we need “send (“decide”, v) to all”?

10 n=4, t=1 v v v v  v v v v v return(v) Round 1 Round 2 Suspect p1 They are stuck waiting for n – t = 3 messages

11 Disseminating the decision Q: ok, so we need the “ send (“decide”, v) to all ”. But why “forward to all”? –|| Upon receive (“decide”, v), forward to all; return(v) A: to prevent a process from blocking forever. A process that decides uses reliable broadcast to disseminate its decision value.

12 n=4, t=1 v v v v  v v v v v v Round 1 Round 2 Suspect p1 v return(v) The “decide” message reaches only one process since the sender crashes. We need the receiver to forward to all, i.e., reliable broadcast Stuck again…

1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.

Similar presentations

Presentation on theme: "1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.

Similar presentations

Presentation on theme: "1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer."— Presentation transcript:

Similar presentations

About project

Feedback