1 Principles of Reliable Distributed Systems Recitation 7 Byz. Consensus without Authentication ◊S-based Consensus Spring 2008 Alex Shraer
22 Part I Berman & Garay Algorithm for Unauthenticated Byzantine Consensus ([AW] section 5.2.5)
33 Reminder Synchronous, Byzantine Fault-Tolerant, t-resilient consensus algorithms – –Strong unanimity with authentication iff t < n/2 Lecture 6 –Weak unanimity with authentication: iff t < n Recitation 6 –Without authentication: iff t < n/3 EIG algorithm in Lecture 6, we didn’t go over it AW section 5.2.4
44 Model: Unauthenticated Byzantine Round-based synchronous Static set P = {p 1, …, p n } of processes t-out-of-n Byzantine (arbitrary) failures No signatures (no authentication) –But: a receiver of the message can determine who sent it secure point-to-point channels Messages between correct processes are not lost This model was defined in Lamport, Pease, Shostak: The Byzantine Generals Problem, 1980 The Byzantine Generals Problem
5 Berman & Garay (this tutorial) 2(t+1) rounds –Twice the optimal –Not early-deciding n/4-resilient –Not optimal O(1) message size –Optimal EIG Algorithm (lecture 6, but we didn’t go over this in class) t+1 rounds –Optimal –Not early-deciding t<n/3 –Optimal Exponential messages –Size (n t+2 )
6 Berman & Garay: Algorithm Structure Every process has a preference –Initialized to its input –After t+1 phases becomes a decision t+1 phases, 2 rounds each –Process p k is the king of phase k
7 The Algorithm’s phases Odd round (round 2k – 1, 1 k t+1) –Processes exchange preference values –Compute the majority value (┴ if none) –Denote by mult the #votes for majority Even round (round 2k, 1 k t+1) –King (process p k ) broadcasts its majority value –Receive king’s majority (┴ if none) –Update preference: if (mult > n/2 + t) then preference majority else preference king’s majority After t+1 phases, decide on preference Phase Note: king is ignored if majority has > n/2+t votes
8 Correctness Termination: immediate Lemma 1: If all correct processes prefer v at the beginning of phase k, then they prefer v at the end of phase k. Proof: Suppose that all correct processes prefer v at the beginning of phase k Each process receives at least n – t copies of v (including its own) in the first round of phase k Note: n > 4t n/2 > 2t n > n/2+ 2t n – t > n/2+ t Thus, all correct processes will prefer v at the end of phase k
9 Validity Validity (Strong Unanimity) –If the input of all the correct processes is v then no correct process decides a value other than v Proof (by induction, using Lemma 1): If all correct processes start with the same input v They continue to prefer v throughout the phases –by Lemma 1; and –since the preference at the end of one phase is the preference at the beginning of the next Finally, they decide on v at the end of phase k + 1
10 Agreement Observation: There are at most t faulty processes, and t+1 phases. Therefore, there is at least one phase whose king is correct Lemma 2: Let k be a phase whose king p k is correct. Then all the correct processes finish this phase with the same preference Proof: consider phase k Case 1: All correct processes use king’s majority for their preference. Since the king is correct it sends everyone the same value, and we’re done Case 2: Some correct process p i uses its own majority value, maj for its preference Thus, in the first round, p i receives more than n/2 + t votes for v More than n/2 correct processes sent v Every correct process receives more than n/2 votes for v and sets its majority to be v in the first round. Including the king! Thus, in the second round, all correct processes set the preference to v, whether they adopt the king’s majority or their own
11 Agreement – Cont. Lemma 2: Let k be a phase whose king p k is correct. Then all the correct processes finish this phase with the same preference Thus, at the end of phase k all processes have the same preference –If k=t + 1, they all decide the same –Otherwise, they all have the same preference at the start of phase k+1 and we continue using Lemma 1
12 Optimal Synchronous Byzantine Agreement in all Regards? Garay & Moses algorithm (STOC 1993) –n/3-resilient –t+1 rounds –Polynomial messages
13 Part II ◊S-based Consensus [Mostefaoui, Raynal 99]
14 Reminder: ◊P and ◊S Failure Detectors ◊P - Eventually Perfect: –Strong Completeness: From some point on, every faulty process is suspected by every correct process –Eventual Strong Accuracy: From some point on, no correct process is suspected ◊S - Eventually Strong: –Strong Completeness –Eventual Weak Accuracy: There exists some correct process that is not suspected by any correct process from some point on Processes do not know who this process is
15 Our Model n processes 1,…,n Reliable links between correct processes Asynchronous –Messages can be delayed arbitrarily Non-assumption –Processes take steps at asynchronous times No clocks ◊S failure detector t<n/2 crash failures –Optimal for ◊S (Chandra, Toueg JACM 96)Chandra, Toueg JACM 96 –A process that crashes at any point in a run is faulty in that run
16 ◊S-based Consensus: MR Algorithm [ Mostefaoui, Raynal 99 ] Asynchronous rounds: –Each process locally progresses through rounds r = 1, 2, 3, … –Different processes can progress at different times Rotating coordinator –Process i mod n is the coordinator of round i Each round consists of two phases
17 val input; est || for r =1, 2, … do coord (r-1 mod n)+1 if I am coord, then send (r,val) to all wait for ( (r, v) from coord OR suspect coord (by ◊S)) if receive v from coord then est v else est send (r, est) to all wait for (r,e) from n-t processes if any non- value e received then val e if all received e’s have same non- value v then send (“decide”, v) to all return(v) || Upon receive (“decide”, v), forward to all; return(v) 1 2 <>S-based Consensus [ Mostefaoui, Raynal 99 ] [ Mostefaoui, Raynal 99 ] Note: the only values sent are the coord’s val and Note: return is like decide and halt
18 MR Principles: Phase 1 The purpose of the 1 st phase: –Ensure that for every p i, est i {val coord, } Progress: why does the 1 st phase terminate? –By Strong Completeness property of <>S, if the coordinator crashes, then every correct process will eventually either receive a message from the coordinator, or suspect the crashed coordinator Note: Because of asynchrony, and since the failure detector is unreliable, some of the processes may have est = null while others have est = val coord
19 MR Principles: Phase 2 A process p i finishes the 2 nd phase when it has received (r, est) from a majority of processes Why is the majority important? –Every two majority sets intersect –If one process got n-t values of v: if all received e’s have same non- value v then send (“decide”, v) to all return(v) –then some other process got at least one value of v: if any non- value e received then val e –Thus, If process p i decides v during r, and if process p j progresses to r+1, then p j does it with est = v The purpose of the 2 nd phase: –Ensure that the Agreement property is never violated Progress: why does the 2 nd phase terminate? –Since there are at least n-t correct processes
20 Second Phase (cont’d) Notation: –v = val coord –rec i – the set of received est values at the end of phase II. –rec i = { } or {v} or {v, } Consider three cases: 1.rec i = {v} (rec j = {v}) or (rec j = { , v}) decide v 2.rec i = { } (rec j = { }) or (rec j = { , v}) skip to the next round 3.rec i = {v, } (rec j = {v}) or (rec j = { }) or (rec j = { , v}) update est to v (why?)
21 val input; est || for r =1, 2, … do coord (r-1 mod n)+1 if I am coord, then send (r,val) to all wait for ( (r, v) from coord OR suspect coord (by ◊S)) if receive v from coord then est v else est send (r, est) to all wait for (r,e) from n-t processes if any non- value e received then val e if all received e’s have same non- value v then send (“decide”, v) to all return(v) || Upon receive (“decide”, v), forward to all; return(v) Do all processes decide at the same time? 1 2 Why do we need “send (“decide”, v) to all”?
22 n=4, t=1 v v v v v v v v v return(v) Round 1 Round 2 Suspect p1 They are stuck waiting for n – t = 3 messages
23 Disseminating the decision Q: ok, so we need the “ send (“decide”, v) to all ”. But why “forward to all”? –|| Upon receive (“decide”, v), forward to all; return(v) A: to prevent a process from blocking forever. A process that decides uses reliable broadcast to disseminate its decision value.
24 n=4, t=1 v v v v v v v v v v Round 1 Round 2 Suspect p1 v return(v) The “decide” message reaches only one process since the sender crashes. We need the receiver to forward to all, i.e., reliable broadcast Stuck again…