1 Principles of Reliable Distributed Systems Recitation 7 Byz. Consensus without Authentication ◊S-based Consensus Spring 2008 Alex Shraer.

Slides:



Advertisements
Similar presentations
Fault Tolerance. Basic System Concept Basic Definitions Failure: deviation of a system from behaviour described in its specification. Error: part of.
Advertisements

Impossibility of Distributed Consensus with One Faulty Process
6.852: Distributed Algorithms Spring, 2008 Class 7.
Distributed Systems Overview Ali Ghodsi
Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©
Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.
Byzantine Generals Problem: Solution using signed messages.
Failure Detectors. Can we do anything in asynchronous systems? Reliable broadcast –Process j sends a message m to all processes in the system –Requirement:
CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
1 Principles of Reliable Distributed Systems Lectures 11: Authenticated Byzantine Consensus Spring 2005 Dr. Idit Keidar.
© Idit Keidar and Sergio Rajsbaum; PODC 2002 On the Cost of Fault-Tolerant Consensus When There are no Faults Idit Keidar and Sergio Rajsbaum PODC 2002.
1 Principles of Reliable Distributed Systems Lecture 6: Synchronous Uniform Consensus Spring 2005 Dr. Idit Keidar.
Failure Detectors & Consensus. Agenda Unreliable Failure Detectors (CHANDRA TOUEG) Reducibility ◊S≥◊W, ◊W≥◊S Solving Consensus using ◊S (MOSTEFAOUI RAYNAL)
1 Principles of Reliable Distributed Systems Lecture 3: Synchronous Uniform Consensus Spring 2006 Dr. Idit Keidar.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
Sergio Rajsbaum 2006 Lecture 3 Introduction to Principles of Distributed Computing Sergio Rajsbaum Math Institute UNAM, Mexico.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Asynchronous Consensus (Some Slides borrowed from ppt on Web.(by Ken Birman) )
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 10: SMR with Paxos.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Synchronous Byzantine.
Randomized Byzantine Agreements (Sam Toueg 1984).
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 9: SMR with Paxos.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
Sergio Rajsbaum 2006 Lecture 4 Introduction to Principles of Distributed Computing Sergio Rajsbaum Math Institute UNAM, Mexico.
1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Impossibility.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Synchronous Byzantine.
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.
CPSC 668Set 11: Asynchronous Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 12: Impossibility.
Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.
Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.
On the Cost of Fault-Tolerant Consensus When There are no Faults Idit Keidar & Sergio Rajsbaum Appears in SIGACT News; MIT Tech. Report.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Efficient Algorithms to Implement Failure Detectors and Solve Consensus in Distributed Systems Mikel Larrea Departamento de Arquitectura y Tecnología de.
Composition Model and its code. bound:=bound+1.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 8: Failure Detectors.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Failure detection and consensus Ludovic Henrio CNRS - projet OASIS Distributed Algorithms.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 11: Asynchronous Consensus 1.
Consensus and Its Impossibility in Asynchronous Systems.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 8 Instructor: Haifeng YU.
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
Hwajung Lee. Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit or Abort.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
SysRép / 2.5A. SchiperEté The consensus problem.
Failure Detectors n motivation n failure detector properties n failure detector classes u detector reduction u equivalence between classes n consensus.
Replication predicates for dependent-failure algorithms Flavio Junqueira and Keith Marzullo University of California, San Diego Euro-Par Conference, Lisbon,
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 9: Fault Tolerant Consensus 1.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 9 Instructor: Haifeng YU.
1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.
1 SECOND PART Algorithms for UNRELIABLE Distributed Systems: The consensus problem.
Unreliable Failure Detectors for Reliable Distributed Systems Tushar Deepak Chandra Sam Toueg Presentation for EECS454 Lawrence Leinweber.
1 AGREEMENT PROTOCOLS. 2 Introduction Processes/Sites in distributed systems often compete as well as cooperate to achieve a common goal. Mutual Trust/agreement.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Distributed Consensus
Agreement Protocols CS60002: Distributed Systems
Distributed Systems, Consensus and Replicated State Machines
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Distributed systems Consensus
Presentation transcript:

1 Principles of Reliable Distributed Systems Recitation 7 Byz. Consensus without Authentication ◊S-based Consensus Spring 2008 Alex Shraer

22 Part I Berman & Garay Algorithm for Unauthenticated Byzantine Consensus ([AW] section 5.2.5)

33 Reminder Synchronous, Byzantine Fault-Tolerant, t-resilient consensus algorithms – –Strong unanimity with authentication iff t < n/2 Lecture 6 –Weak unanimity with authentication: iff t < n Recitation 6 –Without authentication: iff t < n/3 EIG algorithm in Lecture 6, we didn’t go over it AW section 5.2.4

44 Model: Unauthenticated Byzantine Round-based synchronous Static set P = {p 1, …, p n } of processes t-out-of-n Byzantine (arbitrary) failures No signatures (no authentication) –But: a receiver of the message can determine who sent it secure point-to-point channels Messages between correct processes are not lost This model was defined in Lamport, Pease, Shostak: The Byzantine Generals Problem, 1980 The Byzantine Generals Problem

5 Berman & Garay (this tutorial) 2(t+1) rounds –Twice the optimal –Not early-deciding n/4-resilient –Not optimal O(1) message size –Optimal EIG Algorithm (lecture 6, but we didn’t go over this in class) t+1 rounds –Optimal –Not early-deciding t<n/3 –Optimal Exponential messages –Size  (n t+2 )

6 Berman & Garay: Algorithm Structure Every process has a preference –Initialized to its input –After t+1 phases becomes a decision t+1 phases, 2 rounds each –Process p k is the king of phase k

7 The Algorithm’s phases Odd round (round 2k – 1, 1  k  t+1) –Processes exchange preference values –Compute the majority value (┴ if none) –Denote by mult the #votes for majority Even round (round 2k, 1  k  t+1) –King (process p k ) broadcasts its majority value –Receive king’s majority (┴ if none) –Update preference: if (mult > n/2 + t) then preference  majority else preference  king’s majority After t+1 phases, decide on preference Phase Note: king is ignored if majority has > n/2+t votes

8 Correctness Termination: immediate Lemma 1: If all correct processes prefer v at the beginning of phase k, then they prefer v at the end of phase k. Proof: Suppose that all correct processes prefer v at the beginning of phase k Each process receives at least n – t copies of v (including its own) in the first round of phase k Note: n > 4t  n/2 > 2t  n > n/2+ 2t  n – t > n/2+ t Thus, all correct processes will prefer v at the end of phase k

9 Validity Validity (Strong Unanimity) –If the input of all the correct processes is v then no correct process decides a value other than v Proof (by induction, using Lemma 1): If all correct processes start with the same input v They continue to prefer v throughout the phases –by Lemma 1; and –since the preference at the end of one phase is the preference at the beginning of the next Finally, they decide on v at the end of phase k + 1

10 Agreement Observation: There are at most t faulty processes, and t+1 phases. Therefore, there is at least one phase whose king is correct Lemma 2: Let k be a phase whose king p k is correct. Then all the correct processes finish this phase with the same preference Proof: consider phase k Case 1: All correct processes use king’s majority for their preference. Since the king is correct it sends everyone the same value, and we’re done Case 2: Some correct process p i uses its own majority value, maj for its preference Thus, in the first round, p i receives more than n/2 + t votes for v  More than n/2 correct processes sent v Every correct process receives more than n/2 votes for v and sets its majority to be v in the first round. Including the king! Thus, in the second round, all correct processes set the preference to v, whether they adopt the king’s majority or their own

11 Agreement – Cont. Lemma 2: Let k be a phase whose king p k is correct. Then all the correct processes finish this phase with the same preference Thus, at the end of phase k all processes have the same preference –If k=t + 1, they all decide the same –Otherwise, they all have the same preference at the start of phase k+1 and we continue using Lemma 1

12 Optimal Synchronous Byzantine Agreement in all Regards? Garay & Moses algorithm (STOC 1993) –n/3-resilient –t+1 rounds –Polynomial messages

13 Part II ◊S-based Consensus [Mostefaoui, Raynal 99]

14 Reminder: ◊P and ◊S Failure Detectors ◊P - Eventually Perfect: –Strong Completeness: From some point on, every faulty process is suspected by every correct process –Eventual Strong Accuracy: From some point on, no correct process is suspected ◊S - Eventually Strong: –Strong Completeness –Eventual Weak Accuracy: There exists some correct process that is not suspected by any correct process from some point on Processes do not know who this process is

15 Our Model n processes 1,…,n Reliable links between correct processes Asynchronous –Messages can be delayed arbitrarily Non-assumption –Processes take steps at asynchronous times No clocks ◊S failure detector t<n/2 crash failures –Optimal for ◊S (Chandra, Toueg JACM 96)Chandra, Toueg JACM 96 –A process that crashes at any point in a run is faulty in that run

16 ◊S-based Consensus: MR Algorithm [ Mostefaoui, Raynal 99 ] Asynchronous rounds: –Each process locally progresses through rounds r = 1, 2, 3, … –Different processes can progress at different times Rotating coordinator –Process i mod n is the coordinator of round i Each round consists of two phases

17 val  input; est   || for r =1, 2, … do coord  (r-1 mod n)+1 if I am coord, then send (r,val) to all wait for ( (r, v) from coord OR suspect coord (by ◊S)) if receive v from coord then est  v else est   send (r, est) to all wait for (r,e) from n-t processes if any non-  value e received then val  e if all received e’s have same non-  value v then send (“decide”, v) to all return(v) || Upon receive (“decide”, v), forward to all; return(v) 1 2 <>S-based Consensus [ Mostefaoui, Raynal 99 ] [ Mostefaoui, Raynal 99 ] Note: the only values sent are the coord’s val and  Note: return is like decide and halt

18 MR Principles: Phase 1 The purpose of the 1 st phase: –Ensure that for every p i, est i  {val coord,  } Progress: why does the 1 st phase terminate? –By Strong Completeness property of <>S, if the coordinator crashes, then every correct process will eventually either receive a message from the coordinator, or suspect the crashed coordinator Note: Because of asynchrony, and since the failure detector is unreliable, some of the processes may have est = null while others have est = val coord

19 MR Principles: Phase 2 A process p i finishes the 2 nd phase when it has received (r, est) from a majority of processes Why is the majority important? –Every two majority sets intersect –If one process got n-t values of v: if all received e’s have same non-  value v then send (“decide”, v) to all return(v) –then some other process got at least one value of v: if any non-  value e received then val  e –Thus, If process p i decides v during r, and if process p j progresses to r+1, then p j does it with est = v The purpose of the 2 nd phase: –Ensure that the Agreement property is never violated Progress: why does the 2 nd phase terminate? –Since there are at least n-t correct processes

20 Second Phase (cont’d) Notation: –v = val coord –rec i – the set of received est values at the end of phase II. –rec i = {  } or {v} or {v,  } Consider three cases: 1.rec i = {v}  (rec j = {v}) or (rec j = { , v})  decide v 2.rec i = {  }  (rec j = {  }) or (rec j = { , v})  skip to the next round 3.rec i = {v,  }  (rec j = {v}) or (rec j = {  }) or (rec j = { , v})  update est to v (why?)

21 val  input; est   || for r =1, 2, … do coord  (r-1 mod n)+1 if I am coord, then send (r,val) to all wait for ( (r, v) from coord OR suspect coord (by ◊S)) if receive v from coord then est  v else est   send (r, est) to all wait for (r,e) from n-t processes if any non-  value e received then val  e if all received e’s have same non-  value v then send (“decide”, v) to all return(v) || Upon receive (“decide”, v), forward to all; return(v) Do all processes decide at the same time? 1 2 Why do we need “send (“decide”, v) to all”?

22 n=4, t=1 v v v v  v v v v v return(v) Round 1 Round 2 Suspect p1 They are stuck waiting for n – t = 3 messages

23 Disseminating the decision Q: ok, so we need the “ send (“decide”, v) to all ”. But why “forward to all”? –|| Upon receive (“decide”, v), forward to all; return(v) A: to prevent a process from blocking forever. A process that decides uses reliable broadcast to disseminate its decision value.

24 n=4, t=1 v v v v  v v v v v v Round 1 Round 2 Suspect p1 v return(v) The “decide” message reaches only one process since the sender crashes. We need the receiver to forward to all, i.e., reliable broadcast Stuck again…