Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.

Slides:



Advertisements
Similar presentations
Distributed systems Total Order Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
Advertisements

Impossibility of Distributed Consensus with One Faulty Process
Prof R. Guerraoui Distributed Programming Laboratory
The weakest failure detector question in distributed computing Petr Kouznetsov Distributed Programming Lab EPFL.
A General Characterization of Indulgence R. Guerraoui EPFL joint work with N. Lynch (MIT)
1 Distributed systems Causal Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
Teaser - Introduction to Distributed Computing
6.852: Distributed Algorithms Spring, 2008 Class 7.
Distributed Systems Overview Ali Ghodsi
Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.
1 © R. Guerraoui Implementing the Consensus Object with Timing Assumptions R. Guerraoui Distributed Programming Laboratory.
1 © P. Kouznetsov On the weakest failure detector for non-blocking atomic commit Rachid Guerraoui Petr Kouznetsov Distributed Programming Laboratory Swiss.
Byzantine Generals Problem: Solution using signed messages.
Failure Detectors. Can we do anything in asynchronous systems? Reliable broadcast –Process j sends a message m to all processes in the system –Requirement:
1 Principles of Reliable Distributed Systems Lecture 6: Synchronous Uniform Consensus Spring 2005 Dr. Idit Keidar.
Failure Detectors & Consensus. Agenda Unreliable Failure Detectors (CHANDRA TOUEG) Reducibility ◊S≥◊W, ◊W≥◊S Solving Consensus using ◊S (MOSTEFAOUI RAYNAL)
1 Principles of Reliable Distributed Systems Lecture 3: Synchronous Uniform Consensus Spring 2006 Dr. Idit Keidar.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Asynchronous Consensus (Some Slides borrowed from ppt on Web.(by Ken Birman) )
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Synchronous Byzantine.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Distributed Systems Non-Blocking Atomic Commit
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Synchronous Byzantine.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 12: Impossibility.
Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.
On the Cost of Fault-Tolerant Consensus When There are no Faults Idit Keidar & Sergio Rajsbaum Appears in SIGACT News; MIT Tech. Report.
Systems of Distributed systems Module 2 - Distributed algorithms Teaching unit 2 – Properties of distributed algorithms Ernesto Damiani University of Bozen.
Distributed Systems Terminating Reliable Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Efficient Algorithms to Implement Failure Detectors and Solve Consensus in Distributed Systems Mikel Larrea Departamento de Arquitectura y Tecnología de.
1 Principles of Reliable Distributed Systems Recitation 7 Byz. Consensus without Authentication ◊S-based Consensus Spring 2008 Alex Shraer.
Composition Model and its code. bound:=bound+1.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 8: Failure Detectors.
Paxos Made Simple Jinghe Zhang. Introduction Lock is the easiest way to manage concurrency Mutex and semaphore. Read and write locks. In distributed system:
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Failure detection and consensus Ludovic Henrio CNRS - projet OASIS Distributed Algorithms.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
Consensus and Its Impossibility in Asynchronous Systems.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 8 Instructor: Haifeng YU.
1 © R. Guerraoui Regular register algorithms R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch.
CS294, Yelick Consensus revisited, p1 CS Consensus Revisited
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt.
Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
SysRép / 2.5A. SchiperEté The consensus problem.
1 © R. Guerraoui Distributed algorithms Prof R. Guerraoui Assistant Marko Vukolic Exam: Written, Feb 5th Reference: Book - Springer.
Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.
Chapter 21 Asynchronous Network Computing with Process Failures By Sindhu Karthikeyan.
Failure Detectors n motivation n failure detector properties n failure detector classes u detector reduction u equivalence between classes n consensus.
Alternating Bit Protocol S R ABP is a link layer protocol. Works on FIFO channels only. Guarantees reliable message delivery with a 1-bit sequence number.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 9 Instructor: Haifeng YU.
1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.
Unreliable Failure Detectors for Reliable Distributed Systems Tushar Deepak Chandra Sam Toueg Presentation for EECS454 Lawrence Leinweber.
Distributed systems Total Order Broadcast
Alternating Bit Protocol
Distributed Systems, Consensus and Replicated State Machines
Presented By: Md Amjad Hossain
Distributed systems Reliable Broadcast
Distributed Algorithms for Failure Detection in Crash Environments
Failure Detectors motivation failure detector properties
Distributed systems Consensus
CSE 486/586 Distributed Systems Reliable Multicast --- 2
Distributed Systems Terminating Reliable Broadcast
Presentation transcript:

Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory

2 Consensus B A C

3 In the consensus problem, the processes propose values and have to agree on one among these values Solving consensus is key to solving many problems in distributed computing (e.g., total order broadcast, atomic commit, terminating reliable broadcast)

4 Consensus C1. Validity: Any value decided is a value proposed C2. Agreement: No two correct processes decide differently C3. Termination: Every correct process eventually decides C4. Integrity: No process decides twice

5 Consensus p1 p2 p3 propose(0) decide(1) propose(1) propose(0)decide(0) crash decide(0)

6 Uniform consensus C1. Validity: Any value decided is a value proposed C2’. Uniform Agreement: No two processes decide differently C3. Termination: Every correct process eventually decides C4. Integrity: No process decides twice

7 p1 p2 p3 propose(0) decide(0) propose(1) propose(0)decide(0) crash decide(0) Uniform consensus

8 Consensus Events Request: Indication: Properties: C1, C2, C3, C4

9 Modules of a process propose decide deliver suspect broadcast

10 Consensus algorithm I A P-based (fail-stop) consensus algorithm The processes exchange and update proposals in rounds and decide on the value of the non-suspected process with the smallest id [Gue95]

11 Consensus algorithm II A P-based (i.e., fail-stop) uniform consensus algorithm The processes exchange and update proposal in rounds, and after n rounds decide on the current proposal value [Lyn96]

12 Consensus algorithm III A <>P-based uniform consensus algorithm assuming a correct majority The processes alternate in the role of a coordinator until one of them succeeds in imposing a decision [DLS88,CT91]

13 Consensus algorithm I The processes go through rounds incrementally (1 to n): in each round, the process with the id corresponding to that round is the leader of the round The leader of a round decides its current proposal and broadcasts it to all A process that is not leader in a round waits (a) to deliver the proposal of the leader in that round to adopt it, or (b) to suspect the leader

14 Consensus algorithm I Implements: Consensus (cons). Uses: BestEffortBroadcast (beb). PerfectFailureDetector (P). upon event do suspected := empty; round := 1; currentProposal := nil; broadcast := delivered[] := false;

15 Consensus algorithm I upon event do suspected := suspected U {pi}; upon event do if currentProposal = nil then currentProposal := v;

16 Consensus algorithm I upon event do currentProposal := value; delivered[round] := true; upon eventdelivered[round] = true or p round  suspected do round := round + 1;

17 Consensus algorithm I upon event p round =self and broadcast=false and currentProposal  nil do trigger ; broadcast := true;

18 p1 p2 p3 propose(0)decide(0) propose(1) propose(0) Consensus algorithm I decide(0)

19 p1 p2 p3 propose(0) decide(0) propose(1) propose(0) decide(1) crash Consensus algorithm I

20 Correctness argument Let pi be the correct process with the smallest id in a run R. Assume pi decides v. If i = n, then pn is the only correct process. Otherwise, in round i, all correct processes receive v and will not decide anything different from v.

21 Consensus algorithm II Algorithm II implements uniform consensus The processes go through rounds incrementally (1 to n): in each round I, process pI sends its currentProposal to all. A process adopts any currentProposal it receives. Processes decide on their currentProposal values at the end of round n.

22 Consensus algorithm II Implements: Uniform Consensus (ucons). Uses: BestEffortBroadcast (beb). PerfectFailureDetector (P). upon event do suspected := empty; round := 1; currentProposal := nil; broadcast := delivered[] := false; decided := false

23 Consensus algorithm II upon event do suspected := suspected U {pi}; upon event do if currentProposal = nil then currentProposal := v;

24 Consensus algorithm II upon event do currentProposal := value; delivered[round] := true;

25 Consensus algorithm II upon eventdelivered[round] = true or p round  suspected do if round=n and decided=false then trigger decided=true else round := round + 1

26 Consensus algorithm II upon event p round = self and broadcast=false and currentProposal  nil do trigger ; broadcast := true;

27 Consensus algorithm II p1 p2 p3 propose(0) propose(1) propose(0) decide(0)

28 Correctness argument (A) Lemma: If a process pJ completes round I without receiving any message from pI and J > I, then pI crashes by the end of round J. Proof: Suppose pJ completes round I without receiving message from pI, J > I, and pI completes round J. Since pJ suspects pI in round I, pI has crashed before pJ completes round I. In round J either pI suspects pJ (not possible because pI crashes before pJ) or pI receives round J message from pJ (also not possible because pI crashes before pJ completes round I < J).

29 Correctness argument (B) Uniform agreement: Consider the process with the lowest id which decides, say pI. Thus, pI completes round n. By our previous lemma, in round I, every pJ with J > I receives the currentProposal of pI and adopts it. Thus, every process which sends a message after round I or decides, has the same currentProposal at the end of round I.

30 Consensus algorithm III A uniform consensus algorithm assuming: a correct majority a <>P failure detector Basic idea: the processes alternate in the role of a phase coordinator until one of them succeeds in imposing a decision

31 Consensus algorithm III <>P ensures: Strong completeness: eventually every process that crashes is permanently suspected by all correct processes Eventual strong accuracy: eventually no correct process is suspected by any process

32 "<>" makes a difference Eventual strong accuracy: strong accuracy holds only after finite time. Correct processes may be falsely suspected a finite number of times. This breaks consensus algorithms I and II Counterexamples for algorithm I and II (see next slide)

33 Agreement violated with <>P in algorithm I p1 p2 p3 propose(0) propose(1) decide(1) decide(0) suspect p1

34 Agreement violated with <>P in algorithm II p1 p2 p3 propose(0) propose(1) decide(1) decide(0) suspect p1 suspect p3suspect p2

35 Consensus algorithm III The algorithm is also round-based: processes move incrementally from one round to the other Process pi is leader in every round k such that k mod N = I In such a round, pi tries to decide (next 2 slides)

36 Consensus algorithm III pi succeeds if it is not suspected (processes that suspect pi inform pi and move to the next round; pi does so as well) If pi succeeds, pi uses a reliable broadcast to send the decision to all (the reliability of the broadcast is important here to preclude the case where pi crashes, some other processes delivers the message and stop and the rest keeps going without the majority)

37 Consensus algorithm III To decide, pi executes steps pi selects among a majority the latest adopted value (latest with respect to the round in which the value is adopted – see step 2) 2. pi imposes that value at a majority: any process in that majority adopts that value – pi fails if it is suspected 3. pi decides and broadcasts the decision to all

38 Consensus algorithm III P1 P2 P3 propose(0) propose(1) propose(0) p1’s round p2’s round p3’s round p1’s round etc

39 Consensus algorithm III P1 P2 P3 propose(0) decide(0) propose(1) propose(0) decide(0) [0] step 1 step 2 round 1 step 3 [1] [0]

40 P1 P2 P3 propose(0) propose(1) propose(0) crash Consensus algorithm III nack

41 Consensus algorithm III P1 P2 P3 propose(0) propose(1) p1’s round p2’s round p3’s round p1’s round etc 0 nack 0 1

42 Correctness argument A Validity and integrity are trivial Consider termination: if a correct process decides, it uses reliable broadcast to send the decision to all: every correct process decides Assume by contradiction that some process is correct and no correct process decides. We argue that this is impossible.

43 Correctness argument A’ By the correct majority assumption and the completeness property of the failure detector, no correct process remains blocked forever in some phase. By the accuracy property of the failure detector, some correct process reaches a phase where it is leader and it is not suspected and reaches a decision in that phase: a contradiction

44 Correctness argument B Consider now agreement Let k be the first round in which some process pi decides some value v, i.e., pi is the leader of round k and pi decides v in k This means that, in round k, a majority of processes have adopted v By the algorithm, no value else than v will be proposed (and hence decided) by any process in a round higher than k

45 Correctness argument B p1 pn n/2 round k round k+1 = leader of that round v decide(v) vwwwvwww impose v new acks gather

46 Agreement is never violated Look at "totally unreliable" failure detector (provides no guarantees) may always suspect everybody may never suspect anybody Agreement is never violated Can use the same correctness argument as before Termination not ensured (everybody may be suspected infinitely often)

47 Summary (Uniform) Consensus problem is an important problem to maintain consistency Three algorithms: I: consensus using P II: uniform consensus using P III: uniform consensus using <>P and a correct majority