Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation.

Slides:

Advertisements

Similar presentations

Impossibility of Distributed Consensus with One Faulty Process

Advertisements

CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 6 Instructor: Haifeng YU.

6.852: Distributed Algorithms Spring, 2008 Class 7.

Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©

Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.

The Byzantine Generals Problem Leslie Lamport, Robert Shostak, Marshall Pease Distributed Algorithms A1 Presented by: Anna Bendersky.

Prepared by Ilya Kolchinsky.  n generals, communicating through messengers  some of the generals (up to m) might be traitors  all loyal generals should.

Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©

Byzantine Generals Problem: Solution using signed messages.

CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.

1 Principles of Reliable Distributed Systems Lectures 11: Authenticated Byzantine Consensus Spring 2005 Dr. Idit Keidar.

1 Principles of Reliable Distributed Systems Lecture 6: Synchronous Uniform Consensus Spring 2005 Dr. Idit Keidar.

1 Principles of Reliable Distributed Systems Lecture 3: Synchronous Uniform Consensus Spring 2006 Dr. Idit Keidar.

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.

Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.

CPSC 668Set 15: Broadcast1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.

CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.

CPSC 668Set 16: Distributed Shared Memory1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 10: SMR with Paxos.

CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Synchronous Byzantine.

1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 9: SMR with Paxos.

1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 3: Fault-Tolerant.

Impossibility of Distributed Consensus with One Faulty Process Michael J. Fischer Nancy A. Lynch Michael S. Paterson Presented by: Oren D. Rubin.

CPSC 668Set 15: Broadcast1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 1: Introduction.

Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Impossibility.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Synchronous Byzantine.

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Principles of Reliable Distributed Systems Recitation.

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 12: Impossibility.

Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.

Systems of Distributed systems Module 2 - Distributed algorithms Teaching unit 2 – Properties of distributed algorithms Ernesto Damiani University of Bozen.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.

1 Principles of Reliable Distributed Systems Recitation 7 Byz. Consensus without Authentication ◊S-based Consensus Spring 2008 Alex Shraer.

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.

1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 15: Broadcast 1.

Consensus and Its Impossibility in Asynchronous Systems.

9.4 Mathematical Induction

CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 8 Instructor: Haifeng YU.

1 The Byzantine Generals Problem Leslie Lamport, Robert Shostak, Marshall Pease Presented by Radu Handorean.

Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.

Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.

SysRép / 2.5A. SchiperEté The consensus problem.

Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb

DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 9: Fault Tolerant Consensus 1.

CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 9 Instructor: Haifeng YU.

1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.

1 SECOND PART Algorithms for UNRELIABLE Distributed Systems: The consensus problem.

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS

Agreement Protocols CS60002: Distributed Systems

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS

Distributed systems Consensus

CSE 486/586 Distributed Systems Reliable Multicast --- 2

Presentation transcript:

Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Byzantine Synchronous Consensus Spring 2009 Alex Shraer

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Byzantine Synchronous Consensus נדיח את מרינה נדיח את גיא מרינה גיא

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Model Round-based synchronous –Send messages to any set of processes; –Receive messages from this round; –Do local processing (possibly decide, halt) Static set P = {p 1, …, p n } of processes t-out-of-n Byzantine (arbitrary) failures Authentication Messages between correct processes cannot be lost

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Validity and Byzantine Failures Validity – Decision is input of one process Why is that a problem when Byzantine failures can occur? –What is the input of a Byzantine process? Why would we be ok with deciding on this input –A Byzantine leader can lie about its input Strong unanimity - If the input of all correct processes is v then no correct process decides a value other than v

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Weak Unanimity Weak Unanimity: If the input of all the correct processes is v and no process fails then no correct process decides a value other than v We will next see an algorithm for t<n with authentication and Weak Unanimity

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Algorithm (for any t<n) The proposed algorithm is not symmetric (not all processes use the same rules) –One process, p 1, is defined as the leader –Leader’s input – v 1 There is a “default” value, known a-priori v default  {possible decision values}

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Algorithm Process p 1 : SendBuffer = {v 1 }; All other procs: SendBuffer = { } In every round 1 ≤ k ≤ t+1 do –For every message m in SendBuffer Send pi to all the processes that did not sign m –Clear SendBuffer –Receive round k messages –For every received message m, if m has k different valid signatures beginning with p 1 ’s Valid = Valid  {v}, where v is the value received in m SendBuffer = SendBuffer  {m} if Valid contains exactly one value, decide it else decide v default Proof of Termination – trivial In the proof, we will call such messages legitimate

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Proof of Weak Unanimity Weak Unanimity: If the input of all the correct processes is v and no process fails then no correct process decides a value other than v If p 1 (the leader) is correct –All correct processes get v 1 in round 1 and insert into Valid –No other value is inserted into Valid only messages beginning with p 1 ’s signature are considered processes cannot forge leader’s signature –All correct processes decide on v 1 If p 1 is not correct –Weak Unanimity requires nothing

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Why don’t we get Strong Unanimity from this algorithm? Strong unanimity - If the input of all correct processes is v then no correct process decides a value other than v If p 1 is Byzantine, it can send the same value to all processes, but this value can be different than that of correct processes

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Why do we need t+1 rounds? Suppose that a correct process receives a value at the end of the last round and no other correct process has this value… –Can this happen if there are t rounds? –Can this happen if there are t+1 rounds?

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Agreement Lemma: For every two correct processes p i and p j, if v  Value i at the end of round t+1, then v  Value j at the end of round t+1 –i.e., Valid sets of correct processes are the same Then, agreement follows –if the sets are empty or contain more than one value, every correct process decides v default –Otherwise all correct processes decide on the single value in Valid

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Proving the Lemma Lemma: For every two correct processes p i and p j, if v  Value i at the end of round t+1, then v  Value j at the end of round t+1 –i.e., Valid sets of correct processes are the same Consider a correct process p i Suppose that v  Value i at the end of round t+1 –When was v added to Value i ? Denote this round by k –There are two cases: k ≤ t and k = t+1 –We need to prove that by the end of round t+1, v  Value j for every correct client p j –Note: v was a legitimate value when p i received it in round k

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 Proof of Lemma Case 1: k ≤ t –Since links from p i to other correct processes cannot lose messages, all correct processes receive v by round k+1 and add it to Valid if its not already there Case 2: k = t+1 The first t processes that signed the message must be faulty –Otherwise, p i would receive v in an earlier round from a correct process the last process p that signed the message is correct –v is a legitimate message received in round t+1, thus all t+1 signatures on v are different. But there are only t faulty processes p received v in round t From Case 1 we know that all correct processes receive v by round t+1 and add it to Valid if its not already there

Q1 from HW2 – part (b) Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring p1p1 p2p2 p3p3 m1m1 m2m2 deliver(m 1 ) deliver(m 2 ) bcast(m 1 ) bcast(m 2 ) Prove that in the absence of failures a broadcast algorithm that guarantees FIFO + Total Order - also guarantees Causal Order First, the broadcast must be RELIABLE. Otherwise, the statement is not true. Counter example: FIFO is trivially preserved here since each process bcasts only one message TOTAL order is trivially preserved, since only one process delivers 2 messages Causal is not preserved! Is this execution reliable?

Q1 HW2 – part (c) Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Does this claim hold when there are failures in the system? The claim doesn’t hold: p 3 never delivers m 1 - this is allowed by the Validity and agreement of reliable broadcast since p 1 faulty. It is also allowed by Agreement of reliable broadcast because p 1 and p 2 are faulty and therefore p 3 can deliver different messages Need to explain why 3 properties of Reliable Broadcast preserved Need to explain why FIFO and TOTAL order are preserved Need to explain why Causal order is violated p1 p2 p3 m1m1 m2m2 bcast(m 2 ) bcast(m 1 ) deliver(m 2 )

Q1 from HW2 – part (a) Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Suppose that m →m’. This means that bcast(m) → bcast(m’). This means that there exists a chain of bcast-deliver events Induction on the number deliver events in the chain from m* to m’. If there are 0 deliver events, the claim holds by FIFO Assume for k, and lets show for k+1: Observe the last sub-chain of the form: Since all processes are correct, by Validity of the Reliable Broadcast, p 2 delivers m’. From Integrity, this is after deliver(m 1 ) From TOTAL, all processes deliver m 1 and m’ in the same order as p 2. m* is delivered before m 1 at all processes (induction assumption) => m* is delivered before m’. If m  m*, from FIFO m is delivered before m* bcast(m) deliver(m*)bcast(m’’) deliver(m’’) bcast(m’) deliver(m 1 ) bcast(m’) bcast(m 1 ) p1p1 p2p2 bcast(m*)

Q3 from HW2 1. Initially: 2. TS[j] ← 0 for all 1≤j≤ n /* array of integers */ 3. pending ← empty /* set of messages */ 4. abcast(msg): 5. TS[i] ← TS[i] bcast( msg, 〈 TS[i], i 〉 ) 7. upon recv( msg, 〈 ts, j 〉 ): 8. add( pending, ( msg, 〈 ts, j 〉 ) ) 9. TS[j] ← ts 10. TS[i] ← max( TS[i], ts ) 11. forever do 12. let ( msg, 〈 ts, j 〉 ) be the entry in pending with the smallest 〈 t, j 〉 13. if 〈 ts, j 〉 ≤ 〈 TS[k], k 〉 for all 1≤k≤ n then 14. remove( pending, ( msg, 〈 ts, j 〉 ) ) 15. adeliver( msg ) האלגוריתם לעיל נבדל מאלגוריתם LTS שנלמד בכיתה בשורות 5 ו 10 Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring m2m1 p1 p2 p Delivery according to the new algorithm Q3 from HW2 – part (a)

Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Q3 from HW2 – part (c) The original LTS would deliver m1 at time 7 When would it deliver m2?