reaching agreement in the presence of faults

Slides:



Advertisements
Similar presentations
Fault Tolerance. Basic System Concept Basic Definitions Failure: deviation of a system from behaviour described in its specification. Error: part of.
Advertisements

+ The Byzantine Generals Problem Leslie Lamport, Robert Shostak and Marshall Pease Presenter: Jose Calvo-Villagran
Byzantine Generals. Outline r Byzantine generals problem.
Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The.
Teaser - Introduction to Distributed Computing
The Byzantine Generals Problem Boon Thau Loo CS294-4.
The Byzantine Generals Problem Leslie Lamport, Robert Shostak, Marshall Pease Distributed Algorithms A1 Presented by: Anna Bendersky.
Achieving Byzantine Agreement and Broadcast against Rational Adversaries Adam Groce Aishwarya Thiruvengadam Ateeq Sharfuddin CMSC 858F: Algorithmic Game.
Prepared by Ilya Kolchinsky.  n generals, communicating through messengers  some of the generals (up to m) might be traitors  all loyal generals should.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Byzantine Generals Problem: Solution using signed messages.
The Byzantine Generals Problem (M. Pease, R. Shostak, and L. Lamport) January 2011 Presentation by Avishay Tal.
Copyright 2006 Koren & Krishna ECE655/ByzGen.1 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing ECE 655.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
Randomized and Quantum Protocols in Distributed Computation Michael Ben-Or The Hebrew University Michael Rabin’s Birthday Celebration.
Last Class: Weak Consistency
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering.
The Byzantine Generals Strike Again Danny Dolev. Introduction We’ll build on the LSP presentation. Prove a necessary and sufficient condition on the network.
Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.
The Byzantine Generals Problem Leslie Lamport Robert Shostak Marshall Pease.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Lecture #12 Distributed Algorithms (I) CS492 Special Topics in Computer Science: Distributed Algorithms and Systems.
Ch11 Distributed Agreement. Outline Distributed Agreement Adversaries Byzantine Agreement Impossibility of Consensus Randomized Distributed Agreement.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
1 Chapter 12 Consensus ( Fault Tolerance). 2 Reliable Systems Distributed processing creates faster systems by exploiting parallelism but also improve.
1 Resilience by Distributed Consensus : Byzantine Generals Problem Adapted from various sources by: T. K. Prasad, Professor Kno.e.sis : Ohio Center of.
The Byzantine General Problem Leslie Lamport, Robert Shostak, Marshall Pease.SRI International presented by Muyuan Wang.
CSE 60641: Operating Systems Implementing Fault-Tolerant Services Using the State Machine Approach: a tutorial Fred B. Schneider, ACM Computing Surveys.
Hwajung Lee. Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit or Abort.
CSE 486/586 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
V1.7Fault Tolerance1. V1.7Fault Tolerance2 A characteristic of Distributed Systems is that they are tolerant of partial failures within the distributed.
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
Reaching Agreement in the Presence of Faults M. Pease, R. Shotak and L. Lamport Sanjana Patel Dec 3, 2003.
1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
Behavior of Byzantine Algorithm Chun Zhang. Index Introduction Experimental Setup Behavior Observation Result Analysis Conclusion Future Work.
PROCESS RESILIENCE By Ravalika Pola. outline: Process Resilience  Design Issues  Failure Masking and Replication  Agreement in Faulty Systems  Failure.
Distributed Agreement. Agreement Problems High-level goal: Processes in a distributed system reach agreement on a value Numerous problems can be cast.
1 AGREEMENT PROTOCOLS. 2 Introduction Processes/Sites in distributed systems often compete as well as cooperate to achieve a common goal. Mutual Trust/agreement.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586 Distributed Systems Byzantine Fault Tolerance
Synchronizing Processes
COMP28112 – Lecture 15 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 26-Jan-18 COMP28112.
When Is Agreement Possible
8.2. Process resilience Shreyas Karandikar.
DC7: More Coordination Chapter 11 and 14.2
COMP28112 – Lecture 14 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 13-Oct-18 COMP28112.
Byzantine Fault Tolerance
Outline Distributed Mutual Exclusion Distributed Deadlock Detection
CSE 486/586 Distributed Systems Byzantine Fault Tolerance
COMP28112 – Lecture 13 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 19-Nov-18 COMP28112.
Alternating Bit Protocol
Distributed Consensus
Agreement Protocols CS60002: Distributed Systems
Distributed Systems, Consensus and Replicated State Machines
Distributed Consensus
Jacob Gardner & Chuan Guo
Distributed Systems CS
Byzantine Generals Problem
Byzantine Faults definition and problem statement impossibility
Consensus in Synchronous Systems: Byzantine Generals Problem
The Byzantine Generals Problem
COMP28112 – Lecture 13 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 22-Feb-19 COMP28112.
EEC 688/788 Secure and Dependable Computing
COMP28112 – Lecture 13 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 24-Apr-19 COMP28112.
Byzantine Generals Problem
CSE 486/586 Distributed Systems Byzantine Fault Tolerance
Presentation transcript:

reaching agreement in the presence of faults Written By: Marshall Pease, Robert Shostak, Leslie Lamport Lecturer in charge: Oded Shmueli Presentation By: Shahar Yair

The Two Generals Problem Two generals need to agree on whether to attack the enemy city or retreat. If they come to an agreement then the war is either won or there will be peace. But if one attacks and the other retreats, then the war is lost and everyone is sad 

TCP/IP ? There is no perfect solution to the problem! If you respond, I’ll attack! If you respond, I’ll attack! We will attack! There is no perfect solution to the problem!

The Byzantine Agreement Problem

The Byzantine Agreement Problem “We imagine that several divisions of the Byzantine army are camped outside an enemy city, each division commanded by its own general. The generals can communicate with one another only by messengers. After observing the enemy, they must decide upon a common plan of action. However, some of the generals may be traitors, trying to prevent the loyal generals from reaching agreement.” Retreat! Attack! Attack! Attack! Retreat!

The Basic Byzantine Problem Assumptions: The Network works perfectly. There are N=4 generals. Each general has a unique name (color). One or more of the generals may be a traitor (Red). We’ll say there are M=1 traitors for now. All of the non-traitor generals must come to an agreement or we lose the war. Assume a previously appointed Super-General. Binary Values (A/R).

The Problem With N=4, M=1 Attack! Attack! Attack! Attack!

The Problem With N=4, M=1 [A,A,?] [Yellow, Green, Red] Rettack!! Attack! Attack! $#@! $#@! [A,A,?] [A,A,?] Attack! Attack! Attack! Attack!

The Problem With N=4, M=1 We win the war! Attack! [A,A,?] [A,A,?] Rettack!! Attack!

What If the Super-General is a Traitor? Rettack!! Attack! Attack! Retreat!

What If the Super-General is a Traitor? [A,A,R] [Yellow, Green, Blue] Retreat! Attack! Attack! Retreat! Retreat! [A,A,R] [A,A,R] Attack! Attack! Attack! Attack!

What If the Super-General is a Traitor? Rettack!! We win the war again! [A,A,R] [A,A,R] [A,A,R] Attack! Attack! Attack!

The General Byzantine Problem Assumptions: The Network works perfectly. There are N processors (shown as nodes). Each processor has a unique ID, (IDi). There are M=1 faulty processors. All of the non- faulty processors must come to an agreement. The system is asynchronous. ID1 ID2 ID3 ID4

The Byzantine Agreement ID1 ID2 V1 V1 V1 ID3 ID4

The Byzantine Agreement ID1 ID2 V2 V2 V2 ID3 ID4

The Byzantine Agreement ID1 ID2 V3 V3 ID3 ID4 V3

The Byzantine Agreement ID1 ID2 V4 ID3 ID4 V5

The Byzantine Agreement [V1,V2,V3,V4] [V1,V2,V3,NIL] ID1 ID2 [V1,V2,V3,V5] [$#@!] ID3 ID4

The Byzantine Agreement [V1,V2,V3,NIL] [V1,V2,V3,NIL] ID1 ID2 [V1,V2,V3,V5] [$#@!] ID3 ID4

The Byzantine Agreement [V1,V2,V3,V4] [V1,V2,V3,NIL] ID1 ID2 [V1,V2,V3,V5] [$#@!] ID3 ID4

The Byzantine Agreement [V1,V2,V3,V4] [V1,V2,V3,NIL] ID1 ID2 [V1,V2,V3,NIL] [$#@!] ID3 ID4

The Byzantine Agreement [V1,V2,V3,V4] [V1,V2,V3,NIL] ID1 ID2 [V1,V2,V3,V5] [$#@!] ID3 ID4

The Byzantine Agreement [V1,V2,V3,NIL] [V1,V2,V3,NIL] ID1 ID2 [V1,V2,V3,NIL] [$#@!] ID3 ID4

Remarks For M faulty processors, M+1 rounds are required. The first round is for exchanging the self values, and the others are for “processor X told me…” (We’ll see an example) That means, that the algorithm requires O((N-1)M+1) exchanges – Extremely inefficient! The problem is only solvable if N ≥ 3M+1. ID1 ID2 ID3 ID4

The Procedure for N≥3M+1 The problem is solvable only if N ≥ 3M+1 Definitions: P={p1,p2,p3,…pn} : Set of processors V={v1,v2,v3,…,vn} : Set of values We will define a scenario σ as a mapping from an alphabet Σ=P={p1,p2,p3,…pn}, to V. Example: For a scenario σ and string w=p1p2…pr σ(w) = the value of pr that pr tells pr-1 … which p2 tells p1. * Notice that for an arbitrary processor p, a nonfaulty processor q and w over P: σ(pqw) = σ(qw)

The Procedure for N≥3M+1 The message a processor p receives in a scenario σ are given by the restriction σp of σ to strings beginning with p. For some subset Q of P so that size(Q) ≥ (N+M)/2: * If σp (pwq) = v for each string w over Q  p records the value of q as v. * Otherwise, the algorithm for m-1, n-1 is recursively applied with P replaced by P-{q} and σp(pwq) by σp(pw). If at least (N+M)/2 of the n-1 elements in the vector obtained by the recursive call agree p records the value of q as v, otherwise it records NIL.

If you recall… σ(P1P3P4)=V5 Is the value of P4 that P4 told P3 that P3 told P1 [V1,V2,V3,NIL] P1 P2 [V1,V2,V3,NIL] [V1,V2,V3,V5] [$#@!] P3 P4 *Notice: σ(P1P3P4)= σ(P3P4)= V5

What about N<3M+1? We can prove that it is impossible to solve the problem for N < 3M+1. Intuition for why it is true: Attack! Attack! [A,R] ?? Retreat! Attack!

Using Authenticators What if we restrict the values the faulty processor can send? What is an authenticator? An authenticator is a redundant augment to a data item that can be created, ideally, only by the originator of the data. A processor p constructs an authenticator for a data item d by calculating Ap[d], where Ap is some mapping known only top. At the same time, it must be easy for q to check, for a given p, v, and d, that v = Ap[d ].

Using Authenticators Now, the faulty processor has to tell the truth about the values it has received. * But it can still lie about its own value. Implication: No restriction on M.

Authenticators - Example V1 V1 P3 P3

Authenticators - Example V3 P3 V3 P2

Authenticators - Example V21 P3 V23 P2

Authenticators - Example [V1,V21,V3] P3 P2 [V1,V23,V3] [V1,V2?,V3]

Authenticators - Example [V1,NIL,V3] Notice that P2 has to send the same value it sent before! P3 P2 [V1,V23,V3] [V1,V21,V3]

Authenticators - Example [V1,V21,V3] P3 P2 But we don’t care… [V1,V23,V3] [V1,NIL,V3]

Authenticators - Example [V1,V21,V3] Again, P2 has to send the same value it sent before to P3! P3 P2 [V1,NIL,V3] [V1,V23,V3]

Authenticators - Example [V1,NIL,V3] P3 P2 [V1,NIL,V3] [V1,NIL,V3]

Back to the Byzantine Problem We saw that the algorithm is VERY inefficient ((N-1)M+1). So how can we improve its efficiency? A sub committee! Run the algorithm

Expansions of the Byzantine Problem Possible properties to play with: - Synchronous / Asynchronous (no time limit). - Kind of fault: Fail-stop / Byzantine. - Computationally bounded adversary/ private channels/ full information model. - Static / dynamic adversary (picks who to corrupt during the protocol). - Message passing / shared registers/ broadcasts. - Completely connected / sparse network. - Leader election / Global coin toss. - Resiliency, Time and Bit complexity. - Renormalized / Quantum (Hint: QVSS =  Quantum Verifiable Secret Sharing protocol).

Questions?