Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Lecture 6: Synchronous Byzantine.

Slides:



Advertisements
Similar presentations
Fault Tolerance. Basic System Concept Basic Definitions Failure: deviation of a system from behaviour described in its specification. Error: part of.
Advertisements

Impossibility of Distributed Consensus with One Faulty Process
Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The.
6.852: Distributed Algorithms Spring, 2008 Class 7.
Computer Science 425 Distributed Systems CS 425 / ECE 428 Consensus
The Byzantine Generals Problem Boon Thau Loo CS294-4.
The Byzantine Generals Problem Leslie Lamport, Robert Shostak, Marshall Pease Distributed Algorithms A1 Presented by: Anna Bendersky.
Prepared by Ilya Kolchinsky.  n generals, communicating through messengers  some of the generals (up to m) might be traitors  all loyal generals should.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Byzantine Generals Problem: Solution using signed messages.
Failure Detectors. Can we do anything in asynchronous systems? Reliable broadcast –Process j sends a message m to all processes in the system –Requirement:
CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
1 Principles of Reliable Distributed Systems Lectures 11: Authenticated Byzantine Consensus Spring 2005 Dr. Idit Keidar.
1 Principles of Reliable Distributed Systems Lecture 6: Synchronous Uniform Consensus Spring 2005 Dr. Idit Keidar.
1 Principles of Reliable Distributed Systems Lecture 3: Synchronous Uniform Consensus Spring 2006 Dr. Idit Keidar.
Sergio Rajsbaum 2006 Lecture 3 Introduction to Principles of Distributed Computing Sergio Rajsbaum Math Institute UNAM, Mexico.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 10: SMR with Paxos.
CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Synchronous Byzantine.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 9: SMR with Paxos.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Impossibility.
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 12: Impossibility.
Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.
The Byzantine Generals Problem Leslie Lamport Robert Shostak Marshall Pease.
On the Cost of Fault-Tolerant Consensus When There are no Faults Idit Keidar & Sergio Rajsbaum Appears in SIGACT News; MIT Tech. Report.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
1 Principles of Reliable Distributed Systems Recitation 7 Byz. Consensus without Authentication ◊S-based Consensus Spring 2008 Alex Shraer.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 8: Failure Detectors.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Lecture 8-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 16, 2010 Lecture 8 The Consensus.
Lecture #12 Distributed Algorithms (I) CS492 Special Topics in Computer Science: Distributed Algorithms and Systems.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
1 Lectures on Parallel and Distributed Algorithms COMP 523: Advanced Algorithmic Techniques Lecturer: Dariusz Kowalski Lectures on Parallel and Distributed.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 8 Instructor: Haifeng YU.
6.852: Distributed Algorithms Spring, 2008 Class 4.
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt.
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
Hwajung Lee. Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit or Abort.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
SysRép / 2.5A. SchiperEté The consensus problem.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 9 Instructor: Haifeng YU.
1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.
1 SECOND PART Algorithms for UNRELIABLE Distributed Systems: The consensus problem.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Agreement. Agreement Problems High-level goal: Processes in a distributed system reach agreement on a value Numerous problems can be cast.
1 AGREEMENT PROTOCOLS. 2 Introduction Processes/Sites in distributed systems often compete as well as cooperate to achieve a common goal. Mutual Trust/agreement.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Synchronizing Processes
Algorithms for UNRELIABLE Distributed Systems:
Intrusion Tolerant Architectures
Distributed Consensus
Agreement Protocols CS60002: Distributed Systems
Distributed Systems, Consensus and Replicated State Machines
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Distributed systems Consensus
Presentation transcript:

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Synchronous Byzantine Consensus Spring 2008 Prof. Idit Keidar

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Today’s Material Attiya and Welch, Distributed Computing, –Ch. 5 Nancy Lynch, Distributed Algorithms, –Ch. 6

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Debt from Last Week “Two Generals” Problem

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Weak Coordinated Attack Agreement: If both generals decide, they decide the same Termination: Every general eventually decides Weak (Conditional) Validity: If both inputs are “not ready” the decision is “no attack”; if both inputs are “ready” and no messages are lost then the decision is “attack”

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Lemma 1’ There is no protocol that solves the Weak Coordinated Attack problem and does not send any messages before deciding in runs when both inputs are “ready”. Proof: –Similar to Lemma 1 from last week. –Homework question.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Impossibility of Weak Coordinated Attack By induction –Lemma 1’ – base case On board

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring … and now for our feature presentation … Synchronous Byzantine Consensus

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring The Byzantine Generals Problem First formulation of the consensus problem [Pease, Shostak, Lamport 80] Let’s attack Let’s not attack

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Byzantine Faults Faulty process can behave arbitrarily, i.e., they don’t have to follow the protocol. E.g., –can suffer benign failures – crash, timing; –can send bogus values in messages; –can send messages at the wrong time; –can send different messages to different processes; etc. Captures software bugs, hacker intrusions

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Byzantine Nodes can Lead Correct Nodes to Conflicting Decisions Correct nodes cannot know whom to believe נדיח את מרינה נדיח את גיא

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Byzantine-Fault-Tolerant (BFT) Consensus Only non-uniform makes sense. Why? Recall, we defined consensus as follows: –Agreement: correct processes’ decisions are the same –Termination: eventually all correct processes decide –Validity: decision is input of one process Problem?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Validity: Take II Strong unanimity: If the input of all the correct processes is v then no correct process decides a value other than v How resilient can an algorithm satisfying this property be? –Homework: prove this!

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Today’s Problem: Consensus with Strong Unanimity Each process has input, should on decide output Agreement: correct processes’ decisions are the same Validity (Strong Unanimity): If the input of all the correct processes is v then no correct process decides a value other than v Termination: eventually all correct processes decide

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Byzantine Models 1.Authenticated –Uses digital signatures –Assumes PKI – Public Key Infrastructure 2.Un-authenticated –No digital signatures –Secure point-to-point communication –Over the Internet – implemented with symmetric keys

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Authenticated (Byzantine) Model Authentication: The receiver of a message can ascertain its origin –An intruder cannot masquerade as someone else Integrity: The receiver of a message can verify that it has not been modified in transit –An intruder cannot substitute a false message for a legitimate one Nonrepudiation: A sender cannot falsely deny later that he sent a message

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Implementing Authentication Uses a Cryptographic Public Key Infrastructure (PKI) Each process has a well-know public key and a matching private key –  M  p is message M signed by p’s private key –Only p can generate  M  p –Every process can verify p’s signature on  M  p using p’s public key

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Exploiting Authentication All messages are signed by their source Every receiver can verify the message Signed messages can be forwarded as proof “I can prove that Idit said that I don’t have to submit this homework assignment” –  Yossy does not have to submit homework assignment 2  Idit Liars can be exposed

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Today’s Model 1 Round-based synchronous Static set P = {p 1, …, p n } of processes t-out-of-n Byzantine (arbitrary) failures –t < n/2 Authentication

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Exponential Information Gathering (EIG) Algorithms Forward all received messages in each round, for t+1 rounds: In round 1: send your value to all In later rounds: for every received message m (w/out my_id) forward m + my_id to all

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring EIG with Signatures for t <n/2 send  v i  pi to all in every round 2 ≤ k ≤ t+1: for every received message m: if (m has k-1 different valid signatures and not mine) then send  m  pi to all Valid i = {  v j  pj | all messages with t+1 different valid signatures starting with p j ’s have same value v j } decide on most common value in Valid i in case of a tie – choose the default value

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Signatures Expose Liars גיא  דן  נדיח את מרינה   דן  נדיח את גיא  דן  נדיח את מרינה  מרינה  דן  נדיח את גיא   Remove from Valid

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Validity Need to prove Strong Unanimity: If the input of all correct processes is v then no correct process decides a value other than v Claim: At every correct p i, for all correct p j, Valid i includes  v j  pj Validity follows

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Agreement Claim: For two correct processes p i and p j, Valid i and Valid j include the same values Agreement follows

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Termination Decide always happens after t+1 rounds

Can We Improve the Resilience? Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Validity: Take III Weak unanimity: If the input of all the correct processes is v and no process fails then no correct process decides a value other than v Does this prevent a trivial solution? Resilience? –See recitation

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Summary of Known Results Synchronous, Byzantine Fault-Tolerant, t-resilient consensus algorithms – –Strong unanimity with authentication iff t < n/2 As we just saw –Weak unanimity with authentication: iff t < n Recitation –Without authentication: iff t < n/3 Up next

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Model 2: Unauthenticated Byzantine Round-based synchronous Static set P = {p 1, …, p n } of processes t-out-of-n Byzantine (arbitrary) failures –t < n/3 No signatures (no authentication) –But secure point-to-point channels –Model of [Lamport, Pease, Shostak 80]

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring EIG: Reminder round 1: send  v i, p i  to all in every round 2 ≤ k ≤ t+1: for every received message m: if (m has k-1 different ids and not mine) then send  m, p i  to all Forward all received messages in every round t+1 rounds Exponential messages

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Information Gathering Tree at Each Process v1v1 v2v2 vnvn … v 1 p 2 v 1 p 3 … v 2 p 1 v 2 p 3 v n p 1 v n p 2 …… Round 1 Round 2 v n p 1 p 2..p f … v 1 p 2 p 3 …p f+1 … Round f+1

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring EIG Decision W/Out Signatures Resolve tree from leaves upward –Decide on root’s value, default value if nil  For each internal node: take strict majority of child values –nil  if none exists Each node has at least n-t children (t+1 levels) –At least n-2t  3t+1-2t = t+1 correct ones –Correct children are a majority –If node does not lie – all correct children are the same  

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Validity At a correct p i, for a correct p j –in the resolved information gathering tree –level 1 node j holds correct v j Strong Unanimity: If the input of all correct processes is v then all correct processes decide v v1v1 v2v2 vnvn …Round 1

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Agreement Common node: resolved value agreed upon by all correct processes Lemma: in every sub-tree, if there is a common node in every path from a leaf to the root, then the root is common ……

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Agreement Follows The depth of the tree is t+1 So there is a correct process on the path to the root from every leaf in the tree All correct processes are common –Proven where we showed Validity From the lemma, the root is common

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring EIG Algorithm: Summary Optimal worst-case number of rounds –t+1 –Not early-deciding Optimal resilience –t<n/3 Exponential messages –Send entire tree in one big message –Size  (n t+2 )