1 Fault-Tolerant Consensus
2 Communication Model Complete graph Synchronous, network
3 Broadcast Send a message to all processors in one round a a a a
4 At the end of round: everybody receives a a a a a
5 Broadcast Two or more processes can broadcast at the same round a a a a b b b b
6 a,b a b
7 Crash Failures Faulty processor a a a a
8 Faulty processor Some of the messages are lost, they are never received a a
9 Faulty processor a a
10 Failure Round 1 Round 2 Round 3 Round 4 Round 5 After failure the process disappears from the network
11 Consensus Start Everybody has an initial value
Finish Everybody must decide the same value
Start If everybody starts with the same value they must decide that value Finish Validity condition:
14 A simple algorithm 1.Broadcast value to all processors 2.Decide on the minimum Each processor: (only one round is needed)
Start
Broadcast values 0,1,2,3,4
Decide on minimum 0,1,2,3,4
Finish
19 This algorithm satisfies the validity condition Start Finish If everybody starts with the same initial value, everybody decides on that value (minimum)
20 Consensus with Crash Failures 1.Broadcast value to all processors 2.Decide on the minimum Each processor: The simple algorithm doesn’t work
Start fail The failed processor doesn’t broadcast Its value to all processors 0 0
Broadcasted values 0,1,2,3,4 1,2,3,4 fail 0,1,2,3,4 1,2,3,4
Decide on minimum 0,1,2,3,4 1,2,3,4 fail 0,1,2,3,4 1,2,3,4
Finish fail No Consensus!!!
25 If an alforithm solves consensus for f failed process we say it is: an f-resilient consensus algorithm
26 The input and output of a 3-resilient consensus algorithm Start Finish 1 1 Example:
27 An f-resilient algorithm Round 1: Broadcast my value Round 2 to round f+1: Broadcast any new received values End of round f+1: Decide on the minimum value received
Start Example: f=1 failures, f+1 = 2 rounds needed
Round fail Example: f=1 failures, f+1 = 2 rounds needed Broadcast all values to everybody 0,1,2,3,4 1,2,3,4 0,1,2,3,4 1,2,3,4 (new values)
30 Example: f=1 failures, f+1 = 2 rounds needed Round 2 Broadcast all new values to everybody 0,1,2,3,
31 Example: f=1 failures, f+1 = 2 rounds needed Finish Decide on minimum value ,1,2,3,4
Start Example: f=2 failures, f+1 = 3 rounds needed Another example execution with 3 failures
Round 1 0 Failure 1 Broadcast all values to everybody 1,2,3,4 0,1,2,3,4 1,2,3,4 Example: f=2 failures, f+1 = 3 rounds needed
Round 2 Failure 1 Broadcast new values to everybody 0,1,2,3,4 1,2,3,4 0,1,2,3,4 1,2,3,4 Failure 2 Example: f=2 failures, f+1 = 3 rounds needed
Round 3 Failure 1 Broadcast new values to everybody 0,1,2,3,4 O, 1,2,3,4 Failure 2 Example: f=2 failures, f+1 = 3 rounds needed
Finish Failure 1 Decide on the minimum value 0,1,2,3,4 O, 1,2,3,4 Failure 2 Example: f=2 failures, f+1 = 3 rounds needed
Start Example: f=2 failures, f+1 = 3 rounds needed Another example execution with 3 failures
Round 1 0 Failure 1 Broadcast all values to everybody 1,2,3,4 0,1,2,3,4 1,2,3,4 Example: f=2 failures, f+1 = 3 rounds needed
Round 2 Failure 1 Broadcast new values to everybody 0,1,2,3,4 Example: f=2 failures, f+1 = 3 rounds needed At the end of this round all processes know about all the other values Remark:
Round 3 Failure 1 Broadcast new values to everybody 0,1,2,3,4 Example: f=2 failures, f+1 = 3 rounds needed (no new values are learned in this round) Failure 2
Finish Failure 1 Decide on minimum value 0,1,2,3,4 Example: f=2 failures, f+1 = 3 rounds needed Failure 2
42 If there are f failures and f+1 rounds then there is a round with no failed process Example: 5 failures, 6 rounds 1 2 No failure 3456 Round
43 In the algorithm, at the end of the round with no failure: Every (non faulty) process knows about all the values of all other participating processes This knowledge doesn’t change until the end of the algorithm
44 Therefore, at the end of the round with no failure: everybody would decide the same value However, we don’t know the exact position of this round, so we have to let the algorithm execute for f+1 rounds
45 Validity of algorithm: when all processes start with the same input value then the consensus is that value This holds, since the value decided from each process is some input value
46 A Lower Bound Any f-resilient consensus algorithm requires at least f+1 rounds Theorem:
47 Proof sketch: Assume for contradiction that f or less rounds are enough Worst case scenario: There is a process that fails in each round
48 Round a 1 before process fails, it sends its value a to only one process Worst case scenario
49 Round a 1 before process fails, it sends value a to only one process Worst case scenario 2
50 Round1 Worst case scenario 2 ……… a f3 At the end of round f only one process knows about value a
51 Round1 Worst case scenario 2 ……… f3 Process may decide a, and all other processes may decide another value (b) a b decide
52 Round1 Worst case scenario 2 ……… f3 a b decide Therefore f rounds are not enough At least f+1 rounds are needed
53 Byzantine Failures
54 Byzantine Failures Faulty processor a b a c Different processes receive different values
55 Faulty processor a a A Byzantine process can behave like a Crashed-failed process Some messages may be lost
56 Failure Round 1 Round 2 Round 3 Round 4 Round 5 After failure the process continues Functioning in the network Failure Round 6
57 Consensus with Byzantine Failures solves consensus for f failed processes f-resilient consensus algorithm:
58 The input and output of a 1-resilient consensus algorithm Start Finish 3 3 Example: 3 3
59 Validity condition: if all non-faulty processes start with the same value then all non-faulty processes decide that value Start Finish
60 Any f-resilient consensus algorithm with byzantine failures requires at least f+1 rounds Theorem: follows from the crash failure lower bound Proof: Lower bound on number of rounds
61 A Consensus Algorithm solves consensus with processes and failures, where The King algorithm
62 The King algorithm There are phases Each phase has two broadcast rounds In each phase there is a different king
63 Example: 12 processes, 2 faults, 3 kings initial values Faulty
64 Example: 12 processes, 2 faults, 3 kings Remark: There is a king that is not faulty initial values King 1King 2King 3
65 The King algorithm Each processor has a preferred value In the beginning, the preferred value is set to the initial value
66 The King algorithm Phase k Round 1, processor : Broadcast preferred value Set Let be the majority of received values (including ) (in case of tie pick an arbitrary value)
67 If had majority of less than The King algorithm Phase k Round 2, king : Broadcast new preferred value Round 2, process : then set
68 The King algorithm End of Phase f+1: Each process decides on preferred value
69 Example: 6 processes, 1 fault Faulty 01 king 1 king
70 01 king Phase 1, Round 1 2,1,1,0,0,0 2,1,1,1,0,0 2,1,1,0,0, Everybody broadcasts
71 10 king Phase 1, Round 1 Chose the majority Each majority vote was On round 2, everybody will chose the king’s value 2,1,1,1,0,0
72 Phase 1, Round king 1 The king broadcasts
73 Phase 1, Round king 1 Everybody chooses the king’s value
74 01 king Phase 2, Round 1 2,1,1,0,0,0 2,1,1,1,0,0 2,1,1,0,0, Everybody broadcasts
Phase 2, Round 1 Chose the majority Each majority vote was On round 2, everybody will chose the king’s value king 2 2,1,1,1,0,0
76 Phase 2, Round The king broadcasts king
77 Phase 2, Round king 2 Everybody chooses the king’s value Final decision
78 Theorem: In the phase where the the king is non-faulty, every non-faulty processor decides the same value Proof:Consider phase
79 At the end of round 1, we examine two cases: Case 1: some node has chosen its preferred value with strong majority ( votes) Case 2: No node has chosen its preferred value with strong majority
80 Case 1: suppose node has chosen its preferred value with strong majority ( votes) At the end of round 1, every other node must have preferred value Explanation: At least non-faulty nodes must have broadcasted at start of round 1 (including the king)
81 At end of round 2: If a node keeps its own value: then decides If a node gets the value of the king: then it decides, since the king has decided Therefore: Every non-faulty node decides
82 Case 2: No node has chosen its preferred value with strong majority ( votes) Every non-faulty node will adopt the value of the king, thus all decide on same value END of PROOF
83 After, value will always be preferred with strong majority, since the number of non-faulty processors is: Let be the value decided at the end of phase (since )
84 Thus, from until the end of phase Every non-faulty processor decides
85 There is no -resilient algorithm for processes, where Theorem: Proof:First we prove the 3 process case, and then the general case An Impossibility Result
86 There is no 1-resilient algorithm for 3 processes Lemma: Proof:Assume for contradiction that there is a 1-resilient algorithm for 3 processes The 3 processes case
87 A(0) B(1)C(0) Initial value Local algorithm
Decision value
89 A(0) B(1) C(1) A(1)C(0) B(0) Assume processes are in a ring Processes think they are in a triangle
90 A(0) B(1) C(1) A(1)C(0) B(0) B(1) A(1) faulty C(1) C(0)
91 A(0) B(1) C(1) A(1)C(0) B(0) 1 1 faulty (validity condition)
92 A(0) B(1) C(1) A(1)C(0) B(0) 1 C(0) B(0) A(0) A(1) faulty
93 A(0) B(1) C(1) A(1)C(0) B(0) faulty (validity condition)
94 A(0) B(1) C(1) A(1)C(0) B(0) 1 0 A(1)C(0) B(1)B(0) faulty
95 A(0) B(1) C(1) A(1)C(0) B(0) B(1) A(1) faulty C(1) C(0) B(0) A(0) A(1) faulty A(1)C(0) B(1)B(0) faulty
96 A(0) B(1) C(1) A(1)C(0) B(0) faulty
97 10 faulty Impossible!!! since the algorithm is 1-resilient
98 Therefore: There is no algorithm that solves consensus for 3 processes in which 1 is a byzantine process
99 The n processes case Assume for contradiction that there is an -resilient algorithm A for processes, where We will use algorithm A to solve consensus for 3 processes and 1 failure (contradiction)
100 algorithm A 011 … … start failures 11 … … 11111finish
101 Each process simulates algorithm A on of processes
102 fails When a fails then of processes fail too
103 fails algorithm A tolerates failures Finish of algorithm A k k k k k k k k k k k k k all decide k
104 fails Final decision k k We reached consensus with 1 failure Impossible!!!
105 There is no -resilient algorithm for processes, where Threrefore:
106 Randomized Byzantine Agreement There is a trustworthy processor which at every round throws a random coin and informs every other processor Coin = heads (probability ) Coin = tails (probability )
107 Each processor has a preferred value In the beginning, the preferred value is set to the initial value Assume that initial value is binary
108 The algorithm tolerates Byzantine processors There are three threshold values:
109 In each round, processor executes: Broadcast ; Receive values from all processors; majority value; occurrences of ; If coin=heads then else If then else If then decision is reached
110 Analysis:Examine two cases in a round Case 1: Two processors and have different Case 2: All processors have same Termination: There is a processor with Other cases:
111 Termination: There is a processor with Since faulty processors are at most processor received at least votes for from good processors
112 Therefore, every processor will have with Consequently, at the end of the round all the good processors will have the same preferred value:
113 Observation: If in the beginning of a round all the good processors have same preferred value then the algorithm terminates in that round This holds since for every processor the termination condition will be true in that round
114 Therefore, if the termination condition is true for one processor at a round, then, the termination condition will be true for all processors at next round.
115 Case 1: Two processors and have different It has to be that and And therefore Thus, every processor chooses 0, and the algorithm terminates in next round
116 Then at least Good processors have voted Suppose (for sake of contradiction) that Consequently, Contradiction!
117 Case 2: All processors have same Then for any two processors and it holds that Since otherwise, the number of faulty Processors would exceed
118 Let be the processor with
119 Sub-case 1: If then, for any processor it holds (this occurs with probability )
120 And therefore Thus, every processor chooses 0, and the algorithm terminates in next round (this occurs with probability )
121 Sub-case 2: If then, for any processor it holds (this occurs with probability )
122 And therefore Thus, every processor chooses, and the algorithm terminates in next round (this occurs with probability )