Randomized Byzantine Agreements (Sam Toueg 1984)
2 Motivation Already know that no deterministic agreement algorithm exists for asynchronous communication. Therefore use a randomized protocol that terminates with probability 1. Show that no Byzantine Agreement algorithm can overcome more than faulty processes in asynchronous systems. Show a minimal algorithm
3 Model n processes, at most t faulty (the rest are proper ) Reliable p2p communication Digital signatures authenticate all messages
4 Model, cont’d Use idea from Shamir’s “How to Share a Secret” (1979): Divide a secret among n participants, where only (t+1)<n pieces are necessary and sufficient to decrypt Use a non-faulty dealer that generates a sequence of random bits, each bit a shared secret
5 Compute_secret function Compute_secret(s k ): Broadcast each piece to all processes Wait to receive t+1, compute s k If proper processes need not rely on faulty processes to compute the secret, then n should be at least: n>t + (t+1) n>2t+1
6 An asynchronous broadcast algorithm Limits the power of faulty processes for n≥ 3t+1 echo_broadcast(process G, message m) G sends [ initial,G,m ] to all processes Every recipient replies with [ echo, G,m ] to all and ignores subsequent [ initial,G,m’ ] Upon receiving [ echo,G,m ] from (n+t)/2 distinct processes, then a process accepts m from G.
7 Proof for echo_broadcast All messages accepted by proper processes are identical If not, then G 1 and G 2 accept m 1 and m 2 respectively. So at least (n+t)/2 sent [ echo,G,m 1 ] and at least (n+t)/2 sent [ echo,G,m 2 ]. So more than t sent both messages and are thus faulty. Yet at most t are faulty Contradiction If G is proper, then all proper processes accept m from G If G is proper, then it sends initial to all processes. At least n-t are proper and send echo to all processes. A proper process accepts m after receiving (n+t)/2 echo. By the req that n≥3t+1 we get that n- t > (n+t)/2 and thus all proper processes accept m.
8 Async binary Byzantine Agreement, n≥3t+1 Gi:M :=M i for k =1 t o k =R do (* Phase 1 *) broadcast M; wait to receive M-messages from n-t distinct processes; proof := set of received messages; count(1) := number of received messages with M = 1; if count(1) > n - 2t then M := 1 else M :=0; (* Phase 2 *) echo_broadcast [M, proof ]; wait to accept [M, proof ]-messages, with a correct proof, from n - t distinct processes; count(1) := number of accepted messages with M =1; compute_secret( s k ); if (s k = 0 and count(1) ≥ 1) or (s k = 1 and count(1) ≥ 2t+1) then M := 1 else M : = O; od
9
10 Proofs for bin Aync Byz Agree 1.Terminates? Yes —all non-faulty processes accept n-t messages and exit both wait phases. As for rounds, R is constant. 2.If the system is initially proper (all non-faulty processes have the same value m ) then every process terminates the algorithm with M=m.
11 Proof of part 2 Phase 1: count(1)≥n-2t iff m=1 In the beginning, at most t processes broadcast M different from 1. Therefore among n-t distinct messages received by G, at least n-2t have M=1, and at most t have 0. Therefore if m=1 then count(1)≥n-2t and if m=0 then count(1) ≤ t. Note that t<n-2t Thus every proper process sets M:=m at end of phase 1 of iteration k.
12 Proof of part 2, cont’d Phase 2: No correct proofs for any value M different from m. If m =0 then n-t have M=0 at phase start. Proof of m’=1 cannot exist: need n-2t signed messages with value 1. But n-2t>t. If m =1 then a correct proof for m ’=0 consists of n-t messages from distinct processes that have more than t+1 of them with value 0. Also impossible. Since there are no correct proofs for values different than m, every process accepts only messages with M=m in Phase 2 of the iteration. Therefore, at the end of second phase, every proper process sets M:=m, independently from the value of the bit s k
13 Proof, part 3 3.If the system is not proper, then with probability at least 1-(1/2) R, every proper process terminates the algorithm with the same value M Show that for k≥1, if state(k)=disagreement then with prob ≥ 1/2, state(k+1)=agree
14 Proof, part 3 cont.
15 Proof part 3, cont’d Let G be a proper process receiving n-t [M,proof] messages from G 1 …G n-t. Two possible cases for count(1) at G: count(1)≥t+1 : w.l.o.g G accepts a message with M=1 from G 1 …G t+1. If every other G’ accepts messages from all but t processes, then G’ must accept one message from G 1 …G t+1. Therefore count(1)≥1 for all G’ at the end of this phase.
16 Proof part 3, cont’d count(1)<t+1 : Any other proper G’ accepts messages with M=1 from at most t processes in G 1 …G n-t and from at most t processes in G n-t+1 …G n. Therefore G’ has count(1)<2t+1 at the end of this phase. Let a k denote the prob that count(1)≥ t+1. If s k =0 then all proper processes have count(1)>1 and set M:=1 at the end of iteration k. This happens with prob (1/2) a k. If s k =1 and count(1)<t+1 then all proper processes set M:=0 at the end of iteration k. This happens with prob (1/2)(1- a k ). So state(k+1)=agreement with prob at least 0.5 a k +0.5(1- a k ) = 1/2.
17 Upper bound on number of faulty processes Theorem: There are no Byzantine Agreement algorithms for asynchronous systems where n≤3t. Proof sketch: By contradiction, suppose P is an alg with n=3t processes. Divide into three groups A,B,C of size t each
18 Upper bound proof, cont’d Scenarios: 1.All processes in A,B have init val 0; C has val 1. Start running P but processes in C die immediately. Eventually P must agree on 0 using A and B, at time t 1. 2.B and C have init val 1, A has 0. A dies, B and C agree on 1, at time t 2.
19 Upper bound proof, cont’d 3.A has 0, C has 1, B is faulty. B pretends towards A that its init is 0 and to C that its init is 1. If A and C’s communication link is very slow > ( max(t 1,t 2 ) ), then processes in A see same scenario as (1) and therefore at t 1 all in A agree 0. Similarly, all in C see scenario (2) and agree on 1. P is not a Byzantine Agreement algorithm.