Synchronous Atomic Broadcast  A. Mok 2016

Synchronous Atomic Broadcast  A. Mok 2016
CS 386C Synchronous Atomic Broadcast  A. Mok 2016

Synchronous Atomic Broadcast for Redundant Broadcast Channels
Processor A Processor B Application process Application process SEND RECEIVE OS kernel OS kernel SEND RECEIVE DELIVER DELIVER send receive System Architecture

System Assumptions n processors with distinct, totally ordered names and also bounded broadcast rates Channels suffer omission failures (upper bound C on delay, but no atomicity) Out-adaptors suffer performance failures (normal bound O on adaptor delay) In-adaptors suffer omission failures (upper bound I, can be relaxed) Processors suffer crash failures (upper bound P on SEND and RECEIVE)

Synchronous Atomic Broadcast for Redundant Broadcast Channels
Atomic broadcast satisfies the following properties: Atomicity: If any correct processor delivers an update at time U, then that update was initiated by some processor and is delivered by all correct processors at time U. Order: All updates delivered by correct processors are delivered in the same order by each correct processor. Synchronous atomic broadcast also satisfies: Termination: Every update whose broadcast is initiated by a correct processor at time T is delivered by all correct processors at time T+Δ.

Systems Assumptions (cntd)
Processor clocks are correct and synchronized within ε At most f ≤ n-2 failures during a broadcast DELIVER can be scheduled (by real-time executive) We shall use the end-to-end delay parameter, δ = P+O+C+I+P later.

Need for forwarding to achieve atomicity when f > 1
What to do when a message (T,s,σ) that has not been seen before arrives? Prompt forwarding rule: p forwards message received on any channel c to all other channels as soon as the message arrives. In the absence of failure, prompt forwarding requires 1+f+(n-1)f = nf+1 messages for each broadcast.

Need for forwarding to achieve atomicity when f > 1 (continued)

Lazy forwarding The goal is not to have to forward messages when it is not necessary to do so, e.g., when no failure occurs.

Simple lazy forwarding rule
A sending processor s enqueues any message m to be broadcast on the out-adaptors to channels 1,2, ..., f+1 in this order. Let p be an arbitrary processor different from s that receives m and let c be the highest channel number on which p receives a copy of m. If c ≥ f, then p does not need to forward m, else p forwards m on channels c+1, ..., f.

Simple lazy forwarding rule example
A sending processor s enqueues any message m to be broadcast on the out- adaptors to channels 1,2, ..., f+1 in this order. Let p be an arbitrary processor different from s that receives m and let c be the highest channel number on which p receives a copy of m. If c ≥ f, then p does not need to forward m, else p forwards m on channels c+1, ..., f. Fault budget f = 2 Example 1: t1: s sends m on C1 and dies t2: m reaches r m dies before reaching q t3: r forwards m on C2 t4: m reaches q from r Example 2: t1: s sends m on C1 and C2 and dies t2: m reaches r on C1 t3: m reaches q on C2

Theorem If s initiates broadcast of message m and some correct processor r (which does not fail in this broadcast) receives m and forwards m by following the simple lazy forwarding rule, then every correct q will also receive m.

Proof If s has not failed, then m must have been sent on all f+1 channels. A least one of them must reach q. So assume s has failed. Case c ≥ f Subcase c = f+1 A copy of m must have been sent on each of f+1 channels. At least one of them must reach q.

Proof (cntd) Subcase c = f
A copy of m must have been sent on each of f channels. If s has failed (count 1 failure), then not all f channels can. Case c < f r forwards m so that m is sent on all f channels. If s has failed (count 1 failure) then not all f channels can.

Key advantage of lazy forwarding
In the absence of failures, only f+1 messages are sent. This implies scalability (because of independence from n). f+1 is also the minimum number of messages required. Theorem f+1 is the minimum number of messages that must be sent each broadcast to achieve atomicity and termination.

Lower bound on messages sent is f+1
For example, consider the case f = 2 and only 2 messages are sent. In the above scenario, at least two of {p, q, r}, say q and r must have not sent any message, that is, they must receive the update from some other processor. However, q but not r will receive the message, if r loses both in-adaptors.

A simple case: single-fault tolerance
No forwarding is needed. To ensure order, order delivery by timestamp and processor name. Each processor that receives m timestamped T must wait till T+δ+ε before delivering m to a process. This ensures that all messages timestamped T have arrived before any of them is delivered. δ (end-to-end delay) = P+O+C+I+P ε = maximum deviation between processor clocks

The single-fault tolerant protocol
task Start; const Δ = δ +ε Var T:Time; σ: Update; s: Processor; cycle SEND(σ); T ← clock; for c = 1 to 2 do send(T,myid,σ) on c; H ← H (T,myid,σ); schedule Deliver(T) at Τ+Δ; endcycle

The single-fault tolerant protocol (cntd)
task Receive; const Δ = δ + ε; var U,T: Time; σ: Update; s: Processor; cycle receive(T,s,σ) from c; U ← clock; if U ≥ T + Δ then "late message" iterate fi; if T ε dom(H) & s ε dom(H(T)) then "deja vu" iterate fi; H ← H  (T,s,σ); schedule Deliver(T) at T+Δ; endcycle

The single-fault tolerant protocol (cntd)
task Deliver(T:Time); var p: Processor; val: Processor→Update; val ← H(T); while dom(val) ≠ { } do p ← min(dom(val)); RECEIVE(val(p)); val ← p; od H ← H \ T;

The single-fault tolerant protocol
Task Start; const Δ = δ +ε Var T:Time; σ: Update; s: Processor; cycle SEND(σ); T ← clock; for c = 1 to 2 do send(T,myid,σ) on c; H ← H (T,myid,σ); schedule Deliver(T) at Τ+Δ; endcycle Task Receive; const Δ = δ + ε; var U,T: Time; σ: Update; s: Processor; cycle receive(T,s,σ) from c; U ← clock; if U ≥ T + Δ then "late message" iterate fi; if T ε dom(H) & s ε dom(H(T)) then "deja vu" iterate fi; H ← H  (T,s,σ); schedule Deliver(T) at T+Δ; endcycle Task Deliver(T:Time); var p: Processor; val: Processor→Update; val ← H(T); while dom(val) ≠ { } do p ← min(dom(val)); RECEIVE(val(p)); val ← p; od H ← H \ T;

The general case: tolerance to up to f faults
Ideas: Discard late messages by using hop count h. A message time-stamped T with hop count h received at local time U is timely if U < T+h(δ+ε). If a correct processor forwards a timely message, then the forwarded message will be timely for all correct processors.

Ideas (continued) If nothing goes wrong, broadcast will be accomplished in one hop. If (T,s,σ,h), h>1, is accepted by processor p, then there must have been some faults since broadcast starts. Accordingly, p does not need to forward on all f+1 channels. In case forwarding is needed, we shall show that there must have been at least h faults and so p only needs to ensure that there are at least f+1-h copies of the message that can reach the remaining correct processors.

Lazy forwarding rule To initiate a broadcast, a sender s enqueues messages (Τ,s,σ,1) on channels 1, 2, ..., f+1 in that order. Let (T,s,σ,h), h ≤ k, be a message accepted by a processor p ≠ s, and let c be the highest channel on which p receives a copy of the message by local time T+h(δ+ε). If c < f+1-h at T+h(δ+ε) on p's clock then p forwards (T,s,σ,h+1) on channels c+1, ..., f+1-h.

Lazy forwarding rule example
To initiate a broadcast, a sender s enqueues messages (Τ,s,σ,1) on channels 1, 2, ..., f+1 in that order. Let (T,s,σ,h), h ≤ k, be a message accepted by a processor p ≠ s, and let c be the highest channel on which p receives a copy of the message by local time T+h(δ+ε). If c < f+1-h at T+h(δ+ε) on p's clock then p forwards (T,s,σ,h+1) on channels c+1, ..., f+1-h. Example with fault budget f = 5 t1: s0 sends m on C1 and dies. t2: m reaches s1. m dies before reaching s2. t3: s1 forwards m on C2 and dies before forwarding on C3. t4: m from s1 reaches s2 on C2 and then dies. t5: s2 forwards m on C3 and dies. t6: m from s2 reaches s3. s3 is the first correct processor to receive a copy of m. This copy will reach all other correct processors since no more faults can occur after 5 faults. s3 s0 s1 s2

Termination time Δ ≤ w + δ + ε
where w = worst-case-delay-to-first-correct-processor In general, w = k (δ +ε) if f = 2k (k+1)(δ+ ε) if f=2k+1 Worst-case-delay-to-first-correct-processor scenario (e.g., f=5):

Termination time example
k (δ +ε) if f = 2k w = (k+1)(δ+ ε) if f=2k+1 w is the worst-cast time to reach the first processor that does not crash during the execution of the broadcast protocol. Example with fault budget f = 5 k = 2 w = (2+1)(δ+ ε) = 3(δ+ ε) Example with fault budget f = 6 k = 3 w = 3(δ+ ε) s3 s0 s1 s2

Termination time (continued)
Δ can actually be set to (k+1)(δ+ε) where k=[f/2], regardless of whether f is even or odd.

Example (f=6, need 7 channels)

Definitions Suppose the original sender s is processor p0
Let ch be the highest-numbered channel on which a correct processor ph receives a hop-h message (T,s,σ,h) with h ≤ k, where k = [f/2]. Recursively define pi, ci, i = h-1, ...,1 backward as follows: pi is the processor that sends the message (T,s,σ,i+1) from pi to pi+1 and the highest-numbered channel on which pi received the hop-i message (T,s,σ,i) is ci p1 is the processor that received a hop-1 message from the sender s = p0 and it sends the message (T,s,σ,2) to p2. The highest numbered channel on which p1 has received message (T,s,σ,1) is c1. Notice that ci > cj, for all i>j.

Definitions (continued)
Let Ai be the set of components: { (1) pi-1, (2) pi-1's out-adaptor to ci +1, (3) channel ci +1, (4) pi 's in-adaptor to ci +1 } These four components are in red in the picture below. Recursively let C1 = A1, Ci+1 = Ci  Ai+1 Notice that Ci = A1 A2 ...  Ai and the sets Ci and Ai+1 are disjoint. Channel ci is the highest channel Pi receives hop-i message from Pi-1 P0 A1 Pi-1 Ai P1 A2 Pi out-adaptor to ci +1 in-adaptor to ci +1 channel ci +1

Theorem Suppose by local time Th = T+h(δ+ε), processor ph accepts a message (T,s,σ,h). If ch ≤ f+1-h, then there are at least h faults in the set Ch by time Th. If ch > f+1-h, then there are at least h-1 faults in Ch. Channel ch is the highest channel Ph receives hop-h message from Ph-1 P0 A1 Ph-1 Ah P1 A2 Ph out-adaptor to ch +1 in-adaptor to ch +1 hop-1 message hop-2 message hop-h message channel ch +1 Ch = A1 A2 ...  Ah Ah includes the sender of the hop-h message. Ph receives the hop-h message. If Ph-1 decides to forward the hop-h message to Ph, then there is an additional fault in Ah

Induction Proof Theorem: Suppose by local time Th = T+h(δ+ε), processor ph accepts a message (T,s,σ,h). If ch ≤ f+1-h, then there are at least h faults in the set Ch by time Th. If ch > f+1-h, then there are at least h-1 faults in Ch. Base step h = 1 p1 did not receive on channel f+1 if c1 ≤ f+1-1 = f. Hence one of the components in C1 must have failed. Else, nothing has failed and there is h-1 = 0 faults. P0 A1 Pi Ai+1 P1 A2 Pi+1 out-adaptor to ci+1 +1 in-adaptor to ci+1 +1 channel ci+1 +1

Induction Proof (continued)
Induction step assumes case holds for h = i Suppose by time Ti+1 = T+(i+1)(δ+ε), pi+1 received (T,s,σ,i+1) on channels no higher than ci+1. By definition, pi must have forwarded (T,s,σ,i+1) to pi+1 on ci+1, ..., f+1-i. The need for pi to forward the message means ci < f+1-i and therefore Ci has at least i faults. If ci+1 ≤ f+1-h =f+1-(i+1) = f-i, then pi+1 did not receive on channel f+1-i. This means either pi has crashed, or there is a fault with the out-adaptor of pi on channel ci+1 +1, or the channel, ci+1 + 1, or the in-adaptor of pi+1 on ci Thus there is a fault in Ai+1. Since pi forwards to pi+1 on channels > ci, we have ci+1 > ci and the sets Ci and Ai+1 are disjoint. Since Ci+1 = CiAi+1, there are at least i+1 faults in Ci+1. If ci+1 > f+1-h, we only need to show h-1 = i+1-1 = i faults in Ci+1 ,but this is already assured by the induction hypothesis which assumes that Ci has at least i faults, and so must Ci+1.

The Unanimity Property
If a correct processor p (one that does not crash during the broadcast) accepts a broadcast by time T+Δ on p's clock, then each correct processor q accepts the broadcast by time T+ Δ on q's clock. P0 A1 Pi Ai+1 P1 A2 Pi+1 Ch = A1 A2 ...  Ah

Unanimity Proof (continued)
Case c = f+1-h: A message has already been sent on channels 1, ..., f+1-h. There are at most f-h faults which can affect any correct processor not in Ch. Case c > f+1-h: Then there are at least h-1 faults in Ch, i.e., there are at most f-(h-1) faults that can affect processors not in Ch. However, at least f+1-h +1 message have been sent. One of them must reach a correct processor. P0 A1 Pi Ai+1 P1 A2 Pi+1 Ch = A1 A2 ...  Ah

The Unanimity Proof (continued)
Suppose by local time Τh = T+h(δ+ε), a correct processor p accepts a message (T,s,σ,h). Let c be the highest channel that receives a copy of the message. Case c < f+1-h: Then there are at least h faults in the set Ch by time Th. Notice that the processors in Ch in the theorem have all received the broadcast message, i.e., the h faults do not involve any processor (or its in-adaptor) that has not received a copy by time Th. Also, all the channels in Ch that may be faulty {ci+1,i≤h} have messages forwarded on them by the corresponding pis. Since p is correct, it sends messages on all of channels c+1 to f+1-h. Thus there is a message (not corrupted by any of the h failures in Ch) on channels 1, ..., c, c+1, ..., f+1-h for processors not in Ch to receive. Since at most f-h faults can affect any processor not already in Ch, one of the f+1-h channels must reach any of these processors.

The f-fault tolerant protocol
task Start; const Δ = [ f/2](δ+ε) + (δ+ε); var T: Time; σ: Update; s: Processor; cycle SEND(σ); T ← clock; for c = 1 to f + 1 do send(T,myid,σ,1) on c; H ← H  (T,myid,σ); schedule Deliver(T) at T + Δ; endcycle

The f-fault tolerant protocol (cntd)
const Δ = [ f/2 ](δ+ε) + (δ+ε); task Receive; var U,T: Time; σ: Update; s: Processor; h: Integer; cycle receive (T,s,σ,h) from c; U ← clock; if U ≥ T + Δ then "late message" iterate fi; if U ≥ T + h(δ+ε) then "too late to forward" Iiterate fi; if T  dom(H) & s  dom(H(T)) then "deja vu" C(T)(s) ← max{c,C(T)(s)}; else H ← H  (T,s,σ); if h ≤ [ f/2 ] & c < f + 1 – h then C ← C  (T,s,c); schedule Forward(T,s,h) at T+h(δ+ε); fi schedule Deliver(T) at T + Δ; endcycle

The f-fault tolerant protocol (continued)
task Forward(T: Time; s: Processor; h: Integer); if C(T)(s) < f+1-h then for i = C(T)(s) + 1 to f+1-h do send(T,s,H(T)(s),h+1) on i fi; "H(T)(p) = the update σ broadcast by p at time t." "C(T)(s) = highest channel on which a message (T,s,*,*) was received."

Unanimity Compromise The unanimity may be violated if correct processors do not wait long enough for messages to arrive on all channels before forwarding. For example, if processors wait till T+hδ+ε for hop-h message times-tamped T, instead of waiting till T+h(δ+ε), then unanimity may be compromised. Consider an example where f = 2, ε =2, δ =4

Let Clock1(0)= 0, Clock2(0)= 1, Clock3(0)= 0, Clock4(0)= 2
Unanimity Compromise Let Clock1(0)= 0, Clock2(0)= 1, Clock3(0)= 0, Clock4(0)= 2 Unanimity can be violated by a processor performance failure as shown above. Here f=2, ε=2, δ=4 and processors wait till T+hδ+ε for hop-h message time-stamped T, instead of waiting till T+h(δ+ε)

Unanimity Compromise (continued)
At time=0 At time=11 S CS(0)=0 E Actual transmissions take 11 instead of δ time units E CE(0)=0 CE(11)=13 L L CL(0)=0 CL(11)=11 S takes 11 time units to send message to E and L, whereas δ=10, ε=2, i.e., the message should arrive at E and L at their local time 12 in the worst case. E will reject but L will accept the message.

Synchronous Atomic Broadcast  A. Mok 2016

Similar presentations

Presentation on theme: "Synchronous Atomic Broadcast  A. Mok 2016"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Synchronous Atomic Broadcast  A. Mok 2016

Similar presentations

Presentation on theme: "Synchronous Atomic Broadcast  A. Mok 2016"— Presentation transcript:

Similar presentations

About project

Feedback