Download presentation
Presentation is loading. Please wait.
Published byMaria Caldwell Modified over 9 years ago
1
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor: Ching-Chi Hsu ftp://ftp.db.toronto.edu/pub/vassos/fault.tolerant.broadcasts.dvi.Z
2
2 Overview An earlier version appears in “Fault-Tolerant Broadcasts and Related Problems”, in chapter 5 of “Distributed Systems”, edited by Sape Mullender, Addison-Wesley Publishing Co., 1993 Introduction and Preliminaries Broadcast Specifications Broadcast Algorithms Consensus Terminating Reliable Broadcast Multicast Specifications
3
3 Introduction The communication primitives available are too weak, e.g., no reliable broadcast primitive Fault-tolerant broadcasts are communication primitives that facilitate the development of fault-tolerant applications Another paradigm: Consensus The literature is not coherent Primary goal of this paper: develop material of fault- tolerant broadcasts and consensus in a coherent way
4
4 Preliminaries Focus on message-passing models only The chief characteristics of a message-passing model: the type of communication network, the model of process and communication failures and the synchrony of the system Types of communication Networks point-to-point and broadcast channel Many of the results in this paper are independent of the type of communication networks When needed, only point-to-point network is considered Point-to-point networks communication primitives: send and receive
5
5 Preliminaries Outgoing message buffer and incoming message buffer Every process executes an infinite sequence of steps Failure types Process failures Crash failure send-omission failure receive-omission failure arbitrary(Byzantine or malicious) failure Link failure omission failure
6
6 Preliminaries Synchronous and Asynchronous Networks A point-to-point network is synchronous if: There is a know upper bound to execute a step Local clocks has known bounded rate of drift with respect to real time There is a known upper bound on message delay ( consists of the time to send, transport and receive) Asynchronous: no timing assumptions Clock and Performance Failures in Synchronous Networks Clock failure of a process: clock drift rate exceed the bound Performance failure of a process: completion time of a step exceeds the bound
7
7 Preliminaries Performance failure of a link: transport some message in more time than the bound Classification of Failures and Terminology Omission failures: crash, send-omission, receive-omission failures of process and link omission failures Timing failures: omission, clock and performance failures Benign failures: synonymous to omission failures in asynchronous networks and to timing failures in synchronous networks Causal Precedence Properties of clocks Clock Monotonicity: the clock never decreases or skip values and for any time c, the clock eventually reaches c.
8
8 Preliminaries Logical clocks: for processes p and q, and any steps e and f that occur at p and q, if then C e (e) < C p (f) Synchronized Clocks: clock value at real time t differ by at most a know constant
9
9 Broadcast Specification Assume benign failures Reliable Broadcast: two primitives, broadcast and deliver Assume each message is attached with sender’s id and message’s sequence number Specification of reliable broadcast Validity: if a correct process broadcasts a message m, then it eventually delivers m Agreement: if a correct process delivers a message m, then all correct processes eventually deliver m Integrity: For any message m, every correct process delivers m at most once, and only if m was previously broadcast by sender(m) If the sender of a message m is faulty, the specification
10
10 Broadcast Specifications Two possible outcomes: either m is delivered by all correct processes or by none. FIFO Broadcast FIFO Order: If a process broadcasts a message m before it broadcasts a message m’, then no correct process delivers m’ unless it has previously delivered m Causal Broadcast Causal Order: If the broadcast of a message m causally precedes the broadcast of a message m’, then no correct process delivers m’ unless it has previously delivered m
11
11 Broadcast Specifications Faulty specifications (from the literature) If the broadcast of m causally precedes the broadcast of m’, then every correct process that delivers both messages must deliver m before m’ Messages that are causally related are delivered in the causal order Local Order: If a process broadcasts a message m and a process delivers m before broadcasting m’, then no correct process delivers m’ unless it has previously delivered m Theorem: Causal Order is equivalent to FIFO Order and Local Order m m’
12
12 Broadcast Specificatoins Atomic Broadcast Total Order: If correct processes p and q both deliver messages m and m’, then p delivers m before m’ if and only if q delivers m before m’ FIFO Atomic Broadcast Causal Atomic Broadcast Timed Broadcasts Elapsed time can be interpreted in two different ways: real time or local time
13
13 Broadcast Specifications Real-Time Timeliness: There is a known constant such that if a message m is broadcast at real time t, then no correct process delivers m after real time t+ Assume each message m contains a timestamp ts(m) denoting the local time at which m was broadcast according to the sender’s clock Local-Time -Timeliness: There is a known constant such that no correct process p delivers a message m after local time ts(m)+ on p’s clock
14
14 Broadcast Specifications Place restrictions on the messages delivered by faulty processes Uniform Agreement: If a process (whether correct or faulty) delivers a message m, then all correct processes eventually deliver m Uniform Integrity: For any message m, every process (whether correct or faulty) delivers m at most once, and only if m was previously broadcast by sender(m) Uniform Real-time -Timeliness: There is a known constant such that if a message m is broadcast at real time t, then no process (whether correct or faulty) delivers m after real time t +
15
15 Broadcast Specifications Uniform Local-Time -Timeliness: There is a known constant such that no process p (whether correct or faulty) delivers a message m after local time ts(m)+ on p’s clock Uniform FIFO Order, Uniform Local Order, Uniform Causal Order, Uniform Total Order Broadcast Specifications for Arbitrary Failures
16
16 Relationship Among Broadcast Primitives Reliable Broadcast Atomic Broadcast FIFO Broadcast FIFO Atomic Broadcast Causal Atomic Broadcast Causal Broadcast Total Order FIFO Order Causal Order FIFO Order
17
17 Inconsistency and Contamination The traditional specifications of most broadcasts, including Uniform broadcasts, allow the inconsistency of faulty processes, and the subsequent contamination of correct processes Example: Atomic Broadcast It is possible to prevent the inconsistency of faulty processes, or at least the contamination of correct ones
18
18 Amplification of Failures Broadcast primitives are usually on top of communication primitives A broadcast algorithm is likely to amplify the severity of failures that occur at the low level Even if processes are only subject to crash failures, we cannot assume that the message deliveries that a process make before crashing are always correct. Example: a coordinator based atomic broadcast algorithm. Even if a faulty process behaves correctly until it crashes, it may still deliver messages out-of-order before it crashes! Crash failures by themselves do not guarantee reasonable behavior at the broadcast/delivery level
19
19 Broadcast Algorithm I -- Methodology Start with any given Reliable Broadcast algorithm, and show how to achieve each one of these 3 order properties by a corresponding algorithmic transformation 3 transformations: one adds FIFO Order, one adds Causal order and one adds Total Order None of the transformations require assumptions on the type or synchrony of the underlying network, and all of them work for any type and number of benign failures. All transformations preserve Uniform Agreement and, under certain assumptions, both versions of -Timeliness
20
20 Broadcast Algorithms II -- Transformations Achieving total order Achieving FIFO order Achieving causal order All transformations preserve Uniform Agreement and, under some conditions, both versions of -Timeliness All transformations work for any type and number of benign failures, and regardless of the type or synchrony of the network All broadcasts consider here satisfy Uniform Integrity
21
21 Achieving Total Order A transform that can be used to transform a Reliable, FIFO or Causal Broadcast that satisfies Local-Time -Timeliness into its Atomic counterpart This transformation preserves Validity, Agreement, Integrity, FIFO Order and Causal Order ( and their uniform counterparts)
22
22 Preserving Total Order Algorithm To execute broadcast(BA, m) broadcast(B, m) deliver(BA, m) upon deliver(B, m) do schedule deliver(BA, m) at time ts(m)+
23
23 Achieving FIFO Order An algorithm that transforms any Reliable Broadcast algorithm into a FIFO Broadcast that satisfies Uniform FIFO Order. Preserves (Uniform) Total Order Assume a sequence number is attached at every message
24
24 Achieving Causal Order Two algorithms to transform from FIFO Broadcast to Causal Broadcast, one is blocking and the other not Both require that the given FIFO Broadcast algorithms satisfy Uniform FIFO Order Non-Blocking Transformation: preserves Total Order, but not Uniform Total Order If the given FIFO Broadcast satisfies Uniform Agreement, the transformation preserve both versions of -Timeliness
25
25 Achieving Causal Order
26
26 Achieving Causal Order Blocking Transformation Advantage: uses shorter messages Uses vector timestamps Preserves (Uniform) Total Order
27
27 Point-to-Point Networks Model of Point-to-Point Networks Primitives send and receive satisfy: Validity: If p sends m to q, and both p and q and the link from p to q are correct, then q eventually receives m. Uniform Integrity: For any message m, q receives m at most once from p, and only if p previously sent m to q All Reliable Broadcast algorithms given here rely on two assumptions Benign Failures: No Partitioning
28
28 Reliable Broadcast Algorithm To execute broadcast(R,m) send(m) to p upon receive(m) do if p has not previously executed deliver(R,m) then send(m) to all neighbors deliver(R, m) The algorithm satisfies Validity, Agreement, and Uniform Integrity
29
29 Reliable Broadcast Additional property of send and receive primitives Uniform FIFO Order: If p sends m to q before it sends m’ to q, then q does not receive m’ unless it has previously received m Theorem: If send and receive primitives satisfy Uniform FIFO Order, the Reliable Broadcast algorithm satisfies Uniform Causal Order Additional property of send and receive primitives Strong Validity: If a process p ( whether correct or not) completes the sending of a message m to a correct process q, and the link from p to q is correct, then q eventually receives m
30
30 Reliable Broadcast Theorem: Consider a network such that: (1) processes do not commit send-omission failures, and (2) every process p (whether correct or faulty) is connected to every correct process via a path consisting entirely of correct processes and links (with the possible exception of p itself). The Reliable Broadcast algorithm satisfies Uniform Agreement Model of Synchronous Point-to-Point Networks
31
31 Consensus Two primitives: propose and decide The consensus problem requires that if each correct process proposes a value then the following hold: Termination: Every correct process eventually decides exactly one value Agreement: If a correct process decides v, then all correct processes eventually decide v Integrity: If a correct process decides v, then v was previously proposed by some process Agreement and Integrity can be strengthened to Uniformity
32
32 Consensus Relating Consensus and Atomic Broadcast Transforming Atomic Broadcast into Consensus To execute propose(v) broadcast(A, v) upon deliver(A, v) do if p has not previously executed deliver(A, -) then decide(u)
33
33 Consensus Transforming Reliable Broadcast and Consensus to Atomic Broadcast
34
34 Terminating Reliable Broadcast With Reliable Broadcast processes have no knowledge of the impending broadcasts Allow the delivery of a special message With TRB for sender s, s can broadcast any message and the following hold: Termination: Every correct process eventually delivers exactly one message Validity: If s is correct and broadcasts a message m, then it eventually delivers m Agreement: If a correct process delivers a message m, then all correct processes eventually deliver m Integrity: If a correct process delivers a message m then sender(m)=s. If then m was previously broadcast by s
35
35 Terminating Reliable Broadcast In some synchronous point-to-point networks, Consensus is equivalent to TRB In asynchronous systems, the two problems are not equivalent
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.