Download presentation
Presentation is loading. Please wait.
1
Reliable group communication
Hai Le Advanced Operating System
2
Such services guarantee that messages are delivered to all members in a process group.
3
8.4.1 - Basic Reliable-Multicasting Schemes
What is reliable multicasting? It means that a message that is sent to a process group should be delivered to each member of that group. To cover such situations, a distinction should be made between reliable communication when processes are assumed to operate correctly.
4
8.4.1 - Basic Reliable-Multicasting Schemes
Underlying communication system means that a multicast message may be lost part way and delivered to some, but not all, of the intended receivers.
5
8.4.1 - Basic Reliable-Multicasting Schemes
6
8.4.1 - Basic Reliable-Multicasting Schemes
7
8.4.1 - Basic Reliable-Multicasting Schemes
Detect missing message
8
8.4.1 - Basic Reliable-Multicasting Schemes
Return a negative acknowledgment
9
8.4.2 - Scalability in Reliable Multicasting
The main problem with the reliable multicast scheme just described is that it cannot support large numbers of receivers Swamped with such feedback messages (a feedback implosion.) Returning only negative acknowledgments, in theory, will be forced to keep a message in its history buffer forever.
10
8.4.2.a - Nonhierarchical Feedback Control
To resolve the key issue to scalable, the Scalable Reliable Multicasting (SRM) protocol developed by Floyd et al. (1997) and works as follows. In SRM 1. receivers never acknowledge the successful delivery 2. report only when they are missing a message 3. it multicasts its feedback to the rest of the group. 4. Allow another group member to suppress its own feedback. => only a single request for retransmission reaches S
11
8.4.2.a - Nonhierarchical Feedback Control
12
8.4.2.a - Nonhierarchical Feedback Control
13
8.4.2.a - Nonhierarchical Feedback Control
Feedback suppression has shown to scale reasonably well, But has a number of serious problems
14
8.4.2.a - Nonhierarchical Feedback Control
Ensuring that only one request for retransmission is requires a reasonably accurate scheduling of feedback messages at each receiver => not easy to archive Interrupts those processes to which the message has been successfully delivered =>other receivers are force to receive that are useless to them.
15
8.4.2.b - Hierarchical Feedback Control
Achieving scalability for very large groups of receivers requires that hierarchical approaches are adopted.
16
8.4.2.b - Hierarchical Feedback Control
Coordinator at root
17
8.4.2.b - Hierarchical Feedback Control
Coordinator at root Has its own history buffer
18
8.4.2.b - Hierarchical Feedback Control
Multicasting scheme for small groups
19
8.4.2.b - Hierarchical Feedback Control
If a member misses a message m -> it asks the coordinator to retransmit m.
20
8.4.2.b - Hierarchical Feedback Control
If the coordinator acknowledgments for message m from all members -> remove m from its history buffer.
21
8.4.2.b - Hierarchical Feedback Control
The main problem is the construction of the tree. A tree needs to be constructed dynamically. A local coordinator in the way just described is not easy to do. It is a difficult problem -> no single best solution exists.
22
Atomic Multicast To achieve reliable multicasting for a distributed system - > a message is delivered to either all processes or none at all. This is known as the atomic multicast problem.
23
Atomic Multicast A replicated database Distributed System
24
8.4.3 - Atomic Multicast A replicated database Receiver 1 Receiver 2
Distributed System Receiver 3 Receiver 4
25
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1
Distributed System Receiver 3 Receiver 4
26
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1
Distributed System Receiver 3 M1 Receiver 4 M1
27
8.4.3 - Atomic Multicast Crash A replicated database Receiver 1 M1
Distributed System Receiver 3 M1 Receiver 4 M1
28
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1 M2
Distributed System Receiver 3 M1 Receiver 4 M1
29
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1
Receiver 2 M1 M2 Distributed System Receiver 3 M1 M2 Receiver 4 M1 M2
30
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1 M3
Distributed System Receiver 3 M1 M2 Receiver 4 M1 M2
31
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1
Receiver 2 M1 M2 M3 Distributed System Receiver 3 M1 M2 M3 Receiver 4 M1 M2 M3
32
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1
Restore Receiver 2 M1 M2 M3 Distributed System Receiver 3 M1 M2 M3 Receiver 4 M1 M2 M3
33
Missed several updates
Atomic Multicast A replicated database Missed several updates Receiver 1 M1 Receiver 2 M1 M2 M3 Distributed System Receiver 3 M1 M2 M3 Receiver 4 M1 M2 M3
34
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1
Force reconciliation Receiver 1 M1 Receiver 2 M1 M2 M3 Distributed System Receiver 3 M1 M2 M3 Receiver 4 M1 M2 M3
35
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1 M2 M3
Distributed System Receiver 3 M1 M2 M3 Receiver 4 M1 M2 M3
36
8.4.3.a - Virtual Synchrony The whole idea of atomic multicasting is that a multicast message m is uniquely associated with a list of processes to which it should be delivered. Assume that while the multicast is taking place, a process joins or leaves the group. A new message vc announcing the joining or leaving of a process. We need to guarantee that m is either delivered to all processes before each one of them is delivered message vc, or m is not delivered at all.
37
8.4.3.a - Virtual Synchrony If m is not delivered, how can we speak of reliable multicast protocol? Birman and Joseph (1987) develop a reliable multicast method to handle this situations called: Virtual Synchrony.
38
8.4.3.a - Virtual Synchrony
39
8.4.3.a - Virtual Synchrony
40
8.4.3.a - Virtual Synchrony
41
8.4.3.a - Virtual Synchrony
42
8.4.3.a - Virtual Synchrony
43
8.4.3.a - Virtual Synchrony Only rejoin after its state has
been brought up to date
44
8.4.3.a - Virtual Synchrony
45
8.4.3.b - Message Ordering Besides reliable, the ordering of multicasts are also very important. Unordered multicasts FIFO ordered multicasts Causally ordered multicasts Totally ordered multicasts
46
8.4.3.b - Message Ordering Unordered multicasts
No guarantees are given concerning the order FIFO ordered multicasts Deliver incoming messages from the same process in the same order as they have been sent Causally ordered multicasts Delivers messages so that potential causality between different messages is preserved. Totally ordered multicasts Regardless of whether message delivery is unordered, FIFO, or causally ordered, it is required additionally that when messages are delivered, they are delivered in the same order to all group members.
47
8.4.3.b - Totally-ordered multicasts
Virtually synchronous reliable multicasting offering totally ordered delivery of messages is called atomic multicasting.
48
8.4.4 - Implementing Virtual Synchrony
Just one of the possible implementations. Isis, a fault-tolerant distributed system. Makes use of available reliable point to point communication. Although each transmission is guaranteed to succeed, there are no guarantees that all group members receive m. => Only stable messages are allowed to be delivered.
49
8.4.4 - Implementing Virtual Synchrony
50
8.4.4 - Implementing Virtual Synchrony
P4 notices that P7 has crashed and send a view change
51
8.4.4 - Implementing Virtual Synchrony
P6 send out all its unstable messages Then a flush message => to check if it is safe to install a new view
52
8.4.4 - Implementing Virtual Synchrony
P6 installs the new view
53
8.4.4 - Implementing Virtual Synchrony
The major flaw in this protocol is that it cannot deal with process failures while a new view change is being announced
54
8.4.5 - Current work RMTP: A reliable Multicast Transport Protocol
Lossless transport protocol Achieve reliability by using a packet based selective repeat retransmission scheme. Scalable
55
Future work Improve ISIS system by handling the failure process. Adding a database on top of the network. Provide previous messages to failure process, so it is up to date and ready to re join the network.
56
Reference Tanenbaum, Andrew S., and Maarten van Steen. Distributed Systems: Principles and Paradigms. Maarten Van Steen, 2016. Lee, I. (2017). Software System. [online] Available at: [Accessed 27 Sep. 2017]. J. and Sanjay, P. (1996). RMTP: A Reliable Multicast Transport Protocol. [online] Semantic scholar. Available at: a18ee0ff9.pdf [Accessed 27 Sep. 2017].
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.