Reliable group communication Hai Le Advanced Operating System
Such services guarantee that messages are delivered to all members in a process group.
8.4.1 - Basic Reliable-Multicasting Schemes What is reliable multicasting? It means that a message that is sent to a process group should be delivered to each member of that group. To cover such situations, a distinction should be made between reliable communication when processes are assumed to operate correctly.
8.4.1 - Basic Reliable-Multicasting Schemes Underlying communication system means that a multicast message may be lost part way and delivered to some, but not all, of the intended receivers.
8.4.1 - Basic Reliable-Multicasting Schemes
8.4.1 - Basic Reliable-Multicasting Schemes
8.4.1 - Basic Reliable-Multicasting Schemes Detect missing message
8.4.1 - Basic Reliable-Multicasting Schemes Return a negative acknowledgment
8.4.2 - Scalability in Reliable Multicasting The main problem with the reliable multicast scheme just described is that it cannot support large numbers of receivers Swamped with such feedback messages (a feedback implosion.) Returning only negative acknowledgments, in theory, will be forced to keep a message in its history buffer forever.
8.4.2.a - Nonhierarchical Feedback Control To resolve the key issue to scalable, the Scalable Reliable Multicasting (SRM) protocol developed by Floyd et al. (1997) and works as follows. In SRM 1. receivers never acknowledge the successful delivery 2. report only when they are missing a message 3. it multicasts its feedback to the rest of the group. 4. Allow another group member to suppress its own feedback. => only a single request for retransmission reaches S
8.4.2.a - Nonhierarchical Feedback Control
8.4.2.a - Nonhierarchical Feedback Control
8.4.2.a - Nonhierarchical Feedback Control Feedback suppression has shown to scale reasonably well, But has a number of serious problems
8.4.2.a - Nonhierarchical Feedback Control Ensuring that only one request for retransmission is requires a reasonably accurate scheduling of feedback messages at each receiver => not easy to archive Interrupts those processes to which the message has been successfully delivered =>other receivers are force to receive that are useless to them.
8.4.2.b - Hierarchical Feedback Control Achieving scalability for very large groups of receivers requires that hierarchical approaches are adopted.
8.4.2.b - Hierarchical Feedback Control Coordinator at root
8.4.2.b - Hierarchical Feedback Control Coordinator at root Has its own history buffer
8.4.2.b - Hierarchical Feedback Control Multicasting scheme for small groups
8.4.2.b - Hierarchical Feedback Control If a member misses a message m -> it asks the coordinator to retransmit m.
8.4.2.b - Hierarchical Feedback Control If the coordinator acknowledgments for message m from all members -> remove m from its history buffer.
8.4.2.b - Hierarchical Feedback Control The main problem is the construction of the tree. A tree needs to be constructed dynamically. A local coordinator in the way just described is not easy to do. It is a difficult problem -> no single best solution exists.
8.4.3 - Atomic Multicast To achieve reliable multicasting for a distributed system - > a message is delivered to either all processes or none at all. This is known as the atomic multicast problem.
8.4.3 - Atomic Multicast A replicated database Distributed System
8.4.3 - Atomic Multicast A replicated database Receiver 1 Receiver 2 Distributed System Receiver 3 Receiver 4
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1 Distributed System Receiver 3 Receiver 4
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1 Distributed System Receiver 3 M1 Receiver 4 M1
8.4.3 - Atomic Multicast Crash A replicated database Receiver 1 M1 Distributed System Receiver 3 M1 Receiver 4 M1
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1 M2 Distributed System Receiver 3 M1 Receiver 4 M1
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1 Receiver 2 M1 M2 Distributed System Receiver 3 M1 M2 Receiver 4 M1 M2
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1 M3 Distributed System Receiver 3 M1 M2 Receiver 4 M1 M2
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1 Receiver 2 M1 M2 M3 Distributed System Receiver 3 M1 M2 M3 Receiver 4 M1 M2 M3
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1 Restore Receiver 2 M1 M2 M3 Distributed System Receiver 3 M1 M2 M3 Receiver 4 M1 M2 M3
Missed several updates 8.4.3 - Atomic Multicast A replicated database Missed several updates Receiver 1 M1 Receiver 2 M1 M2 M3 Distributed System Receiver 3 M1 M2 M3 Receiver 4 M1 M2 M3
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1 Force reconciliation Receiver 1 M1 Receiver 2 M1 M2 M3 Distributed System Receiver 3 M1 M2 M3 Receiver 4 M1 M2 M3
8.4.3 - Atomic Multicast A replicated database Receiver 1 M1 M2 M3 Distributed System Receiver 3 M1 M2 M3 Receiver 4 M1 M2 M3
8.4.3.a - Virtual Synchrony The whole idea of atomic multicasting is that a multicast message m is uniquely associated with a list of processes to which it should be delivered. Assume that while the multicast is taking place, a process joins or leaves the group. A new message vc announcing the joining or leaving of a process. We need to guarantee that m is either delivered to all processes before each one of them is delivered message vc, or m is not delivered at all.
8.4.3.a - Virtual Synchrony If m is not delivered, how can we speak of reliable multicast protocol? Birman and Joseph (1987) develop a reliable multicast method to handle this situations called: Virtual Synchrony.
8.4.3.a - Virtual Synchrony
8.4.3.a - Virtual Synchrony
8.4.3.a - Virtual Synchrony
8.4.3.a - Virtual Synchrony
8.4.3.a - Virtual Synchrony
8.4.3.a - Virtual Synchrony Only rejoin after its state has been brought up to date
8.4.3.a - Virtual Synchrony
8.4.3.b - Message Ordering Besides reliable, the ordering of multicasts are also very important. Unordered multicasts FIFO ordered multicasts Causally ordered multicasts Totally ordered multicasts
8.4.3.b - Message Ordering Unordered multicasts No guarantees are given concerning the order FIFO ordered multicasts Deliver incoming messages from the same process in the same order as they have been sent Causally ordered multicasts Delivers messages so that potential causality between different messages is preserved. Totally ordered multicasts Regardless of whether message delivery is unordered, FIFO, or causally ordered, it is required additionally that when messages are delivered, they are delivered in the same order to all group members.
8.4.3.b - Totally-ordered multicasts Virtually synchronous reliable multicasting offering totally ordered delivery of messages is called atomic multicasting.
8.4.4 - Implementing Virtual Synchrony Just one of the possible implementations. Isis, a fault-tolerant distributed system. Makes use of available reliable point to point communication. Although each transmission is guaranteed to succeed, there are no guarantees that all group members receive m. => Only stable messages are allowed to be delivered.
8.4.4 - Implementing Virtual Synchrony
8.4.4 - Implementing Virtual Synchrony P4 notices that P7 has crashed and send a view change
8.4.4 - Implementing Virtual Synchrony P6 send out all its unstable messages Then a flush message => to check if it is safe to install a new view
8.4.4 - Implementing Virtual Synchrony P6 installs the new view
8.4.4 - Implementing Virtual Synchrony The major flaw in this protocol is that it cannot deal with process failures while a new view change is being announced
8.4.5 - Current work RMTP: A reliable Multicast Transport Protocol Lossless transport protocol Achieve reliability by using a packet based selective repeat retransmission scheme. Scalable
8.4.6 - Future work Improve ISIS system by handling the failure process. Adding a database on top of the network. Provide previous messages to failure process, so it is up to date and ready to re join the network.
Reference Tanenbaum, Andrew S., and Maarten van Steen. Distributed Systems: Principles and Paradigms. Maarten Van Steen, 2016. Lee, I. (2017). Software System. [online] Available at: http://www.cis.upenn.edu/~lee/07cis505/Lec/lec-ch8b-mcast-v5.pdf [Accessed 27 Sep. 2017]. J. and Sanjay, P. (1996). RMTP: A Reliable Multicast Transport Protocol. [online] Semantic scholar. Available at: https://pdfs.semanticscholar.org/2b55/4ae834827462bc629fde348e239 a18ee0ff9.pdf [Accessed 27 Sep. 2017].