Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013

Slides:



Advertisements
Similar presentations
1 Process groups and message ordering If processes belong to groups, certain algorithms can be used that depend on group properties membership create (
Advertisements

Chapter 12 Message Ordering. Causal Ordering A single message should not be overtaken by a sequence of messages Stronger than FIFO Example of FIFO but.
Logical Clocks (2).
CS425/CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
CS 542: Topics in Distributed Systems Diganta Goswami.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 6 Instructor: Haifeng YU.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Reliable Multicast Steve Ko Computer Sciences and Engineering University at Buffalo.
CS542 Topics in Distributed Systems Diganta Goswami.
LEADER ELECTION CS Election Algorithms Many distributed algorithms need one process to act as coordinator – Doesn’t matter which process does the.
CS 582 / CMPE 481 Distributed Systems
Causality & Global States. P1 P2 P Physical Time 4 6 Include(obj1 ) obj1.method() P2 has obj1 Causality violation occurs when order.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
CPSC 668Set 15: Broadcast1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 19: Paxos All slides © IG.
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
Logical Clocks (2). Topics r Logical clocks r Totally-Ordered Multicasting r Vector timestamps.
Computer Science 425 Distributed Systems (Fall 2009) Lecture 5 Multicast Communication Reading: Section 12.4 Klara Nahrstedt.
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
Fault-Tolerant Broadcast Terminology: broadcast(m) a process broadcasts a message to the others deliver(m) a process delivers a message to itself 1.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
Multicast Communication. Unicast vs. Multicast Multicast Whereas the majority of network services and network applications use unicast for IPC, multicasting.
Logical Clocks. Topics Logical clocks Totally-Ordered Multicasting Vector timestamps.
Lecture 11-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 28, 2010 Lecture 11 Leader Election.
Lecture 10 – Mutual Exclusion Distributed Systems.
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt.
Totally Ordered Broadcast in the face of Network Partitions [Keidar and Dolev,2000] INF5360 Student Presentation 4/3-08 Miran Damjanovic
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Building Dependable Distributed Systems, Copyright Wenbing Zhao
Lecture 10: Coordination and Agreement (Chap 12) Haibin Zhu, PhD. Assistant Professor Department of Computer Science Nipissing University © 2002.
Reliable Communication in the Presence of Failures Kenneth P. Birman and Thomas A. Joseph Presented by Gloria Chang.
Fault-Tolerant Broadcast Terminology: broadcast(m) a process broadcasts a message to the others deliver(m) a process delivers a message to itself 1.
Lecture 12-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2012 Indranil Gupta (Indy) October 4, 2012 Lecture 12 Mutual Exclusion.
Lecture 7- 1 CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 7 Distributed Mutual Exclusion Section 12.2 Klara Nahrstedt.
Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 15: Coordination.
CSE 486/586 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Lecture 9: Multicast Sep 22, 2015 All slides © IG.
Distributed Systems Lecture 9 Leader election 1. Previous lecture Middleware RPC and RMI – Marshalling 2.
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
Distributed Systems Lecture 7 Multicast 1. Previous lecture Global states – Cuts – Collecting state – Algorithms 2.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Oct 1, 2015 Lecture 12: Mutual Exclusion All slides © IG.
Slides for Chapter 11: Coordination and Agreement From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.
Distributed systems II Fault-Tolerant Broadcast
Distributed Systems Course Coordination and Agreement
CSE 486/586 Distributed Systems Reliable Multicast --- 1
Coordination and Agreement
CSE 486/586 Distributed Systems Leader Election
CS 525 Advanced Distributed Systems Spring 2013
Lecture 17: Leader Election
CS 525 Advanced Distributed Systems Spring 2018
CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy)
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy)
Mutual Exclusion CS p0 CS p1 p2 CS CS p3.
EEC 688/788 Secure and Dependable Computing
CSE 486/586 Distributed Systems Leader Election
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Distributed Systems Course Coordination and Agreement
CSE 486/586 Distributed Systems Reliable Multicast --- 2
CSE 486/586 Distributed Systems Reliable Multicast --- 1
CSE 486/586 Distributed Systems Leader Election
Presentation transcript:

Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013 Indranil Gupta (Indy) September 17, 2013 Lecture 7 Multicast Reading: Sections 15.4  2013, I. Gupta, K. Nahrtstedt, S. Mitra, N. Vaidya, M. T. Harandi, J. Hou

Communication Modes in Distributed System Unicast (best effort or reliable) Messages are sent from exactly one process to one process. Best effort: if a message is delivered it would be intact; no reliability guarantees. Reliable: guarantees delivery of messages. Broadcast Messages are sent from exactly one process to all processes on the network. Broadcast protocols are not practical. Multicast Messages broadcast within a group of processes. A multicast message is sent from any one process to the group of processes on the network. Reliable multicast can be implemented “above” (i.e., “using”) a reliable unicast. This lecture!

Other Examples of Multicast Use Akamai’s Configuration Management System (called ACMS) uses a core group of 3-5 servers. These servers continuously multicast to each other the latest updates. They use reliable multicast. After an update is reliably multicast within this group, it is then sent out to all the (1000s of) servers Akamai has all over the world. Air Traffic Control System: orders by one ATC need to be ordered (and reliable) multicast out to other ATC’s. Newsgroup servers multicast to each other in a reliable and ordered manner. Facebook servers multicast your updates to each other

What’re we designing in this class Application (at process p) MULTICAST PROTOCOL send multicast Incoming messages deliver (upcall) One process p

Basic Multicast (B-multicast) Let’s assume the all processes know the group membership A straightforward way to implement B-multicast is to use a reliable one-to-one send (unicast) operation: B-multicast(group g, message m): for each process p in g, send (p,m). receive(m): B-deliver(m) at p. A “correct” process= a “non-faulty” process A basic multicast primitive guarantees a correct process will eventually deliver the message, as long as the sender (multicasting process) does not crash. Can we provide reliability even when the sender crashes (after it has sent the multicast)?

Reliable Multicast Integrity: A correct (i.e., non-faulty) process p delivers a message m at most once. Validity: If a correct process multicasts (sends) message m, then it will eventually deliver m itself. Guarantees liveness to the sender. Agreement: If some one correct process delivers message m, then all other correct processes in group(m) will eventually deliver m. Property of “all or nothing.” Validity and agreement together ensure overall liveness: if some correct process multicasts a message m, then, all correct processes deliver m too.

Reliable R-Multicast Algorithm “USES” B-multicast “USES” reliable unicast

Reliable Multicast Algorithm (R-multicast) Integrity Agreement Integrity, Validity if some correct process B-multicasts a message m, then, all correct processes R-deliver m too. If no correct process B-multicasts m, then no correct processes R-deliver m.

What about Multicast Ordering? FIFO ordering: If a correct process issues multicast(g,m) and then multicast(g,m’), then every correct process that delivers m’ will have already delivered m. Causal ordering: If multicast(g,m)  multicast(g,m’) then any correct process that delivers m’ will have already delivered m. Total ordering: If a correct process delivers message m before m’ (independent of the senders), then any other correct process that delivers m’ will have already delivered m.

Total, FIFO and Causal Ordering Totally ordered messages T1 and T2. FIFO-related messages F1 and F2. Causally related messages C1 and C3 Causal ordering implies FIFO ordering (why?) Total ordering does not imply causal ordering. Causal ordering does not imply total ordering. Hybrid mode: causal-total ordering, FIFO-total ordering.

Display From Newsgroup os.interesting Item From Subject 23 A.Hanlon Mach 24 G.Joseph Microkernels 25 Re: Microkernels 26 T.L’Heureux RPC performance 27 M.Walker Re: Mach end What is the most appropriate ordering for this application? (a) FIFO (b) causal (c) total What is the most appropriate ordering for Facebook posts?

Providing Ordering Guarantees (FIFO) Look at messages from each process in the order they were sent: Each process keeps a sequence number for each other process (vector) When a message is received, as expected (next sequence), accept higher than expected, buffer in a queue lower than expected, reject If Message# is

Implementing FIFO Ordering Spg: the number of messages p has sent to g. Rqg: the sequence number of the latest group-g message that p has delivered from q (maintained for all q at p) For p to FO-multicast m to g p increments Spg by 1. p “piggy-backs” the value Spg onto the message. p B-multicasts m to g. At process p, Upon receipt of m from q with sequence number S: p checks whether S= Rqg+1. If so, p FO-delivers m and increments Rqg If S > Rqg+1, p places the message in the hold-back queue until the intervening messages have been delivered and S= Rqg+1. If S < Rqg+1, reject m

Hold-back Queue for Arrived Multicast Messages

Example: FIFO Multicast (do NOT confuse with vector timestamps) “Accept” = Deliver Physical Time Reject: 1 < 1 + 1 Accept 1 = 0 + 1 Accept: 2 = 1 + 1 1 0 0 2 0 0 2 1 0 2 1 0 P1 0 0 0 2 2 1 1 1 1 P2 0 0 0 2 1 0 1 0 0 2 0 0 Reject: 1 < 1 + 1 Accept 1 = 0 + 1 1 P3 0 0 0 0 0 0 1 0 0 2 1 0 2 0 0 Buffer 2 > 0 + 1 2 0 0 Accept Buffer 2 = 1 + 1 Accept: 1 = 0 + 1 0 0 0 Sequence Vector

Total Ordering Using a Sequencer Sequencer = Leader process

ISIS: Total ordering without sequencer P 2 1 Message 3 2 2 P 2 Proposed Seq 4 1 3 Agreed Seq 1 2 P 1 3 P 3

ISIS algorithm for total ordering The multicast sender multicasts the message to everyone. Recipients add the received message to a special queue called the priority queue, tag the message undeliverable, and reply to the sender with a proposed priority (i.e., proposed sequence number). Further, this proposed priority is 1 more than the latest sequence number heard so far at the recipient, suffixed with the recipient's process ID. The priority queue is always sorted by priority.  The sender collects all responses from the recipients, calculates their maximum, and re-multicasts original message with this as the final priority for the message.  On receipt of this information, recipients mark the message as deliverable, reorder the priority queue, and deliver the set of lowest priority messages that are marked as deliverable.

Proof of Total Order For a message m1, consider the first process p that delivers m1 At p, when message m1 is at head of priority queue Suppose m2 is another message that has not yet been delivered (i.e., is on the same queue or has not been seen yet by p) finalpriority(m2) >= proposedpriority(m2) > finalpriority(m1) Suppose there is some other process p’ that delivers m2 before it delivers m1. Then at p’, finalpriority(m1) >= proposedpriority(m1) > finalpriority(m2) a contradiction! Due to “max” operation at sender and since proposed priorities by process p only increase Since queue ordered by increasing priority Due to “max” operation at sender Since queue ordered by increasing priority

Causal Ordering using vector timestamps The number of group-g messages from process j that have been seen at process i so far

Example: Causal Ordering Multicast Reject: Accept 1,0,0 1,1,0 P1 1,1,0 0,0,0 (1,1,0) (1,0,0) (1,1,0) P2 0,0,0 1,1,0 1,0,0 (1,1,0) (1,0,0) P3 0,0,0 1,1,0 Accept: 1,0,0 Accept Buffered message 1,1,0 Accept Buffer, missing P1(1) Physical Time

Summary Multicast is operation of sending one message to multiple processes in a given group Reliable multicast algorithm built using unicast Ordering – FIFO, total, causal Thursday RPCs: Section 4.3, parts of Chapter 5 Important for MP2 Homework 1 due this Thursday Hand in to me at start of lecture (not during or after lecture)