CS425/CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.

Slides:



Advertisements
Similar presentations
1 Process groups and message ordering If processes belong to groups, certain algorithms can be used that depend on group properties membership create (
Advertisements

Chapter 12 Message Ordering. Causal Ordering A single message should not be overtaken by a sequence of messages Stronger than FIFO Example of FIFO but.
Global States.
Logical Clocks (2).
CS 542: Topics in Distributed Systems Diganta Goswami.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 6 Instructor: Haifeng YU.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Reliable Multicast Steve Ko Computer Sciences and Engineering University at Buffalo.
CS542 Topics in Distributed Systems Diganta Goswami.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
Distributed Systems Spring 2009
LEADER ELECTION CS Election Algorithms Many distributed algorithms need one process to act as coordinator – Doesn’t matter which process does the.
CS 582 / CMPE 481 Distributed Systems
Causality & Global States. P1 P2 P Physical Time 4 6 Include(obj1 ) obj1.method() P2 has obj1 Causality violation occurs when order.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
CPSC 668Set 15: Broadcast1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Lecture 13 Synchronization (cont). EECE 411: Design of Distributed Software Applications Logistics Last quiz Max: 69 / Median: 52 / Min: 24 In a box outside.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Ordered Communication. Define guarantees about the order of deliveries inside group of processes Type of ordering: Deliveries respect the FIFO ordering.
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
State Machines CS 614 Thursday, Feb 21, 2002 Bill McCloskey.
Logical Clocks (2). Topics r Logical clocks r Totally-Ordered Multicasting r Vector timestamps.
Computer Science 425 Distributed Systems (Fall 2009) Lecture 5 Multicast Communication Reading: Section 12.4 Klara Nahrstedt.
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed Shared Memory Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Mutual Exclusion Steve Ko Computer Sciences and Engineering University at Buffalo.
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Replication with View Synchronous Group Communication Steve Ko Computer Sciences and Engineering.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
CS425/CSE424/ECE428 – Distributed Systems Nikita Borisov - UIUC1 Some material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra,
Multicast Communication. Unicast vs. Multicast Multicast Whereas the majority of network services and network applications use unicast for IPC, multicasting.
Group Communication Group oriented activities are steadily increasing. There are many types of groups:  Open and Closed groups  Peer-to-peer and hierarchical.
Logical Clocks. Topics Logical clocks Totally-Ordered Multicasting Vector timestamps.
Dealing with open groups The view of a process is its current knowledge of the membership. It is important that all processes have identical views. Inconsistent.
Dealing with open groups The view of a process is its current knowledge of the membership. It is important that all processes have identical views. Inconsistent.
Hwajung Lee. A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types of groups:
Lecture 11-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 28, 2010 Lecture 11 Leader Election.
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt.
Building Dependable Distributed Systems, Copyright Wenbing Zhao
Distributed Systems Topic 5: Time, Coordination and Agreement
Fault-Tolerant Broadcast Terminology: broadcast(m) a process broadcasts a message to the others deliver(m) a process delivers a message to itself 1.
Lecture 12-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2012 Indranil Gupta (Indy) October 4, 2012 Lecture 12 Mutual Exclusion.
Lecture 7- 1 CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 7 Distributed Mutual Exclusion Section 12.2 Klara Nahrstedt.
Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 15: Coordination.
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 6: Synchronyzation 3/5/20161 Distributed Systems - COMP 655.
CSE 486/586 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Lecture 9: Multicast Sep 22, 2015 All slides © IG.
CS425 / CSE424 / ECE428 — Distributed Systems — Fall Nikita Borisov - UIUC1 Some material derived from slides by Leslie Lamport.
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
Distributed Systems Lecture 7 Multicast 1. Previous lecture Global states – Cuts – Collecting state – Algorithms 2.
Slides for Chapter 11: Coordination and Agreement From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley.
Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.
Distributed Systems Course Coordination and Agreement
CSE 486/586 Distributed Systems Reliable Multicast --- 1
Coordination and Agreement
CSE 486/586 Distributed Systems Leader Election
Advanced Operating System
Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013
CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy)
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CSE 486/586 Distributed Systems Leader Election
Distributed Systems Course Coordination and Agreement
CSE 486/586 Distributed Systems Reliable Multicast --- 2
CSE 486/586 Distributed Systems Reliable Multicast --- 1
CSE 486/586 Distributed Systems Leader Election
Presentation transcript:

CS425/CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya

 Groups for MPs  Two people  Must select by next week  Find a partner:  In class  On newsgroup  staff Nikita Borisov - UIUC2

 Consider the following vector timestamps  T1: [1,3,2]  T2: [2,4,2]  How do they compare:  A: T1 > T2  B: T1 < T2  C: T1 = T2  D: None of the above Nikita Borisov - UIUC3

 Which of these cuts is consistent?  A: 1C: both  B: 2D: neither Nikita Borisov - UIUC4

 Unicast  One-to-one: Message from process p to process q.  Best effort: message may be delivered, but will be intact  Reliable: message will be delivered  Broadcast  One-to-all: Message from process p to all processes  Impractical for large networks  Multicast  One-to-many: “Local” broadcast within a group g of processes Nikita Borisov - UIUC5

 Define multicast properties  Reliability  Ordering  Examine algorithms for reliable and/or ordered multicast  Readings:  12.4 (4 th ed), 15.4 (5 th ed)  Optional: 4.5 (4 th ed), 4.4 (5 th ed) Nikita Borisov - UIUC6

Nikita Borisov - UIUC7

 Akamai’s Configuration Management System (called ACMS) uses a core group of 3-5 servers. These servers continuously multicast to each other the latest updates. They use reliable multicast. After an update is reliably multicast within this group, it is then sent out to all the (1000s of) servers Akamai has all over the world.  Air Traffic Control System: orders by one ATC need to be ordered (and reliable) multicast out to other ATC’s.  Newsgroup servers multicast to each other in a reliable and ordered manner Nikita Borisov - UIUC8

Application (at process p) MULTICAST PROTOCOL send multicast Incoming messages deliver multicast One process p Nikita Borisov - UIUC9

 A straightforward way to implement B-multicast is to use a reliable one-to-one send (unicast) operation:  B-multicast(g,m): for each process p in g, send(p,m).  receive(m): B-deliver(m) at p.  Guarantees?  All processes in g eventually receive every multicast message…  … as long as send is reliable  … and no process crashes Nikita Borisov - UIUC10

 Integrity: A correct (i.e., non-faulty) process p delivers a message m at most once.  Agreement: If a correct process delivers message m, then all the other correct processes in group(m) will eventually deliver m.  Property of “all or nothing.”  Validity: If a correct process multicasts (sends) message m, then it will eventually deliver m itself.  Guarantees liveness to the sender.  Validity and agreement together ensure overall liveness: if some correct process multicasts a message m, then, all correct processes deliver m too Nikita Borisov - UIUC11

On initialization Received := {}; For process p to R-multicast message m to group g B-multicast(g,m); (p ∈ g is included as destination) On B-deliver(m) at process q with g = group(m) if (m ∉ Received): Received := Received ∪ {m}; if (q ≠ p): B-multicast(g,m); R-deliver(m) R-multicast B-multicast reliable unicast “USES” Nikita Borisov - UIUC12

On initialization Received := {}; For process p to R-multicast message m to group g B-multicast(g,m); (p ∈ g is included as destination) On B-deliver(m) at process q with g = group(m) if (m ∉ Received): Received := Received ∪ {m}; if (q ≠ p): B-multicast(g,m); R-deliver(m) Integrity Validity Agreement Nikita Borisov - UIUC13

 FIFO ordering: If a correct process issues multicast(g,m) and then multicast(g,m’), then every correct process that delivers m’ will have already delivered m.  Causal ordering: If multicast(g,m)  multicast(g,m’) then any correct process that delivers m’ will have already delivered m.  Typically,  defined in terms of multicast communication only  Total ordering: If a correct process delivers message m before m’ (independent of the senders), then any other correct process that delivers m’ will have already delivered m Nikita Borisov - UIUC14

Totally ordered messages T 1 and T 2. FIFO-related messages F 1 and F 2. Causally related messages C 1 and C 3 Causal ordering implies FIFO ordering Total ordering does not imply causal ordering. Causal ordering does not imply total ordering. Hybrid mode: causal-total ordering, FIFO-total ordering Nikita Borisov - UIUC15

Bulletin board: os.interesting Item FromSubject 23A.HanlonMach 24G.JosephMicrokernels 25A.HanlonRe: Microkernels 26T.L’HeureuxRPC performance 27M.WalkerRe: Mach end What is the most appropriate ordering for this application? (a) FIFO (b) causal (c) total What is the most appropriate ordering for this application? (a) FIFO (b) causal (c) total Nikita Borisov - UIUC16

 Look at messages from each process in the order they were sent:  Each process keeps a sequence number for each other process.  When a message is received, if message # is: ▪ as expected (next sequence), accept ▪ higher than expected, buffer in a queue ▪ lower than expected, reject Nikita Borisov - UIUC17

 S p g : the number of messages p has sent to g.  R q g : the sequence number of the latest group-g message p has delivered from q.  For p to FO-multicast m to g  p increments S p g by 1.  p “piggy-backs” the value S p g onto the message.  p B-multicasts m to g.  At process p, Upon receipt of m from q with sequence number S:  p checks whether S= R q g +1. If so, p FO-delivers m and increments R q g  If S > R q g +1, p places the message in the hold-back queue until the intervening messages have been delivered and S= R q g Nikita Borisov - UIUC18

Nikita Borisov - UIUC19

 P1 P2 P Physical Time Reject: 1 < Accept 1 = Accept: 2 = Buffer 2 > Accept: 1 = Accept Buffer 2 = Reject: 1 < Accept 1 = Sequence Vector (do NOT confuse with vector timestamps) “Accept” = Deliver Nikita Borisov - UIUC20

Sequencer = Leader process Nikita Borisov - UIUC21

Message 2 Proposed Seq P 2 P 3 P 1 P 4 3 Agreed Seq Nikita Borisov - UIUC22

 Sender multicasts message to everyone  Reply with proposed priority (sequence no.)  Larger than all observed agreed priorities  Larger than any previously proposed (by self) priority  Store message in priority queue  Ordered by priority (proposed or agreed)  Mark message as undeliverable  Sender chooses agreed priority, re-multicasts message with agreed priority  Maximum of all proposed priorities  Upon receiving agreed (final) priority  Mark message as deliverable  Deliver any deliverable messages at front of priority queue Nikita Borisov - UIUC23

Nikita Borisov - UIUC24 A B C A:1 B:1 A:2 C:3 C:2 C:3 B:3 P1 P2 P3 A:2 ✔ B:3 ✔ B:3.1 C:3.3 ✔✔ B:3.1 ✔ ✔ ✔ ✔✔

 For a message m 1, consider the first process p that delivers m 1  At p, when message m 1 is at head of priority queue and has been marked deliverable, let m 2 be another message that has not yet been delivered (i.e., is on the same queue or has not been seen yet by p) finalpriority(m 2 ) >= proposedpriority(m 2 ) > finalpriority(m 1 )  Suppose there is some other process p’ that delivers m 2 before it delivers m 1. Then at p’, finalpriority(m 1 ) >= proposedpriority(m 1 ) > finalpriority(m 2 )  a contradiction! Due to “max” operation at sender Since queue ordered by increasing priority Due to “max” operation at sender Since queue ordered by increasing priority Nikita Borisov - UIUC25

The number of group-g messages from process j that have been seen at process i so far Nikita Borisov - UIUC26

 P1 P2 P3 Physical Time (1,1,0) Reject: Accept 0,0,0 1,0,0 1,1,0 1,0,0 Buffer, missing P1(1) 1,1,0 Accept: 1,0,0 Accept Buffered message 1,1,0 (1,0,0) (1,1,0) Accept Nikita Borisov - UIUC27

 Multicast is operation of sending one message to multiple processes in a given group  Reliable multicast algorithm built using unicast  Ordering – FIFO, total, causal Nikita Borisov - UIUC28