Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring 2008 1 Principles of Reliable Distributed Systems Recitation.

Slides:



Advertisements
Similar presentations
Distributed systems Total Order Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
Advertisements

Impossibility of Distributed Consensus with One Faulty Process
DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 6 Instructor: Haifeng YU.
1 Distributed systems Causal Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Reliable Multicast Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
1 Principles of Reliable Distributed Systems Lecture 3: Synchronous Uniform Consensus Spring 2006 Dr. Idit Keidar.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
CPSC 668Set 15: Broadcast1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
Non-blocking Atomic Commitment Aaron Kaminsky Presenting Chapter 6 of Distributed Systems, 2nd edition, 1993, ed. Mullender.
Group Communication Phuong Hoai Ha & Yi Zhang Introduction to Lab. assignments March 24 th, 2004.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 3: Fault-Tolerant.
CPSC 668Set 15: Broadcast1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 1: Introduction.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Ordered Communication. Define guarantees about the order of deliveries inside group of processes Type of ordering: Deliveries respect the FIFO ordering.
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 12: Impossibility.
Systems of Distributed systems Module 2 - Distributed algorithms Teaching unit 2 – Properties of distributed algorithms Ernesto Damiani University of Bozen.
Distributed Systems Terminating Reliable Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
1 Principles of Reliable Distributed Systems Recitation 7 Byz. Consensus without Authentication ◊S-based Consensus Spring 2008 Alex Shraer.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Logical Clocks (2). Topics r Logical clocks r Totally-Ordered Multicasting r Vector timestamps.
CIS 720 Distributed algorithms. “Paint on the forehead” problem Each of you can see other’s forehead but not your own. I announce “some of you have paint.
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
ECE291 Computer Engineering II Lecture 23 Dr. Zbigniew Kalbarczyk University of Illinois at Urbana- Champaign.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 15: Broadcast 1.
Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey (Paper by X. Défago, A. Schiper, and P. Urbán) ACM computing Surveys, Vol. 36,No 4,
Reliable Communication in the Presence of Failures Based on the paper by: Kenneth Birman and Thomas A. Joseph Cesar Talledo COEN 317 Fall 05.
Lab 2 Group Communication Farnaz Moradi Based on slides by Andreas Larsson 2012.
Issues with Clocks. Context The tree correction protocol was based on the idea of local detection and correction. Protocols of this type are complex to.
Synchronization. Why we need synchronization? It is important that multiple processes do not access shared resources simultaneously. Synchronization in.
Logical Clocks. Topics Logical clocks Totally-Ordered Multicasting Vector timestamps.
Farnaz Moradi Based on slides by Andreas Larsson 2013.
Vector Clock Each process maintains an array of clocks –vc.j.k denotes the knowledge that j has about the clock of k –vc.j.j, thus, denotes the clock of.
Synchronization Chapter 5.
Communication & Synchronization Why do processes communicate in DS? –To exchange messages –To synchronize processes Why do processes synchronize in DS?
Approximation of δ-Timeliness Carole Delporte-Gallet, LIAFA UMR 7089, Paris VII Stéphane Devismes, VERIMAG UMR 5104, Grenoble I Hugues Fauconnier, LIAFA.
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
D u k e S y s t e m s Asynchronous Replicated State Machines (Causal Multicast and All That) Jeff Chase Duke University.
SysRép / 2.5A. SchiperEté The consensus problem.
Reliable Communication in the Presence of Failures Kenneth P. Birman and Thomas A. Joseph Presented by Gloria Chang.
Event Ordering. CS 5204 – Operating Systems2 Time and Ordering The two critical differences between centralized and distributed systems are: absence of.
Fault-Tolerant Broadcast Terminology: broadcast(m) a process broadcasts a message to the others deliver(m) a process delivers a message to itself 1.
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 6: Synchronyzation 3/5/20161 Distributed Systems - COMP 655.
Distributed systems Causal Broadcast
Exercises for Chapter 11: COORDINATION AND AGREEMENT
CSE 486/586 Distributed Systems Reliable Multicast --- 1
Coordination and Agreement
Distributed systems Total Order Broadcast
Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Distributed systems Causal Broadcast
Distributed algorithms
Distributed Systems Terminating Reliable Broadcast
Distributed systems Causal Broadcast
CSE 486/586 Distributed Systems Reliable Multicast --- 1
Presentation transcript:

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Principles of Reliable Distributed Systems Recitation 2: Broadcast Services Spring 2009 Alex Shraer

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Broadcast Service for Replication Primitives: broadcast(m), deliver(m). –For simplicity, assume m is unique. Network Broadcast Algorithm Application deliver broadcast receivesend Broadcast Algorithm Application deliver broadcast receivesend

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Reliable Broadcast Specifications Validity: if a correct process broadcasts m then all correct processes eventually deliver m Agreement: if a correct process delivers m then all correct processes eventually deliver m –Uniform Agreement: if any process delivers m then all correct processes eventually deliver m Integrity: m is delivered by a correct process at most once, and only if it was previously broadcast

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Reliable Broadcast - Quiz What happens if a process fails during the broadcast of a message? Does a message delivery by a faulty process require the delivery of this message by correct processes?

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring FIFO Broadcast Why is FIFO important? FIFO Order: If a process broadcasts a message m before it broadcasts a message m’, then no correct process delivers m’ unless it has previously delivered m. FIFO Broadcast: Reliable broadcast + FIFO Order Alternative definition of FIFO Order? –“all messages broadcast by the same process are delivered to all processes in the order they are sent” Quiz: Are these definitions equivalent?

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Example Also, this alternative definition forces faulty processes to deliver messages. (impossible)

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Causal Broadcast Why is causality important? Event e causally precedes event f (e→f) iff: –a process executes both e and f, in that order, or –e is the broadcast of some message m and f is the delivery of m, or –There is an event h, such that e→h and h→f. Causal Order: If the broadcast of a message m causally precedes the broadcast of a message m’, then no correct process delivers m’ unless it has previously delivered m. Causal Broadcast: Reliable broadcast + Causal order

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Atomic Broadcast and Uniformity Why would we want more than Causal Broadcast? Atomic Broadcast: Reliable Broadcast + Total Order Total Order: if correct processes p and q both deliver messages m and m’, then p delivers m before m’ if and only if q delivers m before m’.

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Broadcast Primitives

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Uniformity Agreement, Integrity and Order place no restrictions on the behavior of faulty processes. Uniform – limit the behavior of faulty processes Example 1: Agreement allows a faulty process to deliver a message that is never delivered by correct processes Uniform Agreement: If a process (whether correct or faulty) delivers a message m, then all correct processes eventually deliver m. Example 2: Integrity allows a faulty process to deliver a message more than once, and to deliver messages ‘out of thin air’ Uniform Integrity: For any message m, every process (whether correct or faulty) delivers m at most once, and only if some process broadcast m. Likewise, we can strengthen the Order properties: Uniform FIFO Order: If a process broadcasts a message m before it broadcasts a message m’, then no process (whether correct of faulty) delivers m’ unless it has previously delivered m. Uniform Causal Order: If the broadcast of a message m causally precedes the broadcast of a message m’, then no process (whether correct or faulty) delivers m’ unless it has previously delivered m. Uniform Total Order: if any processes p and q (whether correct or faulty) both deliver messages m and m’, then p delivers m before m’ iff q delivers m before m’.

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Crash Failures Suppose processes are only subject to crash failures –They operate correctly up to the time they crash (by definition). Can we assume that the message deliveries that a process makes before crashing are always ‘correct’ (consistent with those of correct processes)? –No

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Crash Failures (cont’d) Coordinator-based algorithm: –When a process intends to broadcast a message m, it first sends m to a coordinator. –The coordinator delivers messages in the order in which it receives them, and periodically informs the other processes of this message delivery order. –Other processes deliver messages according to this order. –If the coordinator crashes, another process takes over as coordinator.

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Crash Failures (cont’d) The algorithm satisfies the specification Atomic Broadcast Suppose a coordinator delivers m before m’ and crashes. A new coordinator could think m’ is before m. All correct processes follow the new coordinator Thus, the old coordinator delivered messages out of order before it crashed. –Inconsistency can occur even when there are only crash failures. –Protocols should explicitly prevent inconsistency even when there are only crash failures

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring LTS Broadcast Algorithm - code for process p i Logical Clock Assignment: TS[j] ← 0,  j=0,…,n pending ← empty broadcast (m) TS[i] ← TS[i] + 1 send (m,  TS[i], i  ) to all upon receive (m,  t, j  ) TS[j] ← t add (m,  t, j  ) to pending TS[i] ← max (TS[i], t) + 1 Delivery Rule let (m,  t, j  ) be the entry in pending with the smallest  t, j  if  t, j    TS[k],k   k=0,…n then deliver (m) remove (m,  t, j  ) from pending

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring  1,2  2  3,1   3,3  4 5  6,2  האם הרשת מעבירה את כל ההודעות בסדר שמשמר את יחס ה-causality (כלומר happens-before)? סמנו את ערך logical clock בכל פעם שהוא משתנה. סמנו את ערך ה-LTS המצורף לכל הודעה. ציינו מתי (באיזה t) כל תהליך מבצע delivery להודעות m1 ו m3. אם בריצה המתוארת לא מתבצע delivery ציינו זאת בטבלה. m3m1 p1 p2 p3 Example Exam Question Delivery according to LTS

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Vector Clocks At process p i, on broadcast(m) –VC[i] := VC[i]+1 –use reliable broadcast to send m with VC to all –deliver m locally Upon receive m –place in message buffer Deliver m from p j from buffer if –VC[j] = m.VC[j] – 1 –forall k≠j : VC[k] ≥ m.VC[k] Upon deliver –VC[j] := VC[j] + 1 VC[j] is the number of messages of p j that causally precede p i ’s subsequent messages FIFO

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Example Exam Question – Cont. סמנו את ה-Vector Clock שהיה מצורף לכל הודעה אם היינו משתמשים בהם. ציינו מתי (באיזה t) כל תהליך מבצע delivery להודעות m1 ו m3. אם בריצה המתוארת לא מתבצע delivery ציינו זאת בטבלה. [0,0,0] [0,1,0] [1,1,0] [0,1,1] [1,1,1] [1,1,0] [1,2,1] [0,1,0] [0,1,1] [1,1,0] [0,1,1] [1,2,1] m3m1 p1 p2 p Delivery according to VC