1 © R. Guerraoui Regular register algorithms R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch.

Slides:



Advertisements
Similar presentations
1 © R. Guerraoui The Limitations of Registers R. Guerraoui Distributed Programming Laboratory.
Advertisements

CS 542: Topics in Distributed Systems Diganta Goswami.
CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.
The weakest failure detector question in distributed computing Petr Kouznetsov Distributed Programming Lab EPFL.
A General Characterization of Indulgence R. Guerraoui EPFL joint work with N. Lynch (MIT)
1 © R. Guerraoui The Power of Registers Prof R. Guerraoui Distributed Programming Laboratory.
(c) Oded Shmueli Distributed Recovery, Lecture 7 (BHG, Chap.7)
1 CSIS 7102 Spring 2004 Lecture 8: Recovery (overview) Dr. King-Ip Lin.
1 © R. Guerraoui Implementing the Consensus Object with Timing Assumptions R. Guerraoui Distributed Programming Laboratory.
1 © P. Kouznetsov On the weakest failure detector for non-blocking atomic commit Rachid Guerraoui Petr Kouznetsov Distributed Programming Laboratory Swiss.
Failure Detectors. Can we do anything in asynchronous systems? Reliable broadcast –Process j sends a message m to all processes in the system –Requirement:
1 © R. Guerraoui Object implementations out of faulty base objects Prof R. Guerraoui Distributed Programming Laboratory.
Failure Detectors & Consensus. Agenda Unreliable Failure Detectors (CHANDRA TOUEG) Reducibility ◊S≥◊W, ◊W≥◊S Solving Consensus using ◊S (MOSTEFAOUI RAYNAL)
1 © R. Guerraoui - Shared Memory - R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Asynchronous Consensus (Some Slides borrowed from ppt on Web.(by Ken Birman) )
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CPSC 668Set 16: Distributed Shared Memory1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 17: Fault-Tolerant Register Simulations1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.
1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 19: Paxos All slides © IG.
Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.
Distributed Systems Terminating Reliable Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Composition Model and its code. bound:=bound+1.
Midterm Exam 06: Exercise 2 Atomic Shared Memory.
Failure detection and consensus Ludovic Henrio CNRS - projet OASIS Distributed Algorithms.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.
1 Chapter 9 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2014 Synchronization Algorithms and Concurrent Programming Synchronization.
DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch Set 11: Asynchronous Consensus 1.
CS294, Yelick Consensus revisited, p1 CS Consensus Revisited
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
SysRép / 2.5A. SchiperEté The consensus problem.
1 © R. Guerraoui Distributed algorithms Prof R. Guerraoui Assistant Marko Vukolic Exam: Written, Feb 5th Reference: Book - Springer.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 16: Distributed Shared Memory 1.
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 9: Fault Tolerant Consensus 1.
“Towards Self Stabilizing Wait Free Shared Memory Objects” By:  Hopeman  Tsigas  Paptriantafilou Presented By: Sumit Sukhramani Kent State University.
“Distributed Algorithms” by Nancy A. Lynch SHARED MEMORY vs NETWORKS Presented By: Sumit Sukhramani Kent State University.
Distributed Systems Lecture 9 Leader election 1. Previous lecture Middleware RPC and RMI – Marshalling 2.
1 © R. Guerraoui Set-Agreement (Generalizing Consensus) R. Guerraoui.
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
The consensus problem in distributed systems
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Atomic register algorithms
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Computing with anonymous processes
Distributed systems Total Order Broadcast
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy)
EEC 688/788 Secure and Dependable Computing
Distributed systems Reliable Broadcast
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Distributed Algorithms (22903)
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Constructing Reliable Registers From Unreliable Byzantine Components
Distributed Algorithms (22903)
- Atomic register specification -
R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch
Computing with anonymous processes
Distributed systems Consensus
Broadcasting with failures
R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch
Presentation transcript:

1 © R. Guerraoui Regular register algorithms R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch

2 Overview of this lecture (1) Overview of a register algorithm (2) A bogus algorithm (3) A simplistic algorithm (4) A simple fail-stop algorithm (5) A tight asynchronous lower bound (6) A fail-silent algorithm

3 A distributed system P1 P2 P3

4 Shared memory model Registers P1 P3

5 Message passing model A B C

6 Implementing a register  From message passing to shared memory  Implementing the register comes down to implementing Read() and Write() operations at every process

7 Implementing a register  Before returning a Read() value, the process must communicate with other processes  Before performing a Write(), i.e., returning the corresponding ok, the process must communicate with other processes

8 Overview of this lecture (1) Overview of a register algorithm (2) A bogus algorithm (3) A simplistic algorithm (4) A simple fail-stop algorithm (5) A tight asynchronous lower bound (6) A fail-silent algorithm

9 A bogus algorithm  We assume that channels are reliable (perfect point to point links)  Every process pi holds a copy of the register value vi

10 A bogus algorithm  Read() at pi Return vi  Write(v) at pi vi := v Return ok  The resulting register is live but not safe: Even in a sequential and failure-free execution, a Read() by pj might not return the last written value, say by pi

11 P2 P1 W(5) W(6) R 1 () -> 0R 2 () -> 0R 3 () -> 0 No safety

12 Overview of this lecture (1) Overview of a register algorithm (2) A bogus algorithm (3) A simplistic algorithm (4) A simple fail-stop algorithm (5) A tight asynchronous lower bound (6) A fail-silent algorithm

13 A simplistic algorithm  We still assume that channels are reliable but now we also assume that no process fails  Basic idea: one process, say p1, holds the value of the register

14 A simplistic algorithm  Read() at pi send [R] to p1 when receive [v] Return v  Write(v) at pi send [W,v] to p1 when receive [ok] Return ok  At p1: T1: when receive [R] from pi send [v1] to pi T2: when receive [W,v] from pi v1 := v send [ok] to pi

15 Correctness (liveness)  By the assumption that (a) no process fails, (b) channels are reliable no wait statement blocks forever, and hence every invocation eventually terminates

16 Correctness (safety) (a) If there is no concurrent or failed operation, a Read() returns the last value written (b) A Read() must return some value written NB. If a Read() returns a value written by a given Write(), and another Read() that starts later returns a value written by a different Write(), then the second Write() cannot start after the first Write() terminates

17 Correctness (safety – 1) (a) If there is no concurrent or failed operation, a Read() returns the last value written Assume a Write(x) terminates and no other Write() is invoked. The value of the register is hence x at p1. Any subsequent Read() invocation by some process pj returns the value of p1, i.e., x, which is the last written value

18 Correctness (safety – 2) (b) A Read() returns the previous value written or the value concurrently written Let x be the value returned by a Read(): by the properties of the channels, x is the value of the register at p1. This value does necessarily come from a concurrent or from the last Write().

19 What if?  Processes might crash?  If p1 crashes, then the register is not live (wait- free)  If p1 is always up, then the register is regular and wait-free

20 Overview of this lecture (1) Overview of a register algorithm (2) A bogus algorithm (3) A simplistic algorithm (4) A simple fail-stop algorithm (5) A tight asynchronous lower bound (6) A fail-silent algorithm

21 A fail-stop algorithm We assume a fail-stop model; more precisely: any number of processes can fail by crashing (no recovery) channels are reliable failure detection is perfect (we have a perfect failure detector)

22 We implement a regular register every process pi has a local copy of the register value vi every process reads locally the writer writes globally, i.e., at all (non- crashed) processes A fail-stop algorithm

23 A fail-stop algorithm Write(v) at pi send [W,v] to all for every pj, wait until either: receive [ack] or suspect [pj] Return ok At pi: when receive [W,v] from pj vi := v send [ack] to pj Read() at pi Return vi

24 Correctness (liveness) A Read() is local and eventually returns A Write() eventually returns, by the (a) the strong completeness property of the failure detector, and (b) the reliability of the channels

25 Correctness (safety – 1) (a) In the absence of concurrent or failed operation, a Read() returns the last value written Assume a Write(x) terminates and no other Write() is invoked. By the accuracy property of the failure detector, the value of the register at all processes that did not crash is x. Any subsequent Read() invocation by some process pj returns the value of pj, i.e., x, which is the last written value

26 Correctness (safety – 2) (b) A Read() returns the value concurrently written or the last value written Let x be the value returned by a Read(): by the properties of the channels, x is the value of the register at some process. This value does necessarily come from the last or a concurrent Write().

27 What if? Failure detection is not perfect Can we devise an algorithm that implements a regular register and tolerates an arbitrary number of crash failures?

28 Overview of this lecture (1) Overview of a register algorithm (2) A bogus algorithm (3) A simplistic algorithm (4) A simple fail-stop algorithm (5) A tight asynchronous lower bound (6) A fail-silent algorithm

29 Lower bound  Proposition: any wait-free asynchronous implementation of a regular register requires a majority of correct processes  Proof (sketch): assume a Write(v) is performed and n/2 processes crash, then a Read() is performed and the other n/2 processes are up: the Read() cannot see the value v  The impossibility holds even with a 1-1 register (one writer and one reader)

30 The majority algorithm [ABD95]  We assume that p1 is the writer and any process can be reader  We assume that a majority of the processes is correct (the rest can fail by crashing – no recovery)  We assume that channels are reliable  Every process pi maintains a local copy of the register vi, as well as a sequence number sni and a read timestamp rsi  Process p1 maintains in addition a timestamp ts1

31 Algorithm - Write()  Write(v) at p1 ts1++ send [W,ts1,v] to all when receive [W,ts1,ack] from majority Return ok  At pi when receive [W,ts1, v] from p1 If ts1 > sni then l vi := v l sni := ts1 l send [W,ts1,ack] to p1

32 Algorithm - Read()  Read() at pi rsi++ send [R,rsi] to all when receive [R, rsi,snj,vj] from majority v := vj with the largest snj Return v  At pi when receive [R,rsj] from pj send [R,rsj,sni,vi] to pj

33 What if? Any process that receives a write message (with a timestamp and a value) updates its value and sequence number, i.e., without checking if it actually has an older sequence number

34 P1 W(5) W(6) P2 sn1 = 1; v1 = 5 Old writes P3 sn2 = 1; v2 = 5 sn3 = 2; v3 = 6 sn3 = 1; v3 = 5 R() -> 5 sn1 = 2; v1 = 6

35 Correctness 1 Liveness: Any Read() or Write() eventually returns by the assumption of a majority of correct processes (if a process has a newer timestamp and does not send [W,ts1,ack], then the older Write() has already returned) Safety 2: By the properties of the channels, any value read is the last value written or the value concurrently written

36 Correctness 2 (safety – 1) (a) In the absence of concurrent or failed operation, a Read() returns the last value written Assume a Write(x) terminates and no other Write() is invoked. A majority of the processes have x in their local value, and this is associated with the highest timestamp in the system. Any subsequent Read() invocation by some process pj returns x, which is the last written value

37 What if? Multiple processes can write concurrently?

38 P1 W(5) W(6) P2 ts1 = 1 Concurrent writes P3 R() -> 6 ts1 = 2 W(1) ts3 = 1

39 Overview of this lecture (1) Overview of a register algorithm (2) A bogus algorithm (3) A simplistic algorithm (4) A simple fail-stop algorithm (5) A tight asynchronous lower bound (6) A fail-silent algorithm