Fault-Tolerant SemiFast Implementations of Atomic Read/Write Registers Nicolas Nicolaou, University of Connecticut Joint work with: C. Georgiou, University of Cyprus A. A. Shvartsman, University of Connecticut 11/28/2019
What is an Atomic R/W Register? Write(7) Read The goal of this work is to investigate efficient implementations of atomic read/write registers. The read/write property of the register, imposes that only read or write operations can be performed on the register. The write operation writes a value on the register, and a read operation returns the value written. The read/write register is atomic if all the operations performed on the register can be order in a sequential manner. To achieve atomic consistency on a single register is relatively easy. (one operation can access the register at each time unit) However a single copy of the register constitutes a single point of failure and thus the system very vulnerable to failures. To increase availability and fault-tolerance we consider a distributed read/wirte register, where we have multiple copies of the register and these copies are replicated among a set of processes. Some of these replicas might fail, in our case by crashing. The challenge now is to maintain atomicity even though in the distributed environment the register might be accessed concurrently by more than a single process. Essentially we need to be able to order the operations on the distributed register such that they seem they happen in a sequential order. Write(0) 11/28/2019
Prior Results Attiya et al. 1995 - Single Writer Multiple Reader (SWMR) model where <1/2 of processes may crash: Pairs <value, tag> are used for ordering operations Writer increases tag and sends <value, tag> to a majority Reader: Phase 1: obtains maximum tag from a majority Phase 2 propagates the tag to a majority and then returns the value associated with that tag Lynch, Shvartsman 1997 and Englert, Shvartsman 2000 extend the above result for MWMR Quorums instead of majorities 2 round protocols for read/write operations Here we present some prior results on this area. Starting from Attiya et al, they presented the first implementation of a read write register in the SWMR model. In their model they assume that the majority of the processes holding the replicas do not crash. They use the <value, tag> pair (something we are going to use as well) to order the write operations. The protocols for the write and read operations are simple. In particular the writer sends the value he wants to write, with a new tag to the majority of the processes. For the read operation the reader obtains the values of the majority of the processes. It detects the highest timestamp among those values and before returning the associated timestamp it propagates the <maxTag, value> pair to the majority. A generalization of the first work suggested the use of quorums instead of majorities. Note that both approaches use 2 communication rounds for a read operation (2 phases). 11/28/2019
Fast Implementations Dutta, Guerraoui, Levy, Chakraborty 2004 SWMR model Single communication round for all write and read operations Requires R < (S/t) – 2 R: # readers, S: # servers, t: max # server failures Not applicable to MWMR A new result of Dutta et al presented in PODC2004 showed that fast implementations where both write and read operations perform a single communication round are possible in the SWMR. The main disadvantage of this work is that it introduces a strict constraint on the number of reader participants in the system. In particular, the number of the readers is inversely proportional to the number of failures in the system. As a consequence the number of readers must be strictly less than the number of servers (replicas) if there exists at least one failure in the system. The authors of that work though posed the question whether we can obtain semifast implementations, where we have fast reads or fast writes, and relax the bound on the number of readers. This question we are trying to answer in this work and we provide some analysis of our results. Question: Can one introduce SemiFast Implementations (with fast reads or fast writes) to relax the bound on the number of readers? 11/28/2019
Our Contributions Formally define semifast implementations Develop a semifast implementation Based on Fast implementation of Dutta et al. 04 Introduce the notion of virtual nodes Bounds On the Number of Virtual Nodes V < (S/t) - 2 Show that no SemiFast implementations are possible for MWMR Allow n communication rounds for the reads Simulation Results A small percentile of read operations require a second communication round. 11/28/2019
Semifast Implementations Def. An implementation I is semifast if it satisfies the following properties (informally): All writes are fast All complete read operations perform one or two communication rounds Ιf a read operation ρ1 performs two communication rounds, then all read operations that precede or succeed ρ1 and return the same* value as ρ1 are fast Τhere exists some execution of I which contains only fast read and write operations * Assuming all written values are unique 11/28/2019
Simulation Results NS2 Simulator Only 10% of read operations need to perform 2nd communication round Stochastic Environment Fix Interval Environment 11/28/2019
Conclusions Definition of Semifast implementations: Only one complete read operation has to perform 2 comm. rounds for every write operation #Virtual Nodes < (S/t) - 2 No semifast implementation possible for MWMR model 11/28/2019
References [Partha Dutta, Rachid Gerraoui, Ron R. Levy and Arindam Chakraborty, How Fast can a Distributed Atomic Read be, Proceedings of the 23rd annual ACM Symposium on Principles of distributed computing (PODC 2004), pp. 236- 245, ACM press 2004.] [S. Dolev, S. Gilbert, N.A.Lynch,A.A.Shvartsman,J.L.Welch Geoquorums:Implementing Atomic Memory in Mobile Ad-Hoc Networks, Technical Report LCS-TR-900, MIT (2003) ] [Nancy Lynch and Alex Shvartsman. Rambo: A reconfigurable atomic memory service for dynamic networks. In Proceedings of the 16th International Symposium on Distributed Computing, pages 173-- 190, 2002 ] [H.Attiya, A.Bar-Noy, and D.Dolev Sharing memory robustly in message-passing systems, Journal of the ACM, January 1995.] [B. Englert and A. A. Shvartsman. Graceful quorum reconfiguration in a robust emulation of shared memory.In International Conference on Distributed Computing Systems, pages 454–463, 2000] [N. A. Lynch and A. A. Shvartsman. Robust emulation of shared memory using dynamic quorumacknowledged broadcasts. In Symposium on Fault-Tolerant Computing, pages 272–281, 1997] 11/28/2019
Questions? 11/28/2019
Atomicity [Lynch96] Valid Executions: Invalid Executions: write(8) ack( ) write(8) Time Time read( ) ret(0) read( ) ret(0) read( ) ret(8) Definition: Read and Write - arbitrary time to complete Operations appear to occur at some point between its invocation and response. (serialization point) Produce a sequential trace. write(8) ack( ) write(8) Time Time read( ) ret(0) read( ) ret(0) read( ) ret(8) read( ) ret(0) 11/28/2019
Definitions Each process invokes 1 operation at a time. Each operation consists of: Invocation Step Matching Response Step Incomplete Operation: no matching response for the invocation. Complete operation op1 precedes op2 => response for op1 precedes invocation for op2. If op is a read we write “rd” If op is a write we write “wr” 11/28/2019
Definitions (Cont.) If rd returns x then there is wrk s.t. valk=x Algorithm implements a register => satisfies termination and atomicity properties Termination: Every operation by correct process completes. Atomicity (SWMR, wrk:kth write): If rd returns x then there is wrk s.t. valk=x If wrk precedes rd and rd returns valj, then j k If rd returns valk then wrk precedes or is concurrent to rd If rd1 returns valk and a succeeding rd2 returns valj then j k 11/28/2019
Atomic vs Shared Register Accessible from Single Process Write(v): Stores the value v and returns OK Read(): Read the last value stored Atomic Register: A distributed data structure Accessed by multiple processes concurrently Behaves as a sequential register. (Recall Atomicity) 11/28/2019
Atomic vs Shared Register (Graphical) Sequential Register: Atomic Register: Register=0 Register=8 Read(0) Read(8) WriteAck() Read() Write(8) Read() Read() Write(8) Write(8) Register WriteAck( ) ReadAck2(0) Read1( ) ReadAck1(8) Read2( ) 11/28/2019
When a SemiFast Impl. is Impossible? When V<(S/t)-2 If V≤(S/t)-1 then No fast implementation even in the case of a skip-free write operation. (violates non-triv. Property 3) If V=(S/t)-2 then there is an execution where we need 2 complete read operations to perform 2 com. rounds. (violates Property 1) When V=(S/t)-2 There exists an execution where 2 read operations return the same value and they both perform 2 com. rounds (violates Prop. 2). 11/28/2019
No Semifast for MWMR model. Proof Sketch: Split multiple round operations into: Read phases Write phases Show that as soon as an operation performs a write phase cannot change its return value. Show a construction where W=2, R=2 and t=1 and atomicity is violated. 11/28/2019
Challenge How fast can a general implementation of an Atomic Register can be? Dynamic Environment (Mobility) Hybrid implementations with some read and write operations to perform multiple roundtrips. Communication Overhead in such impl.? Quorum based algorithms. How fast can they be? 11/28/2019