“Towards Self Stabilizing Wait Free Shared Memory Objects” By:  Hopeman  Tsigas  Paptriantafilou Presented By: Sumit Sukhramani Kent State University.

Slides:

Advertisements

Similar presentations

Completeness and Expressiveness

Advertisements

Mutual Exclusion – SW & HW By Oded Regev. Outline: Short review on the Bakery algorithm Short review on the Bakery algorithm Black & White Algorithm Black.

Impossibility of Distributed Consensus with One Faulty Process

1 © R. Guerraoui The Limitations of Registers R. Guerraoui Distributed Programming Laboratory.

N-Consensus is the Second Strongest Object for N+1 Processes Eli Gafni UCLA Petr Kuznetsov Max Planck Institute for Software Systems.

TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.

1 © R. Guerraoui The Power of Registers Prof R. Guerraoui Distributed Programming Laboratory.

Snap-stabilizing Committee Coordination Borzoo Bonakdarpour Stephane Devismes Franck Petit IEEE International Parallel and Distributed Processing Symposium.

Mutual Exclusion By Shiran Mizrahi. Critical Section class Counter { private int value = 1; //counter starts at one public Counter(int c) { //constructor.

Multiprocessor Synchronization Algorithms ( ) Lecturer: Danny Hendler The Mutual Exclusion problem.

Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©

Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.

Consensus Hao Li.

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1.

Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©

THIRD PART Algorithms for Concurrent Distributed Systems: The Mutual Exclusion problem.

1 Introduction to Computability Theory Lecture11: Variants of Turing Machines Prof. Amos Israeli.

1 © R. Guerraoui - Shared Memory - R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch.

CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.

CPSC 668Set 16: Distributed Shared Memory1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.

CPSC 668Set 17: Fault-Tolerant Register Simulations1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.

CPSC 668Self Stabilization1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.

Concurrency in Distributed Systems: Mutual exclusion.

Database Management Systems I Alex Coman, Winter 2006

CPSC 668Set 11: Asynchronous Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.

Copyright © Cengage Learning. All rights reserved.

On the Cost of Fault-Tolerant Consensus When There are no Faults Idit Keidar & Sergio Rajsbaum Appears in SIGACT News; MIT Tech. Report.

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 11: Asynchronous Consensus 1.

Transactions Sylvia Huang CS 157B. Transaction A transaction is a unit of program execution that accesses and possibly updates various data items. A transaction.

CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 3 (26/01/2006) Instructor: Haifeng YU.

Transaction Lectured by, Jesmin Akhter, Assistant professor, IIT, JU.

1 Lectures on Parallel and Distributed Algorithms COMP 523: Advanced Algorithmic Techniques Lecturer: Dariusz Kowalski Lectures on Parallel and Distributed.

Memory Consistency Models Alistair Rendell See “Shared Memory Consistency Models: A Tutorial”, S.V. Adve and K. Gharachorloo Chapter 8 pp of Wilkinson.

Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park.

1 Chapter 9 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2014 Synchronization Algorithms and Concurrent Programming Synchronization.

DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch Set 11: Asynchronous Consensus 1.

1 © R. Guerraoui Regular register algorithms R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch.

THIRD PART Algorithms for Concurrent Distributed Systems: The Mutual Exclusion problem.

CHAPTER 1 Regular Languages

Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.

CS294, Yelick Consensus revisited, p1 CS Consensus Revisited

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 8: More Mutex with Read/Write Variables 1.

Max Registers, Counters, and Monotone Circuits Keren Censor, Technion Joint work with: James Aspnes, Yale University Hagit Attiya, Technion (Slides

Hwajung Lee. Well, you need to capture the notions of atomicity, non-determinism, fairness etc. These concepts are not built into languages like JAVA,

Hwajung Lee. Why do we need these? Don’t we already know a lot about programming? Well, you need to capture the notions of atomicity, non-determinism,

Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.

CS 542: Topics in Distributed Systems Self-Stabilization.

Chapter 21 Asynchronous Network Computing with Process Failures By Sindhu Karthikeyan.

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 16: Distributed Shared Memory 1.

Logical Clocks. Topics r Logical clocks r Totally-Ordered Multicasting.

Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb

DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 9: Fault Tolerant Consensus 1.

Program Correctness. The designer of a distributed system has the responsibility of certifying the correctness of the system before users start using.

Equivalence with FA * Any Regex can be converted to FA and vice versa, because: * Regex and FA are equivalent in their descriptive power ** Regular language.

Chapter 10 Mutual Exclusion Presented by Yisong Jiang.

Space Bounds for Reliable Storage: Fundamental Limits of Coding Alexander Spiegelman Yuval Cassuto Gregory Chockler Idit Keidar 1.

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS

When Is Agreement Possible

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS

Transactions Sylvia Huang CS 157B.

(Slides were added to Keren’s original presentation. DH.)

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS

Constructing Reliable Registers From Unreliable Byzantine Components

- Atomic register specification -

R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch

R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch

Presentation transcript:

“Towards Self Stabilizing Wait Free Shared Memory Objects” By:  Hopeman  Tsigas  Paptriantafilou Presented By: Sumit Sukhramani Kent State University

1. Two types of faulty systems:  Processors fail  The memory is faulty Processors fail: The distributed system must remain functional even if a certain fraction of the processors are malfunctioning. In the construction of shared memory objects like atomic objects, by considering wait free constructions we can guarantee any operation executed by a single processor to complete even if all other processors crash. Memory is faulty: A distributed system is required to overcome arbitrary changes to its state within a bounded amount of time. If the system does that then it is called self stabilizing. A shared memory object is a data structure stored in shared memory which may be accessed concurrently by several processors in the system through the invocation of operations for the object.

As I proceed I’ll be covering the following topics:  Defining self stabilized wait-free shared memory object. Wait free object of course will ensure that all operations after a transient error will eventually behave according to their specification.  Next I’ll show that within this framework one cannot construct a self stabilizing single reader single writer regular bit from single reader single writer safe bits.  Next we’ll see that how self stabilizing dual reader single writer safe bits can be used to construct single reader single writer regular bits. 1. Defining Self stabilizing wait free objects: Shared registers are shared objects reminiscent of ordinary variables, but which can be read or written by different processors concurrently. Registers are constructed basically from the data structure of memory cells (sub registers) and set of read and write procedures. There are different levels of consistency in shared memory objects: a) Safe: A read returns most recently written value unless the read is concurrent with a write in which case returned values will be arbitrary. b) Regular: A read returns the value written by concurrent or immediately preceding write. c) Atomic: All operations appear to take effect instantaneously.

Linearizability Although operations of concurrent processes may overlap, each operation appears to take effect instantaneously at some point between its invocation and response. The desired behavior of the object is described by its sequential specification S. This specifies the state if the object and for each operation its effect on the state of the object and its response. It is represented as (s, r = O (p), s | ) ε S if invoking O with parameters p in state s changes the state of the object to s | and returns r as its response. Definition1: An object is linearizable w.r.t. its sequential specification S if all possible runs over the object are linearizable w.r.t. S. This means that an object is linearizable w.r.t. specification S if all actions appear to take effect instantaneously and act according to S. Run over an object is basically a tuple with actions A and partial ordering  such that for A,B ε A, A  B iff the response for A occurred before the invocation of B. Runs thus have a infinite length and capture the real time ordering between actions invoked by the processors. Definition 2: A shared wait-free object is self stabilizing if an arbitrary fair execution ( in which all operations on all processors are executed infinitely often) started in an arbitrary state, is linearizable except for a finite prefix. Because this definition does not give a bound on the length of the finite prefix (which is stabilization time) and requires the processors to repair the transient fault, we come up with a stronger definition. Definition 3: A shared wait free object is self stabilizing if an arbitrary execution started in an arbitrary state is linearizable except for a bounded finite prefix.

Definition 4: A run is linearizable w.r.t. sequential specification S after k processor actions, if there exists a corresponding sequential execution such that  is an extension of  and for all i if count (A i ) > k then (s i, A i, s i+1 ) ε S. Here k is the delay of implementation. This definition gives the guarantee that all actions following arbitrarily behaving actions will agree on how that action actually behaved. For example all reads that occur immediately after a transient error should return the same value. Definition 5: An implementation of a shared object specified by sequential specification S is called k self stabilizing wait free if all its runs starting in an arbitrary initial state are linearizable w.r.t. S after k processor actions, and all actions complete within a bounded number of steps. In this definition the delay k of the implementation is taken to be independent of the type of operations performed by a processor. Definition 6: A register is k-stabilizing wait free if for all its runs starting in an arbitrary initial state only the first k actions preformed by one processor behave arbitrarily after the error, and all actions finish within a bounded number of steps. Such a register is safe if for all other read actions R not concurrent with a write action, R returns the value written by W with W   R. Such a register is regular if all other read actions return the value written by a feasible write.

0-Stabilizing 1W1R safe bits are not strong enough Conventionally it is taken that an adversary can chose a schedule, thereby forcing the protocols to behave incorrectly. The adversary is in control of:  Scheduling the process steps in a run.  Choosing the configuration of the system after a transient error. The writer (reader) does not have read access to the sub registers which it can write, in order to know its contents and to settle itself into correct stabilized behaviors, it has to rely on local information or on information passed by the shared registers that can be written by the reader. The adversary here can set the system in an inconsistent state. It thereby misleads the processes into incorrect behaviors. Theorem: There exists no deterministic implementation of a wait free k stabilizing 1W1R binary regular register using I W1R binary 0 stabilizing safe sub registers.  Proof by contradiction

Constructing a 0-Stabilizing 1W1R Regular Bit We construct here a 0-stabilizing 1W1R regular bit from a 0-stabilizing 1W2R safe bit. Similarity to Lamports construction of regular bit. For a binary register the only difference between regular and safe register is that if all feasible writes of a read operation write the same value a and at least one such write overlaps with the read, then for the safe bit the read may return either a or ¬a, whereas for a regular bit it must return a. But if the processor maintains a local copy of the last written value, and only writes a new value if this new value differs from the previous value stored in the register, then this is similar to a regular register. This approach although simple cannot be applied because as we have already seen that this approach cannot work for the implementation of a self stabilizing regular bit because due to a transient error the local copy of the last written value may be out of sync with the actual value present in the register. Theorem: The above protocol implements a wait free 0-stabilizing 1W1R regular binary register using one 0-stabilizing 1W2R safe binary register.

Proof: We have to show that in all reads return the value written by a feasible write.  Suppose a read RR returns the value a, since the register is binary this can be wrong only if all feasible writes write ¬a.  Then the write WR with WR   RR writes ¬a, and all writes WR | concurrent with RR write ¬a, hence we have WR   WR |.  WR and WR | read S without interference from.  Either WR reads ¬a from S and hence does not write to S, or WR reads a from S and hence writes ¬a to S.  Continuing this way we see that for all WR | concurrent with RR we see that none of them writes S.  Hence, then RR reads S without interference in and as WR   RR reads ¬a from S and returns ¬a.  This is contradiction.

Constructing a 0-Stabilizing 1WIR l-Ary Regular Register Here we see the construction of a 0-stabilizing 1W1R l-ary regular register from l 0- stabilizing 1W1R regular bits. This protocol is also similar to regular multiple valued register by Lamport. The difference here is that we take into account that a transient error may cause all bits to be set to 0. In that case a default (l-1) is returned.

Lemma: If WR i (1)  | RR i then RR i returns 1 or there exists a j > I and a WR | (j) such that WR | j (1)  | RR i+1. Moreover, if WR i (1)  | RR i then WR (i)  Wr | (j). Theorem: Protocol 2 implements a 0-stabilizing 1W1R l-ary regular register using l 0-stabilizing 1W1R regular binary registers. Proof: This theorem can be proved by considering the definition 6 wherein we have to show that all reads return the value written by a feasible write. If for a read RR, for all I, RR i returns 0, then it returns l-1. Applying the above lemma inductively starting at i= l-1 we conclude that for no I, there is a WR i (1) such that WR i (1)  | RR i. Constructing a 1-Stabilizing n WnR l-ary Atomic Register Here we construct a 1-stabilizing n WnR l-ary atomics register from 1R1W ∞ -ary o- stabilizing regular registers. In this protocol, V is the domain of values written and lead by the multi-writer register. The constructions use n2 regular 0-stabilizing regular registers Rij written by processor I and read by processor j. These register store a label consisting of an unbounded tag, a processor id with values in the domain {1,……,n} and a value in V. Labels are lexicographically ordered by ≤.