1 © R. Guerraoui Distributed algorithms Prof R. Guerraoui Assistant Marko Vukolic Exam: Written, Feb 5th Reference: Book - Springer.

Slides:



Advertisements
Similar presentations
Fault Tolerance. Basic System Concept Basic Definitions Failure: deviation of a system from behaviour described in its specification. Error: part of.
Advertisements

Global States.
Impossibility of Distributed Consensus with One Faulty Process
DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.
A General Characterization of Indulgence R. Guerraoui EPFL joint work with N. Lynch (MIT)
Failure Detection The ping-ack failure detector in a synchronous system satisfies – A: completeness – B: accuracy – C: neither – D: both.
6.852: Distributed Algorithms Spring, 2008 Class 7.
Distributed Systems Overview Ali Ghodsi
P. Kouznetsov, 2006 Abstracting out Byzantine Behavior Peter Druschel Andreas Haeberlen Petr Kouznetsov Max Planck Institute for Software Systems.
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class Joachim Wieland Mikel Larrea Alberto Lafuente The University of.
1 © P. Kouznetsov On the weakest failure detector for non-blocking atomic commit Rachid Guerraoui Petr Kouznetsov Distributed Programming Laboratory Swiss.
Byzantine Generals Problem: Solution using signed messages.
Failure Detectors. Can we do anything in asynchronous systems? Reliable broadcast –Process j sends a message m to all processes in the system –Requirement:
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Asynchronous Consensus (Some Slides borrowed from ppt on Web.(by Ken Birman) )
Teaching material based on Distributed Systems: Concepts and Design, Edition 3, Addison-Wesley Copyright © George Coulouris, Jean Dollimore, Tim.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Impossibility.
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.
Cloud Computing Concepts
Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.
Distributed Systems Terminating Reliable Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Efficient Algorithms to Implement Failure Detectors and Solve Consensus in Distributed Systems Mikel Larrea Departamento de Arquitectura y Tecnología de.
Composition Model and its code. bound:=bound+1.
State Machines CS 614 Thursday, Feb 21, 2002 Bill McCloskey.
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Failure detection and consensus Ludovic Henrio CNRS - projet OASIS Distributed Algorithms.
Lecture 8-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 16, 2010 Lecture 8 The Consensus.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
Consensus and Its Impossibility in Asynchronous Systems.
1 © R. Guerraoui Regular register algorithms R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch.
Agenda Fail Stop Processors –Problem Definition –Implementation with reliable stable storage –Implementation without reliable stable storage Failure Detection.
Prepared By: Md Rezaul Huda Reza
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt.
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window.
Several sets of slides by Prof. Jennifer Welch will be used in this course. The slides are mostly identical to her slides, with some minor changes. Set.
SysRép / 2.5A. SchiperEté The consensus problem.
Feb 15, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements.
Alternating Bit Protocol S R ABP is a link layer protocol. Works on FIFO channels only. Guarantees reliable message delivery with a 1-bit sequence number.
Distributed Agreement. Agreement Problems High-level goal: Processes in a distributed system reach agreement on a value Numerous problems can be cast.
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
Unreliable Failure Detectors for Reliable Distributed Systems Tushar Deepak Chandra Sam Toueg Presentation for EECS454 Lawrence Leinweber.
1 AGREEMENT PROTOCOLS. 2 Introduction Processes/Sites in distributed systems often compete as well as cooperate to achieve a common goal. Mutual Trust/agreement.
Coordination and Agreement
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Distributed systems Total Order Broadcast
Alternating Bit Protocol
Agreement Protocols CS60002: Distributed Systems
Parallel and Distributed Algorithms
Outline Announcements Fault Tolerance.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
EEC 688/788 Secure and Dependable Computing
Distributed systems Reliable Broadcast
EEC 688/788 Secure and Dependable Computing
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Chapter 5 (through section 5.4)
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Distributed systems Consensus
Distributed Systems Terminating Reliable Broadcast
Presentation transcript:

1 © R. Guerraoui Distributed algorithms Prof R. Guerraoui Assistant Marko Vukolic Exam: Written, Feb 5th Reference: Book - Springer Verlag - - Introduction to Reliable Distributed Programming -

2 Algorithms (history) M. Al-Khawarizmi ~9th century: inventor of the zero, the decimal system, Arithmetic and Algebra Calculated the circumference and volume of planets (including the earth): the first significant program

3 In short We study algorithms for distributed systems: a new way of thinking about algorithms Whereas a centralized algorithm is the soul of a computer, a distributed algorithm is the soul of a society of computers

4 Distributed algorithms (history) E. Dijkstra (concurrent os)~60’s L. Lamport: ‘‘a distributed system is one that stops your application because a machine you have never heard from crashed’’ ~70’s J. Gray (transactions) ~70’s N. Lynch (consensus) ~80’s Birman, Schneider, Toueg – Cornell – (this course) ~90’s

5 Important This course is complementary to the optional course (selected topics in distributed computing) We study here message passing based algorithms whereas the other course focuses on shared memory based algorithms

6 Overview (1) Why? Motivation (2) Where? Between the network and the application (3) How? (3.1) Specifications, (3.2) assumptions, and (3.3) algorithms

7 A distributed system A B C

8 Clients-server Client B Client A Server

9 Multiple servers (genuine distribution) Server A Server B Server C

10 Applications Military and traffic control Finances: e-transactions, e-banking, stock-exchange Reservation systems

11 The optimistic view Concurrency => speed (load-balancing) Partial failures => high-availability

12 The pessimistic view  Concurrency (interleaving) => incorrectness  Partial failures => incorrectness

13 Overview (1) Why? Motivation (2) Where? Between the network and the application (3) How? (3.1) Specifications, (3.2) assumptions, and (3.3) algorithms

14 Distributed systems

15 Distributed systems The application needs underlying services for distributed interaction The network is not enough Reliability guarantees (e.g., TCP) are only offered for communication among pairs of processes, i.e., one- to-one communication (client-server)

16 Reliable broadcast Causal order broadcast Shared memory Consensus Total order broadcast Atomic commit Leader election Terminating reliable broadcast Content of this course

17 Reliable distributed services Example 1: reliable broadcast Ensure that a message sent to a group of processes is received (delivered) by all or none Example 2: atomic commit Ensure that the processes reach a common decision on whether to commit or abort a transaction

18 Underlying services (1): processes (abstracting computers) (2): channels (abstracting networks) (3): failure detectors (abstracting time)

19 Processes  The distributed system is made of a finite set of processes: each process models a sequential program  Processes are denoted by p1,..pN or p, q, r  Processes have unique identities and know each other  Every pair of processes is connected by a link through which the processes exchange messages

20 Processes A process executes a step at every tick of its local clock: a step consists of A local computation (local event) and message exchanges with other processes (global event) NB. One message is delivered from/sent to a process per step

21 Processes The program of a process is made of a finite set of modules (or components) organized as a software stack Modules within the same process interact by exchanging events upon event do // something trigger

22 Modules of a process request(deliver) indication request(deliver) indication request(deliver) indication

23 Overview (1) Why? Motivation (2) Where? Between the network and the application (3) How? (3.1) Specifications, (3.2) assumptions, and (3.3) algorithms

24 Approach Specifications: What is the service? i.e., the problem ~ liveness + safety Assumptions: What is the model, i.e., the power of the adversary? Algorithms: How do we implement the service? Where are the bugs (proof)? What cost?

25 Overview (1) Why? Motivation (2) Where? Between the network and the application (3) How? (3.1) Specifications, (3.2) assumptions, and (3.3) algorithms

26 Liveness and safety Safety is a property which states that nothing bad should happen Liveness is a property which states that something good should happen Any specification can be expressed in terms of liveness and safety properties (Lamport and Schneider)

27 Liveness and safety Example: Tell the truth Having to say something is liveness Not lying is safety

28 Specifications Example 1: reliable broadcast Ensure that a message sent to a group of processes is received by all or none Example 2: atomic commit Ensure that the processes reach a common decision on whether to commit or abort a transaction

29 Overview (1) Why? Motivation (2) Where? Between the network and the application (3) How? (3.1) Specifications, (3.2) assumptions, and (3.3) algorithms

30 Overview (1) Why? Motivation (2) Where? Between the network and the application (3) How? (3.1) Specifications, (3.2) assumptions, and (3.3) algorithms Assumptions on processes and channels Failure detection

31 Processes  A process either executes the algorithm assigned to it (steps) or fails  Two kinds of failures are mainly considered: Omissions: the process omits to send messages it is supposed to send (distracted) Arbitrary: the process sends messages it is not supposed to send (malicious or Byzantine)

32 Processes Crash-stop: a more specific case of omissions A process that omits a message to a process, omits all subsequent messages to all processes (permanent distraction): it crashes

33 Processes By default, we shall assume a crash-stop model throughout this course; that is, unless specified otherwise: processes fail only by crashing (no recovery) A correct process is a process that does not fail (that does not crash)

34 Processes communicate by message passing through communication channels Messages are uniquely identified and the message identifier includes the sender’s identifier Processes/Channels

35 Fair-loss links FL1. Fair-loss: If a message is sent infinitely often by pi to pj, and neither pi or pj crashes, then m is delivered infinitely often by pj FL2. Finite duplication: If a message is sent a finite number of times by pi to pj, it is delivered a finite number of times by pj FL3. No creation: No message is delivered unless it was sent

36 Stubborn links SL1. Stubborn delivery: if a process pi sends a message m to a correct process pj, and pi does not crash, then pj delivers m an infinite number of times SL2. No creation: No message is delivered unless it was sent

37 Algorithm (sl) Implements: StubbornLinks (sp2p). Uses: FairLossLinks (flp2p). upon event do while (true) do trigger ; upon event do trigger ;

38 Reliable (Perfect) links Properties PL1. Validity: If pi and pj are correct, then every message sent by pi to pj is eventually delivered by pj PL2. No duplication: No message is delivered (to a process) more than once PL3. No creation: No message is delivered unless it was sent

39 Algorithm (pl) Implements: PerfectLinks (pp2p). Uses: StubbornLinks (sp2p). upon event do delivered := empty; upon event do trigger ; upon event do if m  delivered then trigger ; add m to delivered;

40 Reliable links We shall assume reliable links (also called perfect) throughout this course (unless specified otherwise) Roughly speaking, reliable links ensure that messages exchanged between correct processes are not lost

41 Overview (1) Why? Motivation (2) Where? Between the network and the application (3) How? (3.1) Specifications, (3.2) assumptions, and (3.3) algorithms Processes and links Failure Detection

42 Failure Detection A failure detector is a distributed oracle that provides processes with suspicions about crashed processes It is implemented using (i.e., it encapsulates) timing assumptions According to the timing assumptions, the suspicions can be accurate or not

43 Failure Detection A failure detector module is defined by events and properties Events Indication: Properties: Completeness Accuracy

44 Failure Detection Perfect: Strong Completeness: Eventually, every process that crashes is permanently suspected by every correct process Strong Accuracy: No process is suspected before it crashes Eventually Perfect: Strong Completeness Eventual Strong Accuracy: Eventually, no correct process is ever suspected

45 Failure Detection Implementation: (1) Processes periodically exchange heartbeat messages (2) A process sets a timeout based on worst case round trip of a message exchange (3) A process suspects another process if it timeouts that process (4) A process that delivers a message from a suspected process revises its suspicion and increases its time-out

46 Timing assumptions Synchronous: Processing: the time it takes for a process to execute a step is bounded and known Delays: there is a known upper bound limit on the time it takes for a message to be received Clocks: the drift between a local clock and the global real time clock is bounded and known Eventually Synchronous: the timing assumptions hold eventually Asynchronous: no assumption

47 Overview (1) Why? Motivation (2) Where? Between the network and the application (3) How? (3.1) Specifications, (3.2) assumptions, and (3.3) algorithms

48 Algorithms modules of a process request(deliver) indication request(deliver) indication request(deliver) indication

49 Algorithms p1 p2 p3 m1 m2 m3

50 Algorithms p1 p2 p3 m1 m2 crash

51 The rest; for every abstraction (A) We assume a crash-stop system with a perfect failure detector (fail-stop) We give algorithms (B) We try to make a weaker assumption We revisit the algorithms