Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring 2008 1 Principles of Reliable Distributed Systems Recitation 1: Introduction.

Slides:

Advertisements

Similar presentations

Distributed Snapshots: Determining Global States of Distributed Systems - K. Mani Chandy and Leslie Lamport.

Advertisements

Impossibility of Distributed Consensus with One Faulty Process

Part 3: Safety and liveness

Process Synchronization Continued 7.2 The Critical-Section Problem.

6.852: Distributed Algorithms Spring, 2008 Class 7.

Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©

Distributed Computing 5. Snapshot Shmuel Zaks ©

Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©

CSE115/ENGR160 Discrete Mathematics 02/28/12

Safety and Liveness. Defining Programs Variables with respective domain –State space of the program Program actions –Guarded commands Program computation.

1 Principles of Reliable Distributed Systems Lecture 6: Synchronous Uniform Consensus Spring 2005 Dr. Idit Keidar.

Spin Tutorial (some verification options). Assertion is always executable and has no other effect on the state of the system than to change the local.

1 Introduction to Computability Theory Lecture15: Reductions Prof. Amos Israeli.

1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.

Introduction to Computability Theory

Ordering and Consistent Cuts Presented By Biswanath Panda.

1 Principles of Reliable Distributed Systems Lecture 3: Synchronous Uniform Consensus Spring 2006 Dr. Idit Keidar.

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.

Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 1: Introduction.

CPSC 668Set 3: Leader Election in Rings1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Synchronous Byzantine.

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.

Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.

1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.

1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Impossibility.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Synchronous Byzantine.

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Principles of Reliable Distributed Systems Recitation.

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 12: Impossibility.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Principles of Reliable Distributed Systems Lecture 1: Introduction.

Cloud Computing Concepts

Message Passing Systems A Formal Model. The System Topology – network (connected undirected graph) Processors (nodes) Communication channels (edges) Algorithm.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.

1 Principles of Reliable Distributed Systems Recitation 7 Byz. Consensus without Authentication ◊S-based Consensus Spring 2008 Alex Shraer.

Composition Model and its code. bound:=bound+1.

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.

Halting Problem. Background - Halting Problem Common error: Program goes into an infinite loop. Wouldn’t it be nice to have a tool that would warn us.

Message Passing Systems A Formal Model. The System Topology – network (connected undirected graph) Processors (nodes) Communication channels (edges) Algorithm.

Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.

Cs3102: Theory of Computation Class 18: Proving Undecidability Spring 2010 University of Virginia David Evans.

Consensus and Its Impossibility in Asynchronous Systems.

CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur

CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 8 Instructor: Haifeng YU.

Halting Problem Introduction to Computing Science and Programming I.

CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.

CS294, Yelick Consensus revisited, p1 CS Consensus Revisited

CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt.

Defining Liveness by Bowen Alpern and Fred B. Schneider Presented by Joe Melnyk.

Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.

Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.

CSE 311 Foundations of Computing I Lecture 28 Computability: Other Undecidable Problems Autumn 2011 CSE 3111.

CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 3: Leader Election in Rings 1.

Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb

Program Correctness. The designer of a distributed system has the responsibility of certifying the correctness of the system before users start using.

CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 9 Instructor: Haifeng YU.

CSE 486/586 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.

Introduction to distributed systems description relation to practice variables and communication primitives instructions states, actions and programs synchrony.

When Is Agreement Possible

A Balanced Introduction to Computer Science David Reed, Creighton University ©2005 Pearson Prentice Hall ISBN X Chapter 13 (Reed) - Conditional.

Decidable Languages Costas Busch - LSU.

A Balanced Introduction to Computer Science David Reed, Creighton University ©2005 Pearson Prentice Hall ISBN X Chapter 13 (Reed) - Conditional.

Distributed systems Consensus

Presentation transcript:

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 1: Introduction Spring 2009 Alex Shraer

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring Topics for today Specifications Liveness and Safety Coordinated Attack

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring Safety and Liveness The properties are verifiable in an execution Safety = a property always happens –the program will never produce a wrong result –In all prefixes of a given execution –Even in the empty prefix (doing nothing doesn't violate safety) Liveness = a property eventually happens –the program will eventually produce a result

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring Safety: those properties whose violation always has a finite witness (finite refutation) In other words: if every counter-example for a property p has a finite prefix in which p does not hold, then p is safety. A safety property cannot be “fixed” after it is violated. Liveness: those properties whose violation never has a finite witness (all counter-example traces are infinite) No mater what happens along a finite trace, something good could still happen later - you can always extend a trace to satisfy a livenes property. Safety and Liveness - Cont.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring Examples If the driver pushes the brakes of the car, it will eventually stop –if this is not true, how can we refute this? –the counter example will have the following form: driver pushes the breakes at some point, but the car never stops afterwards - an infinite execution –this is a liveness property The program terminates within 31 computational steps –a finite execution may violate this; this is a safety property! The program eventually terminates –only an infinite example can possibly refute this claim; liveness! Each process will enter its critical section infinitely often –This means: at any point of the run, each process will eventually enter its critical section (at some future point) (infinitely often = always eventually) –liveness!

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring The meaning of liveness [Lamport 2000] The question of whether a real system satisfies a liveness property is meaningless; it can be answered only by observing the system for an infinite length of time, and real systems don’t run forever. Liveness is always an approximation to the property we really care about. We want a program to terminate within 100 years, but proving that it does would require addition of distracting timing assumptions. So, we prove the weaker condition that the program eventually terminates. This doesn’t prove that the program will terminate within our lifetimes, but it does demonstrate the absence of infinite loops.

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring A non-safety and non-liveness property The machine provides infinitely often beer after initially providing sprite three times in a row This property consists of two parts: 1.it requires beer to be provided infinitely often this is a liveness property 2.the first three drinks it provides should all be sprite example of a bad prefix: one of first three drinks is beer this is a safety property This property is a conjunction of a safety and a liveness property

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring Example - execution of an elevator system states floor: doors: buttons pushed: 1st closed none 1st closed 3rd floor 3rd open none 3rd closed 3rd floor call from 3rd floor moving up 2 floors open the doors make beep sound events: What is the trace of this execution? call from 3rd floor moving up 2 floors open the doors,, (infinite execution), make beep sound

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring Predicates The doors are always open –False! Counter example: the prefix of the execution consisting of its first state If someone summons the elevator to some floor, the doors eventually open – True over our execution After doors open, the next action of the elevator is to make a beep sound –True over our execution The elevator may break after the 1 st year of use –not a property! cannot be evaluated over an execution Suppose that we add time to our model If someone summons the elevator to some floor: –The elevator will eventually stop –The elevator reaches that floor no later than 1 minute later

Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring Safety/Liveness/Both/None? הדלתות נפתחות לכל היותר שלוש פעמים מספר המנעולים שאחראי הבניין מתקין הוא לפחות 30. הדלתות נפתחות בדיוק שלוש פעמים היועץ המשפטי יוכל להתמנות לתפקיד שופט. נהג שצבר שלוש עבירות תנועה לא ינהג לפני שיעבור קורס נהיגה מונעת. כל קריאה לפונקציה חוזרת. המשטרה תפנה את הצומת החסום, אבל זה ייקח לה לפחות חצי שעה. המשטרה תפנה את הצומת החסום תוך חצי שעה לכל היותר.

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Coordinated Attack Let’s attack A B

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring The Model: Synchronous with Message Loss Message loss can be detected –Bounded delay, timeouts Message loss is unbounded –In some runs, all the messages are lost

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Properties of Coordinated Attack Agreement: If both generals decide, they decide the same. Termination: Every general eventually decides. Validity: –If both inputs are “not ready” then no general decides “attack” –if both inputs are “ready” and no messages are lost then no general decides “no-attack”.

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring What happens if? (cont’d) Weak Termination: If there are no message losses, then all processes eventually decide. We want an algorithm that solves the problem where Agreement, Weak Termination and Validity are required. Each general performs the following: –Send(inp) –Upon Deliver(m) Decide(this.inp & m.inp) Or any deterministic rule that matches validity –halt

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring What happens if? (cont’d) Where’s the difference? Why couldn’t we use the proof from class when only Weak Termination was used? The proof shown in the lecture relies on the fact that all processes terminate (=decide), when we build the runs. Otherwise, the proof doesn’t work

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring What happens if? (cont’d) Unanimous Termination: If any process decides, then all processes eventually decide. We want an algorithm that solves the problem where Agreement, Weak Termination, Unanimous Termination and Validity are required. –Homework

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Stronger Models Bounded loss rate –At most, 10 messages are lost on each channel (from general A to general B and vice versa). Is it enough? Each general performs the following: –Repeat 11 times: Send(my_vote) –Upon Deliver(other_vote) Decide(my_vote & other_vote) Or any deterministic rule that matches validity –halt.

Correctness of this Algorithm Agreement: If both generals decide, they decide on the “&” of their votes – the same Validity –if both votes are 0 (not attack), since 0&0 = 0, no general decides “attack” –If both votes are 1 (attack), since 1&1 = 1, no general decides “not attack” Termination –Each vote is sent 11 times. Since at most 10 can be lost, at least one message is delivered (to each general). Therefore, they both decide and halt. Aran Bergman/Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring To Summarize The exact model assumptions and the exact problem specification are critical –Minor changes in either lead to different results.