Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation.

Slides:



Advertisements
Similar presentations
Impossibility of Distributed Consensus with One Faulty Process
Advertisements

Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Authored by: Seth Gilbert and Nancy Lynch Presented by:
6.852: Distributed Algorithms Spring, 2008 Class 7.
Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©
Computer Science 425 Distributed Systems CS 425 / ECE 428 Consensus
Consensus Hao Li.
Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©
Byzantine Generals Problem: Solution using signed messages.
1 Principles of Reliable Distributed Systems Lecture 6: Synchronous Uniform Consensus Spring 2005 Dr. Idit Keidar.
1 Principles of Reliable Distributed Systems Lecture 3: Synchronous Uniform Consensus Spring 2006 Dr. Idit Keidar.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
Sergio Rajsbaum 2006 Lecture 3 Introduction to Principles of Distributed Computing Sergio Rajsbaum Math Institute UNAM, Mexico.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous (Uniform)
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Asynchronous Consensus (Some Slides borrowed from ppt on Web.(by Ken Birman) )
Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 1: Introduction.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CPSC 668Set 16: Distributed Shared Memory1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Synchronous Byzantine.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Aran Bergman Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
Impossibility of Distributed Consensus with One Faulty Process Michael J. Fischer Nancy A. Lynch Michael S. Paterson Presented by: Oren D. Rubin.
1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.
Alex Shraer, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 1: Introduction.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Impossibility.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Synchronous Byzantine.
Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Principles of Reliable Distributed Systems Recitation.
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 12: Impossibility.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 Principles of Reliable Distributed Systems Lecture 1: Introduction.
Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.
Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.
On the Cost of Fault-Tolerant Consensus When There are no Faults Idit Keidar & Sergio Rajsbaum Appears in SIGACT News; MIT Tech. Report.
Systems of Distributed systems Module 2 - Distributed algorithms Teaching unit 2 – Properties of distributed algorithms Ernesto Damiani University of Bozen.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
1 Principles of Reliable Distributed Systems Recitation 7 Byz. Consensus without Authentication ◊S-based Consensus Spring 2008 Alex Shraer.
Composition Model and its code. bound:=bound+1.
Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 8: Failure Detectors.
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Chapter 14 Asynchronous Network Model by Mikhail Nesterenko “Distributed Algorithms” by Nancy A. Lynch.
Consensus and Its Impossibility in Asynchronous Systems.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Consensus with Partial Synchrony Kevin Schaffer Chapter 25 from “Distributed Algorithms” by Nancy A. Lynch.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 8 Instructor: Haifeng YU.
CS294, Yelick Consensus revisited, p1 CS Consensus Revisited
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window.
Hwajung Lee. Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit or Abort.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
SysRép / 2.5A. SchiperEté The consensus problem.
Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.
Chapter 21 Asynchronous Network Computing with Process Failures By Sindhu Karthikeyan.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 16: Distributed Shared Memory 1.
Alternating Bit Protocol S R ABP is a link layer protocol. Works on FIFO channels only. Guarantees reliable message delivery with a 1-bit sequence number.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 9: Fault Tolerant Consensus 1.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 9 Instructor: Haifeng YU.
Unreliable Failure Detectors for Reliable Distributed Systems Tushar Deepak Chandra Sam Toueg Presentation for EECS454 Lawrence Leinweber.
Coordination and Agreement
Alternating Bit Protocol
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Presentation transcript:

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 1: Introduction Spring 2007 Alex Shraer

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Last on Models –Synchronous and Asynchronous –Failure models (a little…) Specifications –Liveness and Safety The Coordinated Attack Problem Note: The proofs on the board are included in the course’s material –Yes, you should know them for the exam

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Safety and Liveness The properties are verifiable on an execution’s trace Safety = a property always happens –Closed under all prefixes Liveness = a property eventually happens

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Safety and Liveness A safety property cannot be “fixed” after it is violated. You can always extend a trace to satisfy a liveness property. finite refutation: if every counter-example trace for a property p has a finite prefix in which p does not hold, then p is safety. On the other hand, you can always extend a trace to satisfy a liveness property.

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Safety/Liveness/Both/None? Consider a partial elevator spec: The elevator will not stop in between floors. The elevator may break after the 1 st year of use If someone summons the elevator to some floor: –The elevator will eventually stop. –The elevator reaches that floor no later than 1 minute later

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Safety/Liveness/Both/None? ראש השנה חל בדיוק פעם אחת בשנה. היועץ המשפטי יוכל להתמנות לתפקיד שופט. נהג שצבר שלוש עבירות תנועה לא ינהג לפני שיעבור קורס נהיגה מונעת. אף תהליך לא יכול להשתמש ב CPU במשך זמן אינסופי. כל קריאה לפונקציה חוזרת. המשטרה תפנה את הצומת החסום, אבל זה ייקח לה לפחות חצי שעה. המשטרה תפנה את הצומת החסום תוך חצי שעה לכל היותר. מספר המנעולים שאחראי הבניין מתקין במהלך השנה הוא לפחות 30.

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Coordinated Attack Let’s attack A B

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring The Model: Synchronous with Message Loss Message loss can be detected –Bounded delay, timeouts Message loss is unbounded –In some runs, all the messages are lost

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Coordinated Attack Definition (Reminder) Requirements: –both generals must decide the same: either to attack or not to attack –if both are not ready to attack they must not attack –if both are ready to attack and no messages are lost then they must attack Still cannot be achieved!

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Properties of Coordinated Attack Agreement: If both generals decide, they decide the same. Termination: Every general eventually decides. Validity: –If both inputs are “not ready” then no general decides “attack” –if both inputs are “ready” and every message sent is delivered then no general decides “no-attack”.

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring What happens if? (cont’d) Weak Termination: If there are no message losses, then all processes eventually decide. We want an algorithm that solves the problem where Agreement, Weak Termination and Validity are required.

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring What happens if? (cont’d) Unanimous Termination: If any process decides, then all processes eventually decide. We want an algorithm that solves the problem where Agreement, Weak Termination, Unanimous Termination and Validity are required. Homework

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Where’s the difference? Why couldn’t we use the proof from class when only Weak Termination was used?

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Stronger Models Bounded loss rate – take 1 –At most, 10 messages are lost on each channel (from general A to general B and vice versa). Is it enough?

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Interfaces (Reminder) There are two generals A, B. Each has an input inp A, inp B  {“ready”, “not ready”} Possible actions for Q  {A, B}: –Decide Q (v), v  {“attack”, “no attack”} (Output) –Send Q (m), m  {“yes”, “no”} (Output) –Deliver Q (m) (Input)

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring Suggested Algorithm Each general performs the following: –Repeat 11 times: Send(inp) –Upon Deliver(m) Decide(this.inp & m.inp) Or any deterministic rule that matches validity –halt.

Eran Bergman & Eddie Bortnikov, Principles of Reliable Distributed Systems, Technion EE, Spring To Summarize The exact model assumptions and the exact problem specification are critical –Minor changes in either lead to different results.