ITEC452 Distributed Computing Lecture 2 – Part 2 Models in Distributed Systems Hwajung Lee.

Slides:



Advertisements
Similar presentations
IPC (Interprocess Communication)
Advertisements

Impossibility of Distributed Consensus with One Faulty Process
PROTOCOL VERIFICATION & PROTOCOL VALIDATION. Protocol Verification Communication Protocols should be checked for correctness, robustness and performance,
Leader Election Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.  i,j  V  i,j are non-faulty.
Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window.
Failure detector The story goes back to the FLP’85 impossibility result about consensus in presence of crash failures. If crash can be detected, then consensus.
Time and Clock Primary standard = rotation of earth De facto primary standard = atomic clock (1 atomic second = 9,192,631,770 orbital transitions of Cesium.
Distributed Systems and Algorithms Sukumar Ghosh University of Iowa Spring 2014.
Byzantine Generals Problem: Solution using signed messages.
Failure Detectors. Can we do anything in asynchronous systems? Reliable broadcast –Process j sends a message m to all processes in the system –Requirement:
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
1 Complexity of Network Synchronization Raeda Naamnieh.
Ordering and Consistent Cuts Presented By Biswanath Panda.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
CPSC 668Set 3: Leader Election in Rings1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
Group Communications Group communication: one source process sending a message to a group of processes: Destination is a group rather than a single process.
CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
Non-blocking Atomic Commitment Aaron Kaminsky Presenting Chapter 6 of Distributed Systems, 2nd edition, 1993, ed. Mullender.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
Lecture 13 Synchronization (cont). EECE 411: Design of Distributed Software Applications Logistics Last quiz Max: 69 / Median: 52 / Min: 24 In a box outside.
Distributed Algorithms (22903) Lecturer: Danny Hendler Leader election in rings This presentation is based on the book “Distributed Computing” by Hagit.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 12: Impossibility.
1 Principles of Reliable Distributed Systems Recitation 7 Byz. Consensus without Authentication ◊S-based Consensus Spring 2008 Alex Shraer.
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
Composition Model and its code. bound:=bound+1.
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Why do we need models? There are many dimensions of variability in distributed systems. Examples: interprocess communication mechanisms, failure classes,
CSE 812. Outline What is a distributed system/program? Program Models Program transformation.
Reliable Communication in the Presence of Failures Based on the paper by: Kenneth Birman and Thomas A. Joseph Cesar Talledo COEN 317 Fall 05.
Consensus and Its Impossibility in Asynchronous Systems.
1 Lectures on Parallel and Distributed Algorithms COMP 523: Advanced Algorithmic Techniques Lecturer: Dariusz Kowalski Lectures on Parallel and Distributed.
Foundations of Communication on Multiple-Access Channel Dariusz Kowalski.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 10, 2005 Session 9.
The Complexity of Distributed Algorithms. Common measures Space complexity How much space is needed per process to run an algorithm? (measured in terms.
Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
Physical clock synchronization Question 1. Why is physical clock synchronization important? Question 2. With the price of atomic clocks or GPS coming down,
Hwajung Lee.  Models are simple abstractions that help understand the variability -- abstractions that preserve the essential features, but hide the.
Hwajung Lee. Well, you need to capture the notions of atomicity, non-determinism, fairness etc. These concepts are not built into languages like JAVA,
Hwajung Lee. Why do we need these? Don’t we already know a lot about programming? Well, you need to capture the notions of atomicity, non-determinism,
Leader Election (if we ignore the failure detection part)
SysRép / 2.5A. SchiperEté The consensus problem.
Hwajung Lee.  Models are simple abstractions that help understand the variability -- abstractions that preserve the essential features, but hide the.
Hwajung Lee. Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.   i,j  V  i,j are non-faulty ::
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 3: Leader Election in Rings 1.
1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.
ITEC452 Distributed Computing Lecture 2 – Part 2 Models in Distributed Systems Hwajung Lee.
Failure detection The design of fault-tolerant systems will be easier if failures can be detected. Depends on the 1. System model, and 2. The type of failures.
Alternating Bit Protocol S R ABP is a link layer protocol. Works on FIFO channels only. Guarantees reliable message delivery with a 1-bit sequence number.
Distributed Algorithms (22903) Lecturer: Danny Hendler Leader election in rings This presentation is based on the book “Distributed Computing” by Hagit.
Introduction to distributed systems description relation to practice variables and communication primitives instructions states, actions and programs synchrony.
Understanding Models. Modeling Communication: A message passing model System topology is a graph G = (V, E), where V = set of nodes ( sequential processes.
Distributed Systems and Algorithms Sukumar Ghosh University of Iowa Fall 2011.
ITEC452 Distributed Computing Lecture 2 – Part 3 Models in Distributed Systems Hwajung Lee.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Model and complexity Many measures Space complexity Time complexity
When Is Agreement Possible
Understanding Models.
Distributed Algorithms (22903)
CS 5620 Distributed Systems and Algorithms
Alternating Bit Protocol
Distributed Consensus
Agreement Protocols CS60002: Distributed Systems
Inter Process Communication (IPC)
Distributed Algorithms (22903)
ITEC452 Distributed Computing Lecture 3 Models in Distributed Systems
CS 5620 Distributed Systems and Algorithms
CS 5620 Distributed Systems and Algorithms
Presentation transcript:

ITEC452 Distributed Computing Lecture 2 – Part 2 Models in Distributed Systems Hwajung Lee

3.5 Relationship among Models

Weak vs. Strong Models One object (or operation) of a strong model = More than one objects (or operations) of a weaker model. Often, weaker models are synonymous with fewer restrictions. One can add layers (additional restrictions) to create a stronger model from weaker one. Examples HLL model is stronger than assembly language model. Asynchronous is weaker than synchronous. Bounded delay is stronger than unbounded delay (channel)

Model transformation Stronger models - simplify reasoning, but - needs extra work to implement Weaker models - are easier to implement. - Have a closer relationship with the real world “Can model X be implemented using model Y?” is an interesting question in computer science. Sample problems Non-FIFO to FIFO channel Message passing to shared memory Non-atomic broadcast to atomic broadcast Complicated Simple

Non-FIFO to FIFO channel (1) P Q buffer m1m4m3m2

Non-FIFO to FIFO channel (2) {Sender process P}{Receiver process Q} var i : integer {initially 0} var k : integer {initially 0} buffer: buffer [0..∞] of msg {initially  k: buffer [k] = empty repeatrepeat {STORE} send m[i],i to Q; receive m[i],i from P; i := i+1 store m[i] into buffer[i]; forever {DELIVER} while buffer[k] ≠ empty do begin deliver content of buffer [k]; buffer [k] := empty  k := k+1; end forever

Observations (a) Needs Unbounded sequence numbers and (b) Unbounded number of buffer slots (Both are bad) Now solve the same problem on a model where (a) The propagation delay has a known upper bound of T. (b) The messages are sent per unit time. (c) The messages are received at a rate faster than r. The buffer requirement drops to r.T. Synchrony pays. Question. How to solve the problem using bounded buffer space if the propagation delay is arbitrarily large?

Message-passing to Shared memory {Read X by process i } : read x[i] {Write X:= v by process i } - x[i] := v ; - Atomically broadcast v to every other process j (j ≠ i); -After receiving broadcast, process j (j ≠ i) sets x[j] to v. Understand the significance of atomic operations. It is not trivial, but is very important in distributed systems This is incomplete. There are more pitfalls here.

Non-atomic to atomic broadcast Atomic broadcast = either everybody or nobody receives {process i is the sender} for j = 1 to N-1 (j ≠ i) send message m to neighbor[j] (Easy!) Now include crash failure as a part of our model. What if the sender crashes at the middle? Implement atomic broadcast in presence of crash.

Mobile-agent based communication Communicates via messengers instead of (or in addition to) messages. What is the lowest Price of an iPod in Radford? Carries both program and data

3.6 Classification Based on Special Properties

Other classifications of models Reactive vs Transformational systems A reactive system never sleeps (like: a server) A transformational (or non-reactive systems) reaches a fixed point after which no further change occurs in the system (Examples?) Named vs Anonymous systems In named systems, process id is a part of the algorithm. In anonymous systems, it is not so. All are equal. (-) Symmetry breaking is often a challenge. (+) Easy to switch one process by another with no side effect. Saves log N bits.

Knowledge based communication Alice and Bob enter into an agreement: whenever one falls sick, (s)he will call the other person. Since making the agreement, no one called the other person, so both concluded that they are in good health. Assume that the clocks are synchronized, communication links are perfect, and a telephone call requires zero time to reach. What kind of interprocess communication model is this?

History The paper “Cheating Husbands and Other Stories: A Case Study of Knowledge, Action, and Communication” by Yoram Moses, danny Dolev, Joseph Halpern illustrates how actions are taken and decisions are made without explicit communication using common knowledge. (Adaptation of Gamow and Stern, “Forty unfaithful wives,” Puzzle Math, 1958)

Relevance (Bidding in the game of cards like bridge is an example of knowledge-based communication)  Communicating through silence is energy- efficient

The Queen’s Proclamation The Queen read out the following in a meeting at the town square.  There are one or more unfaithful husbands in our community.  None of you know whether your husband is faithful. But each of you which of the other husbands are unfaithful.  Do not discuss this with anyone, but should you discover that your own husband is unfaithful, you should shoot him on the midnight of the day you find out about it

What happened after this Thirty nine silent nights went by, and on the fortieth night, gunshots were heard.  What was going on for 39 nights?  How many unfaithful husbands were there?  Why did it take so long

A simple case  W2 does not know of any other unfaithful husband.  W2 knows that there is at least one (common knowledge)  W2 concludes that it must be H2, and kills him on the first night. W1H1 W2H2 W3H3 W4H4

Theorem If there are N unfaithful H’s, then they will all be killed on the midnight of the N th day.

A more complicated case  W2 knows H3 is unfaithful. She expects H3 to be killed on the first night.  W3 has similar thoughts. So no one was killed on the first night. W1H1 W2H2 W3H3 W4H4

The case continues  Then W2 thinks, why wasn’t H3 killed in the first night. Does it mean, W3 knows of another unfaithful husband whom I don’t know? It must be my husband H2 then!  W3 has similar thoughts.  So both H2 and H3 are killed on the SECOND night. W1H1 W2H2 W3H3 W4H4

Common Knowledge F is a common knowledge means :  Everyone knows F  Everyone knows that everyone knows F  Everyone knows that everyone knows that everyone knowsthat …F (ad infinitum) …

Model and complexity Many measures oSpace complexity oTime complexity oMessage complexity oBit complexity oRound complexity What do these mean? Consider broadcasting in an n-cube (n=3) source

Broadcasting using messages {Process 0} sends m to neighbors {Process i > 0} repeat receive m {m contains the value } ; if m is received for the first time then x[i] := m.value ; send x[i] to each neighbor j > I else discard m end if forever What is the (1) message complexity (2) space complexity per process? Each process j has a variable x[j] initially undefined

Broadcasting using shared memory {Process 0} x[0] := v {Process i > 0} repeat if  a neighbor j < i : x[i] ≠ x[j] then x[i] := x[j] {this is a step} else skip end if forever What is the time complexity? (i.e. how many steps are needed?) Can be arbitrarily large! WHY? Each process j has a variable x[j] initially undefined

Broadcasting using shared memory Now, use “large atomicity”, where in one step, a process j reads the states of ALL its neighbors of smaller id, and updates x[j] only when these are equal, and different from x[j]. What is the time complexity? How many steps are needed? The time complexity is now O(n 2 ) Each process j has a variable x[j] initially undefined

Time complexity in rounds Rounds are truly defined for synchronous systems. An asynchronous round consists of a number of steps where every process (including the slowest one) takes at least one step. How many rounds will you need to complete the broadcast using the large atomicity model? Each process j has a variable x[j] initially undefined