Lecture 3: State, Detection

Slides:



Advertisements
Similar presentations
Distributed Snapshots: Determining Global States of Distributed Systems - K. Mani Chandy and Leslie Lamport.
Advertisements

Global States.
Distributed Snapshots: Determining Global States of Distributed Systems Joshua Eberhardt Research Paper: Kanianthra Mani Chandy and Leslie Lamport.
Distributed Computing 5. Snapshot Shmuel Zaks ©
Optimal Termination Detection for Rings Murat Demirbas OSU.
Termination Detection Algorithm for Distributed Computations Edsger W. Dijkstra W.H.J. Feijen A.J.M. van Gasteren Presented by : Charu Jain.
Virtual Time “Virtual Time and Global States of Distributed Systems” Friedmann Mattern, 1989 The Model: An asynchronous distributed system = a set of processes.
Program correctness The State-transition model A global state S  s 0 x s 1 x … x s m {s k = local state of process k} S0  S1  S2  … Each state transition.
PROTOCOL VERIFICATION & PROTOCOL VALIDATION. Protocol Verification Communication Protocols should be checked for correctness, robustness and performance,
Uncoordinated Checkpointing The Global State Recording Algorithm.
Uncoordinated Checkpointing The Global State Recording Algorithm Cristian Solano.
Leader Election Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.  i,j  V  i,j are non-faulty.
CS 484. Discrete Optimization Problems A discrete optimization problem can be expressed as (S, f) S is the set of all feasible solutions f is the cost.
6.852: Distributed Algorithms Spring, 2008 Class 12.
CS542 Topics in Distributed Systems Diganta Goswami.
Distributed Computing 5. Snapshot Shmuel Zaks ©
Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Logical Clocks and Global State.
Termination Detection. Goal Study the development of a protocol for termination detection with the help of invariants.
Termination Detection Part 1. Goal Study the development of a protocol for termination detection with the help of invariants.
Distributed Snapshot (continued)
Ordering and Consistent Cuts Presented By Biswanath Panda.
CSE115/ENGR160 Discrete Mathematics 04/12/11 Ming-Hsuan Yang UC Merced 1.
Distributed Systems Fall 2009 Logical time, global states, and debugging.
LSRP: Local Stabilization in Shortest Path Routing Hongwei Zhang and Anish Arora Presented by Aviv Zohar.
Termination Detection Presented by: Yonni Edelist.
Chapter 11 Detecting Termination and Deadlocks. Motivation – Diffusing computation Started by a special process, the environment environment sends messages.
Chapter 10 Global Properties. Unstable Predicate Detection A predicate is stable if, once it becomes true it remains true Snapshot algorithm is not useful.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Logical Clocks and Global State.
CIS 720 Distributed algorithms. “Paint on the forehead” problem Each of you can see other’s forehead but not your own. I announce “some of you have paint.
1 Distributed Systems CS 425 / CSE 424 / ECE 428 Global Snapshots Reading: Sections 11.5 (4 th ed), 14.5 (5 th ed)  2010, I. Gupta, K. Nahrtstedt, S.
Distributed Computing 5. Snapshot Shmuel Zaks ©
Chapter 9 Global Snapshot. Global state  A set of local states that are concurrent with each other Concurrent states: no two states have a happened before.
Diffusing Computation. Using Spanning Tree Construction for Solving Leader Election Root is the leader In the presence of faults, –There may be multiple.
Distributed Snapshot. Think about these -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes?
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Distributed Systems Fall 2010 Logical time, global states, and debugging.
Program correctness The State-transition model The set of global states = so x s1 x … x sm {sk is the set of local states of process k} S0 ---> S1 --->
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Program correctness The State-transition model A global states S  s 0 x s 1 x … x s m {s k = set of local states of process k} S0  S1  S2  Each state.
Hwajung Lee. The State-transition model The set of global states = s 0 x s 1 x … x s m {s k is the set of local states of process k} S0  S1  S2  Each.
Distributed Snapshot. One-dollar bank Let a $1 coin circulate in a network of a million banks. How can someone count the total $ in circulation? If not.
Stabilization Presented by Xiaozhou David Zhu. Contents What-is Motivation 3 Definitions An Example Refinements Reference.
More on Correctness. Prime Factorization Problem: Write a program that computes all the prime factors of a given number Solution (Idea): Factors are less.
Hwajung Lee. The State-transition model The set of global states = s 0 x s 1 x … x s m {s k is the set of local states of process k} S0  S1  S2  Each.
Hwajung Lee. -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
CSE 486/586 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Hwajung Lee. -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute.
Distributed Systems Lecture 6 Global states and snapshots 1.
Consistent cut A cut is a set of events.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Lecture 3: State, Detection
Theoretical Foundations
ITEC452 Distributed Computing Lecture 9 Global State Collection
Distributed Snapshot.
Atomicity, Non-determinism, Fairness
CS60002: Distributed Systems
ITEC452 Distributed Computing Lecture 5 Program Correctness
Distributed Snapshot.
Global state collection
Distributed Snapshot Distributed Systems.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Adaptivity and Dynamic Load Balancing
ITEC452 Distributed Computing Lecture 8 Distributed Snapshot
Distributed Snapshot.
Distributed algorithms
CIS825 Lecture 5 1.
Consistent cut If this is not true, then the cut is inconsistent
Distributed Snapshot.
Presentation transcript:

Lecture 3: State, Detection Anish Arora CSE 763 1

The Stability Detection Problem A stable property of a distributed system is one that persists: once a stable property is true it remains true thereafter Examples: “the computation has terminated” “the system is deadlocked” “all tokens in a token ring have disappeared” Solution Determine the global state of the system Test the global state to see if the stable property holds

Termination Detection Processes 0..N-1 arbitrarily connected by channels Each process either idle or active An active process can become idle spontaneously An idle process can become active only upon receiving a message The Problem : Detect that all processes are idle and all channels are empty

Program and Proof (hand-in-hand) Design Step 0 : How to count messages in channels. process j {send msg}  c.j := c.j + 1 ▯ {receive msg}  c.j := c.j - 1 Proof : Invariant I1  (Sum j :: c.j) = # of messages in channels

Refining the program Step 1 : How to detect that all processes are idle. Consider a logical ring 0 -> … N-1 -> … 0 and pass a token Let t denote the location of the token process j {send msg}  c.j := c.j + 1 ▯ {receive msg}  c.j := c.j - 1 ▯ {propagate token}  t := t – 1  j  0  t = j  idle.j ; q := q + c.j ▯ {retransmit token}  t := N – 1  j = 0  t = j  idle.j ; q := 0   (q + c.0 = 0)

Refining the proof Proof : We begin with an idealized Invariant  I1  Q, where Q  (j : t<j  j<N : idle.j)  (q = (Sum j : t<j  j<N : c.j)) However Q is not preserved by one of the actions (the receive action for j, t < j  j < N) But when Q is violated, R becomes true, where R  q + (Sum j : 0 j  j  t : c.j) > 0 So, we weaken Invariant  I1  (Q  R) However R is not preserved by one of the actions (the receive action for j, 0  j and j  t)

Refining the program again Step 2 : How to abort a detection when unsure that the token traversal was uninterrupted. process j {send msg}  c.j := c.j + 1 ▯ {receive msg}  c.j := c.j – 1; ; blacken j ▯{propagate token}  t := t – 1  j  0  t = j  idle.j ; q := q + c.j ; whiten j ▯{retransmit token}  t := N – 1  j = 0  t = j  idle.j ; q := 0 (q + c.0 = 0  0 is white) ; whiten j

Iterated refinement ▯ {retransmit token}  t := N – 1 Proof : Invariant  I1  (Q  R  S) where S  (j:0  j  jt:j is black) However S is not preserved by one of the actions (the propagate action at a black node) So we introduce a color for the token and get the final program program of process j {send msg}  c.j := c.j + 1 ▯ {receive msg}  c.j := c.j – 1; ; blacken j ▯ {propagate token}  t := t – 1  j  0  t = j  idle.j ; q := q + c.j ; if black j then blacken token ; whiten j ▯ {retransmit token}  t := N – 1  j = 0  t = j  idle.j ; q := 0 (q + c.0 = 0  ; whiten token token is white  0 is white) ; whiten j

Termination Detection Predicate Termination  (j :: idle.j)  # of msgs sent - # of msgs received = 0 Invariant  (Sum j:: c.j) = # of msgs sent - # of msgs received  (Q  R  S  T) Q  (j : t<j  j<N : idle.j)  (q=( j : t<j  j<N : c.j)) R  q + ( j : 0 j  j  t : c.j) > 0 S  (j : 0  j  j  t : j is black) T  token is black

Proof of correctness Invariant  t=0  O is white  idle.0  q+c.0=0  token is white  Termination Invariant  Termination leads-to t = 0  0 is white  idle.0  q + c.0 = 0  token is white

Termination Detection Proof of (1): O is white  t = 0   S q + c.0 = 0  t = 0   R token is white   T Hence the antecedent implies Invariant  Q  q + c.0 = 0 i.e., the antecedent implies Termination Proof of (2): If termination has occurred, only the propagation and retransmission actions can execute After the first complete traversal of the ring by the token, all processes are white and the token is white At the end of the next traversal, when t = 0, the algorithm detects the termination of the underlying computation