Hwajung Lee. -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute.

Slides:



Advertisements
Similar presentations
Distributed Snapshots: Determining Global States of Distributed Systems - K. Mani Chandy and Leslie Lamport.
Advertisements

Global States.
Distributed Snapshots: Determining Global States of Distributed Systems Joshua Eberhardt Research Paper: Kanianthra Mani Chandy and Leslie Lamport.
Global States and Checkpoints
Distributed Computing 5. Snapshot Shmuel Zaks ©
Scalable Algorithms for Global Snapshots in Distributed Systems
SES Algorithm SES: Schiper-Eggli-Sandoz Algorithm. No need for broadcast messages. Each process maintains a vector V_P of size N - 1, N the number of processes.
Uncoordinated Checkpointing The Global State Recording Algorithm.
Uncoordinated Checkpointing The Global State Recording Algorithm Cristian Solano.
Time and Global States Part 3 ECEN5053 Software Engineering of Distributed Systems University of Colorado, Boulder.
CS542 Topics in Distributed Systems Diganta Goswami.
Distributed Computing 5. Snapshot Shmuel Zaks ©
OSU CIS Lazy Snapshots Nigamanth Sridhar and Paul A.G. Sivilotti Computer and Information Science The Ohio State University
Distributed Snapshot (continued)
S NAPSHOT A LGORITHM. W HAT IS A S NAPSHOT - INTUITION Given a system of processors and communication channels between them, we want each processor to.
CS514: Intermediate Course in Operating Systems Professor Ken Birman Vivek Vishnumurthy: TA.
Causality & Global States. P1 P2 P Physical Time 4 6 Include(obj1 ) obj1.method() P2 has obj1 Causality violation occurs when order.
Ordering and Consistent Cuts Presented By Biswanath Panda.
CMPT 431 Dr. Alexandra Fedorova Lecture VIII: Time And Global Clocks.
Distributed Systems Fall 2009 Logical time, global states, and debugging.
Ordering and Consistent Cuts Presented by Chi H. Ho.
Cloud Computing Concepts
Computer Science Lecture 10, page 1 CS677: Distributed OS Last Class: Clock Synchronization Physical clocks Clock synchronization algorithms –Cristian’s.
Chapter 5.
CIS 720 Distributed algorithms. “Paint on the forehead” problem Each of you can see other’s forehead but not your own. I announce “some of you have paint.
Distributed Computing 5. Snapshot Shmuel Zaks ©
UBI529 Distributed Algorithms
Chapter 9 Global Snapshot. Global state  A set of local states that are concurrent with each other Concurrent states: no two states have a happened before.
Lecture 6-1 Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013 Indranil Gupta (Indy) September 12, 2013 Lecture 6 Global Snapshots Reading:
Distributed Snapshot. Think about these -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes?
Distributed Systems Fall 2010 Logical time, global states, and debugging.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Snapshot. One-dollar bank Let a $1 coin circulate in a network of a million banks. How can someone count the total $ in circulation? If not.
Hwajung Lee. -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute.
Networks and Distributed Snapshot Sukumar Ghosh Department of Computer Science University of Iowa.
Global State Collection
D ISTRIBUTED S YSTEM UNIT-2 Theoretical Foundation for Distributed Systems Prepared By: G.S.Mishra.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
CSE 486/586 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Hwajung Lee. Some applications - computing network topology - termination detection - deadlock detection Chandy Lamport algorithm does a partial job.
1 Chapter 11 Global Properties (Distributed Termination)
Efficient Algorithms for Distributed Snapshots and Global Virtual Time Approximation Author: Friedermann Mattern Presented By: Shruthi Koundinya.
Token-passing Algorithms Suzuki-Kasami algorithm The Main idea Completely connected network of processes There is one token in the network. The holder.
Distributed Systems Lecture 6 Global states and snapshots 1.
Global state and snapshot
Consistent cut A cut is a set of events.
Global state and snapshot
Lecture 3: State, Detection
CSE 486/586 Distributed Systems Global States
Theoretical Foundations
Distributed Snapshots & Termination detection
ITEC452 Distributed Computing Lecture 9 Global State Collection
Distributed Snapshot.
Distributed Snapshot.
湖南大学-信息科学与工程学院-计算机与科学系
Time And Global Clocks CMPT 431.
Global State Collection
Distributed Snapshot Distributed Systems.
Uncoordinated Checkpointing
Slides for Chapter 11: Time and Global State
ITEC452 Distributed Computing Lecture 8 Distributed Snapshot
Distributed Snapshot.
Chap 5 Distributed Coordination
CSE 486/586 Distributed Systems Global States
Jenhui Chen Office number:
Distributed algorithms
CIS825 Lecture 5 1.
Consistent cut If this is not true, then the cut is inconsistent
Chandy-Lamport Example
Distributed Snapshot.
Presentation transcript:

Hwajung Lee

-- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute these?

Let a $1 coin circulate in a network of a million banks. How can someone count the total $ in circulation? If not counted “properly,” then one may think the total $ in circulation to be one million.

Major uses in - deadlock detection - termination detection - rollback recovery - global predicate computation

(a  consistent cut C)  (b happened before a)  b  C If this is not true, then the cut is inconsistent a b c d g m e f k i h j Cut 1Cut 2 A cut is a set of events. (Not consistent) (Consistent) P1 P2 P3

 The set of states immediately following a consistent cut forms a consistent snapshot of a distributed system.  A snapshot that is of practical interest is the most recent one. Let C1 and C2 be two consistent cuts and C1  C2. Then C2 is more recent than C1.

 The set of states immediately following a consistent cut forms a consistent snapshot of a distributed system.  A snapshot that is of practical interest is the most recent one. Let C1 and C2 be two consistent cuts and C1  C2. Then C2 is more recent than C1.  Analyze why certain cuts in the one-dollar bank are inconsistent.

How to record a consistent snapshot? Note that 1. The recording must be non-invasive 2. Recording must be done on-the-fly. You cannot stop the system.

Works on a (1)strongly connected graph (2) each channel is FIFO. A process called the initiator initiates the distributed snapshot algorithm by sending out a marker ( )

Initially every process is white. When a process receives a marker, it turns red if it has not already done so. Every action by a process, and every message sent by a process gets the color of that process.

Step 1. In one atomic action, the initiator (a) turns red (b) records its own state (c) sends a marker along all outgoing channels Step 2. Every other process, upon receiving a marker for the first time (and before doing anything else) (a) turns red (b) records its own state (c) sends markers along all outgoing channels The algorithm terminates when (1) every process turns red, and (2) every process has received a marker through each incoming channel.

Lemma 1. No red message is received by a white action.

Theorem. The Chandy-Lamport algorithm records a consistent global state. The global state recorded by Chandy- Lamport algorithm is equivalent to the ideal snapshot state SSS. Hint. A pair of actions (a, b) can be scheduled in any order, if there is no causal order between them, so (a, b) is equivalent to (b, a) SSS Easy conceptualization of the snapshot state All whiteAll red

Let an observer see the following actions: w[i] w[k] r[k] w[j] r[i] w[l] r[j] r[l] …  w[i] w[k] w[j] r[k] r[i] w[l] r[j] r[l] …[Lemma 1]  w[i] w[k] w[j] r[k] w[l] r[i] r[j] r[l] …[Lemma 1]  w[i] w[k] w[j] w[l] r[k] r[i] r[j] r[l] …[done!] Recorded state Call it “break,” when a red action precedes a white action in a given run. The idea view of the schedule has zero breaks. But in general, The number of breaks in the recorded sequence of actions can be a positive number. A swap reduces the number of breaks by 1.