Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Snapshots & Termination detection

Similar presentations


Presentation on theme: "Distributed Snapshots & Termination detection"— Presentation transcript:

1 Distributed Snapshots & Termination detection
Presented by Subashini Balachandran

2 What is a snapshot A snapshot of a distributed system is a global state where the local states of all processes and of all communication channels are recorded simultaneously Such a causally consistent state in a distributed system without a common clock is extremely complicated to achieve

3 Where is a snapshot used?
Detection of deadlock of a distributed system Compute monotonic functions of the global state such as lower bounds on the simulation time. Check pointing and recovery of distributed data bases Monitoring and debugging of distributed systems.

4 Consistent and Inconsistent cuts
A cut is consistent if no message arrow starts in future and ends in past. (e.g. ) AB Otherwise it is inconsistent ( e.g.) CD C A 1 2 3 4 D B

5 Consistent cut Algorithm
Consistent cut for non-FIFO systems by piggybacking a one bit status onto basic messages every process is initially white and turns red while taking a local snapshot every message sent by a white(red) process is colored white(red) every process takes a local snapshot at its convenience-but before a red message is possibly received

6 Example -1 The Snapshot is taken till the white color ends for all the process P1 P2 P3 P4

7 Example - 2 The Snapshot is taken before looking at or processing the red message P1 P2 P3 P4

8 Consistent cut Algorithm(cont..)
cut defined by the white events is consistent No red message sent after the cut is received by a white process before the cut a white process must be able to take a local snapshot at the moment it receives a red basic message

9 Catching the messages in transit
Messages in transit are precisely the white messages which are received by red process so whenever a red process gets a white message ,it can send a copy of it to the snapshot initiator

10 The Snapshot principle
After the snapshot initiator received the last copy of all in-transit messages and the local snapshots of all process, it knows the snapshot is complete. P1 End P2 P3 P4 Local snapshot Copy of messages in transit

11 Termination Detection
A process is considered active if it is white and passive otherwise Only white messages are considered Then white computation has terminated if no process is white no white messages are in transit Problems ? cannot determine when it has received the last white message.

12 Deficiency Counting TD
Each process had a counter being part of process state counter count = (#of basic messages that process has sent) - ( #of basic messages it has received from any other process) together with local snapshot and counters, can determine the total number of messages in transit Thus the end of the snapshot is determined.

13 Vector Counter Principle TD
every process Pi counts the number of white messages it has sent to Pj(i=j) on the j-th component of a local vector Vi of length n (n= number of process ) when a white message is received , its own component is decrement Vi[i] = Vi[i] -1 control vector C circulates the ring ,accumulates the local vector and resets them to zero C = C+Vi ; Vi = 0

14 Vector Counter Principle TD
at the end of first round C[i] indicates the number of white messages that are in transit to Pi for the cut no more new white messages are generated second round is necessary if C[i] >0 waits at each Pi until all the (white) in-transit messages have been received Vi[i]+C[i] <= 0 all the in-transit messages are collected guarantee termination after 2 control rounds

15 Example P1 P2 P3 P4 1 Accumulated control vector C 2 -1 1 1 1

16 Conclusion Presented a new algorithm for computing snapshot
Basic idea is to use 2 colors indicating the process states to identify the past and the future Termination detection using vector counter method

17 Thank You :-)


Download ppt "Distributed Snapshots & Termination detection"

Similar presentations


Ads by Google