Download presentation
Presentation is loading. Please wait.
Published byNathanial Wellings Modified over 10 years ago
1
Distributed Snapshots: Non-blocking checkpoint coordination protocol Next: Uncoordinated Chkpnt
2
Uncoordinated Processes take chkpnt independently Domino Effect! Next: Coordinated Blocking Chkpnt
3
Coordinated Blocking Processes are coordinated to form a consistent global state, and … initiator Ready!Go! p1 p2 p3 * * * * okay, channels flushed Next: Coordinated Blocking Chkpnt (cont ’ )
4
Coordinated Blocking (cont’) Advantage Always consistent No Domino Effect Less storage overhead Disadvantage Large latency to chkpnt! Next: Coordinated Non-blocking Chkpnt
5
Coordinated Non-blocking Processes are coordinated, but … Do we really need to block …? ! ! K. Mani Chandy Leslie Lamport Next: Global-state Recording Algorithm
6
Global-state Recording Alg. Step 1: process states Step 2: channel states Step 3: end of the algorithm “Distributed snapshots: determining global states of distributed systems”, K. Mani Chandy and Leslie Lamport Next: Model of Distributed System
7
Model of Distributed System Processes Channels: directed, FIFO, error-free pq r c1 c2 c3 c4 Next: Step 1, process states
8
Step 1: process states Initiator: Save its local state Send marker tokens on all outgoing edges All other processes: On receiving the first marker on any incoming edges, Save state, and propagate markers on all outgoing edges Resume execution. Further markers will be eaten up. Next: Example
9
Example pq r c1 c2 c3 c4 initiator p q r marker checkpoint x x x x x Next: Proof
10
Proof pq x x x x x p q Let us assume that a message m exists, and it makes our cut inconsistent. m Next: Proof (cont ’ )
11
Proof(cont’) pq x x x1 x2 x p q m x1 p q m [Incomplete page] Contradict the assumption. x2 (2) x1 is not the 1 st marker for process q (1)x1 is the 1 st marker for process q Next: Step 2, channel states
12
Step 2: channel states p q Sent along the channel before the sender ’ s chkpnt Received along the channel after the receiver ’ s chkpnt In-flight messages Next: Example
13
Example p x x x q r s t u 1 2 3 4 5 6 7 8 (1) p is receiving messages p x x x q r s t u 4 5 6 7 8 (2) p has just saved its state x Next: Example (cont ’ )
14
Example(cont’) p q r s p x x x q r s t u 1 2 3 4 5 6 7 8 p ’ s chkpnt triggered by a marker from q x x x x x x x 1 2 3 4 5 6 7 8 Next: Algorithm (revised)
15
Algorithm (revised) Initiator: Save its local state Send marker tokens on all outgoing edges All other processes: On receiving the first marker on any incoming edges, Save state, and propagate markers on all outgoing edges Resume execution, but also save incoming messages until a marker arrives through the channel Guarantees a consistent global state! Next: Step 3, end of the algorithm
16
Step 3: end of the algorithm Did every process save its state and in-flight messages? p q r initiator direct channel to the initiator? spanning tree? General solution? Next: References
17
References “Distributed snapshots: determining global States of distributed systems”, K. Mani Chandy and Leslie Lamport
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.