Download presentation
Presentation is loading. Please wait.
Published byCharleen Richardson Modified over 8 years ago
1
Hierarchical Coordinated Checkpointing Protocol Himadri Sekhar Paul. Arobinda Gupta. R. Badrinath. Dept. of Computer Sc. & Engg. Indian Institute of Technology, Kharagpur, INDIA 721302. @cse.iitkgp.ernet.in
2
Dept. of Computer Sc. & Engg. IIT Kharagpur Hierarchical Coordinated Checkpointing Protocol 2 2 Motivation Long running application executing on Distributed Systems. – Metacomputer running over WAN. Prone to failure, fault tolerance is important. – Checkpoint and recovery technique.
3
Dept. of Computer Sc. & Engg. IIT Kharagpur Hierarchical Coordinated Checkpointing Protocol 3 3 Motivation Coordinated Checkpointing protocol is a popular scheme. Coordinated checkpointing protocol is bottlenecked by the slowest link in the network. Hierarchical Coordinated Checkpointing Protocol caters for the heterogeneous link speed, as in WAN.
4
Dept. of Computer Sc. & Engg. IIT Kharagpur Hierarchical Coordinated Checkpointing Protocol 4 4 System Model Nodes are fail-safe. Network is immune to partitioning. Links are unreliable. All computing nodes are reachable from the others. Network is hierarchically connected – Clusters of computing nodes realized by high speed networks. – Clusters inter-connected by lower speed networks.
5
Dept. of Computer Sc. & Engg. IIT Kharagpur Hierarchical Coordinated Checkpointing Protocol 5 5 Cluster System Model Computation Nodes
6
Dept. of Computer Sc. & Engg. IIT Kharagpur Hierarchical Coordinated Checkpointing Protocol 6 6 Flat Coordinated Checkpointing Protocol (2-phase commit) Message Checkpoint Ckpt Rqst Ack Ckpt Rqst Ckpt Estb Ack Ckpt Estb Process blocked … Coordinator Follower
7
Dept. of Computer Sc. & Engg. IIT Kharagpur Hierarchical Coordinated Checkpointing Protocol 7 7 AckCkpt_commit Message Checkpoint Initiator Follower Leader Ckpt_rqst AckCkpt_rqst Ckpt_estb Ckpt_commit Blocked Blocking at Extra-cluster msg AckCkpt_rqst AckCkpt_estb Ckpt_rqst AckCkpt_rqst Ckpt_estb AckCkpt_estbAckCkpt_commit
8
Dept. of Computer Sc. & Engg. IIT Kharagpur Hierarchical Coordinated Checkpointing Protocol 8 8 Simulation Result Simulation Setup – Two level network, with intra-cluster link speed of 10 Mbps and inter-cluster link speed of 1 Mbps. – Communication pattern of the application is random. – Varying fraction of extra-cluster application message. (Flat = Flat Coordinated Checkpointing Protocol) (Hier = Hierarchical Coordinated Checkpointing Protocol)
9
Dept. of Computer Sc. & Engg. IIT Kharagpur Hierarchical Coordinated Checkpointing Protocol 9 9 Simulation Result
10
Dept. of Computer Sc. & Engg. IIT Kharagpur Hierarchical Coordinated Checkpointing Protocol 10 Conclusion & Future Work In a two-level hierarchical network the hierarchical checkpointing protocol incurs less latency than the flat checkpointing protocol, even for very high communication intensity. The protocol can be extended to a generic hierarchical network.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.