Elections in a Distributed Computing System Hector Garcia-Molina Presenter: Srinath Rao
Introduction to Elections Strategies to deal with a node failure –Have s/w which can operate continuously even as failures occur –Halt temporarily, reorganize the system Need for the coordinator and hence election Election protocols can be used to start up a system, add/remove nodes Issues ?
Issues Constituent nodes may fail after election What does it mean to be a coordinator? How to cope up with failures during the election itself? Always possible to select a unique coordinator? Might wish to have more than one coordinator
Outline Assumptions Elections with no commn. failures –The Bully Election Algorithm Elections with commn. failures –The Invitation Election Algorithm Related Work Conclusions
Assumptions 1.All nodes cooperate 2. Election algorithm makes use of “bug-free” software facilities 3.Communication subsystem will not spontaneously generate messages 4.Nodes have “safe” storage cells 5.Node halts processing when it fails 6.No transmission errors
Assumptions (contd..) 7.Messages are processed in the order they are received 8.Communication subsystem does not fail 9.Node never pauses
State Vector of node A collection of safe storage cells Principal components of Vector S(i) –Status of node i: S(i).s Down, Election, Reorganization, Normal –Coordinator according to node i: S(i).c –Definition of the Task being performed: S(i).d
Outline Assumptions Elections with no commn. failures –The Bully Election Algorithm Elections with commn. failures –The Invitation Election Algorithm Related Work Conclusions
Desired Characteristics Assertion 1: For two nodes i and j –S(i).c = S(j).c if nodes i and j are in one of the states “Normal” or “Reorganization” –S(i).d = S(j).d if both i and j are in “Normal” state States what it means to be a coordinator
Desired Characteristics (contd..) Assertion 2: If no failures, election will eventually transform a system in any state to a state: –There exists node i with S(i).s =“Normal” and S(i).c = i –Other active nodes j have S(j).s = “Normal” and S(j).c = i
The Bully Election Algorithm Each node has an unique id no. Algorithm uses id no. as priorities Two step algorithm 1.Node i tries to contact all nodes with higher priorities. If no reply received, then assume the role of coordinator 2.Inform all the lower priority nodes –Send “halt” message, force state of j to “Election” –Send “I am elected” message, node j sets S(j).c=I and S(j).s = “Reorganization” –Distribute new algorithms to nodes, all status changed to “Normal”
Bully (contd..) Let the recovering node k attempt to become the coordinator using the same algorithm –Halts all lower priority nodes which may be in the process of becoming coordinators –Step 1 ensures no conflict with higher priority nodes
Outline Assumptions Elections with no commn. failures –The Bully Election Algorithm Elections with commn. failures –The Invitation Election Algorithm Related Work Conclusions
Discussion Failures –Partitioning of nodes –A node can only send/receive message –Node i and node j can talk to node k but not with each other –Node may pause and then resume Observation: Impossibility of consensus in the event of failure of commn. subsystem or node pausing. Inference: Redefine the meaning of an election
Discussion (contd..) Notion of a group of nodes and group id Node i stores group id in its state vector: S(i).g Nodes are free to change groups Identify messages with group id Coordinator is unique within a group
New Desired Characteristics Assertion 3: For two nodes i and j –S(i).c = S(j).c if nodes i and j are in the same group and are in one of the states “Normal” or “Reorganization” –S(i).d = S(j).d if both i and j are in “Normal” state and are in the same group
Desired Characteristics (contd..) No requirements for nodes with only one- way communication Assertion 4: If no failures, election will eventually transform a set of nodes R that have two way communication in any state to a state: –There exists node i with S(i).s =“Normal” and S(i).c = i –Other active nodes j in R have S(j).s = “Normal” and S(j).c = i and S(j).g = S(i).g
The Invitation Election Algo. A node “invites” other nodes to join it in forming a new group A node may accept or decline an invitation Make a receiving node form a new group with itself the coordinator and the only member Objective: to merge groups Coordinators periodically send “invite” message Can be used instead of the Bully algorithm
Related Work Scott D. Stoller (2000) –Modifies Bully algorithm to work with crash failures –Points out a flaw and proposes a new specification Gurdip Singh (1996) –Proposes an algorithm for leader election in the presence of link failures
Conclusions Meaning of an election depends on the possible types of failures Paper studied elections in two representative failure environments Postulated assertions that define concept of an election Presented an election algorithm for each environment