Leader in a clique in O(nlog(n)) messages By Pierre A. Humblet and the help of Miki Asa.

Slides:



Advertisements
Similar presentations
Leader Election Breaking the symmetry in a system.
Advertisements

CS 542: Topics in Distributed Systems Diganta Goswami.
CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Computing 1. Lower bound for leader election on a complete graph Shmuel Zaks ©
Token-Dased DMX Algorithms n LeLann’s token ring n Suzuki-Kasami’s broadcast n Raymond’s tree.
1 CS 194: Elections, Exclusion and Transactions Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Uncoordinated Checkpointing The Global State Recording Algorithm Cristian Solano.
Chapter 15 Basic Asynchronous Network Algorithms
CPSC 668Set 18: Wait-Free Simulations Beyond Registers1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
Termination Detection of Diffusing Computations Chapter 19 Distributed Algorithms by Nancy Lynch Presented by Jamie Payton Oct. 3, 2003.
1 2 Lecture Outline 1.Problem description TThe distinct weights demand 2.Review of spanning trees PProperties of spanning trees KKruskal’s algorithm.
Announcements. Midterm Open book, open note, closed neighbor No other external sources No portable electronic devices other than medically necessary medical.
Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.
Tirgul 8 Graph algorithms: Strongly connected components.
Lectures on Network Flows
CPSC 668Set 5: Synchronous LE in Rings1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
The Byzantine Generals Problem (M. Pease, R. Shostak, and L. Lamport) January 2011 Presentation by Avishay Tal.
CPSC 668Set 4: Asynchronous Lower Bound for LE in Rings1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
CPSC 668Set 3: Leader Election in Rings1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
Termination Detection Presented by: Yonni Edelist.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
CPSC 668Set 8: More Mutex with Read/Write Variables1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
Computer Science Lecture 11, page 1 CS677: Distributed OS Last Class: Clock Synchronization Logical clocks Vector clocks Global state.
A (nlog(n)) lower bound for leader election in a clique.
Bit Complexity of Breaking and Achieving Symmetry in Chains and Rings.
Leader Election in Rings
A Distributed Algorithm for Minimum-Weight Spanning Trees by R. G. Gallager, P.A. Humblet, and P. M. Spira ACM, Transactions on Programming Language and.
Message Passing Systems A Formal Model. The System Topology – network (connected undirected graph) Processors (nodes) Communication channels (edges) Algorithm.
Election Algorithms and Distributed Processing Section 6.5.
Election Algorithms. Topics r Issues r Detecting Failures r Bully algorithm r Ring algorithm.
Message Passing Systems A Formal Model. The System Topology – network (connected undirected graph) Processors (nodes) Communication channels (edges) Algorithm.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 6 Synchronization.
By J. Burns and J. Pachl Based on a presentation by Irina Shapira and Julia Mosin Uniform Self-Stabilization 1 P0P0 P1P1 P2P2 P3P3 P4P4 P5P5.
1 Leader Election in Rings. 2 A Ring Network Sense of direction left right.
Leader Election. Leader Election: the idea We study Leader Election in rings.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 5: Synchronous LE in Rings 1.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 8: More Mutex with Read/Write Variables 1.
Lecture 10 – Mutual Exclusion Distributed Systems.
Tree Constructions Distributed Algorithms for Multi-Agent Networks Instructor: K. Sinan YILDIRIM.
1 Selecting leader in a clique in O (n log n) By Pierre A.Humblet Presented by: Anat Berkman, Distributed Algorithms, Technion.
and 6.855J March 6, 2003 Maximum Flows 2. 2 Network Reliability u Communication Network u What is the maximum number of arc disjoint paths from.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 3: Leader Election in Rings 1.
1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.
1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
1 Chapter 11 Global Properties (Distributed Termination)
Distributed Algorithms (22903) Lecturer: Danny Hendler Leader election in rings This presentation is based on the book “Distributed Computing” by Hagit.
CSE 486/586 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
K 3 -Saturator By Roy Oursler. What is K 3 -Saturator K 3 -Saturator is a combinatorial game A combinatorial game is a two player game where both of the.
CIS 825 Lecture 8. Leader Election Aim is to elect exactly one node as the leader.
Paola Flocchini Election in Arbitrary Networks Mega-Merger Yo-Yo Some Considerations.
CSE 486/586 Distributed Systems Leader Election
Election algorithm Who wins? God knows.
Distributed Algorithms (22903)
Election in the Complete Graph
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Lectures on Network Flows
CS60002: Distributed Systems
Huffman Codes Let A1,...,An be a set of items.
The Mathematics Of 1950’s Dating: Who wins The Battle of The Sexes?
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
MST GALLAGER HUMBLET SPIRA ALGORITHM
Network Flow 2016/04/12.
CSE 486/586 Distributed Systems Leader Election
MST GALLAGER HUMBLET SPIRA ALGORITHM
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CSE 486/586 Distributed Systems Leader Election
Presentation transcript:

Leader in a clique in O(nlog(n)) messages By Pierre A. Humblet and the help of Miki Asa

Rules of the game Initialization : Initialization : Each node has {1.. N-1} edges + edge (–1) to itself. Each node has {1.. N-1} edges + edge (–1) to itself. master(id) = id, state = active. master(id) = id, state = active. Node attempt to capture other nodes (increase the size of their domain). Node attempt to capture other nodes (increase the size of their domain). A node that has been captured stop capturing and it’s state = stopped. A node that has been captured stop capturing and it’s state = stopped. Node can be captured by many other nodes but has only one master  the last one that captured it. Node can be captured by many other nodes but has only one master  the last one that captured it. Node that captured all the nodes  leader Node that captured all the nodes  leader

How it’s done 2 messages TEST(size,id) TEST(size,id) An attempt to capture other node. An attempt to capture other node. A TEST from a node I didn’t captured  forward to my master. A TEST from a node I didn’t captured  forward to my master. A TEST from a node I captured  answer it with WINNER(id). A TEST from a node I captured  answer it with WINNER(id). WINNER(id) WINNER(id) The result of the fight  id won the fight. The result of the fight  id won the fight. Determined by who got bigger domain Determined by who got bigger domain (if equal bigger id)

AB State = active Master = A State = active Master = B

AB B wins A  B : TEST(size(A),id(A)) B  B : TEST(size(A),id(A)) B  B : WINNER(B) B wins Waiting

B wins A  B : TEST(size(A),id(A)) B  B : TEST(size(A),id(A)) B  B : WINNER(B) AB State = active Master = A State = active Master = B B wins

AB A wins A  B : TEST(size(A),id(A)) B  B : TEST(size(A),id(A)) B  B : WINNER(A) B  A : WINNER(A) A wins

A  B : TEST(size(A),id(A)) B  B : TEST(size(A),id(A)) B  B : WINNER(A) B  A : WINNER(A) AB State = active Master = A State = stop Master = A A wins

CB A State = active Master = C State = stop Master = A State = active Master = A

CB A A wins C  B : TEST(size(C),id(C)) B  A : TEST(size(C),id(C)) A  B : WINNER(A) A wins Waiting

CB A A wins C  B : TEST(size(C),id(C)) B  A : TEST(size(C),id(C)) A  B : WINNER(A) State = active Master = C State = stop Master = A State = active Master = A A wins

CB A C wins C  B : TEST(size(C),id(C)) B  A : TEST(size(C),id(C)) A  B : WINNER(C) B  C : WINNER(C) C wins

CB A C  B : TEST(size(C),id(C)) B  A : TEST(size(C),id(C)) A  B : WINNER(C) B  C : WINNER(C) State = active Master = C State = stop Master = C State = stop Master = A C wins

Variables STATE : active – won all the fights stopped – lost a fight  won’t be a leader stopped – lost a fight  won’t be a leader SIZE : number of nodes I captured (send me winner (ID)) including me (size of my domain). (size of my domain). MASTER : the id of my master. EDGE(id) : the edge leading to node id Dealing with messages PENDING_SET : messages waiting to be forward to the master PENDING : number of messages in PENDING_SET Initialization Each awaken node with ID receive winner(ID) on edge > 0.

Receiving WINNER(id) on edge e If (id = ID) // I won the fight { if (e > 0 & state = active)// fight that I initiated { SIZE ++ if SIZE = N // I capture all the nodes stop// announce elected else send TEST(SIZE,ID) on edge SIZE //continue } else// I didn’t initiate the fight do nothing }

Receiving WINNER(id) (cont.) Else// id != ID { if (MASTER != id)// my master lost { MASTER = id// change the master Send WINNER(id) on EDGE(id) // inform the new master } Else // my master won { do nothing } PENDING - -// continue as a slave if PENDING > 0 send from PENDING_SET on EDGE(MASTER) }

Receiving TEST(size,id) on edge e If e < SIZE// message from my domain { if (size,id) > (SIZE,ID) // I lost this fight { STATE = stopped // out of the game send WINNER(id) on EDGE e // inform the new master } else // I won send WINNER(ID) on EDGE e // inform my slave that I won } Else // not from my domain { EDGE(id) = e // update my data-base PENDING ++ // send TEST to the master If PENDING = 1 send TEST(size,id) on EDGE (MASTER) else put TEST(size,id) on EDGE (MASTER) on PENDING_SET }

state Size Master active 1 3 active 1 2 active 1 active 1 6 active 1 5 active 1 4

13 state Size Master stop active 2 3 active 1 2 active 1 6 active 1 5 active 1 4

1 8 4 state Size Master active 2 4 stop stop active 2 3 active 1 2 active 1 6 active 1 5 active 1 4 Waiting

state Size Master 13 stop active 2 3 active 1 2 active 1 6 active 1 4 Waiting active 1 5

state Size Master 13 stop active 2 3 active 1 2 active 2 6 stop 1 6 Waiting active 1 5

state Size Master 13 stop active 2 3 active 1 2 active 2 6 stop 1 6 Waiting active 1 5 Waiting

state Size Master 13 stop active 2 3 active 1 2 active 2 6 stop 1 6 Waiting active 1 5 Waiting

state Size Master 13 stop stop 2 3 active 1 2 active 3 6 stop 1 6 Waiting active 1 5 Waiting

state Size Master 13 stop stop 2 3 active 1 2 active 3 6 stop 1 6 Waiting active 1 5 Waiting

state Size Master 13 stop stop 2 3 active 1 2 active 3 6 stop 1 6 Waiting stop 1 3

state Size Master 13 stop stop 2 3 active 1 2 active 3 6 stop 1 6 Waiting stop 1 3

state Size Master 13 stop stop 2 6 active 1 2 active 4 6 stop 1 6 Waiting stop 1 3

state Size Master 13 stop stop 2 6 active 1 2 active 4 6 stop 1 6 Waiting stop 1 3 Waiting

state Size Master 13 stop stop 2 6 active 1 2 active 4 6 stop 1 6 Waiting stop 1 3 Waiting

state Size Master 13 stop stop 2 6 active 1 6 active 5 6 stop 1 6 Waiting stop 1 3 Waiting

state Size Master 13 stop stop 2 6 active 1 6 active 5 6 stop 1 6 Waiting stop 1 3 Waiting

state Size Master 13 stop stop 2 6 active 1 6 active 6 stop 1 6 Waiting stop 1 6 Waiting

Termination Each node can’t send more then N-1 TEST. Each node can’t send more then N-1 TEST. Each TEST cause at the most 3 more messages. Each TEST cause at the most 3 more messages.

Correctness Lemma 1 At the end of the algorithm there will be a node with size = N. Proof Size only grows with time. Size only grows with time. For each TEST sent For each TEST sent size++ or size++ or Node is “Waiting” Node is “Waiting” Therefore Therefore Enough to prove there is no deadlock

Liveness Suppose there is a deadlock Suppose there is a deadlock Lets take the node with the biggest (size,ID)  it’s active. Lets take the node with the biggest (size,ID)  it’s active. Eventually it will get an answer for the last test it initiated. Winner(ID). Eventually it will get an answer for the last test it initiated. Winner(ID). Once he’ll get that answer  size++ Once he’ll get that answer  size++ Size = N  leader Size = N  leader Send TEST for the next node Send TEST for the next node

complexity My domain: the nodes from which I received winner(ID) while I was active + myself. Lemma 1 SIZE = size of my domain. If (id = ID) // I won the fight { if (e > 0 & state = active)// fight that I initiated { SIZE ++ Got winner(ID) from other node & state = active  all conditions fulfill  size++ Got winner(ID) from other node & state = active  all conditions fulfill  size++ Size++  all conditions fulfill  Got winner(ID) from other node & state = active Size++  all conditions fulfill  Got winner(ID) from other node & state = active

Lemma 2 If SIZE(A) at time Ta = SIZE(B) at time Tb then the domains at Ta & Tb are disjoint. Proof: Assume domain(A) & domain(B) are joint at Ta or Tb  They are joint at max(Ta,Tb)  There is a Tm < max(Ta,Tb) s.t. At Tm a node from domain(A) join domain(B)  After Tm domain(A) doesn’t grow  At Tm Size(A) >= SIZE  Tm > Ta After Tm domain(B) > domain(A) >= SIZE  Tm > Tb Contradiction

Lemma 3 Rank the nodes in decreasing order by their SIZE at termination. S 1, S 2, S 3,..., S n -1, S n Then S k <= N/k Proof S 1 – S k -1 had once the size of S k  there all disjoint (lemma 2) conclusion There is only one node with SIZE=N  one leader

Number of TEST sent If (id = ID) // I won the fight { if (e > 0 & state = active)// fight that I initiated { SIZE ++ if SIZE = N // I capture all the nodes stop// announce elected else send TEST(SIZE,ID) on EDGE(SIZE) //continue } For each TEST sent  SIZE++ Each TEST cause at the most 3 more messages

Total number of messages is no more then 4*(N/1 + N/2 + N/ N/K) = O(N log(K)) = O(N log(K)) K = the number of nodes awake.

Improvements The bound for the number of messages can be tightened by noticing that the first TEST received by a generates at the most one more message. The bound for the number of messages can be tightened by noticing that the first TEST received by a generates at the most one more message. The algorithm can stop as soon as SIZE = N/2 (then announce elected). The algorithm can stop as soon as SIZE = N/2 (then announce elected).