Minimum Spanning Tree.

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

Distributed Minimum Spanning Tree Gallagher, Humblet, Spira (1983) pseudocode adaptation by Kenneth J. Goldman September 2002.
Chapter 13 Leader Election. Breaking the symmetry in system Similar to distributed mutual exclusion problems, the first process to enter the CS can be.
UBE529 Distributed Coordination. 2 Leader Election Gerard LeLann posed the Election problem in a famous paper 1977 Many distributed systems are client.
CS 542: Topics in Distributed Systems Diganta Goswami.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
Lecture 8: Asynchronous Network Algorithms
Token-Dased DMX Algorithms n LeLann’s token ring n Suzuki-Kasami’s broadcast n Raymond’s tree.
Chapter 15 Basic Asynchronous Network Algorithms
Leader Election Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.  i,j  V  i,j are non-faulty.
Self Stabilizing Algorithms for Topology Management Presentation: Deniz Çokuslu.
Lecture 7: Synchronous Network Algorithms
Minimum Spanning Trees
Minimum Spanning Trees
1 Complexity of Network Synchronization Raeda Naamnieh.
CPSC 668Set 2: Basic Graph Algorithms1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
1 Minimum Spanning Trees Gallagher-Humblet-Spira (GHS) Algorithm.
CPSC 668Set 3: Leader Election in Rings1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
A Distributed Algorithm for Minimum-Weight Spanning Trees by R. G. Gallager, P.A. Humblet, and P. M. Spira ACM, Transactions on Programming Language and.
Raeda Naamnieh 1. The Partition Algorithm Intuitively, the idea of the following algorithm is to choose each cluster as a maximal subset of nodes whose.
Election Algorithms and Distributed Processing Section 6.5.
Complexity of Bellman-Ford Theorem. The message complexity of Bellman-Ford algorithm is exponential. Proof outline. Consider a topology with an even number.
Distributed Algorithms 2014 Igor Zarivach A Distributed Algorithm for Minimum Weight Spanning Trees By Gallager, Humblet,Spira (GHS)
Minimum Spanning Tree Given a weighted graph G = (V, E), generate a spanning tree T = (V, E’) such that the sum of the weights of all the edges is minimum.
Review for Exam 2. Topics included Deadlock detection Resource and communication deadlock Graph algorithms: Routing, spanning tree, MST, leader election.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 10, 2005 Session 9.
Complexity of Bellman-Ford
A correction The definition of knot in page 147 is not correct. The correct definition is: A knot in a directed graph is a subgraph with the property that.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
1 Leader Election in Rings. 2 A Ring Network Sense of direction left right.
Hwajung Lee. Why graph algorithms ? It is not a “graph theory” course! Many problems in networks can be modeled as graph problems. Note that -The topology.
Minimum Spanning Tree. Given a weighted graph G = (V, E), generate a spanning tree T = (V, E’) such that the sum of the weights of all the edges is minimum.
Leader Election (if we ignore the failure detection part)
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 2: Basic Graph Algorithms 1.
Hwajung Lee. Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.   i,j  V  i,j are non-faulty ::
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 3: Leader Election in Rings 1.
Distributed, Self-stabilizing Placement of Replicated Resources in Emerging Networks Bong-Jun Ko, Dan Rubenstein Presented by Jason Waddle.
CIS 825 Review session. P1: Assume that processes are arranged in a ring topology. Consider the following modification of the Lamport’s mutual exclusion.
Minimum Spanning Tree Given a weighted graph G = (V, E), generate a spanning tree T = (V, E’) such that the sum of the weights of all the edges is minimum.
CSE 486/586 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
CIS 825 Lecture 9. Minimum Spanning tree construction Each node is a subtree/fragment by itself. Select the minimum outgoing edge of the fragment Send.
CIS 825 Lecture 8. Leader Election Aim is to elect exactly one node as the leader.
1 Minimum Spanning Trees Gallagher-Humblet-Spira (GHS) Algorithm.
Leader Election Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.  i,j  V  i,j are non-faulty ::
CSE 486/586 Distributed Systems Leader Election
Advanced Topics in Concurrency and Reactive Programming: Time and State Majeed Kassis.
A Distributed Algorithm for Minimum-Weight Spanning Trees
Distributed Processing Election Algorithm
Great Theoretical Ideas in Computer Science
CS60002: Distributed Systems
Lecture 9: Asynchronous Network Algorithms
Intra-Domain Routing Jacob Strauss September 14, 2006.
ITEC452 Distributed Computing Lecture 9 Global State Collection
Leader Election (if we ignore the failure detection part)
Discrete Mathematics for Computer Science
MST GALLAGER HUMBLET SPIRA ALGORITHM
Alternating Bit Protocol
Distributed Consensus
Tree Construction (BFS, DFS, MST) Chapter 5
The Minimum Spanning Tree Problem
CSE 373 Data Structures and Algorithms
Global State Collection
CSE 486/586 Distributed Systems Leader Election
MST GALLAGER HUMBLET SPIRA ALGORITHM
Lecture 8: Synchronous Network Algorithms
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Leader Election Ch. 3, 4.1, 15.1, 15.2 Chien-Liang Fok 4/29/2019
Distributed Systems and Concurrency: Synchronization in Distributed Systems Majeed Kassis.
Chapter 15.5 from “Distributed Algorithms” by Nancy A. Lynch
CSE 486/586 Distributed Systems Leader Election
Presentation transcript:

Minimum Spanning Tree

Minimum Spanning Tree Given a weighted graph G = (V, E), generate a spanning tree T = (V, E’) such that the sum of the weights of all the edges is minimum. A few applications Minimum cost vehicle routing. A cable TV company will use this to lay cables in a new neighborhood. On Euclidean plane, approximate solutions to the traveling salesman problem, We are interested in distributed algorithms only The traveling salesman problem asks for the shortest route to visit a collection of cities and return to the starting point. It is a well-known NP-hard problem

Example

Sequential algorithms for MST Review (1) Prim’s algorithm and (2) Kruskal’s algorithm (greedy algorithms) Theorem. If the weight of every edge is distinct, then the MST is unique.

Gallagher-Humblet-Spira (GHS) Algorithm GHS is a distributed version of Prim’s algorithm. Bottom-up approach. MST is recursively constructed by fragments joined by an edge of least cost. 3 7 5 Fragment Fragment

Challenges Challenge 1. How will the nodes in a given fragment identify the edge to be used to connect with a different fragment? A root node in each fragment is the coordinator

Challenges Challenge 2. How will a node in T1 determine if a given edge connects to a node of a different tree T2 or the same tree T1? Why will node 0 choose the edge e with weight 8, and not the edge with weight 4? Nodes in a fragment acquire the same name before augmentation.

Two main steps Each fragment has a level. Initially each node is a fragment at level 0. (MERGE) Two fragments at the same level L combine to form a fragment of level L+1 (ABSORB) A fragment at level L is absorbed by another fragment at level L’ (L < L’). The new fragment has a level L’. (Each fragment in level L has at least 2L nodes)

Least weight outgoing edge To test if an edge is outgoing, each node sends a test message through a candidate edge. The receiving node may send accept or reject. Root broadcasts initiate in its own fragment, collects the report from other nodes about eligible edges using a convergecast, and determines the least weight outgoing edge. (Broadcast and Convergecast are two handy tools) test accept reject

Accept of reject? Let i send test to j test test Case 1. If name (i) = name (j) then send reject Case 2. If name (i) ≠ name (j) AND level (i) ≤ level (j) then node j sends accept Case 3. If name (i) ≠ name (j) AND level (i) > level (j) then wait until level (j) = level (i) and then send accept/reject. WHY? (See note below) (Also note that levels can only increase). Q: Can fragments wait for ever and lead to a deadlock? Name = X L=4 reject test test Name = Y L=3 Note. It may be the case that the responding node belongs a different fragment when it received the test message, but it is also trying to merge with the sending fragment.

The major steps repeat Test edges as outgoing or not Determine least weight outgoing edge - it becomes a tree edge Send join (or respond to join) Update level & name & identify new coordinator/root until there are no outgoing edges

Types of messages (Initiate) Root initiates the “lwoe” search (report) Nodes respond to the root with info about outgoing edges (test) Nodes test if an edge is outgoing (accept) The recipient of the test message certifies the edge as “outgoing” (reject) The recipient of the test message certifies the edge as “not outgoing” (join) Nodes bordering the edge send join to the fragment at the other end (change_root) Nodes broadcast the id of the new root in the fragment

Classification of edges Basic (initially all branches are basic) Branch (all tree edges) Rejected (not a tree edge) Branch and rejected are stable attributes (once tagged as rejected, it remains so for ever. The same thing holds for tree edges too.)

Wrapping it up Merge The edge through which the join Example of merge Merge The edge through which the join message is exchanged, changes its status to branch, and it becomes a tree edge. The new root broadcasts an (initiate, L+1, name) message to the nodes in its own fragment. initiate

Wrapping it up Merge The edge through which the join Example of merge Merge The edge through which the join message is exchanged, changes its status to branch, and it becomes a tree edge. The new root broadcasts an (initiate, L+1, name) message to the nodes in its own fragment. initiate

Wrapping it up Absorb T’ sends a join message to T, and receives an initiate message. This indicates that the fragment at level L has been absorbed by the other fragment at level L’. They collectively search for the lwoe. The edge through which the join message was sent, changes its status to branch. initiate Example of absorb

Example 8 2 1 5 1 3 7 4 5 4 6 2 6 9 3

Example merge 8 merge 2 1 5 1 3 7 4 5 4 6 2 merge 6 9 3

Example 8 2 1 5 1 7 merge 3 4 5 4 6 absorb 2 6 9 3

Example absorb 8 2 1 5 1 7 3 4 5 4 6 2 6 9 3

Message complexity Each edge may be rejected at most once. It requires two messages (test + reject). The upper bound is 2|E| messages. At each of the (max) log N levels, a node RECEIVES at most (1) one initiate message and (2) one accept message and SENDS (3) one report message (4) one test message not leading to a rejection, and (5) one changeroot or join message. So, the total number of messages has an upper bound of 2|E| + 5N log N

Coordination Algorithms: Leader Election

Leader Election (if we ignore the failure detection part) Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader. The goal is to reach a configuration, where ∀ i,j ∈ V : i,j are non-faulty :: (1) L(i) ∈ V and (2) L(i) = L(j) and (3) L(i) is non-faulty Often reduces to maxima (or minima) finding problem. (if we ignore the failure detection part)

Leader Election Difference between mutual exclusion & leader election The similarity is in the phrase “at most one process.” But, Failure is not an issue in mutual exclusion, a new leader is elected only after the current leader fails. No fairness is necessary - it is not necessary that every aspiring process has to become a leader.

Bully algorithm (Assumes that the topology is completely connected) 1. Send election message (I want to be the leader) to processes with larger id 2. Give up your bid if a process with larger id sends a reply message (means no, you cannot be the leader). In that case, wait for the leader message (I am the leader). Otherwise elect yourself the leader and send a leader message 3. If no reply is received, then elect yourself the leader, and broadcast a leader message. 4. If you receive a reply to the election message, but later don’t receive a leader message from a process of larger id (i.e. the leader-elect has crashed), then re-initiate election by sending election message.

Bully algorithm Node 0 sends N-1 election messages Leader crashed election 1 2 3 4 N-3 N-2 N-1 Node 0 sends N-1 election messages Node 1 sends N-2 election messages Node N-2 sends 1 election messages etc Finally, node N-2 will be elected leader, but before it sent the leader message, it crashed. So, 0 starts all over again The worst-case message complexity = O(n3) (This is bad)

Maxima finding on a unidirectional ring Chang-Roberts algorithm (asynchronous) Initially all initiator processes are red. Each initiator process i sends out token <i> {For each initiator i} do token <j> received ⋀ j < i → skip (do nothing) token <j>⋀ j > i → send token <j>; color := black token <j> ⋀ j = i → L(i) := I {i becomes the leader} od {Non-initiators remain black, and act as routers} do token <j> received → send <j> od Message complexity = O(n2). Why? What are the best and the worst cases? The ids may not be nicely ordered like this

Bidirectional ring Franklin’s algorithm (round based) In each round, every process sends out probes (same as tokens) in both directions to its neighbors. Probes from higher numbered processes will knock the lower numbered processes out of competition. In each round, out of two neighbors, at least one must quit. So at least 1/2 of the current contenders will quit. Message complexity = O(n log n). Why?

Sample execution

Peterson’s algorithm initially ∀i : color(i) = red, alias(i) = i {program for each round and for each red process} send alias; receive alias (N); if alias = alias (N)  I am the leader alias ≠ alias (N)  send alias(N); receive alias(NN); if alias(N) > max (alias, alias (NN))  alias:= alias (N) alias(N) < max (alias, alias (NN))  color := black fi {N(i) and NN(i) denote neighbor and neighbor’s neighbor of i}

Peterson’s algorithm Round-based. Finds maxima on a unidirectional ring using O(n log n) messages. Uses an id and an alias for each process.

Synchronizers Synchronous algorithms (round-based, where processes execute actions in lock-step synchrony) are easer to deal with than asynchronous algorithms. In each round (or clock tick), a process (1) receives messages from neighbors, (2) performs local computation (3) sends messages to ≥ 0 neighbors A synchronizer is a protocol that enables synchronous algorithms to run on an asynchronous system. Synchronous algorithm synchronizer Asynchronous system

Synchronizers “Every message sent in clock tick k must be received by the neighbors in the clock tick k.” This is not automatic - some extra effort is needed. Consider a basic Asynchronous Bounded Delay (ABD) synchronizer Start tick 0 Start tick 0 tick 0 tick 1 tick 2 tick 3 Channel delays have an upper bound d Start tick 0 Each process will start the simulation of a new clock tick after 2d time units, where d is the maximum propagation delay of each channel

α-synchronizers What if the propagation delay is arbitrarily large but finite? The α-synchronizer can handle this. m ack m ack m ack Simulation of each clock tick Send and receive messages for the current tick. Send ack for each incoming message, and receive ack for each outgoing message Send a safe message to each neighbor after sending and receiving all ack messages (then follow steps 1-2-3-1-2-3- …)

Complexity of α-synchronizer Message complexity M(α) Defined as the number of messages passed around the entire network for the simulation of each clock tick. M(α) = O(|E|) Time complexity T(α) Defined as the number of asynchronous rounds needed for the simulation of each clock tick. T(α) = 3 (since each process exchanges m, ack, safe)

Complexity of α-synchronizer MA = MS + TS. M(α) TA = TS. T(α) MESSAGE complexity of the algorithm implemented on top of the asynchronous platform Time complexity of the original synchronous algorithm in rounds Message complexity of the original synchronous algorithm TIME complexity of the algorithm implemented on top of the asynchronous platform Time complexity of the original synchronous algorithm

The β-synchronizer Form a spanning tree with any node as the root. The root initiates the simulation of each tick by sending message m(j) for each clock tick j. Each process responds with ack(j) and then with a safe(j) message along the tree edges (that represents the fact that the entire subtree under it is safe). When the root receives safe(j) from every child, it initiates the simulation of clock tick (j+1) using a next message. To compute the message complexity M(β), note that in each simulated tick, there are m messages of the original algorithm, (N-1) acks, and (N-1) safe messages and (N-1) next messages along the tree edges. So the message overhead is O(N) Time complexity T(β) = depth of the tree. For a balanced tree, this is O(log N)

γ-synchronizer Uses the best features of both α and β synchronizers. (What are these?)* The network is viewed as a tree of clusters. Each cluster has a cluster-head Within each cluster, β-synchronizers are used, but for inter-cluster synchronization, α-synchronizer is used. For best complexity results, the cluster sizes must be carefully chosen. Preprocessing overhead for cluster formation. The number and the size of the clusters is a crucial issue in reducing the message and time complexities Cluster head * α-synch has lower time complexity, β-synchronizers have lower message complexity

Example of application: Shortest path Consider Synchronous Bellman-Ford: O( n |E| ) messages, O(n) rounds Asynchronous Bellman-Ford Many corrections possible (exponential), due to message delays. Message complexity can be exponential in n in the worst case Time complexity exponential in n, counting message pileups. Using (e.g.) Synchronizer α:. MA = MS + TS.M(α) = O(n.|E|) + O(diam).O(|E|) = O(n.|E|) TA = TS. T(α) = 3.diam rounds = O(n) rounds