1 Connected Components & All Pairs Shortest Paths Presented by Wooyoung Kim 3/4/09 CSc 8530 Parallel Algorithms, Spring 2009 Dr. Sushil K. Prasad.

1 Connected Components & All Pairs Shortest Paths Presented by Wooyoung Kim 3/4/09 CSc 8530 Parallel Algorithms, Spring 2009 Dr. Sushil K. Prasad

Outline Adjacent matrix and connectivity matrix Parallel algorithm for computing connectivity matrix Parallel algorithm for computing connected components Sequential algorithms for all-pairs shortest paths Parallel algorithm for all-pairs shortest paths Analysis Related recent research References

9.3. Connected Components

4 Connected Components Let G=(V,E) be a graph, V={v 0,v 1,…,v n-1 } It can be represented by an n x n adjacency matrix A defined as Connected component of an undirected graph G is a connected subgraph of G of maximum size Given such a graph G, we develop an algorithm for computing its connected components on a hypercube interconnection network parallel computer

5 Adjacency matrix - Examples v0v0 v1v1 v2v2 v3v3 v4v4 0 1 2 3 4 0123401234 Example 1: undirected graph v0v0 v1v1 v2v2 v3v3 v4v4 v5v5 Example 2: directed graph 0 1 2 3 4 5 012345012345

6 Applications for Connected Components Identifying clusters. We can represent each item by a vertex and add an edge between each pair of items that are ``similar.'' The connected components of this graph correspond to different classes of items. Component labeling is commonly used in image processing, to join neighboring pixels into connected regions which are the shapes in the image. Testing whether a graph is connected is an essential preprocessing step for every graph algorithm.

7 Computing the Connectivity Matrix A key step in the algorithm for finding the connected components is to find the so-called connectivity matrix Definition: A connectivity matrix of a (directed or undirected) graph G with n vertices is an n x n matrix C defined as: for 0  j,k  n-1 C also known as reflexive and transitive closure of G Given the adjacency matrix A of G,it is required to compute C

8 Computing the Connectivity Matrix – cont. Approach: Boolean matrix multiplication 1. The matrices to be multiplied, and the product matrix are all binary 2. The logical “and” operation replaces regular multiplication 3. The logical “or” operation replaces regular addition  If X, Y and Z are n x n Boolean matrices, where Z is a Boolean product of X and Y, then z ij = (x i1 and y 1j ) or (x i2 and y 2j ) or … or (x in and y nj ) (in regular product: )

9 Computing the Connectivity Matrix – cont. 1 st step: obtain an n x n matrix B from A as follows for 0  j,k  n-1 i.e. B is equal to A with augmented 1’s along the diagonal  B represents all the paths in G of length less than 2, or

10 Computing the Connectivity Matrix – cont. Then B 2 = B x B (a Boolean product of B with itself) represents paths of length 2 or less b ik 2 represents a path of length 2 from v i to v k through v j Generally, B n represents paths of length n or less Observe: If there is a path from v i to v j, it cannot have length more than n-1 since G has only n vertices. Hence, the connectivity matrix C = B n-1 B n-1 is computed through successive squaring vivi vkvk vjvj = i j k 1 0 j k 1 i k 1

11 Computing the Connectivity Matrix – cont. C is obtained after  log (n-1)  Boolean matrix multiplications When n -1 is not a power of 2, C is obtained form B m, where m = 2  log (n-1)  (the smallest power of 2 larger than n-1) this is correct since B m =B n-1 for m > n-1 Implementation: We use the algorithm HYPERCUBE MATRIX MULTIPLICATION, adopted to perform Boolean matrix multiplication Input: the adjacency matrix A of G Output: the connectivity matrix C

12 Computing the Connectivity Matrix – cont. The hypercube used has N = n 3 processors: P 0, P 1, …, P N-1 Arranged in an n x n x n array; P r occupies position (i,j,k) where r = in 2 +jn+k, 0  i,j,k  n-1 Processor P r has 3 registers: A(i,j,k), B(i,j,k), C(i,j,k) Initially, the processors in position (0,j,k) (0  j,k  n-1) contain the adjacency matrix: A(0,j,k) = a jk At the end, the same processors contain the connectivity matrix: C(0,j,k) = c jk (0  j,k  n-1)

13 Algorithm HYPERCUBE CONNECTIVITY (A,C) Step 1: for j = 0 to n-1 do in parallel A(0, j, j)  1 end for Step 2: for j = 0 to n-1 do in parallel for k = 0 to n-1 do in parallel B(0, j, k)  A(0, j, k) end for Step 3: for i = 1 to  log (n-1)  do (3.1) HYPERCUBE MATRIX MULTIPLICATION (A,B,C) (3.2) for j = 0 to n-1 do in parallel for k = 0 to n-1 do in parallel (i) A(0, j, k)  C(0, j, k) (i) B(0, j, k)  C(0, j, k) end for

14 Analysis of the “HYPERCUBE CONNECTIVITY” algorithm Steps 1, 2 and 3 take constant time HYPERCUBE MATRIX MULTIPLICATION: O(log n) time and this step is iterated  log (n-1)  times Total running time: t(n) = O(log 2 n) Since p(n) = n 3  cost c(n) = O(n 3 log 2 n)

15 Algorithm for Connected Components Construct an n x n matrix D using the connectivity matrix C: for 0  j,k  n-1 i.e. row j of D contains names of vertices to which v j is connected by a path Connected components of G are found by assigning each vertex to a component in a following way:  v j is assigned to a component l if l is the smallest index for which d jl  0 v i v k j i value of l is i 0…0

16 Implementation of the Connected Components algorithm Implemented on a hypercube using the HYPERCUBE CONNECTIVITY algorithm It runs on a hypercube with N = n 3 processors, each with three registers A, B, C Processors are arranged in n x n x n array as required for the HYPERCUBE CONNECTIVITY algorithm Initially: A(0,j,k) = a jk for 0  j,k  n-1 At the end: C(0,j,0) contains the component number for vertex v j

17 Algorithm HYPERCUBE CONNECED COMPONENTS Algorithm HYPERCUBE CONNECED COMPONENTS (A,C) Step 1: HYPERCUBE CONNECTIVITY (A,C) Step 2: for j = 0 to n-1 do in parallel for k = 0 to n-1 do in parallel if C(0,j,k) = 1 then C(0, j, k)  v k end if end for Step 3: for j = 1 to n-1 do in parallel (3.1) The n processors in row j find the smallest l for which C(0,j,l)  0 (3.2) C(0,j,0) = l end for Creating matrix D

18 Analysis of the “ HYPERCUBE CONNECED COMPONENTS ” algorithm Step 1 requires O(log 2 n) time Steps 2 and (3.2) take constant time Step (3.1): the n processors in row j form a log n dimensional hypercube; this step is a reduction operation (Step 3 of HYPERCUBE MATRIX MULTIPLICATION with “+” replaced by “min”) Overall running time: t(n) = O(log 2 n) p(n)=n 3  c(n) =O(n 3 log 2 n)

19 Example: comp. Conn. Comp. on a hypercube v3v3 v1v1 v2v2 v4v4 v6v6 v5v5 v7v7 v0v0 Adjacency matrix of G Graph G 0 1 2 3 4 5 6 7 0123456701234567

20 Example v3v3 v1v1 v2v2 v4v4 v6v6 v5v5 v7v7 v0v0 0 1 2 3 4 5 6 7 0123456701234567 X = A x A = A2A2

21 Example 2 – computing the Connectivity Matrix v3v3 v1v1 v2v2 v4v4 v6v6 v5v5 v7v7 v0v0 0 1 2 3 4 5 6 7 0123456701234567 X = x = A4A4 A2A2 A2A2 (= A 2 )  stop

22 Example 2 – cont. v3v3 v1v1 v2v2 v4v4 v6v6 v5v5 v7v7 v0v0 0 1 2 3 4 5 6 7 0123456701234567 Connectivity matrix 0 1 2 3 4 5 6 7 0123456701234567 Matrix of connected components

23 Example 2 – cont. v3v3 v1v1 v2v2 v4v4 v6v6 v5v5 v7v7 v0v0 0 1 2 3 4 5 6 7 0123456701234567 Matrix of connected components Component 1: { v 0,v 5,v 7 } Component 2: { v 1,v 2,v 4 } Component 3: {v 3,v 6 }

9.5. All-Pairs Shortest Paths

Graph Terminology G = (V, E) W = weight matrix  w ij = weight/length of edge (v i, v j )  w ij = ∞ if v i and v j are not connected by an edge  w ii = 0 Assume W has positive, 0, and negative values For this problem, we cannot have a negative-sum cycle in G

Weighted Graph and Weight Matrix v3v3 v2v2 v0v0 v1v1 v4v4 1 2 5 7 6 9 -4 3 0 1 2 3 4

Directed Weighted Graph and Weight Matrix v4v4 v2v2 v0v0 v3v3 v5v5 5 -2 9 4 3 1 2 0 1 2 3 4 5 v1v1 7 6

All-Pairs Shortest Paths Problem For every pair of vertices v i and v j in V, it is required to find the length of the shortest path from v i to v j along edges in E. Specifically, a matrix D is to be constructed such that d ij is the length of the shortest path from v i to v j in G, for all i and j. Length of a path (or cycle) is the sum of the lengths (weights) of the edges forming it.

Sample Shortest Path v4v4 v2v2 v0v0 v3v3 v5v5 5 -2 9 4 3 1 2 v1v1 7 6 Shortest path from v 0 to v 4 is along edges (v 0, v 1 ), (v 1, v 2 ), (v 2, v 4 ) and has length 6

Disallowing Negative-length Cycles APSP does not allow for input to contain negative-length cycles This is necessary because:  If such a cycle were to exist within a path from v i to v j, then one could traverse this cycle indefinitely, producing paths of ever shorter lengths from v i to v j. If a negative-length cycle exists, then all paths which contain this cycle would have a length of -∞.

Sequential Algorithms for APSP Floyd-Warshall algorithm is Θ(V 3 )  Appropriate for dense graphs: |E| = O(|V| 2 ) Johnson’s algorithm  Appropriate for sparse graphs: |E| = O(|V|)  O(V 2 log V + V E) if using a Fibonacci heap  O(V E log V) if using binary min-heap

Properties of Interest Let denote the length of the shortest path from v i to v j that goes through at most k - 1 intermediate vertices (k hops) = w ij (edge length from v i to v j ) If i ≠ j and there is no edge from v i to v j, then Also, Given that there are no negative weighted cycles in G, there is no advantage in visiting any vertex more than once in the shortest path from v i to v j. Since there are only n vertices in G,

Guaranteeing Shortest Paths If the shortest path from v i to v j contains v r and v s (where v r precedes v s ) The path from v r to v s must be minimal (or it wouldn’t exist in the shortest path) Thus, to obtain the shortest path from v i to v j, we can compute all combinations of optimal sub-paths (whose concatenation is a path from v i to v j ), and then select the shortest one vivi vsvs vjvj MIN vrvr ∑ MINs

Iteratively Building Shortest Paths vivi vjvj v1v1 w 1j v2v2 w 2j vnvn w nj …

Recurrence Definition For k > 1, Guarantees O(log k) steps to calculate vivi vlvl vjvj ≤ k/2 vertices ≤ k vertices MIN

Similarity

Computing D Let D k = matrix with entries d ij for 0 ≤ i, j ≤ n - 1. Given D 1, compute D 2, D 4, …, D m Where D = D m To calculate D k from D k/2, use special form of matrix multiplication  ‘  ’ → ‘  ’  ‘  ’ → ‘min’

“Modified” Matrix Multiplication Step 2: for r = 0 to N – 1 do par C r = A r + B r end Step 3: for m = 2q to 3q – 1 do for all r  N (r m = 0) do par C r = min(C r, C r(m) )

“Modified” Example (1) 1 1 -2 3 -3 4 -4 P 100 P 101 P 000 P 001 P 110 P 111 P 010 P 011 From 9.2, Initial

“Modified” Example(2) 1 1 2 -2 2 -2 3 -3 3 -3 4 -4 4 -4 P 100 P 101 P 000 P 001 P 110 P 111 P 010 P 011 From 9.2, after step (1.1)

“Modified” Example(3) 2 1 1 -2 2 -2 4 -3 3 -3 3 -4 4 -4 P 100 P 101 P 000 P 001 P 110 P 111 P 010 P 011 From 9.2, after step (1.2)

“Modified” Example(4) 2 -3 1 1 -2 2 -4 4 -3 3 3 -2 4 -4 P 100 P 101 P 000 P 001 P 110 P 111 P 010 P 011 From 9.2, after step (1.3)

“Modified” Example (5) From 9.2, after modified step 2 P 100 P 101 P 000 P 001 P 110 P 111 P 010 P 011 -2 10 0 21

“Modified” Example (6) From 9.2, after modified step 3 P 100 P 101 P 000 P 001 P 110 P 111 P 010 P 011 -2 10 MIN

Hypercube Setup Begin with a hypercube of n 3 processors  Each has registers A, B, and C  Arrange them in an n  n  n array (cube) Set A(0, j, k) = w jk for 0 ≤ j, k ≤ n – 1  i.e processors in positions (0, j, k) contain D 1 = W When done, C(0, j, k) contains APSP = D m

Setup Example 0 1 2 3 4 5 D 1 = W jk = A(0, j, k) = v4v4 v2v2 v0v0 v3v3 v5v5 5 -2 9 4 3 1 2 v1v1 7 6

APSP Parallel Algorithm Algorithm HYPERCUBE SHORTEST PATH (A,C) Step 1: for j = 0 to n - 1 dopar for k = 0 to n - 1 dopar B(0, j, k) = A(0, j, k) end for Step 2:for i = 1 to do (2.1) HYPERCUBE MATRIX MULTIPLICATION(A,B,C) (2.2) for j = 0 to n - 1 dopar for k = 0 to n - 1 dopar (i) A(0, j, k) = C(0, j, k) (ii) B(0, j, k) = C(0, j, k) end for

An Example 0 1 2 3 4 5 D 1 =D 2 = 0 1 2 3 4 5 D 4 = 0 1 2 3 4 5 D 8 = 0 1 2 3 4 5

Analysis Steps 1 and (2.2) require constant time There are iterations of Step (2.1)  Each requires O(log n) time The overall running time is t(n) = O(log 2 n) p(n) = n 3 Cost is c(n) = p(n) t(n) = O(n 3 log 2 n) Efficiency is

50 Related Paper Edwin Romeijn and Rober Smith: “Parallel Algorithms for Solving Aggregated Shortest Path Problems”, Computers and Operations Research, Special Issue on Aggregation, Volume 26, Issue 10-11, pp 941- 953, 1999

51 Problem of the paper Computing in parallel all pairs of shortest paths in a general large-scale directed network of N nodes. - Using a hierarchical network decomposition algorithm, which yields an important subclass of problems log N savings in computation time over the traditional parallel implementation of Dijkstra’s algorithm.

52 The doubling algorithm  Let is the length of the shortest-path from i to j.  Let is the distance from i to j.  Then using N 2 processors, we can compute the APSP as the following algorithm:  Hence, the time is O(N log N) with N 2 processors.

53 Another Parallel algorithm  An obvious way of parallelizing: with P<=N processors, use Dijkstra’s algorithm in parallel.  Let each processor compute all shortest-paths from at most N/P origins to all possible destination.  O(N/P N log N) for sparse graphs  O(N/P N 2 ) for dense graphs.

54 Another Parallel algorithm – cont.  DIJKSTRA(G, w, s) INITIALIZE-SINGLE-SOURCE(G, s) S ← Ø, Q ← V[G] while Q ≠ Ø do u ← EXTRACT-MIN(Q) S ← S ∪ {u} for each vertex v ∈ Adj[u] do RELAX(u, v, w)

55 Aggregation  We showed:  Sequentially APSP takes O(N 2 log N) time using Dijkstra algorithm.  O(N/P N log N) for sparse graphs using parallel Dijkstra algorithm with P proc.  With N 2 processors it takes O(N log N) time with doubling algorithms.  Goal: Reduce the number of processors necessary to solve the problem, while keeping the time complexity of the algorithm equal to O(N log N).

56 Aggregation :Simple model  Consider a Manhattan network, where every nonboundary node has exactly 4 neighbors. That is, G is a mesh.  Form a partition of the nodes of the network into M classes called macronodes.  Aggregate in such way that the “macronetwork ” of M macronodes is a mesh.  Every macronode itself is a mesh.

57 Aggregation :Simple model - cont N = 64 M = 4 # of macro node= 4 Size of each macro node = 64/4 = 16

58 Aggregation :Simple model - cont  A macroarc is present between two macro-nodes if and only if there is an arc connecting two nodes in their classes.  Arc lengths in the macronetwork : the shortest of the lengths of all arcs connecting two macronodes.  Hierarchical decomposition algorithm:  Approximately solve the shortest-path problem for G by finding all shortest- paths in the macronetwork, and also all shortest-paths within each macronode, and then combine these to get paths connecting all pairs of nodes.  Note these paths are not necessarily shortest-paths in G, even if all sub- problems are solved optimally.

59 Aggregation :Simple model - cont  Theorem1: Consider the Manhattan network with N nodes. Then, using the decomposition algorithm, it is optimal with respect to computational effort to use O( ) number of processors.  Using Dijkstra’s algorithm: O( ) for each macro-node.  In the macronetwork, O( ) with M number of macro-nodes. If using M+1 processors for M macro-nodes, and 1 for the macronetwork, then we want to find the M that minimize maximum of the two time complexities while keeping the time complexity with O(N log N). Then, M should be equal to the.

60 Aggregation :Simple model - cont  In the simple model, the general results hold if we assume that  Each macronode has the same structure as the original network.  The macronetwork has the same structure as the original network.  If there are “one-way-streets” in the network, then a problem can occur.  If link (i,j) ends up inside a macronode, it is possible that there exists a path from j to i in the network, while the decomposition algorithm returns with infinite path.  The reason is that, for a given pair of nodes within a macronode, it is possible that there does not exist a path between thosenodes that is completely contained in the macronode.

61 Aggregation :Simple model - cont  Theorem2: Computing approximate shortest-path lengths using the decomposition algorithm yields a savings of at least O(log N) in time complexity over a parallel implementation of Dijkstra’s algorithm.  Note that, using O( ) processors, it takes O( ) to compute each of the sub-networks if using the parallel Dijkstra’s algorithm.

62 Aggregation :Simple model - cont Decomposition algorithm: After computing shortest path information for the macronetwork and all macronodes, make tables of shortest-path lengths for each of the macronodes, and a table of the macronetwork, we need to modify the entries in the macronetwork table as follows: for every intemediate macronode on a shortest-path in the macronetwork, add the shortest-path length from the entry-micronode to the exit-micronode.  Since the number of nodes on a shortest macro-path will be O(N 1/4 ) on average, this can be performed in O( = N 5/4 ) time sequentially.  Using O(N 1/2 ) processors, it yields O(N 3/4 ) time complexity (less than O(N log N))

63 Aggregation :Simple model - cont  For every pair (i,j) in the original network, the time to find an approximate shortest-path length is reduced to 2 additions and 3 table-lookups: (i to exit- node of micronode containing i) +(shortest-path length from exit-node micronode containing i to the entry-note of the macronode containing j)+(shortest-path from the entry-node of the macronode containing j to j)  Sequentially, it takes O(N 2 ) for every pair.  With O(N 1/2 ) processors, it takes O(N 3/2 ).  That is, it saves O(log N) time.

64 Aggregation :Simple model - cont i j N = 64 M = 4 # of macro node= 4 Size of each macro node = 64/4 = 16

65 Aggregation :Simple model - cont  Theorem3: Computing approximate shortest-path lengths using the decomposition algorithm yields a savings of at most O( ) in time complexity over a parallel implementation of Dijkstra’s algorithm.  We can avoid the last part in the best case, which yields O(N log N).

66 Aggregation :Multi-level Aggregation  Decompose the N X N mesh repeatedly with L levels.  Then each macronode of level L is mesh.  So we have 1 macronetwork of level 1 having M nodes, M macronetworks of level2 having M nodes, …, M L-2 macronetworks of level L-1 having M nodes, and M L-1 level L macronetworks having N/ M L-1 nodes.  Assume we have 1+M+…+ M L-1 = (M L -1)/M-1 =P processors, each exactly solving a shortest-path problem.

67 Aggregation :Simple model - cont N = 64 M = 4 L = 2 Size of each macro node of level 2= 64/4/4 = 4

68 Aggregation : Multi-level Aggregation – cont.  Theorem4: Consider the Manhattan network with N nodes. Then, using the decomposition algorithm with L levels, it is optimal with respect to computational effort to use O(N 1-1/L ) processors.  Inductively using the same reasoning as in the case of L=2 above we obtain that it is optimal to choose M such that N/M L-1 = M, or M = N 1/L  Therefore P = (N-1)/(N 1/L -1) = O(N 1-1/L )

69 Aggregation : Multi-level Aggregation – cont.  Theorem5: Using the decomposition algorithm with L levels, it saves at least O(logN) time and at most O(N 1-1/L ) time over a parallel implementation of Dijkstra’s algorithm.  Similar to Theorem2 and Therem3.

1 Connected Components & All Pairs Shortest Paths Presented by Wooyoung Kim 3/4/09 CSc 8530 Parallel Algorithms, Spring 2009 Dr. Sushil K. Prasad.

Similar presentations

Presentation on theme: "1 Connected Components & All Pairs Shortest Paths Presented by Wooyoung Kim 3/4/09 CSc 8530 Parallel Algorithms, Spring 2009 Dr. Sushil K. Prasad."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Connected Components & All Pairs Shortest Paths Presented by Wooyoung Kim 3/4/09 CSc 8530 Parallel Algorithms, Spring 2009 Dr. Sushil K. Prasad.

Similar presentations

Presentation on theme: "1 Connected Components & All Pairs Shortest Paths Presented by Wooyoung Kim 3/4/09 CSc 8530 Parallel Algorithms, Spring 2009 Dr. Sushil K. Prasad."— Presentation transcript:

Similar presentations

About project

Feedback