Download presentation
Presentation is loading. Please wait.
Published byIsaias Coombs Modified over 9 years ago
1
Jump to first page DISTRIBUTED GENERATION OF PAIRWISE COMBINATIONS PARALLEL GRAPH PARTITIONING ON A HYPERCUBE F. Ercal, P. Sadayappan, and J. Ramanujan University of Missouri-Rolla and The Ohio State University
2
Jump to first page PROBLEM DEFINITION n Given a graph G(V,E), |V|=N |E|=e n Obtain a K partitions from G with the following constraints: u Balanced: Each partition has equal size u Minimum cut: number of edges across partition is minimized n arises in: TasK Allocation, VLSI layout, File Placement etc. n Intractable, no polynomial time algorithm is Known n Heuristics needed n Kernighan-Lin Mincut Heuristic (1970) u Time complexity: O(N 2 logN) n Extension by Fiduccia and Mattheyses (1982) u Used Buckets and moves. Linear time algorithm: O(e)
3
Jump to first page MINCUT ALGORITHM v1v1 v2v2 v3v3 v4v4 v6v6 v7v7 v5v5 v8v8 -2 0 +1 +2 -2 +1 P1P1 P2P2 CUT=5 v1v1 v2v2 v3v3 v4v4 v6v6 v7v7 v5v5 v8v8 0 -2 +1 -2 +1 -3 IF V 2 MOVES GAIN=2 and TOT_GAIN=2 IF V 5 MOVES GAIN=1 and TOT_GAIN=3 CUT=3
4
Jump to first page MINCUT ALGORITHM (Contd..) v1v1 v2v2 v3v3 v4v4 v6v6 v7v7 v5v5 v8v8 0 -2 -3 IF V1 MOVES GAIN=0 and TOT_GAIN=3 CUT=2
5
Jump to first page RECURSIVE BISECTION
6
Jump to first page TIME COMPLEXITY Sequential Time Complexity for Recursive Bisection N + 2*(N/2) + 4*(N/4) + …….2 p *(N/2 p ) ===> O(N*logK) Parallel Time Complexity for Recursive Bisection N + N/2 + N/4 + ……. N/2 p ===> O(N) COMMENT: speedup is very limited to increase speedup, bisection algorithm must be parallelized
7
Jump to first page PAIRWISE MINCUT P1P2P3 P4P5 P6 P7 P8 PAIRS TO BE CONSIDERED FOR MINCUT (1,2) (1,3) (1,4) (1,5) (1,6) (1,7) (1,8) (2,3) (2,4) ………….. (2,8) ……. (7,8)
8
Jump to first page TIME COMPLEXITY Sequential Time Complexity for Pairwise Mincut Parallel Time Complexity for Recursive Bisection CONCLUSIONS Sequential Recursive Bisection (RB) has much lower time complexity than Pairwise Mincut (PM) but superior parallelizability of PM renders its parallel time complexity comparable to that of parallel RB (100% processor utilization)
9
Jump to first page 1) RECURSIVE BISECTION Perform repeated bisection, each time doubling the number of partitions, until K partitions are obtained Time Complexity N+ 2*(N/2) + 4*(N/4)+….+2P*(N/2P) ==> O(N*logK) 2) PAIRWISE MINCUT Initially obtain K partitions. Try to reduce the cut-size between each pair of partitions. K(K-1)/2 pairs (each of size 2N/K) must be considered Time Complexity 3) Any combination of RECURSIVE BISECTION+PAIRWISE MINCUT
10
Jump to first page DISTRIBUTED GENERATION OF PAIRWISE COMBINATIONS ON A HYPERCUBE Problem Given 2P disjoint items, P*(2P-1) distinct pairs can be formed. How would you efficiently generate these pairs on the processors of a hypercube ? Similar to the problem of distributed scheduling of a round-robin tournament between 2C players using C courts, where the paths between courts form a hypercube topology maximum utilization of courts (processor utilization) + minimum walking between courts (min. comm. overhead)
11
Jump to first page A 00 A 01 A 10 A 11 B 00 B 01 B 10 B 11 P 00 P 01 P 10 P 11 C1C2 A 00 A 01 A 10 A 11 P 00 P 01 C1 C2 B 00 B 01 B 10 B 11 C1C2 P 10 P 11 P 00 A 00 A 01 C1C2 A 10 A 11 P 01 C1C2 B 00 B 01 P 10 C1C2 B 10 B 11 P 11 C1C2 Distributed PC Algorithm on a 2d Hypercube (4 Processors) d=0 d=1 d=2
12
Jump to first page A 1 A 2 A 3 : A K/2 A K/2+1 : A K B 1 B 2 B 3 : B K/2 B K/2+1 : B K A 1 A 2 : A K/4 A K/4+1 : A K/2 A K/2+1 : A 3K/4 A 3K/4+1 : A K B 1 B 2 : B K/4 B K/4+1 : B K/2 B K/2+1 : B 3K/4 B 3K/4+1 : B K RING-FRAGMENTATION CYCLIC-TOUR RING-FRAGMENTATION 1 2
13
Jump to first page Ring Communication in different phases of Distributed PC algorithm 1110 1111 1011 1001 1000 1010 1110 1100 0110 0111 0011 0001 0000 0010 0110 0100 (a) d=0 1 ring of size 16 0110 0111 0011 0001 0000 0010 0110 0100 1110 1111 1011 1001 1000 1010 1110 1100 (b) d=1 2 rings of size 8
14
Jump to first page Ring Communication in different phases of Distributed PC algorithm (Contd..) 0110 0111 0011 0001 0000 0010 0110 0100 1110 1111 1011 1001 1000 1010 1110 1100 (c) d=2 4 rings of size 4 0110 0111 0011 0001 0000 0010 0110 0100 1110 1111 1011 1001 1000 1010 1110 1100 (d) d=3 8 rings of size 2
15
Jump to first page Ring Communication in different phases of Distributed PC algorithm (Contd..) 0110 0111 0011 0001 0000 0010 0110 0100 1110 1111 1011 1001 1000 1010 1110 1100 (e) d=4 16 rings of size 1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.