1 Partitioning. 2 Decomposition of a complex system into smaller subsystems  Done hierarchically  Partitioning done until each subsystem has manageable.

Slides:



Advertisements
Similar presentations
Multilevel Hypergraph Partitioning Daniel Salce Matthew Zobel.
Advertisements

L30: Partitioning 성균관대학교 조 준 동 교수
Modularity and community structure in networks
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Lectures on Network Flows
Graph Clustering. Why graph clustering is useful? Distance matrices are graphs  as useful as any other clustering Identification of communities in social.
EE 5301 – VLSI Design Automation I
VLSI Layout Algorithms CSE 6404 A 46 B 65 C 11 D 56 E 23 F 8 H 37 G 19 I 12J 14 K 27 X=(AB*CD)+ (A+D)+(A(B+C)) Y = (A(B+C)+AC+ D+A(BC+D)) Dr. Md. Saidur.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 9: February 20, 2008 Partitioning (Intro, KLFM)
VLSI Layout Algorithms CSE 6404 A 46 B 65 C 11 D 56 E 23 F 8 H 37 G 19 I 12J 14 K 27 X=(AB*CD)+ (A+D)+(A(B+C)) Y = (A(B+C)+AC+ D+A(BC+D)) Dr. Md. Saidur.
2004/9/16EE VLSI Design Automation I 85 EE 5301 – VLSI Design Automation I Kia Bazargan University of Minnesota Part III: Partitioning.
Chapter 2 – Netlist and System Partitioning
1 Vertex Cover Problem Given a graph G=(V, E), find V' ⊆ V such that for each edge (u, v) ∈ E at least one of u and v belongs to V’ and |V’| is minimized.
EDA (CS286.5b) Day 5 Partitioning: Intro + KLFM. Today Partitioning –why important –practical attack –variations and issues.
EDA (CS286.5b) Day 6 Partitioning: Spectral + MinCut.
CS294-6 Reconfigurable Computing Day 13 October 6, 1998 Interconnect Partitioning.
A scalable multilevel algorithm for community structure detection
Processing Rate Optimization by Sequential System Floorplanning Jia Wang 1, Ping-Chih Wu 2, and Hai Zhou 1 1 Electrical Engineering & Computer Science.
1 Circuit Partitioning Presented by Jill. 2 Outline Introduction Cut-size driven circuit partitioning Multi-objective circuit partitioning Our approach.
Partitioning 1 Outline –What is Partitioning –Partitioning Example –Partitioning Theory –Partitioning Algorithms Goal –Understand partitioning problem.
1 Enhancing Performance of Iterative Heuristics for VLSI Netlist Partitioning Dr. Sadiq M. Sait Dr. Aiman El-Maleh Mr. Raslan Al Abaji. Computer Engineering.
15-853Page :Algorithms in the Real World Separators – Introduction – Applications.
Multilevel Graph Partitioning and Fiduccia-Mattheyses
1 CSC 6001 VLSI CAD (Physical Design) January
Partitioning Outline –What is Partitioning –Partitioning Example –Partitioning Theory –Partitioning Algorithms Goal –Understand partitioning problem –Understand.
Joanna Ellis-Monaghan, St. Michaels College Paul Gutwin, Principal Technical Account Manager, Cadence.
Multilevel Hypergraph Partitioning G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar Computer Science Department, U of MN Applications in VLSI Domain.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Graph partition in PCB and VLSI physical synthesis Lin Zhong ELEC424, Fall 2010.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 4: January 28, 2015 Partitioning (Intro, KLFM)
CSE 242A Integrated Circuit Layout Automation Lecture: Partitioning Winter 2009 Chung-Kuan Cheng.
Graph Partitioning Problem Kernighan and Lin Algorithm
CSE 494: Electronic Design Automation Lecture 4 Partitioning.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
Ho-Lin Chang, Hsiang-Cheng Lai, Tsu-Yun Hsueh, Wei-Kai Cheng, Mely Chen Chi Department of Information and Computer Engineering, CYCU A 3D IC Designs Partitioning.
10/25/ VLSI Physical Design Automation Prof. David Pan Office: ACES Lecture 3. Circuit Partitioning.
CSC 211 Data Structures Lecture 13
ECE 260B – CSE 241A Partitioning & Floorplanning 1http://vlsicad.ucsd.edu ECE260B – CSE241A Winter 2005 Partitioning & Floorplanning Website:
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 6: January 23, 2002 Partitioning (Intro, KLFM)
Graph Colouring L09: Oct 10. This Lecture Graph coloring is another important problem in graph theory. It also has many applications, including the famous.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
1/24 Introduction to Graphs. 2/24 Graph Definition Graph : consists of vertices and edges. Each edge must start and end at a vertex. Graph G = (V, E)
CS270 Project Overview Maximum Planar Subgraph Danyel Fisher Jason Hong Greg Lawrence Jimmy Lin.
Circuit Partitioning Divides circuit into smaller partitions that can be efficiently handled Goal is generally to minimize communication between balanced.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 9: February 9, 2004 Partitioning (Intro, KLFM)
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Data Structures and Algorithms in Parallel Computing Lecture 7.
ICS 252 Introduction to Computer Design
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
CprE566 / Fall 06 / Prepared by Chris ChuPartitioning1 CprE566 Partitioning.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 6: January 30, 2013 Partitioning (Intro, KLFM)
Multilevel Partitioning
Iterative Improvement for Domain-Specific Problems Lecturer: Jing Liu Homepage:
1 مرتضي صاحب الزماني Partitioning. 2 First Project Synthesis by Design Compiler Physical design by SoC Encounter –Use tutorials in \\fileserver\common\szamani\EDA.
3/21/ VLSI Physical Design Automation Prof. David Pan Office: ACES Lecture 4. Circuit Partitioning (II)
::Network Optimization:: Minimum Spanning Trees and Clustering Taufik Djatna, Dr.Eng. 1.
Partitioning Jong-Wha Chong Wireless Location and SOC Lab. Hanyang University.
High Performance Computing Seminar
A Continuous Optimization Approach to the Minimum Bisection Problem
VLSI Quadratic Placement
CS137: Electronic Design Automation
Lectures on Network Flows
Chapter 5. Optimal Matchings
A Linear-Time Heuristic for Improving Network Partitions
Chapter 2 – Netlist and System Partitioning
Haim Kaplan and Uri Zwick
EE5780 Advanced VLSI Computer-Aided Design
A Fundamental Bi-partition Algorithm of Kernighan-Lin
Rusakov A. S. (IPPM RAS), Sheblaev M.
Presentation transcript:

1 Partitioning

2 Decomposition of a complex system into smaller subsystems  Done hierarchically  Partitioning done until each subsystem has manageable size  Each subsystem can be designed independently Interconnections between partitions minimized  Less hassle interfacing the subsystems  Communication between subsystems usually costly [©Sherwani] [Bazargan]

3 Example: Partitioning of a Circuit [©Sherwani] Input size: 48 Cut 1=4 Size 1=15 Cut 2=4 Size 2=16Size 3=17 [Bazargan]

4 Hierarchical Partitioning Levels of partitioning:  System-level partitioning: Each sub-system can be designed as a single printed circuit board (PCB)  Board-level partitioning: Circuit assigned to a PCB is partitioned into sub-circuits each fabricated as a VLSI chip  Chip-level partitioning: Circuit assigned to the chip is divided into manageable sub-circuits NOTE: physically not necessary [©Sherwani] [Bazargan]

5 Delay at Different Levels of Partitions A B C PCB1 [©Sherwani] D x 10x 20x PCB2 PCB = Printed Circuit Board [Bazargan]

6 Partitioning: Formal Definition Input:  Graph or hypergraph  Usually with vertex weights (sizes)  Usually weighted edges Constraints  Number of partitions (K-way partitioning)  Maximum capacity of each partition OR maximum allowable difference between partitions Objective  Assign nodes to partitions subject to constraints s.t. the cutsize is minimized Tractability  Is NP-complete  [Bazargan]

7 Some Terminology Partitioning: Dividing bigger circuits into a small number of partitions (top down) Clustering: cluster small cells into bigger clusters (bottom up). Covering / Technology Mapping: Clustering such that each partitions (clusters) have some special structure (e.g., can be implemented by a cell in a cell library). k-way Partitioning: Dividing into k partitions. Bipartitioning: 2-way partitioning. Bisectioning: Bipartitioning such that the two partitions have the same size. [Pan]

8 Circuit Representation Netlist:  Gates: A, B, C, D  Nets: {A,B,C}, {B,D}, {C,D} Hypergraph:  Vertices: A, B, C, D  Hyperedges: {A,B,C}, {B,D}, {C,D}  Vertex label: Gate size/area  Hyperedge label: Importance of net (weight) A B CD A B C D [Pan]

9 Circuit Partitioning Formulation Bi-partitioning formulation : Minimize interconnections between partitions Minimum cut: min c(x, x ’ ) Minimum bisection: min c(x, x’) with |x|= |x ’ | Minimum ratio-cut: min c(x, x ’ ) / |x||x ’ | XX’ c(X,X’) [Pan]

10 A Bi-Partitioning Example Min-cut size=13 Min-Bisection size = 300 Min-ratio-cut size= 19 a b ce df mini-ratio-cut min-bisection min-cut Ratio-cut helps to identify natural clusters [Pan]

11 Circuit Partitioning Formulation (Cont’d) General multi-way partitioning formulation: Partitioning a network N into N 1, N 2, …, N k such that Each partition has an area constraint Each partition has an I/O constraint Minimize the total interconnection:    i Nv i Ava)( iii INNNc  ),( ),( i N i NNNc i   [Pan]

12 Partitioning Algorithms Iterative partitioning algorithms Spectral based partitioning algorithms Net partitioning vs. module partitioning Multi-way partitioning Multi-level partitioning Further study in partitioning techniques (timing-driven …) [Pan]

13 Kernighan-Lin (KL) Algorithm On non-weighted graphs An iterative improvement technique A two-way (bisection) partitioning algorithm The partitions must be balanced (of equal size) Iterate as long as the cutsize improves:  Find a pair of vertices that result in the largest decrease in cutsize if exchanged  Exchange the two vertices (potential move)  “Lock” the vertices  If no improvement possible, and still some vertices unlocked, then exchange vertices that result in smallest increase in cutsize W. Kernighan and S. Lin, Bell System Technical Journal, [Bazargan]

14 Kernighan-Lin (KL) Algorithm Initialize  Bipartition G into V 1 and V 2, s.t., |V 1 | = |V 2 |  1  n = |V| Repeat  for i=1 to n/2 oFind a pair of unlocked vertices v ai  V 1 and v bi  V 2 whose exchange makes the largest decrease or smallest increase in cut-cost oMark v ai and v bi as locked oStore the gain g i.  Find k, s.t.  i=1..k g i =Gain k is maximized  If Gain k > 0 then move v a1,...,v ak from V 1 to V 2 and v b1,...,v bk from V 2 to V 1. Until Gain k  0 [Bazargan]

15 Kernighan-Lin (KL) Example a b c d e f g h 4 { a, e } { d, g } 32 2 { c, f } 11 3 { b, h } -23 Step No.Vertex PairGainCut-cost [©Sarrafzadeh] [Bazargan]

16 Kernighan-Lin (KL) : Analysis Time complexity?  Inner (for) loop oIterates n/2 times oIteration 1: (n/2) x (n/2) oIteration i: (n/2 – i + 1) 2.  Passes? Usually independent of n  O(n 3 ) per pass Drawbacks?  Local optimum  Balanced partitions only  No weight for the vertices  High time complexity  Hyper-edges? Weighted edges? [Bazargan] Replace vertex of weight  with  vertices of size 1 Add “dummy” nodes

17 Internal cost Gain Calculation GAGA GBGB a1a1 a2a2 anan aiai a3a3 a5a5 a6a6 a4a4 b2b2 bjbj b4b4 b3b3 b1b1 b6b6 b7b7 b5b5 [©Kang] External cost [Bazargan]

18 Lemma: Consider any a i  A, b j  B. If a i, b j are interchanged, the gain is Proof: Total cost before interchange (T) between A and B Total cost after interchange (T’) between A and B Therefore Gain Calculation (cont.) [©Kang] [Bazargan]

19 Gain Calculation (cont.) Lemma:  Let D x ’, D y ’ be the new D values for elements of A - {a i } and B - {b j }. Then after interchanging a i & b j, Proof:  The edge x-a i changed from internal in D x to external in D x ’  The edge y-bj changed from internal in D x to external in D x ’  The x-b j edge changed from external to internal  The y-a i edge changed from external to internal More clarification in the next two slides [©Kang] [Bazargan] D x = E x - I x

20 Clarification of the Lemma aiai bjbj x   [Bazargan]

21 Clarification of the Lemma (cont.) Decompose Ix and Ex to separate edges from ai and bj: Write the equations before the move... And after the move [Bazargan]

22 Example: KL Step 1 - Initialization A = {2, 3, 4}, B = {1, 5, 6} A’ = A = {2, 3, 4}, B’ = B = {1, 5, 6} Step 2 - Compute D values D 1 = E 1 - I 1 = 1-0 = +1 D 2 = E 2 - I 2 = 1-2 = -1 D 3 = E 3 - I 3 = 0-1 = -1 D 4 = E 4 - I 4 = 2-1 = +1 D 5 = E 5 - I 5 = 1-1 = +0 D 6 = E 6 - I 6 = 1-1 = +0 [©Kang] Initial partition [Bazargan]

23 Example: KL (cont.)  Step 3 - compute gains g 21 = D 2 + D 1 - 2C 21 = (-1) + (+1) - 2(1) = -2 g 25 = D 2 + D 5 - 2C 25 = (-1) + (+0) - 2(0) = -1 g 26 = D 2 + D 6 - 2C 26 = (-1) + (+0) - 2(0) = -1 g 31 = D 3 + D 1 - 2C 31 = (-1) + (+1) - 2(0) = 0 g 35 = D 3 + D 5 - 2C 35 = (-1) + (0) - 2(0) = -1 g 36 = D 3 + D 6 - 2C 36 = (-1) + (0) - 2(0) = -1 g 41 = D 4 + D 1 - 2C 41 = (+1) + (+1) - 2(0) = +2 g 45 = D 4 + D 5 - 2C 45 = (+1) + (+0) - 2(+1) = -1 g 46 = D 4 + D 6 - 2C 46 = (+1) + (+0) - 2(+1) = -1  The largest g value is g 41 = +2  interchange 4 and 1 (a 1, b 1 ) = (4, 1) A’ = A’ - {4} = {2, 3} B’ = B’ - {1} = {5, 6} both not empty [©Kang] [Bazargan]

24 Example: KL (cont.) Step 4 - update D values of node connected to vertices (4, 1) D 2 ’ = D 2 + 2C C 21 = (-1) + 2(+1) - 2(+1) = -1 D 5 ’ = D 5 + 2C C 54 = (0) - 2(+1) = -2 D 6 ’ = D 6 + 2C C 64 = (0) - 2(+1) = -2 Assign D i = D i ’, repeat step 3 : g25 = D 2 + D 5 - 2C 25 = (0) = -3 g26 = D 2 + D 6 - 2C 26 = (0) = -3 g35 = D 3 + D 5 - 2C 35 = (0) = -3 g36 = D 3 + D 6 - 2C 36 = (0) = -3 All values are equal; arbitrarily choose g 36 = -3  (a2, b2) = (3, 6) A’ = A’ - {3} = {2}, B’ = B’ - {6} = {5} New D values are: D 2 ’ = D 2 + 2C C 26 = (1) - 2(0) = +1 D 5 ’ = D 5 + 2C C 53 = (1) - 2(0) = +0 New gain with D 2  D 2 ’, D 5  D 5 ’ g 25 = D 2 + D 5 - 2C 52 = (0) = +1  (a3, b3) = (2, 5) [©Kang] [Bazargan]

25 Example: KL (cont.) Step 5 - Determine the # of moves to take g 1 = +2 g 1 + g 2 = = -1 g 1 + g 2 + g 3 = = 0 The value of k for max G is 1 X = {a 1 } = {4}, Y = {b 1 } = {1} Move X to B, Y to A  A = {1, 2, 3}, B = {4, 5, 6} Repeat the whole process: The final solution is A = {1, 2, 3}, B = {4, 5, 6} [Bazargan]

26 Fiduccia-Mattheyses (FM) Algorithm Modified version of KL A single vertex is moved across the cut in a single move  Unbalanced partitions Vertices are weighted Concept of cutsize extended to hypergraphs Special data structure to improve time complexity to O(n 2 ) per pass  (Main feature) Can be extended to multi-way partitioning C. M. Fiduccia and R. M. Mattheyses, 19 th DAC, [Bazargan]

27 The FM Algorithm: Data Structure -pmax +pmax -pmax 2nd Partition Ist Partition List of free vertices [©Sherwani] v a1 v a2 v b1 v b2 Vertex n Vertex 12 n [Bazargan]

28 The FM Algorithm: Data Structure pmax  Maximum gain  p max = d max. w max, where d max = max degree of a vertex (# edges incident to it) w max is the maximum edge weight -pmax.. pmax array  Index i is a pointer to the list of unlocked vertices with gain i. Limit on size of partition  A maximum defined for the sum of vertex weights in a partition (alternatively, the maximum ratio of partition sizes might be defined) [Bazargan]

29 The FM Algorithm Initialize  Start with a balance partition A, B of G (can be done by sorting vertex weights in decreasing order, placing them in A and B alternately) Iterations  Similar to KL  A vertex cannot move if violates the balance condition  Choosing the node to move: pick the max gain in the partitions  Moves are tentative (similar to KL)  When no moves possible or no more unlocked vertices available, the pass ends  When no move can be made in a pass, the algorithm terminates [Bazargan]

30  For multi terminal nets, K-L may decompose them into many 2-terminal nets, but not efficient!  Consider this example:  If A = {1, 2, 3} B = {4, 5, 6}, graph model shows the cutsize = 4 but in the real circuit, only 3 wires cut  Reducing the number of nets cut is more realistic than reducing the number of edges cut Why Hyperedges? [©Kang] m q k p m m m q q q k p [Bazargan]

31 Hyperedge to Edge Conversion A hyperedge can be converted to a “clique”. w=?  w=2/(n-1) has been used, also w=2/n  w=4/(n 2 – mod(n,2)) for n=3, w=4/(9-1)=0.5 Always necessary to convert hyper-edge to edge? w w w “Real” cut=1“net” cut=2w [©Keutzer] [Bazargan]

32 FM Gain Calculation: Direct Hyperedge Calc FM is able to calculate gain directly using hyperedges (  not necessary to convert hyperedges to edges) Definition:  Given a partition (A|B), we define the terminal distribution of n as an ordered pair of integers (A(n),B(n)), which represents the number of cells net n has in blocks A and B respectively (how fast can be computed?)  Net is critical if there exists a cell on it such that if it were moved it would change the net’s cut state (whether it is cut or not).  Net is critical if A(n)=0,1 or B(n)=0,1 [©Keutzer] [Bazargan]

33 FM Gain Calc: Direct Hyperedge Calc (cont.) Gain of cell depends only on its critical nets:  If a net is not critical, its cutstate cannot be affected by the move  A net which is not critical either before or after a move cannot influence the gains of its cells Let F be the “from” partition of cell i and T the “to”: g(i) = FS(i) - TE(i), where:  FS(i) = # of nets which have cell i as their only F cell  TE(i) = # of nets connected to i and have an empty T side [©Keutzer] [Bazargan]

34 Hyperedge Gain Calculation Example If node “a” moves to the other partition… a b c d e f g i j k l m n h1h1 h3h3 h2h2 h4h4 [Bazargan]

35 Subgraph Replication to Reduce Cutsize Vertices are replicated to improve cutsize Good results if limited number of components replicated [©Sherwani] C. Kring and A. R. Newton, ICCAD, [Bazargan]

36 FM Partitioning: each object is assigned a gain - objects are put into a sorted gain list - the object with the highest gain from the larger of the two sides is selected and moved. - the moved object is "locked" - gains of "touched" objects are recomputed - gain lists are resorted Object Gain: The amount of change in cut crossings that will occur if an object is moved from its current partition into the other partition Moves are made based on object gain. [Pan]

FM Partitioning: [Pan]

[Pan]

[Pan]

[Pan]

[Pan]

[Pan]

[Pan]

[Pan]

[Pan]

[Pan]

[Pan]

[Pan]

[Pan]

[Pan]

51 Time Complexity of FM For each pass,  Constant time to find the best vertex to move.  After each move, time to update gain buckets is proportional to degree of vertex moved.  Total time is O(p), where p is total number of pins Number of passes is usually small. [Pan]

52 Extension by Krishnamurthy “An Improved Min-Cut Algorithm for Partitioning VLSI Networks”, IEEE Trans. Computer, 33(5): , [Pan]

53 Tie-Breaking Strategy For each vertex, instead of having a gain bucket, a gain vector is used. Gain vector is a sequence of potential gain values corresponding to numbers of possible moves into the future. Therefore, r th entry looks r moves ahead. Time complexity is O(pr), where r is max # of look- ahead moves stored in gain vector. If ties still occur, some researchers observe that LIFO order improves solution quality. [Pan]

54 Ratio Cut Objective by Wei and Cheng “Towards Efficient Hierarchical Designs by Ratio Cut Partitioning”, ICCAD, pages 1: , [Pan]

55 Ratio Cut Objective It is not desirable to have some pre-defined ratio on the partition sizes. Wei and Cheng proposed the Ratio Cut objective. Try to locate natural clusters in circuit and force the partitions to be of similar sizes at the same time. Ratio Cut R XY = C XY /(|X| x |Y|) A heuristic based on FM was proposed. [Pan]

56 Sanchis Algorithm “Multiple-way Network Partitioning”, IEEE Trans. Computers, 38(1):62-81, [Pan]

57 Multi-Way Partitioning Dividing into more than 2 partitions. Algorithm by extending the idea of FM + Krishnamurthy. [Pan]

Simulated annealing See text, section

59 Paper by Johnson, Aragon, McGeoch and Schevon on Bisectioning using SA “Optimization by Simulated Annealing: An Experimental Evaluation Part I, Graph Partitioning”, Operations Research, 37: , [Pan]

60 The Work of Johnson, et al. An extensive empirical study of Simulated Annealing versus Iterative Improvement Approaches. Conclusion: SA is a competitive approach, getting better solutions than KL for random graphs. Remarks:  Netlists are not random graphs, but sparse graphs with local structure.  SA is too slow. So KL/FM variants are still most popular.  Multiple runs of KL/FM variants with random initial solutions may be preferable to SA. [Pan]

Buffon’s needle Given  A set of parallel lines at distance 1  A needle of length 1  Drop the needle, and find the probability that it intersects a line  Can show that this probability is 2/  Generate multiple trials to estimate this probability  Use it to calculate the value of   Google this to find java applets Uses probabilistic methods to solve a deterministic problem: this is a well-established idea 61

62 Another probabilistic experiment: random partitions For any partitioning problem: Suppose solutions are picked randomly. If |G|/|A| = r, Pr(at least 1 good in 5/r trials) = 1-(1-r) 5/r If |G|/|A| = 0.001, Pr(at least 1 good in 5000 trials) = 1-( ) 5000 = G All solutions (State space) Good solutions A [Pan]

63 Adding Randomness to KL/FM In fact, # of good states are extremely few. Therefore, r is extremely small. Need extremely long time if just picking states randomly (without doing KL/FM). Running KL/FM variants several times with random initial solutions is a good idea. Cut Value Partitions Good Initial States Good States [Pan]

64 Some Other Approaches KL/FM-SA Hybrid: Use KL/FM variant to find a good initial solution for SA, then improve that solution by SA at low temperature. Tabu Search Genetic Algorithm Spectral Methods (finding Eigenvectors) Network Flows Quadratic Programming [Pan]

65 Clustering  Bottom-up process  Merge heavily connected components into clusters  Each cluster will be a new “node”  “Hide” internal connections (i.e., connecting nodes within a cluster)  “Merge” two edges incident to an external vertex, connecting it to two nodes in a cluster Can be a preprocessing step before partitioning  Each cluster treated as a single node , ,4 6 1, [Bazargan]

66 Multilevel Hypergraph Partitioning: Applications in VLSI Domain G. Karypis, R. Aggarwal, V. Kumar and S. Shekhar, DAC [Pan]

67 Multi-Level Partitioning [Pan]

68 Coarsening Phase Edge Coarsening Hyper-edge Coarsening (HEC) Modified Hyperedge Coarsening (MHEC) [Pan]

69 Uncoarsening and Refinement Phase 1.FM: Based on FM with two simplifications:  Limit number of passes to 2  Early-Exit FM (FM-EE), stop each pass if k vertex moves do not improve the cut 2.HER (Hyperedge Refinement) Move a group of vertices between partitions so that an entire hyperedge is removed from the cut [Pan]

70 hMETIS Algorithm Software implementation available for free download from Web hMETIS-EE 20  20 random initial partitons  with 10 runs using HEC for coarsening  with 10 runs using MHEC for coarsening  FM-EE for refinement hMETIS-FM 20  20 random initial partitons  with 10 runs using HEC for coarsening  with 10 runs using MHEC for coarsening  FM for refinement [Pan]

71 Experimental Results Compared with five previous algorithms hMETIS-EE 20 is:  4.1% to 21.4% better  On average 0.5% better than the best of the 5 algorithms  Roughly 1 to 15 times faster hMETIS-FM 20 is:  On average 1.1% better than hMETIS-EE 20  Improves the best-known bisections for 9 out of 23 test circuits  Twice as slow as hMETIS-EE 20 [Pan]