VLSI Layout Algorithms CSE 6404 A 46 B 65 C 11 D 56 E 23 F 8 H 37 G 19 I 12J 14 K 27 X=(AB*CD)+ (A+D)+(A(B+C)) Y = (A(B+C)+AC+ D+A(BC+D)) Dr. Md. Saidur Rahman
Circuit Partitioning
Problem Definition Given a graph G=(V,E), where each vertex v in V has a size s(v), and each edge e in E has a weight w(e), the problem is to divide the set V into k subsets V 1, V 2, ….V k, such that an objective function is optimized, subject to certain constraints
Two-way partitioning problem Each node has unit size Each edge has unit weight Find two partitions V 1 and V 2 such that Each of V 1 and V 2 has equal size External wiring will be minimum (cut-set will have to minimmize)
Kernighan-Lin (KL) Algorithm Initialize –Bipartition G into V 1 and V 2, s.t., |V 1 | = |V 2 | 1 –n = |V| Repeat –for i=1 to n/2 Find a pair of unlocked vertices v ai V 1 and v bi V 2 whose exchange makes the largest decrease or smallest increase in cut-cost Mark v ai and v bi as locked Store the gain g i. –Find k, s.t. i=1..k g i =Gain k is maximized –If Gain k > 0 then move v a1,...,v ak from V 1 to V 2 and v b1,...,v bk from V 2 to V 1. Until Gain k 0
Variation of Kernighan-Lin (KL) Algorithm Unequal Sized Blocks Total number of vertices 2n Number of vertices in Partition A: n 1 Number of vertices in Partition B: n 2 Without loss of generality assume n 1 < n 2
Divide V into blocks A and B; A contains n 1 vertices and B contains n 2 vertices. Add n 2 -n 1 dummy vertices to block A. Dummy vertices have no connections to the original graph. Apply K-L algorithm. Remove dummy vertices A B Unequal Sized Blocks
Unequal Sized elements Without loss of generality assume that the smallest element has unit size. Replace each element of size s with complete graph of s vertices having each edge of infinite weight. Apply K-L algorithm. Variation of Kernighan-Lin (KL) Algorithm
A B A B 3 2
K – way partitioning of kn vertices Begin with a random partition of k sets of n vertices each. Apply the two-way partitioning procedure on each pair of Partitions.
m m m q q q p j k j p m k Net cuts and Edge cuts Reducing the number of net cuts is more realistic than reducing the number of edge cut.
Fiduccia-Mattheyses Algorithm “A Linear-time Heuristics for Improving Network Partitions” 19 th DAC, pages , Given a circuit consisting of C cells connected by a set of N nets (where each net connects at least two cells), the problem is to partition circuit C into two blocks A and B such that the number of nets which have cells in both the blocks is minimized and the balance factor r is satisfied.
Ideas of FM Algorithm Similar to KL: –Work in passes. –Lock vertices after moved. –Actually, only move those vertices up to the maximum partial sum of gain. Difference from KL: –Not exchanging pairs of vertices. Move only one vertex at each time. –The use of gain bucket data structure.
Algorithm FM_TWPP Begin Step1: Compute gains of all cells. Step 2: i = 1 Select base cell and call it c i ; If no base cell Then Exit; Step 3. Lock cell c i ; update gains of cells of those affected critical nets; Step 4. If a free cell exists then i = i + 1 and select next base cell; Go to step 3; Step 5. Select best sequence of moves c 1,c 2,c 3, …,c k such that sum of gain is maximum; If G 0 Then Exit Step 6. Make all I moves permanent; Free all cells; Go to Step 1 End
Gain calculation Assume that Cell i is moved from F(i) to T(i). Gain g(i) from this movement g(i) = FS(i) –TE(i) Where FS(i) = the number of nets connected to cell I and not connected to any other cell in F(i) TE(i) = the number of nets that are connected to cell i and not crossing the cut.
j p m k Cell iFTFS(i)TE(i)g(i) 1AB01 2AB21+1 3AB01 4BA110 5BA110 6BA10+1 A B
Critical Net Concepts Gain Bucket Data Structure
Distribution of a net n A(n) : the number of cells of net n in A B(n): the number of cells of net n in B A B
Distribution of a net n A(n) : the number of cells of net n in A B(n): the number of cells of net n in B A B
Distribution of a net n A(n) : the number of cells of net n in A B(n): the number of cells of net n in B Critical Net: A net is critical if it has a cell which if moved will change its cutstate. A net is critical if and only if (i) A(n) is either 0 or 1, or (ii) B(n) is either 0 or 1.
Features of FM Algorithm Modification of KL Algorithm: –Can handle non-uniform vertex weights (areas) –Allow unbalanced partitions –Extended to handle hypergraphs –Clever way to select vertices to move, run much faster.
Gain Bucket Data Structure Cell # Cell # Max Gain +pmax -pmax 1 2n
Performance Driven Partitioning The partitioning algorithms which deal with high performance circuits are called performance driven partitioning algorithms. A B C PCB1 D x 10x 20x PCB2 If a critical path is cut many times by the partition, the delay in the path may be too large to meet the goals of the high performance circuit.
Problem Definition Let G=(V,E) be a weighted directed graph. Each vertex v i in V represents a components in the circuit and each edge represents a connection between two gates. Each vertex v i has a weight GD(v i ), specifying the gate delay associated with the gate corresponding to v i. Each edge (v i, v j ) has a delay associated with it, which depends on the partitions to which vi and vj belongs. The edge delay ED ij =(d 1, d 2, d 3 ) specifies the delay between v i and v j. d 1 : if the chip is cut at a chip level. d 2 : if the chip is cut at a board level. d 3 : if the chip is cut at a system level. Obtain optimal partition which minimize the delay on the critical path.
Clustering a circuit Objective is to minimize delay Given: capacity constraints on the cluster M: largest weight that can be accommodated in a cluster. M = 4
Clustering a circuit Objective is to minimize delay Given: capacity constraints on the cluster M: largest weight that can be accommodated in a cluster. 0 M = 4
Clustering a circuit Objective is to minimize delay Given: capacity constraints on the cluster M: largest weight that can be accommodated in a cluster. 0 0 M = 4
Clustering a circuit Objective is to minimize delay Given: capacity constraints on the cluster M: largest weight that can be accommodated in a cluster M = 4
Clustering a circuit Objective is to minimize delay Given: capacity constraints on the cluster M: largest weight that can be accommodated in a cluster M = 4
Clustering a circuit Objective is to minimize delay Given: capacity constraints on the cluster M: largest weight that can be accommodated in a cluster M = 4
Clustering a circuit Objective is to minimize delay Given: capacity constraints on the cluster M: largest weight that can be accommodated in a cluster M = 4
Clustering a circuit Objective is to minimize delay Given: capacity constraints on the cluster M: largest weight that can be accommodated in a cluster M = 4
Clustering a circuit Objective is to minimize delay Given: capacity constraints on the cluster M = 4 The label represent the maximum delay of a signal when the signal reaches the vertex.
Other Partitioning algorithms. Simulated Annealing Simulated Evolution Crystallization Biological evolution
References and Copyright Some of the slides are used from the following references (with some modification if necessary). –[©Sarrafzadeh] © Majid Sarrafzadeh, 2001; Department of Computer Science, UCLA –[©Sherwani] © Naveed A. Sherwani, 1992 (companion slides to [She99]) –[©Keutzer] © Kurt Keutzer, Dept. of EECS, UC-Berekeley –[©Gupta] © Rajesh Gupta UC-Irvine –[©Kang] © Steve Kang UIUC Kia Bazargan, University of Prof. David Z. Pan, David Z. Pan