Mixed-Size Placement with Fixed Macrocells using Grid-Warping Zhong Xiu*, Rob Rutenbar * Advanced Micro Devices Inc., Department of Electrical and Computer Engineering, Carnegie Mellon University
Slide 2 Placement by Grid-Warping In [Zhong et al, DAC04], we showed first grid-warping placer In [Zhong et al, DAC05], we showed our timing-driven placer Fundamentally new idea for placement improvement Imagine we place the gates on the surface of a flexible elastic sheet We stretch the sheet to improve the placement Quadratic Initial placement Warp Placement surface Improved warped result Recurse & descend to continue
Slide 3 Grid Warping: Attractive Features Novel paradigm for placement: optimize the grid, not the gates Think of “gravity” – we reshape curvature of space to move the mass Flexibly nonlinear Free to warp anyway we like; not driven primarily by linear solves Low-dimensional optimization problem We only need to control the sheet, we don’t move gates individually Early prototypes – WARP1, WARP2 – perform well Competitive on wirelength with other published placers As fast – or faster – than many other analytical placers
Slide 4 Organization of this Talk What’s missing? To handle Mixed-Size Placement Wirelength/Timing optimization is necessary but not sufficient We must be able to handle mixed-size placement with fixed macrocells First, we review basic mechanics of grid-warping Second, we show how to extend grid-warping for mixed-size placement with fixed macrocells Finally, we show our results and future directions
Slide 5 Review: Mechanics of Grid-Warping It’s conceptually useful to think of warping as distorting a regular mesh placed on the elastic placement surface…..but this is not actually how we implement warping Quadratic Initial placement Warp Placement surface Improved warped result Recurse & descend to continue
Slide 6 Warp grids and acquire gates We Formulate Warping in an “Inverse” Way We warp to “acquire” a new set of gates in each unit grid area… … then “pull” gates back to the undistorted grid, to move them Restore grids and pull gates back Initial placement mass and grids
Slide 7 And We Do Not Use a Regular Warping Grid 2x2 Warping grid4x4 Warping grid Instead, we use a grid defined by a set of slicing cuts It turns out this allows a greater range of motion for the gates Yes—a lot like quadrisection or partitioning, but more general The cuts need not be axis parallel Because gates are fully placed in each region, we get real wirelength
Slide 8 Improvement Pre-Warping Decompose/Recurse Nonlinear Grid-Warping Loop Quadratic Placement Legalize (Domino) Complete Grid Warping Flow Complete flow has several steps We review them briefly here
Slide 9 Improvement Pre-Warping Decompose/Recurse Nonlinear Grid-Warping Loop Quadratic Placement Legalize (Domino) Complete Grid Warping Flow Quadratic place onto elastic sheet Note: pure quadratic wirelength No reweighting steps
Slide 10 Improvement Pre-Warping Decompose/Recurse Nonlinear Grid-Warping Loop Quadratic Placement Legalize (Domino) Complete Grid Warping Flow Geometric pre-conditioning step Spreads gates out quickly, uniformly, to improve final wirelen
Slide 11 Improvement Pre-Warping Decompose/Recurse Nonlinear Grid-Warping Loop Quadratic Placement Legalize (Domino) Complete Grid Warping Flow Nonlinear optimizer iteratively perturbs warping grid on sheet
Slide 12 Complete Grid Warping Flow Improvement Pre-Warping Decompose/Recurse Nonlinear Grid-Warping Loop Quadratic Placement Legalize (Domino) Nonlinear optimizer iteratively perturbs warping grid on sheet..each new warping is quickly “stretched” back to a full placement Use this to eval cost function, which tracks ‘rectilinear wirelen + capacity’ stretched
Slide 13 Improvement Pre-Warping Decompose/Recurse Nonlinear Grid-Warping Loop Quadratic Placement Legalize (Domino) Complete Grid Warping Flow Nonlinear optimizer delivers a final warped placement Standard improvement step runs hMetis to optimize location of gates placed near partition cuts
Slide 14 Improvement Pre-Warping Decompose/Recurse Nonlinear Grid-Warping Loop Quadratic Placement Legalize (Domino) Complete Grid Warping Flow Recurse: in this case, 4 new placements inside 4 regions Continue until ~few gates/region
Slide 15 Improvement Pre-Warping Decompose/Recurse Nonlinear Grid-Warping Loop Quadratic Placement Legalize (Domino) Complete Grid Warping Flow Warping flow delivers a final, but still slightly illegal, placement Use Domino (T.U. Munich) to legalize to final detailed placement
Slide 16 Problem: Warped Placement with Macrocells Assumptions We focus on the fixed-macro case The core problem Warping is intrinsically weak at separating large macros and small gates All instances modeled as points; elastic “stretching” keeps nearby points close
Slide 17 Handling Fixed Macrocells Improvement Pre-Warping Decompose/Recurse Nonlinear Grid-Warping Loop Quadratic Placement Legalize (FastPlace) Re-Warping 4 new geometric solutions Inside warping, a geometric “hash” function that greedily re-locates gates that warp on top of macrocells …and a new net model (QP) During partition improvement, closer attention to size imbalances New backend (FastPlace)
Slide 18 (1) Geometric Hashing During Warping Problem: nonlinear warping drops gates on top of fixed macros Solution: “hash” them off, inside warping loop Inside warping, inside each eval of global cost func, check if each cell overlaps macro If so, we push it to the nearest boundary that has enough space for the cell We chop chip up into small grids, store “nearest boundary” info in a hash table Note No attempt to manage density or wirelength in this solution, just legality M M M M
Slide 19 (2) New Net Models 1 st QP & QPs in re-warping For 2-pin nets, use the clique model, and set weight to 1. For nets with 3 or more pins, use star model and introduce a new variable, each net has a weight: #Pins/(#Pins - 1). (FastPlace, ISPD’04) 2 nd QP and beyond At 2nd layer of QP and beyond, use Jens Vygen’s net-split technique Hybrid models, make QP faster and conserve quality If a column (row) contains only one cell, star model is used and no additional variables is introduced. If a column (row) contains two or more cells, a new cell is introduced and is connected to all the internal cells and all the propagated pins.
Slide 20 (3) Better Consideration of Capacity Improvement Pre-Warping Decompose/Recurse Nonlinear Grid-Warping Loop Quadratic Placement Legalize (FastPlace) Re-Warping Get rid of “Pre-Warping” Cost Function: use a single-sided cost function, under-filled regions receive no capacity penalty hMetis: the total area of all cells in each region does not exceed the capacity of that region Re-Warping: all of the above mentioned modifications are applied here
Slide 21 (4) New Backend Use ideas from FastPlace [ICCAD’05] Swap based detail placement algorithm – greedy algorithm Find the “optimal region” for a cell, if the cell is not in this region, try to swap it with a cell or space in that optimal region Detailed backend flow First pass legalize FengShui 5.1 from SUNY Binghm Local repair for small macros/overlaps Local greedy swaps Wirelength minimization Chu’s FastPlace legalizer ideas Our new overall flow Run WARP as global placement Sort all the cells and ~legalize them Run the new detailed backend
Slide 22 Macrocell Results – ISPD’02 Wirelength 3-4% better than Feng Shui Competitive with BonnPlace Run time 2X Fengshui ~ Competitive with BonnPlace Exact comparison difficult – BonnPlace is running on an 1.45GHz IBM 4- processor server and is explicitly parallelized software We’re on a 2.0GHz Xeon. Feng Shui 5.1BonnPlaceWarp 3 WireCPUWireCPUWireCPU
Slide 23 ISPD’02 Layouts ibm01ibm04
Slide 24 More Macrocell Results – ISPD’05 Design WARP 3APlace GlobalLegalizationBackendWirelen adaptec adaptec bigblue bigblue bigblue bigblue Ratio Versus APlace ~7% more wirelen Total hours for all 6 designs for Warp on 2.8GHz Xeon Aside: ISPD’05 contest results: APlace 1.00 mFAR 1.06 Dragon 1.08 mPL 1.09 FastPlace 1.16 Capo 1.17 NTUP 1.21 Fengshui 1.50 Kraftwerk+Domino 1.84
Slide 25 ISPD’05 Layouts adaptec2 adaptec4
Slide 26 ISPD’05 Layouts bigblue2bigblue3
Slide 27 Conclusion and Future Work Placement with macrocells – WARP3 – competitive New techniques such as geometric hashing, improved hybrid net model Can produce very good quality mixed-size placements reasonably quickly Future Work Improve both quality and runtime Better handle macrocells and routing congestion “Hybrid” layout strategies (warping, but a flatter, analytical style)