ePlace: Electrostatics based Placement CK Cheng UC San Diego 2/28/2013 9/22/2018 UC San Diego
Outline Introduction Flow of Placement Statement of Problem Wire Length and Density Approximation: Electrostatic Analogy Nonlinear Optimization Nesterov’s method preconditioning Macro legalization Annealing-directed block shifting Experiments and results Conclusion and future work 9/22/2018 UC San Diego
Introduction Synthesis Analysis ECO Placement Routing 9/22/2018 UC San Diego
Introduction Placement: NP complete problem. Analytic Placement: Placement having a first derivative at all points of placement domain. Heuristic Approach: Mathematic derivation with empirical tuning. 9/22/2018 UC San Diego
ePlace Flow 9/22/2018 UC San Diego
MFNPL Algorithm One-stage approach Nonlinear optimization Simultaneous placement of cell & macro Based on our existing placement prototype [Lu13] Nonlinear optimization Solver: Nesterov’s method (c.f. conjugate gradient) Steplength: dynamic prediction of Lipschitz constant Convergence: backtracking closer to local optimum Mixed size: preconditioning on Hessian matrix resolve gap between macro and cell Macro legalization Annealing: direction of local macro shifting Process flow: nested loop of parameter adjustment 9/22/2018 UC San Diego
Cell Placement 9/22/2018 UC San Diego
𝝍𝒊: electric potential Statement of Problem Objective: min HPWL Constraint: no density violation Density penalty: potential energy Objective function: Obj. + Density penalty 𝑩: grid set 𝒒 𝒊 : charge, cell area 𝝍𝒊: electric potential 𝝀: penalty factor
Wire Length Approximation [Kahng05, chan06, Chen08] High-order density and wirelength function Exponential function to extract boundary pins High accuracy & complexity Line search to locate local optimum Repeated objective evaluation low efficiency Multi-level netlist & grid coarsening Suboptimal clustering quality degradation
Density Approximation: Electrostatic Analogy Electrostatic System Cell Density Charge Density Density Penalty Potential Energy Cell Instances Electric Particles Placement Instance Density Gradient Electric Field
Original Distr. w/o direct current (DC) globally even distribution Modified System Original Distr. w/o direct current (DC) globally even distribution Converged Distr. 9/22/2018 UC San Diego
cell placement charge density field distribution potential energy ADAPTEC1 128x128 charge density cell placement charge density electric field potential distr field distribution potential energy 9/22/2018 UC San Diego
Neumann boundary condition total charge density: zero Poisson’s Equation ρ(x,y): charge density Neumann boundary condition : the outer norm vector total charge density: zero even density distr. total potential: zero unique solution
2D cosine series generation Density Expression 2D cosine series generation 2D DCT expansion Potential Function Field Function
Boundary Condition Verification iteration 100 iteration 150 ADAPTEC1 128x128 The boundary force is zero!
Runtime Density Gradient iteration 3 ADAPTEC1 128x128 iteration 33 Global Information Aware
Complexity Analysis m: total movable nodes standard cells & filler cells Density computation O(m) n: each dimension of 2-D grids Uniform decomposition 2D fast Fourier transform O(n2logn) total complexity: O(mlogm) m=O(n2): # grid ≈ # cells O(m+n2logn) = O(mlogm)
Nesterov’s Method: Problems of CG & Line Search Runtime bottleneck CPU breakdown from ADAPTEC1 of 512x512 grid Line search: 50.38% of PL, 63.22% of GP Zero gradient point? May actually beyond the search interval … Previous solution: step size prediction Empirical method [Chen08]: results oriented Our systematic solution: Nesterov’s method Runtime Lipschitz constant prediction 9/22/2018 UC San Diego
Nesterov’s Method function and gradient vector def. Lipschitz constant f(x) є C1,1(E) def. Lipschitz constant convex function inequality step length limit? prediction 9/22/2018 UC San Diego
Nesterov’s Method gradient vector Iterative approach objective function: f = W + λD gradient vector Iterative approach step length search new solution & parameters 9/22/2018 UC San Diego
Dynamic Steplength Prediction Lipschitz const. prediction step length prediction known solution from k & k-1 iterations known gradient from k & k-1 iterations zero computation overhead However, аk may not be accurate enough… 9/22/2018 UC San Diego
Correction: Steplength Backtracking initial prediction temporary solution reference prediction final prediction & solution 9/22/2018 UC San Diego
Steplength Backtracking Correction on mis-prediction of steplength Get closer towards local optimum Improve convergence of nonlinear optimization Before: density starts oscillation at ~15% After: density down to ~2.2% Average of 16 MMS cases Set stopping threshold as 1.0% Limited overhead on efficiency Zero overhead if 1st backtrack is passed Average # backtracks: 1.037 Average of 16 MMS cases, < 4% CPU overhead on GP 9/22/2018 UC San Diego
Preconditioning 9/22/2018 UC San Diego
Preconditioning Local quadratic approximation Nonlinear problem: Hessian matrix Approximation by diagonal matrix Complexity: O(n2) O(n) Feasible on modern IC design Hessian inversion Reduce condition number clustering eigenvalues & eigenvectors together Ideal case: one step to converge Single eigenvalue & eigenvector 9/22/2018 UC San Diego
Preconditioning A generalized solution to mixed-size design Limitation Equalize cell & macro in placement perspective Resolve physical difference Balance gradient magnitude Limitation Non-convex cost function not positive definite Negative eigenvalue motion mis-direction 9/22/2018 UC San Diego
Macro Legalization Difference with most prior works General framework Post global placement Sufficient logic and physical information Good starting point small design space Direct annealing on macro shifting c.f. annealing on macro sequence / representation General framework Combined cost function Wirelength, cell density, macro overlap Incremental cost update Dynamic parameter adjustment Annealing temperature, motion diameter, penalty factor 9/22/2018 UC San Diego
Macro Legalization Flow outer: adjusting penalty factor (λ) in cost function inner: adjusting annealing parameters (temperature T, diameter R) 9/22/2018 UC San Diego
Layout Comparison: adaptec1 (gif) cell & macro co-placement macro legalization initial placement cell detail placement cell-only placement 9/22/2018 UC San Diego
state-of-the-art mixed-size placers Experiment Setup MMS benchmark suite [Yan09] GSRC bookshelf format, based on ISPD 2005 & 2006 benchmarks Free originally fixed macros, insert terminals to preserve ASIC structure Up to 2.5M cells & 3.7K movable macros, well represent modern design complexity No rotation or flipping of macros (For now) Four state-of-the-art placers Covering two cutting-edge categories: constructive (floorplan-guided) and one-stage (simultaneous macro & cell placement) categories constructive one-stage state-of-the-art mixed-size placers FLOP [Yan09] mPL6 [Chan06] Capo10.5 [Roy06] NTUplace3-unified [Hsu12] 9/22/2018 UC San Diego
Quality & CPU Comparison * from published results 9/22/2018 UC San Diego
Conclusion Nonlinear optimization Annealing-based macro legalization Nesterov’s method, Lipschitz prediction Steplength backtracking, Hessian preconditioning Annealing-based macro legalization Direct cell shifting Nested framework on parameter adjustment Experiments on MMS benchmark suite Improved quality and efficiency 9/22/2018 UC San Diego
Future Work Extensibility towards parallel platform GPU architecture and distributed system Parallel FFT for density gradient [Moreland03] Parallel wirelength gradient generation [Cong09] Extensibility towards other design objectives Routability-driven placement [Kim11] Timing-driven placement [Chan09] 3D-IC analytic placement [Cong09b] 9/22/2018 UC San Diego
References [Chan06] T. F. Chan, J. Cong, J. R. Shinnerl, K. Sze, and M. Xie. mPL6: Enhanced Multilevel Mixed-Size Placement, in ISPD, pages 212–214, 2006. [Chan09] T. F. Chan, J. Cong, and E. Radke, A Rigorous Framework for Convergent Net Weighting Schemes in Timing-Driven Placement, in ICCAD, pages 288-294, 2009. [Chen08] T.-C. Chen et al. NTUPlace3: An Analytical Placer for Large-Scale Mixed-Size Designs with Preplaced Blocks and Density Constraint, IEEE TCAD, 27(7):1228–1240, 2008. [Chen08b] T.-C. Chen, P.-H. Yuh, Y.-W. Chang, F.-J. Huang, and D. Liu, “MP-trees: A packing-based macro placement algorithm for modern mixed-size designs,” IEEE Trans. Comput.-Aided Des., vol. 27, no. 9, pp. 1621–1634, Sep. 2008. [Chen08c] H.-C. Chen, Y.-L. Chung, Y.-W. Chang, and Y.-C. Chang, “Constraint graph-based macro placement for modern mixed-size circuit designs,” in ICCAD, 2008, pp. 218–223 [Cong09] J. Cong and Y. Zou, Parallel Multi-level Analytical Global Placement on Graphics Processing Unit, in ICCAD, pages 681-688, 2009 [Cong09b] J. Cong and G. Luo. A Multilevel Analytical Placement for 3D ICs, in ASPDAC, pp. 361-366, 2009. [Hsu11] M.-K. Hsu, Y.-W. Chang and V. Balabanov. TSV-Aware Analytical Placement for 3D IC Designs, in DAC, pages 664-669, 2011. [Kahng05] A. B. Kahng, S. Reda, and Q. Wang. Architecture and Details of a High Quality, Large-Scale Analytical Placer. In ICCAD, pages 890–897, 2005 [Lu13] J.Lu et al., FFTPL: An Analytic Placement Algorithm Using Fast Fourier Transform for Density Equalization, in ASICON 2013, to appear. [Moreland03] K. Moreland and E. Angel. The FFT on a GPU. Graphics Hardware, 2003. 9/22/2018 UC San Diego
References [Naylor01] W. C. Naylor, R. Donelly, and L. Sha. Non-Linear Optimization System and Method for Wire Length and Delay Optimization for an Automatic Electric Circuit Placer. In US Patent 6301693, 2001. [Nesterov83] Y. E. Nesterov. A Method of Solving a Convex Programming Problem with Convergence Rate O(1/k2). In Soviet Math, 27(2), 1983. [Ooura] General-Purpose FFT Package. http://www.kurims.kyoto-u.ac.jp/~ooura/fft.html. [Pan05] M. Pan, N. Viswanathan, and C. Chu. An Efficient and Effective Detailed Placement Algorithm. In ICCAD, pages 48–55, 2005. [Roy06] J. A. Roy, S. N. Adya, D. A. Papa and I. L. Markov, ``Min-cut Floorplacement'' IEEE TCAD 25(7): pp. 1313-1326, 2006. [Shewchuk94] J. Shewchuk. An Introduction to the Conjugate Gradient Method without the Agonizing Pain. In Technical Report CMU-CS-TR-94-125, Carnegie Mellon University, 1994. [Yan09] J. Z. Yan, N. Viswanathan, and C. Chu, “Handling complexities in modern large-scale mixed-size placement,” in DAC, 2009, pp. 436–441. [Yao05] B. Yao, H. Chen, C.-K. Cheng, N.-C. Chou, L.-T. Liu, and P. Suaris, Unified quadratic programming approach for mixed mode placement, in ISPD, 2005, pp. 193–199. 9/22/2018 UC San Diego
Q & A Thank you 9/22/2018 UC San Diego