SimPL: An Effective Placement Algorithm Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan 1ICCAD 2010, Myung-Chul Kim,

Slides:



Advertisements
Similar presentations
MIP-based Detailed Placer for Mixed-size Circuits Shuai Li, Cheng-Kok Koh ECE, Purdue University {li263,
Advertisements

Capo: Robust and Scalable Open-Source Min-cut Floorplacer Jarrod A. Roy, David A. Papa,Saurabh N. Adya, Hayward H. Chan, James F. Lu, Aaron N. Ng, Igor.
Optimization of Placement Solutions for Routability Wen-Hao Liu, Cheng-Kok Koh, and Yih-Lang Li DAC’13.
An Efficient Technology Mapping Algorithm Targeting Routing Congestion Under Delay Constraints Rupesh S. Shelar Intel Corporation Hillsboro, OR Prashant.
1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.
Wen-Hao Liu1, Yih-Lang Li, and Cheng-Kok Koh Department of Computer Science, National Chiao-Tung University School of Electrical and Computer Engineering,
Natarajan Viswanathan Min Pan Chris Chu Iowa State University International Symposium on Physical Design April 6, 2005 FastPlace: An Analytical Placer.
X-Architecture Placement Based on Effective Wire Models Tung-Chieh Chen, Yi-Lin Chuang, and Yao-Wen Chang Graduate Institute of Electronics Engineering.
MAPLE: Multilevel Adaptive PLacEment for Mixed-Size Designs Myung-Chul Kim †, Natarajan Viswanathan ‡, Charles J. Alpert ‡, Igor L. Markov †, Shyam Ramji.
Meng-Kai Hsu, Sheng Chou, Tzu-Hen Lin, and Yao-Wen Chang Electronics Engineering, National Taiwan University Routability Driven Analytical Placement for.
A Size Scaling Approach for Mixed-size Placement Kalliopi Tsota, Cheng-Kok Koh, Venkataramanan Balakrishnan School of Electrical and Computer Engineering.
Shuai Li and Cheng-Kok Koh School of Electrical and Computer Engineering, Purdue University West Lafayette, IN, Mixed Integer Programming Models.
Ripple: An Effective Routability-Driven Placer by Iterative Cell Movement Xu He, Tao Huang, Linfu Xiao, Haitong Tian, Guxin Cui and Evangeline F.Y. Young.
1 Physical Hierarchy Generation with Routing Congestion Control Chin-Chih Chang *, Jason Cong *, Zhigang (David) Pan +, and Xin Yuan * * UCLA Computer.
Chop-SPICE: An Efficient SPICE Simulation Technique For Buffered RC Trees Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of.
FastPlace: Efficient Analytical Placement using Cell Shifting, Iterative Local Refinement and a Hybrid Net Model FastPlace: Efficient Analytical Placement.
1 Understanding force-directed placement Andrew Kennings Electrical and Computer Engineering University of Waterloo.
Low-power Clock Trees for CPUs Dong-Jin Lee, Myung-Chul Kim and Igor L. Markov Dept. of EECS, University of Michigan 1 ICCAD 2010, Dong-Jin Lee, University.
APLACE: A General and Extensible Large-Scale Placer Andrew B. KahngSherief Reda Qinke Wang VLSICAD lab University of CA, San Diego.
Boosting: Min-Cut Placement with Improved Signal Delay Andrew B. KahngSherief Reda CSE & ECE Departments University of CA, San Diego La Jolla, CA
38 th Design Automation Conference, Las Vegas, June 19, 2001 Creating and Exploiting Flexibility in Steiner Trees Elaheh Bozorgzadeh, Ryan Kastner, Majid.
Constructive Benchmarking for Placement David A. Papa EECS Department University of Michigan Ann Arbor, MI Igor L. Markov EECS.
An Analytic Placer for Mixed-Size Placement and Timing-Driven Placement Andrew B. Kahng and Qinke Wang UCSD CSE Department {abk, Work.
Supply Voltage Degradation Aware Analytical Placement Andrew B. Kahng, Bao Liu and Qinke Wang UCSD CSE Department {abk, bliu,
Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan.
Placement Feedback: A Concept and Method for Better Min-Cut Placements Andrew B. KahngSherief Reda CSE & ECE Departments University of CA, San Diego La.
On Legalization of Row-Based Placements Andrew B. KahngSherief Reda CSE & ECE Departments University of CA, San Diego La Jolla, CA 92093
Can Recursive Bisection Alone Produce Routable Placements? Andrew E. Caldwell Andrew B. Kahng Igor L. Markov Supported by Cadence.
DUSD(Labs) GSRC bX update March 2003 Aaron Ng, Marius Eriksen and Igor Markov University of Michigan.
Processing Rate Optimization by Sequential System Floorplanning Jia Wang 1, Ping-Chih Wu 2, and Hai Zhou 1 1 Electrical Engineering & Computer Science.
A Resource-level Parallel Approach for Global-routing-based Routing Congestion Estimation and a Method to Quantify Estimation Accuracy Wen-Hao Liu, Zhen-Yu.
POLAR 2.0: An Effective Routability-Driven Placer Chris Chu Tao Lin.
MGR: Multi-Level Global Router Yue Xu and Chris Chu Department of Electrical and Computer Engineering Iowa State University ICCAD
Mixed-Size Placement with Fixed Macrocells using Grid-Warping Zhong Xiu*, Rob Rutenbar * Advanced Micro Devices Inc., Department of Electrical and Computer.
CRISP: Congestion Reduction by Iterated Spreading during Placement Jarrod A. Roy†‡, Natarajan Viswanathan‡, Gi-Joon Nam‡, Charles J. Alpert‡ and Igor L.
Horizontal Benchmark Extension for Improved Assessment of Physical CAD Research Andrew B. Kahng, Hyein Lee and Jiajia Li UC San Diego VLSI CAD Laboratory.
TSV-Aware Analytical Placement for 3D IC Designs Meng-Kai Hsu, Yao-Wen Chang, and Valerity Balabanov GIEE and EE department of NTU DAC 2011.
March 20, 2007 ISPD An Effective Clustering Algorithm for Mixed-size Placement Jianhua Li, Laleh Behjat, and Jie Huang Jianhua Li, Laleh Behjat,
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 EECS 527 Paper Presentation High-Performance.
Archer: A History-Driven Global Routing Algorithm Mustafa Ozdal Intel Corporation Martin D. F. Wong Univ. of Illinois at Urbana-Champaign Mustafa Ozdal.
UC San Diego / VLSI CAD Laboratory Incremental Multiple-Scan Chain Ordering for ECO Flip-Flop Insertion Andrew B. Kahng, Ilgweon Kang and Siddhartha Nath.
Seeing the Forest and the Trees: Steiner Wirelength Optimization in Placement Jarrod A. Roy, James F. Lu and Igor L. Markov University of Michigan Ann.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Analytic Placement. Layout Project:  Sending the RTL file: −Thursday, 27 Farvardin  Final deadline: −Tuesday, 22 Ordibehesht  New Project: −Soon 2.
Multilevel Generalized Force-directed Method for Circuit Placement Tony Chan 1, Jason Cong 2, Kenton Sze 1 1 UCLA Mathematics Department 2 UCLA Computer.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
1/24/20071 ECO-system: Embracing the Change in Placement Jarrod A. Roy and Igor L. Markov University of Michigan at Ann Arbor.
Placement. Physical Design Cycle Partitioning Placement/ Floorplanning Placement/ Floorplanning Routing Break the circuit up into smaller segments Place.
Jason Cong‡†, Guojie Luo*†, Kalliopi Tsota‡, and Bingjun Xiao‡ ‡Computer Science Department, University of California, Los Angeles, USA *School of Electrical.
Session 10: The ISPD2005 Placement Contest. 2 Outline  Benchmark & Contest Introduction  Individual placement presentation  FastPlace, Capo, mPL, FengShui,
Quadratic VLSI Placement Manolis Pantelias. General Various types of VLSI placement  Simulated-Annealing  Quadratic or Force-Directed  Min-Cut  Nonlinear.
1 ε -Optimal Minimum-Delay/Area Zero-Skew Clock Tree Wire-Sizing in Pseudo-Polynomial Time Jeng-Liang Tsai Tsung-Hao Chen Charlie Chung-Ping Chen (National.
Physical Synthesis Comes of Age Chuck Alpert, IBM Corp. Chris Chu, Iowa State University Paul Villarrubia, IBM Corp.
Analytical Minimization of Signal Delay in VLSI Placement Andrew B. Kahng and Igor L. Markov UCSD, Univ. of Michigan
Chris Chu Iowa State University Yiu-Chung Wong Rio Design Automation
Multi-objective Placement Optimization for High-performance Nanoscale Integrated Circuits Igor L. Markov August 20, 2012.
1 NTUplace: A Partitioning Based Placement Algorithm for Large-Scale Designs Tung-Chieh Chen 1, Tien-Chang Hsu 1, Zhe-Wei Jiang 1, and Yao-Wen Chang 1,2.
High-Performance Global Routing with Fast Overflow Reduction Huang-Yu Chen, Chin-Hsiung Hsu, and Yao-Wen Chang National Taiwan University Taiwan.
International Symposium on Physical Design San Diego, CA April 2002ER UCLA UCLA 1 Routability Driven White Space Allocation for Fixed-Die Standard-Cell.
Design Automation Conference (DAC), June 6 th, Taming the Complexity of Coordinated Place and Route Jin Hu †, Myung-Chul Kim †† and Igor L. Markov.
May Mike Drob Grant Furgiuele Ben Winters Advisor: Dr. Chris Chu Client: IBM IBM Contact – Karl Erickson.
Effective Linear Programming-Based Placement Techniques Sherief Reda UC San Diego Amit Chowdhary Intel Corporation.
6/19/ VLSI Physical Design Automation Prof. David Pan Office: ACES Placement (3)
Placement and Routing Algorithms. 2 FPGA Placement & Routing.
A SimPLR Method for Routability-driven Placement
HeAP: Heterogeneous Analytical Placement for FPGAs
Revisiting and Bounding the Benefit From 3D Integration
APLACE: A General and Extensible Large-Scale Placer
EE5780 Advanced VLSI Computer-Aided Design
Presentation transcript:

SimPL: An Effective Placement Algorithm Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan 1ICCAD 2010, Myung-Chul Kim, University of Michigan

Global Placement: Motivation ■Interconnect lagging in performance while transistors continue scaling −Circuit delay, power dissipation and area dominated by interconnect −Routing quality highly controlled by placement ■Circuit size and complexity rapidly increasing −Scalable placement algorithm is critical −Simplicity, integration with other optimizations 2ICCAD 2010, Myung-Chul Kim, University of Michigan Unloaded Coupling IR drop RC delay

Placement Formulation ■Objective: Minimize estimated wirelength (half-perimeter wirelength) ■Subject to constraints: −Legality: Row-based placement with no overlaps −Routability: Limiting local interconnect congestion for successful routing −Timing: Meeting performance target of a design 3ICCAD 2010, Myung-Chul Kim, University of Michigan

Prior Work ■Ideal Placer −Fast runtime without sacrificing solution quality −Simplicity, integration with other optimization 4ICCAD 2010, Myung-Chul Kim, University of Michigan Speed Solution Quality Non-convex optimization mFAR, Kraftwerk2, FastPlace3 Ideal placer mPL6, APlace2, NTUPlace3 Quadratic and force-directed

Key features of SimPL ■Flat quadratic placement ■Primal dual optimization −Closing the gap between upper and lower bounds 5 Final Solution Lower-Bound Solution by Linear System Solver Wirelength Iteration Final Legal Solution Upper-Bound Solution by Look-ahead Legalization Initial WL Opt.

Common Analytical Placement Flow 6 Placement Instance Converge yes no Global Placement Initial WL Optimization Legalization and Detailed Placement ICCAD 2010, Myung-Chul Kim, University of Michigan

SimPL Flow 7 We delegate final legalization and detailed placement to FastPlace-DP [M. Pan, et al, “An Efficient and Effective Detailed Placement Algorithm”, ICCAD2005] Placement Instance Legalization and Detailed Placement B2B net model[P. Spindler, et al, “Kraftwerk2 - A Fast Force-Directed Quadratic Placement Approach Using an Accurate Net Model,” TCAD 2008] yes no Pseudonet Insertion Look-ahead Legalization (Upper-Bound) B2B Graph Building Linear System Solver (Lower-Bound) Converge Global Placement B2B Graph Building Linear System Solver WL Converge yes no Initial WL Optimization

SimPL: Look-ahead Legalization ■Purpose: Produces almost-legal placement (Upper-Bound) while preserving the relative cell ordering given by linear system solver (Lower-Bound) ■Identify target region −Find overflow bin b −Create a minimal wide enough bin cluster B around b ■Perform geometric top-down partitioning −Find cell area median (C c ) and whitespace median (C B ) −Assign cells (C c ) to corresponding partitions (C B ) ■Non-linear scaling −Form stripe regions −Move cells across stripe regions in-order based on whitespace 8

SimPL: Look-ahead Legalization (1) Performing geometric top-down partitioning Overfilled bin Cell-area median (C c ) B0 B0 B 1 whitespace median (C B ) Bin cluster (B) 9ICCAD 2010, Myung-Chul Kim, University of Michigan

SimPL: Look-ahead Legalization (2) 10ICCAD 2010, Myung-Chul Kim, University of Michigan Cell-area median (C c ) whitespace median (C B ) B0 B0

SimPL: Look-ahead Legalization (2) C B Obstacle borders Uniform cutlines Cell Ordering Per-stripe Linear Scaling C B ICCAD 2010, Myung-Chul Kim, University of Michigan

SimPL: Look-ahead Legalization (3) ■Example ( adaptec1 ) Look-ahead legalization stops when target regions become small enough

SimPL: Using legal locations as anchors ■Purpose: Gradually perturb the linear system to generate lower-bound solutions with less overlap ■Anchors and Pseudonets −Look-ahead locations used as fixed, zero-area anchors −Anchors and original cells connected with 2-pin pseudonets −Pseudonet weights grow linearly with iterations 13ICCAD 2010, Myung-Chul Kim, University of Michigan

Next illustration: Tug-of-war between low-wirelength and legalized placements 14ICCAD 2010, Myung-Chul Kim, University of Michigan

SimPL Iterations on Adaptec1 (1) Iteration=0 (Init WL Opt.)Iteration=1 (Upper Bound) Iteration=2 (Lower Bound)Iteration=3 (Upper Bound) 15

SimPL Iterations on Adaptec1 (2) Iteration=11 (Upper Bound) Iteration=20 (Lower Bound)Iteration=21 (Upper Bound) Iteration=11 (Upper Bound) Iteration=20 (Lower Bound)Iteration=21 (Upper Bound) Iteration=10 (Lower Bound) 16

SimPL Iterations on Adaptec1 (3) 17 Iteration=31 (Upper Bound)Iteration=30 (Lower Bound) Iteration=40 (Lower Bound)Iteration=41 (Upper Bound)

Convergence of SimPL ■Legal solution is formed between two bounds 18ICCAD 2010, Myung-Chul Kim, University of Michigan

Empirical Results: ISPD05 Benchmarks ■Experimental setup −Single threaded runs on a 3.2GHz Intel core i7 Quad CPU Q660 Linux workstation −HPWL is computed by GSRC Bookshelf Evaluator −< 5000 lines of code in C++, including CG-based solver for sparse linear systems with Jacobi preconditioner 19ICCAD 2010, Myung-Chul Kim, University of Michigan Improvements after ICCAD submission

Empirical Results: Scalability Study ■Take an existing design (ISPD 2005) and split each movable cell into two cells of smaller size −Each connection to the original cell is inherited by one of two split cells, which are connected by a 2-pin net Not in ICCAD paper

Parallelism in Conjugate Gradient Solver ■Runtime bottleneck in SimPL: Conjugate gradient linear system solver ■Coarse-grain row partitioning −Implemented using OpenMP3.0 compiler intrinsic ■SSE2 (Streaming SIMD Extensions) instructions −Process 4 multiple data with a single instruction −Marginal runtime improvement in SpMxV ■Reducing memory bandwidth demand of SpMxV −CSR (Compressed Sparse Row) format Y. Saad, “Iterative Methods for Sparse Linear Systems,” SIAM ICCAD 2010, Myung-Chul Kim, University of Michigan

On-going Research ■Integration with physical synthesis −Look-ahead placement offers opportunity for early estimation of circuit parameters –Timing look-ahead –Congestion look-ahead –Power-density look-ahead −Improving the speed and quality of physical synthesis ■Parallel look-ahead legalization −Run independently in separate sub-regions 22ICCAD 2010, Myung-Chul Kim, University of Michigan

Conclusions ■New flat quadratic placement algorithm: SimPL −Novel primal-dual approach −Amenable to integration with physical synthesis ■Self-contained, compact implementation −Fastest among available academic placers −Highly competitive solution quality 23ICCAD 2010, Myung-Chul Kim, University of Michigan

Questions and Answers Thank you! Time for Questions 24ICCAD 2010, Myung-Chul Kim, University of Michigan