Physical Synthesis Comes of Age Chuck Alpert, IBM Corp. Chris Chu, Iowa State University Paul Villarrubia, IBM Corp.

Slides:



Advertisements
Similar presentations
Chuck Alpert Design Productivity Group Austin Research Laboratory
Advertisements

Porosity Aware Buffered Steiner Tree Construction C. Alpert G. Gandham S. Quay IBM Corp M. Hrkic Univ Illinois Chicago J. Hu Texas A&M Univ.
Capo: Robust and Scalable Open-Source Min-cut Floorplacer Jarrod A. Roy, David A. Papa,Saurabh N. Adya, Hayward H. Chan, James F. Lu, Aaron N. Ng, Igor.
Natarajan Viswanathan Min Pan Chris Chu Iowa State University International Symposium on Physical Design April 6, 2005 FastPlace: An Analytical Placer.
Meng-Kai Hsu, Sheng Chou, Tzu-Hen Lin, and Yao-Wen Chang Electronics Engineering, National Taiwan University Routability Driven Analytical Placement for.
A Size Scaling Approach for Mixed-size Placement Kalliopi Tsota, Cheng-Kok Koh, Venkataramanan Balakrishnan School of Electrical and Computer Engineering.
Ripple: An Effective Routability-Driven Placer by Iterative Cell Movement Xu He, Tao Huang, Linfu Xiao, Haitong Tian, Guxin Cui and Evangeline F.Y. Young.
SimPL: An Effective Placement Algorithm Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan 1ICCAD 2010, Myung-Chul Kim,
1 Physical Hierarchy Generation with Routing Congestion Control Chin-Chih Chang *, Jason Cong *, Zhigang (David) Pan +, and Xin Yuan * * UCLA Computer.
Consistent Placement of Macro-Blocks Using Floorplanning and Standard-Cell Placement Saurabh Adya Igor Markov (University of Michigan)
FastPlace: Efficient Analytical Placement using Cell Shifting, Iterative Local Refinement and a Hybrid Net Model FastPlace: Efficient Analytical Placement.
Early Days of Circuit Placement Martin D. F. Wong Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
1 Understanding force-directed placement Andrew Kennings Electrical and Computer Engineering University of Waterloo.
ISQED’2015: D. Seemuth, A. Davoodi, K. Morrow 1 Automatic Die Placement and Flexible I/O Assignment in 2.5D IC Design Daniel P. Seemuth Prof. Azadeh Davoodi.
International Conference on Computer-Aided Design San Jose, CA Nov. 2001ER UCLA UCLA 1 Congestion Reduction During Placement Based on Integer Programming.
38 th Design Automation Conference, Las Vegas, June 19, 2001 Creating and Exploiting Flexibility in Steiner Trees Elaheh Bozorgzadeh, Ryan Kastner, Majid.
1 BoxRouter: A New Global Router Based on Box Expansion and Progressive ILP Minsik Cho and David Z. Pan ECE Dept. Univ. of Texas at Austin DAC 2006, July.
1 A Tale of Two Nets: Studies in Wirelength Progression in Physical Design Andrew B. Kahng Sherief Reda CSE Department University of CA, San Diego.
Can Recursive Bisection Alone Produce Routable Placements? Andrew E. Caldwell Andrew B. Kahng Igor L. Markov Supported by Cadence.
POLAR 2.0: An Effective Routability-Driven Placer Chris Chu Tao Lin.
VLSI Physical Design Automation Prof. David Pan Office: ACES Lecture 18. Global Routing (II)
CDCTree: Novel Obstacle-Avoiding Routing Tree Construction based on Current Driven Circuit Model Speaker: Lei He.
Placement-Centered Research Directions and New Problems Xiaojian Yang Amir Farrahi Synplicity Inc.
WISCAD – VLSI Design Automation GRIP: Scalable 3-D Global Routing using Integer Programming Tai-Hsuan Wu, Azadeh Davoodi Department of Electrical and Computer.
CSE 242A Integrated Circuit Layout Automation Lecture 5: Placement Winter 2009 Chung-Kuan Cheng.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 FLUTE: Fast Lookup Table Based RSMT Algorithm.
Introduction to Routing. The Routing Problem Apply after placement Input: –Netlist –Timing budget for, typically, critical nets –Locations of blocks and.
MGR: Multi-Level Global Router Yue Xu and Chris Chu Department of Electrical and Computer Engineering Iowa State University ICCAD
Global Routing. Global routing:  Sequential −One net at a time  Concurrent −Order-independent −ILP 2.
Mixed-Size Placement with Fixed Macrocells using Grid-Warping Zhong Xiu*, Rob Rutenbar * Advanced Micro Devices Inc., Department of Electrical and Computer.
CRISP: Congestion Reduction by Iterated Spreading during Placement Jarrod A. Roy†‡, Natarajan Viswanathan‡, Gi-Joon Nam‡, Charles J. Alpert‡ and Igor L.
TSV-Aware Analytical Placement for 3D IC Designs Meng-Kai Hsu, Yao-Wen Chang, and Valerity Balabanov GIEE and EE department of NTU DAC 2011.
Solving Hard Instances of FPGA Routing with a Congestion-Optimal Restrained-Norm Path Search Space Keith So School of Computer Science and Engineering.
10/11/ VLSI Physical Design Automation Prof. David Pan Office: ACES Placement (2)
March 20, 2007 ISPD An Effective Clustering Algorithm for Mixed-size Placement Jianhua Li, Laleh Behjat, and Jie Huang Jianhua Li, Laleh Behjat,
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 EECS 527 Paper Presentation High-Performance.
Archer: A History-Driven Global Routing Algorithm Mustafa Ozdal Intel Corporation Martin D. F. Wong Univ. of Illinois at Urbana-Champaign Mustafa Ozdal.
Seeing the Forest and the Trees: Steiner Wirelength Optimization in Placement Jarrod A. Roy, James F. Lu and Igor L. Markov University of Michigan Ann.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
1 Wire Length Prediction-based Technology Mapping and Fanout Optimization Qinghua Liu Malgorzata Marek-Sadowska VLSI Design Automation Lab UC-Santa Barbara.
Analytic Placement. Layout Project:  Sending the RTL file: −Thursday, 27 Farvardin  Final deadline: −Tuesday, 22 Ordibehesht  New Project: −Soon 2.
Multilevel Generalized Force-directed Method for Circuit Placement Tony Chan 1, Jason Cong 2, Kenton Sze 1 1 UCLA Mathematics Department 2 UCLA Computer.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
1/24/20071 ECO-system: Embracing the Change in Placement Jarrod A. Roy and Igor L. Markov University of Michigan at Ann Arbor.
Placement. Physical Design Cycle Partitioning Placement/ Floorplanning Placement/ Floorplanning Routing Break the circuit up into smaller segments Place.
Jason Cong‡†, Guojie Luo*†, Kalliopi Tsota‡, and Bingjun Xiao‡ ‡Computer Science Department, University of California, Los Angeles, USA *School of Electrical.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 6: Detailed Routing © KLMH Lienig 1 What Makes a Design Difficult to Route Charles.
ARCHER:A HISTORY-DRIVEN GLOBAL ROUTING ALGORITHM Muhammet Mustafa Ozdal, Martin D. F. Wong ICCAD ’ 07.
Session 10: The ISPD2005 Placement Contest. 2 Outline  Benchmark & Contest Introduction  Individual placement presentation  FastPlace, Capo, mPL, FengShui,
Quadratic VLSI Placement Manolis Pantelias. General Various types of VLSI placement  Simulated-Annealing  Quadratic or Force-Directed  Min-Cut  Nonlinear.
I N V E N T I V EI N V E N T I V E A Morphing Approach To Address Placement Stability Philip Chong Christian Szegedy.
An Effective Congestion Driven Placement Framework André Rohe University of Bonn, Germany joint work with Ulrich Brenner.
Chris Chu Iowa State University Yiu-Chung Wong Rio Design Automation
Optimality, Scalability and Stability study of Partitioning and Placement Algorithms Jason Cong, Michail Romesis, Min Xie UCLA Computer Science Department.
Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,
Multi-objective Placement Optimization for High-performance Nanoscale Integrated Circuits Igor L. Markov August 20, 2012.
1 NTUplace: A Partitioning Based Placement Algorithm for Large-Scale Designs Tung-Chieh Chen 1, Tien-Chang Hsu 1, Zhe-Wei Jiang 1, and Yao-Wen Chang 1,2.
FPGA CAD 10-MAR-2003.
High-Performance Global Routing with Fast Overflow Reduction Huang-Yu Chen, Chin-Hsiung Hsu, and Yao-Wen Chang National Taiwan University Taiwan.
International Symposium on Physical Design San Diego, CA April 2002ER UCLA UCLA 1 Routability Driven White Space Allocation for Fixed-Die Standard-Cell.
Congestion Analysis for Global Routing via Integer Programming Hamid Shojaei, Azadeh Davoodi, and Jeffrey Linderoth* Department of Electrical and Computer.
May Mike Drob Grant Furgiuele Ben Winters Advisor: Dr. Chris Chu Client: IBM IBM Contact – Karl Erickson.
6/19/ VLSI Physical Design Automation Prof. David Pan Office: ACES Placement (3)
RTL Design Flow RTL Synthesis HDL netlist logic optimization netlist Library/ module generators physical design layout manual design a b s q 0 1 d clk.
VLSI Physical Design Automation
VLSI Physical Design Automation
HeAP: Heterogeneous Analytical Placement for FPGAs
APLACE: A General and Extensible Large-Scale Placer
Buffer Insertion with Adaptive Blockage Avoidance
2 University of California, Los Angeles
Presentation transcript:

Physical Synthesis Comes of Age Chuck Alpert, IBM Corp. Chris Chu, Iowa State University Paul Villarrubia, IBM Corp.

2 Physical Synthesis Family Tree Roles of layout as a parent: Clean up the mess created by physical synthesis (Implement the netlist generated by physical synthesis) Provide guidance to physical synthesis so that it will do things right Is layout mature enough to serve the role? Is there still room for layout to grow? SynthesisLayout Physical Synthesis

3 New Requirements of Placement 1. Super fast 4 to 8 million objects now Provide quick feedbacks to physical synthesis to refine the netlist 2. Stable in handling incremental placement Physical synthesis constantly makes changes to netlist 3. Flexible objective function Timing, Power, Routability 4. Handle mixed-size modules Hierarchical design and use of IP blocks are common

4 Placement As a Baby Simulated annealing based placement Popularized by Timberwolf [DAC-86] Greedy AlgorithmSimulated Annealing You only have 1 chance. If you get stuck, I will terminate you! OK to make mistakes. Keep trying! Evaluation/Feedback is important. Strength: Good quality for small designs Easy to consider different objective functions Handle incremental changes well Weakness: Very slow – crawling Non-trivial to handle modules of different sizes

5 Placement As a Kid Min-cut placement (or Partitioning-based placement) An old idea [Breuer, DAC-77] Capo [DAC-00] leverages breakthrough in partitioning using multi- level technique (e.g., hMetis [DAC-97], MLFM [DAC-97]) Dragon [ICCAD-00] combines hierarchical partitioning with annealing Strength: Efficient and scalable Very good wirelength, but can we do better? Weakness: More difficult to handle other objectives Not stable in handling incremental changes Not good in white space management Circuit PlacementRegion

6 White Space in Min-Cut Placement Capo (Min-Cut) adaptec2 HPWL=9955 APlace (Analytical) adaptec2 HPWL=8715 Courtesy: IBM

7 Placement Maturing Analytical placement Used by 4 of the top 5 placers in ISPD-05 Placement Contest and the top 5 placers in ISPD-06 Placement Contest Strength: Fastest and scalable Best wirelength Robust framework to incorporate different objectives and constraints Stable in handling incremental changes Good in white space management Why would analytical placement work so well? Can see the big picture Why was it not popular in the past? Hard to spread modules evenly in placement region

8 Attempt Still Relying on Partitioning Gordian: Global Optimization and Rectangle Dissection [TCAD-91] Artificial center of mass constraints disturb global optimal solution too drastically Centers of mass

9 Another Partitioning-based Spreading Quadratic optimization with quadrisection [Vygen, DAC-97] Courtesy: IBM

10 Spreading by Density-based Force Kraftwerk [DAC-98] Quadratic wirelength minimization: Spread cells by additional forces: Density-based force to push cells away from dense to sparse region Great idea: Spread cells smoothly Very good wirelength But not too fast: Constant force, hard to control convergence Density-based force expensive to compute x

11 Dramatic Speedup FastPlace [ISPD-04] repeat Solve quadratic program to minimize wirelength  Spread the cells  until cell distribution is roughly even  Reduce wirelength by iterative heuristic  Hybrid Net Model Speed up solving of QP  Cell Shifting Simple technique to compute spreading force  Fast convergence due to the use of pseudo-net [Hu et al., ISPD- 02] instead of constant force  Iterative Local Refinement More efficient than using QP to refine the solution  Minimize wirelength based on linear objective 

12 Linearization of Quadratic Wirelength New Kraftwerk [ICCAD-06] BoundingBox net model for multi-pin nets: Need to know the outmost pins of a net Accurately models HPWL Faster and less memory than clique model Two fundamental components of spreading force: Hold force – Constant force Move force – Enforced by pseudo-net to fixed point BoundingBoxClique

13 Relaxation Rather than Linearization RQL [DAC-07] Force Vector Modulation to FastPlace framework Currently fastest and best wirelength Spreading Force Magnitude Module Index Rank Modules based on the spreading force magnitude Nullify the spreading force for top 5-10% of modules

14 An Alterative Analytical Approach APlace [ISPD-04], mPL5 [ISPD-05], NTUPlace3 [ICCAD-06] Log-sum-exponential function to approximate HPWL [Naylor et al., US Patent 2001] Density constraint is directed formulated into the objective function Very competitive wirelength and runtime APlaceNTUP3mPL6RQL Wirelength ModelLog-sum-exponentialQuadratic Spreading Force Density potential based Fixed-point based Bell-shaped Poisson smoothed Objective FunctionNon-linear & Non-convexQuadratic

15 Placement: Getting Old or Still Young? Better approach than quadratic / analytical approach? Massive parallelism to speed up placement Better clustering technique Marco placement / floorplanning True timing driven placement

16 Sufficient Parental Guidance? All physical synthesis gets from placement is distance info Physical synthesis has a distorted world view! Wirelength estimation is inaccurate (especially for nets with high pin count) Congestion estimation is inaccurate Area estimation is inaccurate Without buffering and gate sizing Timing estimation is very inaccurate S3 S2 S1 S0 T0T1T2T3 S3 S2 S1 S0 T0T1T2T3 S3 S2 S1 S0 T0T1T2T3 Routing of a BusA Simple SolutionProbablistic Estimation

17 Routing-Driven Physical Synthesis Need a more integrated approach Past: Placement-Driven Physical Synthesis Future: Routing-Driven Physical Synthesis Main obstacle: Runtime Two possibilities: 1. Construct Steiner trees to guide synthesis and placement 2. Perform global routing to guide synthesis and placement

18 Fast Steiner Tree Construction FLUTE (Fast LookUp Table Estimation) [ICCAD 04, ISPD 05] An extremely fast and accurate rectilinear Steiner Tree algorithm Very suitable for VLSI applications: Optimal up to degree 9, Very accurate up to degree 100 Over all 1.57 million nets in 18 IBM circuits [ISPD 98] RMST RSTT SPAN BGA BI1S FLUTE

19 Is Steiner Tree Sufficient? Steiner trees do not consider detour due to routing congestion or buffering congestion Can we predict the impact of congestion on routing? There is no way for generic estimators to accurately estimate congestion of arbitrary global routers! Labyrinth(70%)Labyrinth(50%)Chi Dispersion #cong #match#cong#match ibm ibm ibm ibm ibm ibm ibm ibm ibm match Congestion by router 1 Congestion by router 2

20 Traditional Global Routing Simultaneous approach (e.g., ILP) Very slow Sequential approach Net-by-net routing, Rip-up and Reroute Maze routing for a net: Lee’s, Dijkstra’s, A*-search algorithms Reasonably fast Reasonably good quality Is it good enough to handle the demand of physical synthesis?

21 Progresses in Global Routing Pattern Routing [Kastner et al., ICCAD-00] L-shaped, Z-shaped routes Faster Better cost functions for maze routing [Hadsell & Madden, DAC-03; Pan & Chu, ICCAD-06] Reduce overflow significantly Congestion-driven Steiner tree construction [Pan & Chu, ICCAD-06] Much faster because of much less reliance on maze routing Negotiated Congestion by PathFinder [FPGA-95] Used by BoxRouter [ICCAD-07], FGA [ICCAD-07], Archer [ICCAD-07] Excellent routing ability Very slow because it takes a long time to build congestion history Wanted: Techniques that are both fast and high quality

22 What Should We Do Next? Integration of global routing into placement An initial attempt: IPR [DAC-07] Integration of FastPlace, FastDP, FLUTE and FastRoute Significantly improves routability & wirelength in good runtime Incorporate buffering and gate sizing into integrated placement & routing Much more accurate timing information Should also help congestion and placement density control Integration with logic synthesis In other words, we need: Better basic algorithms – placement, Steiner tree, global routing, buffering, gate sizing, etc. Clever ways of integration It is a (EDA) family problem. Let’s work together!

Thank You