Task 1091.001: Highly Scalable Placement by Multilevel Optimization Task Leaders: Jason Cong (UCLA CS) and Tony Chan (UCLA Math) Students with Graduation.

Slides:



Advertisements
Similar presentations
Capo: Robust and Scalable Open-Source Min-cut Floorplacer Jarrod A. Roy, David A. Papa,Saurabh N. Adya, Hayward H. Chan, James F. Lu, Aaron N. Ng, Igor.
Advertisements

Yi-Lin Chuang1, Sangmin Kim2, Youngsoo Shin2, and Yao-Wen Chang National Taiwan University, Taiwan KAIST, Korea 2010 DAC.
3D-STAF: Scalable Temperature and Leakage Aware Floorplanning for Three-Dimensional Integrated Circuits Pingqiang Zhou, Yuchun Ma, Zhouyuan Li, Robert.
Natarajan Viswanathan Min Pan Chris Chu Iowa State University International Symposium on Physical Design April 6, 2005 FastPlace: An Analytical Placer.
Meng-Kai Hsu, Sheng Chou, Tzu-Hen Lin, and Yao-Wen Chang Electronics Engineering, National Taiwan University Routability Driven Analytical Placement for.
A Size Scaling Approach for Mixed-size Placement Kalliopi Tsota, Cheng-Kok Koh, Venkataramanan Balakrishnan School of Electrical and Computer Engineering.
Ripple: An Effective Routability-Driven Placer by Iterative Cell Movement Xu He, Tao Huang, Linfu Xiao, Haitong Tian, Guxin Cui and Evangeline F.Y. Young.
National Tsing Hua University Po-Yang Hsu,Hsien-Te Chen,
SimPL: An Effective Placement Algorithm Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan 1ICCAD 2010, Myung-Chul Kim,
1 Physical Hierarchy Generation with Routing Congestion Control Chin-Chih Chang *, Jason Cong *, Zhigang (David) Pan +, and Xin Yuan * * UCLA Computer.
FastPlace: Efficient Analytical Placement using Cell Shifting, Iterative Local Refinement and a Hybrid Net Model FastPlace: Efficient Analytical Placement.
1 Thermal Via Placement in 3D ICs Brent Goplen, Sachin Sapatnekar Department of Electrical and Computer Engineering University of Minnesota.
Comments on Development-Oriented vs Basic Research Prof. Jason Cong Computer Science Department University of California, Los Angeles.
International Conference on Computer-Aided Design San Jose, CA Nov. 2001ER UCLA UCLA 1 Congestion Reduction During Placement Based on Integer Programming.
Constructive Benchmarking for Placement David A. Papa EECS Department University of Michigan Ann Arbor, MI Igor L. Markov EECS.
An Analytic Placer for Mixed-Size Placement and Timing-Driven Placement Andrew B. Kahng and Qinke Wang UCSD CSE Department {abk, Work.
Supply Voltage Degradation Aware Analytical Placement Andrew B. Kahng, Bao Liu and Qinke Wang UCSD CSE Department {abk, bliu,
Reconfigurable Computing (EN2911X, Fall07)
Can Recursive Bisection Alone Produce Routable Placements? Andrew E. Caldwell Andrew B. Kahng Igor L. Markov Supported by Cadence.
An Algebraic Multigrid Solver for Analytical Placement With Layout Based Clustering Hongyu Chen, Chung-Kuan Cheng, Andrew B. Kahng, Bo Yao, Zhengyong Zhu.
Analytical Thermal Placement for VLSI Lifetime Improvement and Minimum Performance Variation Andrew B. Kahng †, Sung-Mo Kang ‡, Wei Li ‡, Bao Liu † † UC.
ECE 699: Lecture 2 ZYNQ Design Flow.
Optimality Study of Logic Synthesis for LUT-Based FPGAs Jason Cong and Kirill Minkovich VLSI CAD Lab Computer Science Department University of California,
Placement-Centered Research Directions and New Problems Xiaojian Yang Amir Farrahi Synplicity Inc.
Placement by Simulated Annealing. Simulated Annealing  Simulates annealing process for placement  Initial placement −Random positions  Perturb by block.
CRISP: Congestion Reduction by Iterated Spreading during Placement Jarrod A. Roy†‡, Natarajan Viswanathan‡, Gi-Joon Nam‡, Charles J. Alpert‡ and Igor L.
CAD for Physical Design of VLSI Circuits
Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc.
Horizontal Benchmark Extension for Improved Assessment of Physical CAD Research Andrew B. Kahng, Hyein Lee and Jiajia Li UC San Diego VLSI CAD Laboratory.
TSV-Aware Analytical Placement for 3D IC Designs Meng-Kai Hsu, Yao-Wen Chang, and Valerity Balabanov GIEE and EE department of NTU DAC 2011.
Solving Hard Instances of FPGA Routing with a Congestion-Optimal Restrained-Norm Path Search Space Keith So School of Computer Science and Engineering.
March 20, 2007 ISPD An Effective Clustering Algorithm for Mixed-size Placement Jianhua Li, Laleh Behjat, and Jie Huang Jianhua Li, Laleh Behjat,
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 EECS 527 Paper Presentation High-Performance.
UC San Diego / VLSI CAD Laboratory Incremental Multiple-Scan Chain Ordering for ECO Flip-Flop Insertion Andrew B. Kahng, Ilgweon Kang and Siddhartha Nath.
Seeing the Forest and the Trees: Steiner Wirelength Optimization in Placement Jarrod A. Roy, James F. Lu and Igor L. Markov University of Michigan Ann.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
Analytic Placement. Layout Project:  Sending the RTL file: −Thursday, 27 Farvardin  Final deadline: −Tuesday, 22 Ordibehesht  New Project: −Soon 2.
Multilevel Generalized Force-directed Method for Circuit Placement Tony Chan 1, Jason Cong 2, Kenton Sze 1 1 UCLA Mathematics Department 2 UCLA Computer.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
Jason Cong‡†, Guojie Luo*†, Kalliopi Tsota‡, and Bingjun Xiao‡ ‡Computer Science Department, University of California, Los Angeles, USA *School of Electrical.
Design Space Exploration for Application Specific FPGAs in System-on-a-Chip Designs Mark Hammerquist, Roman Lysecky Department of Electrical and Computer.
Large Scale Circuit Placement: Gap and Promise Jason Cong UCLA VLSI CAD LAB 1 Joint work with Chin-Chih Chang, Tim Kong, Michail Romesis, Joseph R. Shinnerl,
Recursive Bisection Placement*: feng shui 5.0 Ameya R. Agnihotri Satoshi Ono Patrick H. Madden SUNY Binghamton CSD, FAIS, University of Kitakyushu (with.
Quadratic VLSI Placement Manolis Pantelias. General Various types of VLSI placement  Simulated-Annealing  Quadratic or Force-Directed  Min-Cut  Nonlinear.
I N V E N T I V EI N V E N T I V E A Morphing Approach To Address Placement Stability Philip Chong Christian Szegedy.
1 ε -Optimal Minimum-Delay/Area Zero-Skew Clock Tree Wire-Sizing in Pseudo-Polynomial Time Jeng-Liang Tsai Tsung-Hao Chen Charlie Chung-Ping Chen (National.
Physical Synthesis Comes of Age Chuck Alpert, IBM Corp. Chris Chu, Iowa State University Paul Villarrubia, IBM Corp.
Optimality, Scalability and Stability study of Partitioning and Placement Algorithms Jason Cong, Michail Romesis, Min Xie UCLA Computer Science Department.
Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,
Multi-objective Placement Optimization for High-performance Nanoscale Integrated Circuits Igor L. Markov August 20, 2012.
FPGA CAD 10-MAR-2003.
Unified Quadratic Programming Approach for Mixed Mode Placement Bo Yao, Hongyu Chen, Chung-Kuan Cheng, Nan-Chi Chou*, Lung-Tien Liu*, Peter Suaris* CSE.
High-Performance Global Routing with Fast Overflow Reduction Huang-Yu Chen, Chin-Hsiung Hsu, and Yao-Wen Chang National Taiwan University Taiwan.
Constraint-Driven Large Scale Circuit Placement Algorithms Advisor: Prof. Jason Cong Student: Min Xie September, 2006.
International Symposium on Physical Design San Diego, CA April 2002ER UCLA UCLA 1 Routability Driven White Space Allocation for Fixed-Die Standard-Cell.
Effective Linear Programming-Based Placement Techniques Sherief Reda UC San Diego Amit Chowdhary Intel Corporation.
Interconnect Characteristics of 2.5-D System Integration Scheme Yangdong (Steven) Deng & Wojciech P. Maly
Dept. of Electronics Engineering & Institute of Electronics National Chiao Tung University Hsinchu, Taiwan ISPD’16 Generating Routing-Driven Power Distribution.
6/19/ VLSI Physical Design Automation Prof. David Pan Office: ACES Placement (3)
Placement and Routing Algorithms. 2 FPGA Placement & Routing.
HeAP: Heterogeneous Analytical Placement for FPGAs
ePlace: Electrostatics based Placement
APLACE: A General and Extensible Large-Scale Placer
An Automated Design Flow for 3D Microarchitecture Evaluation
mPL 5 Overview ISPD 2005 Placement Contest Entry
ECE 699: Lecture 3 ZYNQ Design Flow.
A Semi-Persistent Clustering Technique for VLSI Circuit Placement
EDA Lab., Tsinghua University
Xilinx Alliance Series
Presentation transcript:

Task : Highly Scalable Placement by Multilevel Optimization Task Leaders: Jason Cong (UCLA CS) and Tony Chan (UCLA Math) Students with Graduation Dates: Michalis Romesis (UCLA CS, March graduated) Kenton Sze (UCLA Math, July graduated) Min Xie (UCLA CS, September graduated) Guojie Luo (UCLA CS, September 2010) Research Staff: Joe Shinnerl, UCLA CS

UCLA VLSICAD LAB2 Industrial Liaisons u Patrick McGuinness, Freescale Semiconductor, Inc. u Natesan Venkateswaran, IBM Corporation u Amit Chowdhary, Intel Corporation

UCLA VLSICAD LAB3 Task Description and Anticipated Result u Highly scalable multilevel, multiheuristic placement algorithms that address the critical placement needs of nanometer designs:  scalability  multi-constraint optimization --- timing, routability, power, manufacturability, etc.  support of mixed-sized placement and incremental design. u Quantitative study of the optimality and scalability of placement algorithms  Construction of synthetic benchmarks with known optima to identify the deficiencies of existing methods u Our goal is to achieve one-process-generation benefit through innovation of physical-design technologies, especially placement.

UCLA VLSICAD LAB4 Task Deliverables u Report on new placement benchmarks with known optimal or near optimal solutions for all major objectives and constraints. Scalability and optimization studies on existing placement techniques (Completed 3-Nov-2003) u Experiments and reports on the applicability of integrated AMG-based weighted aggregation and weighted interpolation. Improvement measured on both PEKO examples and industrial examples from SRC member companies (Completed 1- Jun-2004) u Experiments and reports on multiheuristic, multilevel relaxation and the scalable incorporation of complex constraints into the enhanced multilevel framework. Improvement measured on both PEKO and industrial examples (Completed 1- Jun-2005) u A highly scalable placement tool that (i) supports multi-constraint optimization, mixed-sized placement, and incremental design and (ii) produces best-of-class results for both PEKO and industrial examples from SRC member companies (Completed 1-Jun-2006) u Final report summarizing research accomplishments and future direction (Planned-Oct-31, 2006)

UCLA VLSICAD LAB5 Accomplishments in the Past Year 1. Improvements in mPL for routing density control [Best quality, ISPD 2006 contest] 2. Thermal-Driven Placement 3. Heterogeneous Placement

UCLA VLSICAD LAB6 Relative Wirelength year UNIFORM CELL SIZE NON-UNIFORM CELL SIZE A Brief History of mPL mPL 5.0 Multilevel force directed Mixed-size capability mPL 6.0 Enhanced Routability handling mPL 1.0 [ICCAD00] ESC Clustering Goto relaxation mPL 1.1 FC clustering Partitioning added to legalization mPL 2.0 RDFL relaxation Primal-dual netlist pruning mPL 3.0 [ICCAD03] QRS relaxation AMG interpolation Multiple V cycles mPL 4.0 Improved DP Backtracking V cycle

UCLA VLSICAD LAB7 mPL: Generalized Force-Directed Placement u Use of accurate objective functions [Bertsekas, 82, Naylor et al, 01] u Optimization-based bin-density constraint formulation u Iterative Uzawa solver u Multilevel for better runtime and wirelength is a generalized force

UCLA VLSICAD LAB8 Accomplishments in the Past Year 1. Improvements in mPL for routing density control [Best quality, ISPD 2006 contest] 2. Thermal-Driven Placement 3. Heterogeneous Placement

UCLA VLSICAD LAB9 Core Engine for Density Control u Overall scheme  One V cycle with comparable quality  Minimum perturbation in the last stages of GFD  Significant speed up without losing solution quality u Routing density handling  Residual density in each bin  Even distribution of dummy density into bins  Cell area inflation for better convergence Initial Finest Problem Final Placement coarsening interpolation Coarsest Problem GFD with Density Control Minimun perturbation

UCLA VLSICAD LAB10 Macro Spreading u Need area density below target value [Nam, ISPD06] u Target distance between neighboring macros   : target density u Spreading represented as objective W H w w1w1 w2w2 A1A1 A2A2 f ij x H ij  dx i and dy i : perturbation  fx ij and fy ij : piece-wise linear function

UCLA VLSICAD LAB11 Experiment Results on ISPD06 mPL6 produces the best solution quality using ISPD06 routability-driven metric

Demonstration of mPL UCLA VLSICAD LAB12

UCLA VLSICAD LAB13 Accomplishments in the Past Year 1. Improvements in mPL core engine for mixed-size global placement 2. Thermal-Driven Placement 3. Heterogeneous Placement

UCLA VLSICAD LAB14 Motivation u High power density due to technology scaling u Problems caused by high temperature  Hot spots become more harmful Higher temperature  Higher leakage power  More heat Higher temperature  Higher leakage power  More heat  Previously negligible effects become first-order effects Difficult estimation for power, timing, etc Difficult estimation for power, timing, etc

UCLA VLSICAD LAB15 Thermal Model u One layer mesh to model the substrate  Σ j (T i - T j ) C xy + (T i – T sink ) C z = P i C xy, C z are the thermal conductance for the substrate and the heat sink C xy, C z are the thermal conductance for the substrate and the heat sink  Solved by Fast DCT Solve T from CT = P, given C and P Solve T from CT = P, given C and P Diagonalize C = Γ T ΛΓ Diagonalize C = Γ T ΛΓ u Γ is the discrete cosine matrix u Λ is a diagonal matrix T = Γ -1 Λ -1 Γ P T = Γ -1 Λ -1 Γ P TiTi T j,1 T j,2 T j,3 T j,4 T sink P C xy CzCz

UCLA VLSICAD LAB16 Formulation & Solution u  Implement  i (x) and t i (x) with filler cells and “filler power” without area  T des is a given by user u Solved by Uzawa Algorithm u As additional thermal-aware GFD following a WL-driven V-Cycle

UCLA VLSICAD LAB17 Experiment Results on IBM-FastPlace u Quality improvement  T even is the ideal temperature with the same total power  Max. on-chip temperature: T init after Step 1 T init after Step 1 T final = T des after Step T final = T des after Step u More than 90% quality improvement within 5% WL increase

UCLA VLSICAD LAB18 Accomplishments in the Past Year 1. Improvements in mPL for routing density control [1 st quality, ISPD 2006 contest] 2. Thermal-Driven Placement 3. Heterogeneous Placement

UCLA VLSICAD LAB19 Motivation u Need for placement on array type chips with pre-fabricated resources  FPGA  Structured ASIC u Need for heterogeneous capability  Memory, DSP, etc  Block on sites of the same type

UCLA VLSICAD LAB20 Related Work u Academia  VPR [Betz & Rose 97], PATH [Kong 02], SPCD [Chen & Cong 04,05], PPFF [Maidee et al, 03], CAPRI []  VPR [Betz & Rose 97], PATH [Kong 02], SPCD [Chen & Cong 04,05], PPFF [Maidee et al, 03], CAPRI [Gopalakrishnan et al, 06]  Most comparisons to out-dated tools  No heterogeneous capability u Industry  Quartus II [Altera Corp.], ISE [Xilinx Inc.]  Proprietary chips only  Techniques not publicly documented

UCLA VLSICAD LAB21 Heterogeneous Placement by mPL-H u First analytical placer for heterogeneous placement u Framework based on mPL6 [Chan et al, 05] u Multiple layered placement  One logical layer for each resource  Forbidden regions blocked by obstacles  Uniform wirelength computation u Filler cells on each layer DSP M-RAM LAB

Demonstration of mPL-H UCLA VLSICAD LAB22

UCLA VLSICAD LAB23 Experiment Setting Quartus_map Verilog netlist Quartus_fittermPL-H Clustered.vqm netlist Quartus_router Chip type Stratix Description.xml.qsf placement

UCLA VLSICAD LAB24 Wirelength Comparison u WL still important for architecture evaluation u mPL-H is 3% better in HPWL, and 2% better in routed WL than Quartus II v5.0

UCLA VLSICAD LAB25 Runtime Comparison u mPL-H can be 2X faster than Quartus II v5.0 when the circuit becomes sufficiently large

UCLA VLSICAD LAB26 Overall Accomplishments Over the Funding Period u 34% reduction in WL over 3 years u One technology generation advancement

UCLA VLSICAD LAB27 Technology Transfer in 2006 u Discussions at conferences and workshops  ASPDAC 2006, Yokohama, Japan  ISPD 2006, San Jose, USA  DAC 2006, San Francisco, USA u Benchmark Releases (PEKO-MS) u mPL release:

UCLA VLSICAD LAB28 Software Download Record u PEKO/PEKU [2002 – now]  More than 360 downloads… SRC member companies SRC member companies u Cadence, IBM, Intel, Mentor Graphics,…etc. NON-SRC member companies NON-SRC member companies u Synopsys, Magma, Monterey Design, etc. Universities Universities u CMU, Michigan, MIT, UC Berkeley, UCSD, …etc., u mPL [2001 – now]  More than 480 downloads… SRC member companies SRC member companies u Cadence, Intel, Mentor Graphics,…etc. NON-SRC member companies NON-SRC member companies u Synopsys, Magma, Intrinsity, Oasys, etc. Universities Universities u CMU, Michigan, Stanford, UCSD, Nat’l Taiwan U., …etc.,

UCLA VLSICAD LAB29 Publications in 2006 u Conference papers  ASPDAC 2006: J. Cong, M. Xie, “ A Robust Detailed Placement for Mixed-size IC Designs.”  ISPD 2006: T. F. Chan, J. Cong, J. Shinnerl, K. Sze and M. Xie, “ mPL6: Enhanced Multilevel Mixed-size Placement.” u Thesis  Kenton Sze, “ Multilevel Optimization for VLSI Circuit Placement. ”  Min Xie, “Constraint-Driven Large Scale Circuit Placement Algorithms.”

UCLA VLSICAD LAB30 Room for Further Improvement? u “Swirls” are difficult to correct with localized refinement mPL4 mPL5