Benchmarking for Large-Scale Placement and Beyond S. N. Adya, M. C. Yildiz, I. L. Markov, P. G. Villarrubia, P. N. Parakh, P. H. Madden.

Slides:



Advertisements
Similar presentations
Capo: Robust and Scalable Open-Source Min-cut Floorplacer Jarrod A. Roy, David A. Papa,Saurabh N. Adya, Hayward H. Chan, James F. Lu, Aaron N. Ng, Igor.
Advertisements

(1/25) UCSD VLSI CAD Laboratory - ISQED10, March. 23, 2010 Toward Effective Utilization of Timing Exceptions in Design Optimization Kwangok Jeong, Andrew.
OCV-Aware Top-Level Clock Tree Optimization
Cadence Design Systems, Inc. Why Interconnect Prediction Doesn’t Work.
Ripple: An Effective Routability-Driven Placer by Iterative Cell Movement Xu He, Tao Huang, Linfu Xiao, Haitong Tian, Guxin Cui and Evangeline F.Y. Young.
SimPL: An Effective Placement Algorithm Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan 1ICCAD 2010, Myung-Chul Kim,
1 Physical Hierarchy Generation with Routing Congestion Control Chin-Chih Chang *, Jason Cong *, Zhigang (David) Pan +, and Xin Yuan * * UCLA Computer.
Consistent Placement of Macro-Blocks Using Floorplanning and Standard-Cell Placement Saurabh Adya Igor Markov (University of Michigan)
FastPlace: Efficient Analytical Placement using Cell Shifting, Iterative Local Refinement and a Hybrid Net Model FastPlace: Efficient Analytical Placement.
Boosting: Min-Cut Placement with Improved Signal Delay Andrew B. KahngSherief Reda CSE & ECE Departments University of CA, San Diego La Jolla, CA
International Conference on Computer-Aided Design San Jose, CA Nov. 2001ER UCLA UCLA 1 Congestion Reduction During Placement Based on Integer Programming.
Constructive Benchmarking for Placement David A. Papa EECS Department University of Michigan Ann Arbor, MI Igor L. Markov EECS.
Fixed-outline Floorplanning Through Better Local Search
Fall 2006EE VLSI Design Automation I V-1 EE 5301 – VLSI Design Automation I Kia Bazargan University of Minnesota Part V: Placement.
Andrew Kahng – November 2002 ICCAD-2002 Open Source Panel Andrew B. Kahng UC San Diego CSE & ECE Depts. Igor L. Markov Univ. of Michigan EECS Dept.
Benchmarking for [Physical] Synthesis Igor Markov and Prabhakar Kudva The Univ. of Michigan / IBM.
Local Unidirectional Bias for Smooth Cutsize-delay Tradeoff in Performance-driven Partitioning Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts. Work supported.
VLSI Routing. Routing Problem  Given a placement, and a fixed number of metal layers, find a valid pattern of horizontal and vertical wires that connect.
On Modeling and Sensitivity of Via Count in SOC Physical Implementation Kwangok Jeong Andrew B. Kahng.
Placement Feedback: A Concept and Method for Better Min-Cut Placements Andrew B. KahngSherief Reda CSE & ECE Departments University of CA, San Diego La.
On Legalization of Row-Based Placements Andrew B. KahngSherief Reda CSE & ECE Departments University of CA, San Diego La Jolla, CA 92093
Andrew Kahng – October Layout Planning of Mixed- Signal Integrated Circuits Chung-Kuan Cheng / Andrew B. Kahng UC San Diego CSE Department.
ICS 252 Introduction to Computer Design Lecture 15 Winter 2004 Eli Bozorgzadeh Computer Science Department-UCI.
Can Recursive Bisection Alone Produce Routable Placements? Andrew E. Caldwell Andrew B. Kahng Igor L. Markov Supported by Cadence.
DUSD(Labs) GSRC bX update March 2003 Aaron Ng, Marius Eriksen and Igor Markov University of Michigan.
Accurate Pseudo-Constructive Wirelength and Congestion Estimation Andrew B. Kahng, UCSD CSE and ECE Depts., La Jolla Xu Xu, UCSD CSE Dept., La Jolla Supported.
ISPD 2000, San DiegoApr 10, Requirements for Models of Achievable Routing Andrew B. Kahng, UCLA Stefanus Mantik, UCLA Dirk Stroobandt, Ghent.
Fall 2003EE VLSI Design Automation I 149 EE 5301 – VLSI Design Automation I Kia Bazargan University of Minnesota Part V: Placement.
Placement-Centered Research Directions and New Problems Xiaojian Yang Amir Farrahi Synplicity Inc.
International Symposium of Physical Design San Diego, CA April 2002ER UCLA UCLA 1 Experimental Setup Cadence QPlace Cadence WRoute LEF/DEFLEF/DEF Dragon.
© R.A. Rutenbar 2005 Early Research Experience With OpenAccess Gear : An Open Source Development Environment For Physical Design Zhong Xiu*, David A. Papa.
Placement by Simulated Annealing. Simulated Annealing  Simulates annealing process for placement  Initial placement −Random positions  Perturb by block.
VLSI Physical Design Automation
Global Routing.
Are classical design flows suitable below 0.18  ? ISPD 2001 NEC Electronics Inc. WR0999.ppt-1 Wolfgang Roethig Senior Engineering Manager EDA R&D Group.
Are Floorplan Representations Important in Digital Design? H. H. Chan, S. N. Adya, I. L. Markov The University of Michigan.
CAD for Physical Design of VLSI Circuits
Horizontal Benchmark Extension for Improved Assessment of Physical CAD Research Andrew B. Kahng, Hyein Lee and Jiajia Li UC San Diego VLSI CAD Laboratory.
ASIC Design Flow – An Overview Ing. Pullini Antonio
March 20, 2007 ISPD An Effective Clustering Algorithm for Mixed-size Placement Jianhua Li, Laleh Behjat, and Jie Huang Jianhua Li, Laleh Behjat,
UC San Diego / VLSI CAD Laboratory Incremental Multiple-Scan Chain Ordering for ECO Flip-Flop Insertion Andrew B. Kahng, Ilgweon Kang and Siddhartha Nath.
Seeing the Forest and the Trees: Steiner Wirelength Optimization in Placement Jarrod A. Roy, James F. Lu and Igor L. Markov University of Michigan Ann.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
Improved Cut Sequences for Partitioning Based Placement Mehmet Can YILDIZ and Patrick H. Madden State University of New York at BinghamtonComputer Science.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
1/24/20071 ECO-system: Embracing the Change in Placement Jarrod A. Roy and Igor L. Markov University of Michigan at Ann Arbor.
Placement. Physical Design Cycle Partitioning Placement/ Floorplanning Placement/ Floorplanning Routing Break the circuit up into smaller segments Place.
Session 10: The ISPD2005 Placement Contest. 2 Outline  Benchmark & Contest Introduction  Individual placement presentation  FastPlace, Capo, mPL, FengShui,
Recursive Bisection Placement*: feng shui 5.0 Ameya R. Agnihotri Satoshi Ono Patrick H. Madden SUNY Binghamton CSD, FAIS, University of Kitakyushu (with.
Reporting of Standard Cell Placement Results Patrick H. Madden SUNY Binghamton CSD BLAC CAD Group
ECE 260B – CSE 241A /UCB EECS Kahng/Keutzer/Newton Physical Design Flow Read Netlist Initial Placement Placement Improvement Cost Estimation Routing.
CHAPTER 8 Developing Hard Macros The topics are: Overview Hard macro design issues Hard macro design process Physical design for hard macros Block integration.
Optimality, Scalability and Stability study of Partitioning and Placement Algorithms Jason Cong, Michail Romesis, Min Xie UCLA Computer Science Department.
Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,
1 NTUplace: A Partitioning Based Placement Algorithm for Large-Scale Designs Tung-Chieh Chen 1, Tien-Chang Hsu 1, Zhe-Wei Jiang 1, and Yao-Wen Chang 1,2.
Unified Quadratic Programming Approach for Mixed Mode Placement Bo Yao, Hongyu Chen, Chung-Kuan Cheng, Nan-Chi Chou*, Lung-Tien Liu*, Peter Suaris* CSE.
International Symposium on Physical Design San Diego, CA April 2002ER UCLA UCLA 1 Routability Driven White Space Allocation for Fixed-Die Standard-Cell.
An Exact Algorithm for Difficult Detailed Routing Problems Kolja Sulimma Wolfgang Kunz J. W.-Goethe Universität Frankfurt.
Effective Linear Programming-Based Placement Techniques Sherief Reda UC San Diego Amit Chowdhary Intel Corporation.
Interconnect Characteristics of 2.5-D System Integration Scheme Yangdong (Steven) Deng & Wojciech P. Maly
C.A.D.: Bookshelf June 18, 8:00am-11:00am. Outline Review: [some of] bookshelf objectives Where we want to go vs what we have now Invited presentations.
6/19/ VLSI Physical Design Automation Prof. David Pan Office: ACES Placement (3)
CAD-IP Reuse via the Bookshelf for Fundamental VLSI CAD Algorithms
Jody Matos, Augusto Neutzling, Renato Ribas and Andre Reis
HeAP: Heterogeneous Analytical Placement for FPGAs
Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts.
Revisiting and Bounding the Benefit From 3D Integration
EDA Lab., Tsinghua University
ICCAD-2002 Open Source Panel Andrew B
Presentation transcript:

Benchmarking for Large-Scale Placement and Beyond S. N. Adya, M. C. Yildiz, I. L. Markov, P. G. Villarrubia, P. N. Parakh, P. H. Madden

Outline Motivation Motivation Why does the industry need benchmarking? Why does the industry need benchmarking? Available benchmarks and placement tools Available benchmarks and placement tools Performance results Performance results Unresolved issues Unresolved issues Benchmarking for routability Benchmarking for routability Benchmarking for timing-driven placement Benchmarking for timing-driven placement Public placement utilities Public placement utilities Lessons learned + beyond placement Lessons learned + beyond placement

A True Story About Benchmarking An undergraduate student implements an optimal B&B block packer, An undergraduate student implements an optimal B&B block packer, finds min areas possible for apte & xerox, finds min areas possible for apte & xerox, compares to published results, compares to published results, finds an ISPD 2001 paper that reports: finds an ISPD 2001 paper that reports: Floorplan areas smaller than optimal Floorplan areas smaller than optimal In two cases, areas smaller than  block areas In two cases, areas smaller than  block areas More true stories in our ISPD 2003 paper More true stories in our ISPD 2003 paper

Industrial Benchmarking Growing size & complexity of VLSI chips Growing size & complexity of VLSI chips Design objectives Design objectives Wirelength / congestion / timing / power / yield Wirelength / congestion / timing / power / yield Design constraints Design constraints Fixed die / routability / FP constraints / fixed IPs / cell orientations / pin access / signal integrity / … Fixed die / routability / FP constraints / fixed IPs / cell orientations / pin access / signal integrity / … Can the same algo excel in all contexts? Can the same algo excel in all contexts? Layout sophistication motivates open benchmarking for placement Layout sophistication motivates open benchmarking for placement

Whitespace Handling Modern ASICs are laid out in fixed-die context Modern ASICs are laid out in fixed-die context Layout area, routing tracks, power lines, etc are fixed before placement Layout area, routing tracks, power lines, etc are fixed before placement Area minimization is irrelevant (area is fixed) Area minimization is irrelevant (area is fixed) New phenomenon: whitespace New phenomenon: whitespace Row utilization % = density % = 100% - whitespace % Row utilization % = density % = 100% - whitespace % How does one distribute whitespace ? How does one distribute whitespace ? Pack all cells to the left [Feng Shui, mPL] Pack all cells to the left [Feng Shui, mPL] All whitespace is on the right All whitespace is on the right Typical for variable-die placers Typical for variable-die placers Distribute uniformly [Capo, Kraftwerk] Distribute uniformly [Capo, Kraftwerk] Allocate whitespace to congested regions [Dragon] Allocate whitespace to congested regions [Dragon]

Design Types ASICs ASICs Lots of fixed I/Os, few macros, millions of standard cells Lots of fixed I/Os, few macros, millions of standard cells Placement densities : 40-80% (IBM) Placement densities : 40-80% (IBM) Flat and hierarchical designs Flat and hierarchical designs SoCs SoCs Many more macro blocks, cores Many more macro blocks, cores Datapaths + control logic Datapaths + control logic Can have very low placement densities : < 20% Can have very low placement densities : < 20% Micro-Processor (  P) Random Logic Macros(RLM) Micro-Processor (  P) Random Logic Macros(RLM) Hierarchical partitions are placement instances (5-30K) Hierarchical partitions are placement instances (5-30K) High placement densities : 80%-98% (low whitespace) High placement densities : 80%-98% (low whitespace) Many fixed I/Os, relatively few standard cells Many fixed I/Os, relatively few standard cells Recall “Partitioning w Terminals” DAC`99, ISPD `99, ASPDAC`00

ASICs Many fixed ports: perimeter- and area-array A handfull (1-20) of large, fixed macros 100's to 1000's of fixed smaller cells Some designs are hierarchical E.g., have floorplan constraints Functional / logic hierarchy  physical hierarchy Many designs are flat Up to 2M placeable objects (for flat designs)

Cores Computational and DSP cores are commonly included in SoCs Mix of standard-cell and semi-custom style Datapaths, structured components Some control logic

 P RLMs Manual floorplanning, hierarchies Standard-cell Place & Route instances are small (5K to 30K placeable objects) Std. cells sometimes occupy a single row Almost no whitespace Large ratio of fixed ports to movable cells (relative to ASIC parts) Most cells are movable, but not always Recall “Partitioning w Terminals” DAC`99, ASPDAC`00

IBM PowerPC 601 chip

Intel Centrino chip

Requirements for Placers (1) Must handle 4-10M cells, 1000s macros 64 bits + near-linear asymptotic complexity Scalable/compact design database (OpenAccess) Accept fixed ports/pads/pins + fixed cells Place macros, esp. with var. aspect ratios Non-trivial heights and widths (e.g., height=2rows) Honor targets and limits for net length Respect floorplan constraints Handle a wide range of placement densities (from <25% to 100% occupied), ICCAD `02

Requirements for Placers (2) Add / delete filler cells and Nwell contacts Ignore clock connections ECO placement Fix overlaps after logic restructuring Place a small number of unplaced blocks Datapath planning services E.g., for cores Provide placement dialog services to enable cooperation across tools E.g., between placement and synthesis

Why Worry About Benchmarking? Variety of conflicting objectives Variety of conflicting objectives Multitude of layout features / constraints Multitude of layout features / constraints No single algorithm finds best placements for all design problems (yet?) No single algorithm finds best placements for all design problems (yet?) Need independent evaluation Need independent evaluation Need a set of common placement BM’s with features of interest (e.g., IBM-Floorplacement) Need a set of common placement BM’s with features of interest (e.g., IBM-Floorplacement) Need to know / understand how algorithms behave over the entire design space Need to know / understand how algorithms behave over the entire design space

Available Placement BM’s MCNC MCNC Small and outdated (routing channels between rows, etc) Small and outdated (routing channels between rows, etc) IBM-Place / IBM-Dragon (ste 1 & 2) - UCLA (ICCAD `00) IBM-Place / IBM-Dragon (ste 1 & 2) - UCLA (ICCAD `00) Derived from ISPD98-IBM partitioning suite. Macros removed. Derived from ISPD98-IBM partitioning suite. Macros removed. IBM Floor-placement – Michigan (ISPD ‘02) IBM Floor-placement – Michigan (ISPD ‘02) Derived from same IBM circuits. Nothing removed. Derived from same IBM circuits. Nothing removed. PEKO – UCLA (DAC ‘95, ASPDAC ‘03, ISPD ‘03) PEKO – UCLA (DAC ‘95, ASPDAC ‘03, ISPD ‘03) Artificial netlists with known optimal wirelength; up to 2M cells Artificial netlists with known optimal wirelength; up to 2M cells No global wires No global wires Standardized grids – Michigan Standardized grids – Michigan Created to model data-paths during placement Created to model data-paths during placement Easy to visualize, optimal placements are obvious Easy to visualize, optimal placements are obvious Vertical benchmarks - CMU Vertical benchmarks - CMU Multiple representations (PicoJava, Piperench, CMUDSP) Multiple representations (PicoJava, Piperench, CMUDSP) Have some timing info, but not enough to evaluate timing Have some timing info, but not enough to evaluate timing

Academic Placers We Used Kraftwerk Nov 2002 (no major changes since DAC98) Kraftwerk Nov 2002 (no major changes since DAC98) Eisenmann and Johannes (TU Munch) Eisenmann and Johannes (TU Munch) Force-directed (analytical) placer Force-directed (analytical) placer Capo 8.5 / 8.6 (Apr / Nov 2002) Capo 8.5 / 8.6 (Apr / Nov 2002) Adya, Caldwell, Kahng and Markov (UCLA and Michigan) Adya, Caldwell, Kahng and Markov (UCLA and Michigan) Recursive min-cut bisection (built-in partitioner MLPart) Recursive min-cut bisection (built-in partitioner MLPart) Dragon 2.20 / 2.23 (Sept / Feb 2003) Dragon 2.20 / 2.23 (Sept / Feb 2003) Choi, Sarrafzadeh, Yang and Wang (Northwestern and UCLA) Choi, Sarrafzadeh, Yang and Wang (Northwestern and UCLA) Min-cut multi-way partitioning (hMetis) & simulated annealing Min-cut multi-way partitioning (hMetis) & simulated annealing FengShui 1.2 / 1.6 / 2.0 (Fall 2000 / Feb 2003) FengShui 1.2 / 1.6 / 2.0 (Fall 2000 / Feb 2003) Madden and Yildiz (SUNY Binghamton) Madden and Yildiz (SUNY Binghamton) Recursive min-cut multi-way partitioning (hMetis + built-in) Recursive min-cut multi-way partitioning (hMetis + built-in) mPL 1.2 / 1.2b (Nov 2002 / Feb 2003) mPL 1.2 / 1.2b (Nov 2002 / Feb 2003) Chan, Cong, Shinnerl and Sze (UCLA) Chan, Cong, Shinnerl and Sze (UCLA) Multi-level enumeration-based placer Multi-level enumeration-based placer

Features Supported by Placers

Performance on Available BM’s Our objectives and goals Our objectives and goals Perform first-ever comprehensive evaluation Perform first-ever comprehensive evaluation Seek trends and anomalies Seek trends and anomalies Evaluate robustness of different placers Evaluate robustness of different placers One does not expect a clear winner One does not expect a clear winner Minor obstacles and potential pitfalls Minor obstacles and potential pitfalls Not all placers are open-source / public Not all placers are open-source / public Not all placers support the Bookshelf format Not all placers support the Bookshelf format Most do Most do Must be careful with converters (!) Must be careful with converters (!)

PEKO BMs (ASPDAC 03)

Cadence-Capo BMs (DAC 2000) I – failure to read input; a – abort I – failure to read input; a – abort oc – out-of-core cells; / - in variable-die mode oc – out-of-core cells; / - in variable-die mode Feng Shui – similar to Dragon, better on test1 Feng Shui – similar to Dragon, better on test1

Results : Grids Unique optimal solution

Relative Performance Feng Shui 1.6 / 2.0 improves upon FS 1.2 Feng Shui 1.6 / 2.0 improves upon FS 1.2 ?

Placers Do Well on Benchmarks Published By the Same Group Observe that Observe that Capo does well on Cadence-Capo Capo does well on Cadence-Capo Dragon does well on IBM-Place (IBM-Dragon) Dragon does well on IBM-Place (IBM-Dragon) Not in the table: FengShui does well on MCNC Not in the table: FengShui does well on MCNC mPL does well on PEKO mPL does well on PEKO This is hardly a coincidence This is hardly a coincidence Motivation for more / better benchmarks Motivation for more / better benchmarks

Benchmarking for Routability of Placements Placer tuning also explains routability results Placer tuning also explains routability results Dragon performs well on the IBM-Dragon suite Dragon performs well on the IBM-Dragon suite Capo performs well on the Cadence-Capo suite Capo performs well on the Cadence-Capo suite Routability on one set does not guarantee much Routability on one set does not guarantee much Need accurate / common routability metrics Need accurate / common routability metrics … and shared implementations (binaries, source code) … and shared implementations (binaries, source code) Related benchmarking issues Related benchmarking issues No good public benchmarks for routing ! No good public benchmarks for routing ! Routability may conflict with timing / power optimizations Routability may conflict with timing / power optimizations

Simple Congestion Metrics Horizontal vs. Vertical wirelength Horizontal vs. Vertical wirelength HPWL = WL H +WL V HPWL = WL H +WL V Two placements with same HPWL may have very different WL H and WL V Two placements with same HPWL may have very different WL H and WL V Think of preferred-direction routing & odd #layers Think of preferred-direction routing & odd #layers Probabilistic congestion maps Probabilistic congestion maps Bhatia et al – DAC 02 Bhatia et al – DAC 02 Lou et al - ISPD 00, TCAD 01 Lou et al - ISPD 00, TCAD 01 Carothers & Kusnadi – ISPD 99` Carothers & Kusnadi – ISPD 99`

Horizontal vs. Vertical WL

Probabilistic Congestion Maps

Metric: Run a Router Global or Global + detail? Global or Global + detail? Local effects (design rules, cell libraries) may affect results too much Local effects (design rules, cell libraries) may affect results too much “noise” in global placement (for 2M cells) ? “noise” in global placement (for 2M cells) ? Open-source or Industrial? Open-source or Industrial? Tunable? Easy to integrate? Tunable? Easy to integrate? Saves global routing information? Saves global routing information? Publicly available routers Publicly available routers Labyrinth from UCLA Labyrinth from UCLA Force-directed router from UCB Force-directed router from UCB

Placement Utilities Accept input in the GSRC Bookshelf format Accept input in the GSRC Bookshelf format Format converters Format converters LEF/DEF  Bookshelf LEF/DEF  Bookshelf Bookshelf  Kraftwerk Bookshelf  Kraftwerk BLIF(SIS)  Bookshelf BLIF(SIS)  Bookshelf Evaluators, checkers, postprocessors and plotters Evaluators, checkers, postprocessors and plotters Contributions in these categories are esp. welcome Contributions in these categories are esp. welcome

Placement Utilities (cont’d) Wirelength Calculator (HPWL) Wirelength Calculator (HPWL) Independent evaluation of placement results Independent evaluation of placement results Placement Plotter Placement Plotter Saves gnuplot scripts ( .eps,.gif, …) Saves gnuplot scripts ( .eps,.gif, …) Multiple views (cells only, cells+nets, rows,…) Multiple views (cells only, cells+nets, rows,…) Used earlier in this presentation Used earlier in this presentation Probabilistic Congestion Maps (Lou et al.) Probabilistic Congestion Maps (Lou et al.) Gnuplot scripts Gnuplot scripts Matlab scripts Matlab scripts better graphics, including 3-d fly-by views better graphics, including 3-d fly-by views.xpm files ( .gif,.jpg,.eps, …).xpm files ( .gif,.jpg,.eps, …)

Placement Utilities (cont’d) Legality checker Legality checker Simple legalizer Simple legalizer Layout Generator Layout Generator Given a netlist, creates a row structure Given a netlist, creates a row structure Tunable %whitespace, aspect ratio, etc Tunable %whitespace, aspect ratio, etc All available in binaries/PERL at All available in binaries/PERL at Most source codes are shipped w Capo Most source codes are shipped w Capo Your contributions are welcome Your contributions are welcome

Challenges for Evaluating Timing-Driven Optimizations QOR not defined clearly QOR not defined clearly Max path-length? Worst set-up slack? Max path-length? Worst set-up slack? With false paths or without?... With false paths or without?... Evaluation methods are not replicable (often shady) Evaluation methods are not replicable (often shady) Questionable delay models, technology params Questionable delay models, technology params Net topology generators (MST, single-trunk Steiner trees) Net topology generators (MST, single-trunk Steiner trees) Inconsistent results: path delays <  gate delays Inconsistent results: path delays <  gate delays Public benchmarks?... Public benchmarks?... Anecdote: TD-place benchmarks in Verilog (ISPD `01) Anecdote: TD-place benchmarks in Verilog (ISPD `01) Companies guard netlists, technology parameters Companies guard netlists, technology parameters Cell libraries; area constraints Cell libraries; area constraints

Metrics for Timing + Reporting STA non-trivial: use PrimeTime or PKS STA non-trivial: use PrimeTime or PKS Distinguish between optimization and evaluation Distinguish between optimization and evaluation Evaluate setup-slack using commercial tools Evaluate setup-slack using commercial tools Optimize individual nets and/or paths Optimize individual nets and/or paths E.g., net-length versus allocated budgets E.g., net-length versus allocated budgets Report all relevant data Report all relevant data How was the total wirelength affected? How was the total wirelength affected? Were per-net and per-path optimizations successful? Were per-net and per-path optimizations successful? Did that improve worst slack or did something else? Did that improve worst slack or did something else? Huge slack improvements reported in some 1990s papers, but wire delays were much smaller than gate delays Huge slack improvements reported in some 1990s papers, but wire delays were much smaller than gate delays

Local circuit tweaks improve worst slack Local circuit tweaks improve worst slack How do global placement changes affect slack, when followed by sizing, buffering…? How do global placement changes affect slack, when followed by sizing, buffering…? Impact of Physical Synthesis Slack (TNS) InitialSizedBuffered (-10223)-5.08 (-9955)D (-5497) (-8086)-5.26 (-5287)D (-2370) (-4049) (-3910)D (-3684) (-508)-2.17 (-512)D (-21) # Inst (-7126)-5.16 (-1568)D (-1266)

Correlated Non-timing Metrics? If you cannot solve a hard problem, reduce it to a simpler problem If you cannot solve a hard problem, reduce it to a simpler problem Validate your reduction ! Validate your reduction ! E.g., show that slack correlates with ??... E.g., show that slack correlates with ??... Delay budgeting and net-length limits Delay budgeting and net-length limits Before placement, for the whole chip Before placement, for the whole chip Or in the context of incremental re-placement Or in the context of incremental re-placement Do some placement algorithms lead to smaller circuit delays ? (w/o timing info!) Do some placement algorithms lead to smaller circuit delays ? (w/o timing info!) Recall: quadratic net lengths versus linear Recall: quadratic net lengths versus linear

Benchmarking Needs for Timing Opt. A common, reusable STA methodology A common, reusable STA methodology PrimeTime or PKS PrimeTime or PKS High-quality, open-source infrastructure (funding?) High-quality, open-source infrastructure (funding?) Metrics validated against phys. synthesis Metrics validated against phys. synthesis The simpler the better, but must be good predictors The simpler the better, but must be good predictors Benchmarks with sufficient info Benchmarks with sufficient info Flat gate-level netlists Flat gate-level netlists Library information ( < 250nm ) Library information ( < 250nm ) Realistic timing & area constraints Realistic timing & area constraints

Beyond Placement (Lessons) Evaluation methods for BMs must be explicit Evaluation methods for BMs must be explicit Prevent user errors (no TD-place BMs in Verilog) Prevent user errors (no TD-place BMs in Verilog) Try to use open-source evaluators to verify results Try to use open-source evaluators to verify results Visualization is important (sanity checks) Visualization is important (sanity checks) Regression-testing after bugfixes is important Regression-testing after bugfixes is important Need more open-source tools Need more open-source tools Complete descriptions of algos lower barriers to entry Complete descriptions of algos lower barriers to entry Need benchmarks with more information Need benchmarks with more information Use artificial benchmarks with care Use artificial benchmarks with care Huge gaps in benchmarking for routers Huge gaps in benchmarking for routers

Beyond Placement (cont’d) Need common evaluators of delay / power Need common evaluators of delay / power To avoid inconsistent results To avoid inconsistent results Relevant initiatives from Si2 Relevant initiatives from Si2 OLA (Open Library Architecture) OLA (Open Library Architecture) OpenAccess OpenAccess For more info, see For more info, see Still: no reliable public STA tool Still: no reliable public STA tool Sought: OA-based utilities for timing/layout Sought: OA-based utilities for timing/layout

Acknowledgements Funding: GSRC (MARCO, SIA, DARPA) Funding: GSRC (MARCO, SIA, DARPA) Funding: IBM (2x) Funding: IBM (2x) Equipment grants: Intel (2x) and IBM Equipment grants: Intel (2x) and IBM Thanks for help and comments Thanks for help and comments Frank Johannes (TU Munich) Frank Johannes (TU Munich) Jason Cong, Joe Shinnerl, Min Xie (UCLA) Jason Cong, Joe Shinnerl, Min Xie (UCLA) Andrew Kahng (UCSD) Andrew Kahng (UCSD) Xiaojian Yang (Synplicity) Xiaojian Yang (Synplicity)