Benchmarking for [Physical] Synthesis Igor Markov and Prabhakar Kudva The Univ. of Michigan / IBM.

Slides:



Advertisements
Similar presentations
TOPIC : SYNTHESIS DESIGN FLOW Module 4.3 Verilog Synthesis.
Advertisements

Cadence Design Systems, Inc. Why Interconnect Prediction Doesn’t Work.
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Sequential Synthesis.
Reap What You Sow: Spare Cells for Post-Silicon Metal Fix Kai-hui Chang, Igor L. Markov and Valeria Bertacco ISPD’08, Pages
Modern VLSI Design 3e: Chapter 10 Copyright  2002 Prentice Hall Adapted by Yunsi Fei ECE 300 Advanced VLSI Design Fall 2006 Lecture 24: CAD Systems &
SimPL: An Effective Placement Algorithm Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan 1ICCAD 2010, Myung-Chul Kim,
Consistent Placement of Macro-Blocks Using Floorplanning and Standard-Cell Placement Saurabh Adya Igor Markov (University of Michigan)
FastPlace: Efficient Analytical Placement using Cell Shifting, Iterative Local Refinement and a Hybrid Net Model FastPlace: Efficient Analytical Placement.
: Executable Extensions of the Bookshelf Igor Markov University of Michigan, EECS DARPA.
CSE241 Formal Verification.1Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Recitation 6: Formal Verification.
March 2002 update for GSRC Igor L. Markov University of Michigan.
DARPA Bookshelf For VLSI CAD Algorithms: Progress and Future Directions Andrew E. Caldwell, Andrew B. Kahng and Igor L. Markov.
Constructive Benchmarking for Placement David A. Papa EECS Department University of Michigan Ann Arbor, MI Igor L. Markov EECS.
DUSD(Labs) GSRC bX update December 2002 Aaron Ng, Marius Eriksen and Igor Markov University of Michigan.
Benchmarking for Large-Scale Placement and Beyond S. N. Adya, M. C. Yildiz, I. L. Markov, P. G. Villarrubia, P. N. Parakh, P. H. Madden.
Andrew Kahng – November 2002 ICCAD-2002 Open Source Panel Andrew B. Kahng UC San Diego CSE & ECE Depts. Igor L. Markov Univ. of Michigan EECS Dept.
Architectural-Level Prediction of Interconnect Wirelength and Fanout Kwangok Jeong, Andrew B. Kahng and Kambiz Samadi UCSD VLSI CAD Laboratory
On Modeling and Sensitivity of Via Count in SOC Physical Implementation Kwangok Jeong Andrew B. Kahng.
Placement Feedback: A Concept and Method for Better Min-Cut Placements Andrew B. KahngSherief Reda CSE & ECE Departments University of CA, San Diego La.
On Legalization of Row-Based Placements Andrew B. KahngSherief Reda CSE & ECE Departments University of CA, San Diego La Jolla, CA 92093
1 A Tale of Two Nets: Studies in Wirelength Progression in Physical Design Andrew B. Kahng Sherief Reda CSE Department University of CA, San Diego.
University of Toronto Pre-Layout Estimation of Individual Wire Lengths Srinivas Bodapati (Univ. of Illinois) Farid N. Najm (Univ. of Toronto)
Logic Design Outline –Logic Design –Schematic Capture –Logic Simulation –Logic Synthesis –Technology Mapping –Logic Verification Goal –Understand logic.
Can Recursive Bisection Alone Produce Routable Placements? Andrew E. Caldwell Andrew B. Kahng Igor L. Markov Supported by Cadence.
abk C.A.D. Agenda u Roadmapping: “Living Roadmaps” for systems u SiP physical implementation platforms (CLC, SOS) s Tools needs u Interfaces and.
DUSD(Labs) GSRC bX update March 2003 Aaron Ng, Marius Eriksen and Igor Markov University of Michigan.
Jieyi Long and Seda Ogrenci Memik Dept. of EECS, Northwestern Univ. Jieyi Long and Seda Ogrenci Memik Dept. of EECS, Northwestern Univ. Automated Design.
Accurate Pseudo-Constructive Wirelength and Congestion Estimation Andrew B. Kahng, UCSD CSE and ECE Depts., La Jolla Xu Xu, UCSD CSE Dept., La Jolla Supported.
ECE 699: Lecture 2 ZYNQ Design Flow.
1 Chapter 7 Design Implementation. 2 Overview 3 Main Steps of an FPGA Design ’ s Implementation Design architecture Defining the structure, interface.
Placement-Centered Research Directions and New Problems Xiaojian Yang Amir Farrahi Synplicity Inc.
International Symposium of Physical Design San Diego, CA April 2002ER UCLA UCLA 1 Experimental Setup Cadence QPlace Cadence WRoute LEF/DEFLEF/DEF Dragon.
Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai.
© R.A. Rutenbar 2005 Early Research Experience With OpenAccess Gear : An Open Source Development Environment For Physical Design Zhong Xiu*, David A. Papa.
Are classical design flows suitable below 0.18  ? ISPD 2001 NEC Electronics Inc. WR0999.ppt-1 Wolfgang Roethig Senior Engineering Manager EDA R&D Group.
ECO Methodology for Very High Frequency Microprocessor Sumit Goswami, Srivatsa Srinath, Anoop V, Ravi Sekhar Intel Technology, Bangalore, India Introduction.
CAD for Physical Design of VLSI Circuits
ASIC/FPGA design flow. FPGA Design Flow Detailed (RTL) Design Detailed (RTL) Design Ideas (Specifications) Design Ideas (Specifications) Device Programming.
Horizontal Benchmark Extension for Improved Assessment of Physical CAD Research Andrew B. Kahng, Hyein Lee and Jiajia Li UC San Diego VLSI CAD Laboratory.
UC San Diego / VLSI CAD Laboratory Incremental Multiple-Scan Chain Ordering for ECO Flip-Flop Insertion Andrew B. Kahng, Ilgweon Kang and Siddhartha Nath.
Seeing the Forest and the Trees: Steiner Wirelength Optimization in Placement Jarrod A. Roy, James F. Lu and Igor L. Markov University of Michigan Ann.
1 Wire Length Prediction-based Technology Mapping and Fanout Optimization Qinghua Liu Malgorzata Marek-Sadowska VLSI Design Automation Lab UC-Santa Barbara.
-1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
1/24/20071 ECO-system: Embracing the Change in Placement Jarrod A. Roy and Igor L. Markov University of Michigan at Ann Arbor.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
Chonnam national university VLSI Lab 8.4 Block Integration for Hard Macros The process of integrating the subblocks into the macro.
Placement. Physical Design Cycle Partitioning Placement/ Floorplanning Placement/ Floorplanning Routing Break the circuit up into smaller segments Place.
Session 10: The ISPD2005 Placement Contest. 2 Outline  Benchmark & Contest Introduction  Individual placement presentation  FastPlace, Capo, mPL, FengShui,
Recursive Bisection Placement*: feng shui 5.0 Ameya R. Agnihotri Satoshi Ono Patrick H. Madden SUNY Binghamton CSD, FAIS, University of Kitakyushu (with.
Dec 1, 2003 Slide 1 Copyright, © Zenasis Technologies, Inc. Flex-Cell Optimization A Paradigm Shift in High-Performance Cell-Based Design A.
Tools - Design Manager - Chapter 6 slide 1 Version 1.5 FPGA Tools Training Class Design Manager.
Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,
1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer.
FPGA CAD 10-MAR-2003.
International Symposium on Physical Design San Diego, CA April 2002ER UCLA UCLA 1 Routability Driven White Space Allocation for Fixed-Die Standard-Cell.
Effective Linear Programming-Based Placement Techniques Sherief Reda UC San Diego Amit Chowdhary Intel Corporation.
“Bookshelf.exe”: Executable Extensions of the Bookshelf Marius Eriksen and Igor Markov University of Michigan, EECS.
ASIC/FPGA design flow. Design Flow Detailed Design Detailed Design Ideas Design Ideas Device Programming Device Programming Timing Simulation Timing Simulation.
OpenAccess Gear David Papa 1 Zhong Xiu 2, Christoph Albrecht, Philip Chong, Andreas Kuehlmann 3 Cadence Berkeley Labs 1 University of Michigan, 2 Carnegie.
C.A.D.: Bookshelf June 18, 8:00am-11:00am. Outline Review: [some of] bookshelf objectives Where we want to go vs what we have now Invited presentations.
6/19/ VLSI Physical Design Automation Prof. David Pan Office: ACES Placement (3)
ASIC Design Methodology
CAD-IP Reuse via the Bookshelf for Fundamental VLSI CAD Algorithms
Jody Matos, Augusto Neutzling, Renato Ribas and Andre Reis
Revisiting and Bounding the Benefit From 3D Integration
ECE 699: Lecture 3 ZYNQ Design Flow.
ICCAD-2002 Open Source Panel Andrew B
Measuring the Gap between FPGAs and ASICs
Presentation transcript:

Benchmarking for [Physical] Synthesis Igor Markov and Prabhakar Kudva The Univ. of Michigan / IBM

In This Talk … Benchmarking vs benchmarks Benchmarking vs benchmarks Benchmarking exposes new research Qs Benchmarking exposes new research Qs Why industry should care about benchmarking Why industry should care about benchmarking What is (and is not) being done to improve benchmarking infrastructure What is (and is not) being done to improve benchmarking infrastructure Not in this talk, but in a focus group Not in this talk, but in a focus group Incentives for verifying published work Incentives for verifying published work How to accelerate a culture change How to accelerate a culture change

Benchmarking Design benchmarks Design benchmarks Data model / representation; Instances Data model / representation; Instances Objectives (QOR metrics) and constraints Objectives (QOR metrics) and constraints Algorithms, methodologies; Implementations Algorithms, methodologies; Implementations Solvers: ditto Solvers: ditto Empirical and theoretical analyses, e.g., Empirical and theoretical analyses, e.g., Hard vs easy benchmarks (regardless of size) Hard vs easy benchmarks (regardless of size) Correlation between different objectives Correlation between different objectives Upper / lower bounds for QOR, statistical behavior, etc Upper / lower bounds for QOR, statistical behavior, etc Dualism between benchmarks and solvers Dualism between benchmarks and solvers For more details, see For more details, see

Industrial Benchmarking Growing size & complexity of VLSI chips Growing size & complexity of VLSI chips Design objectives Design objectives Area / power / yield / etc Area / power / yield / etc Design constraints Design constraints Timing / FP + fixed-die partitions / fixed IPs / routability / pin access / signal integrity… Timing / FP + fixed-die partitions / fixed IPs / routability / pin access / signal integrity… Can the same algo excel in all contexts? Can the same algo excel in all contexts? Sophistication of layout and logic motivate open benchmarking for Synthesis and P&R Sophistication of layout and logic motivate open benchmarking for Synthesis and P&R

Design Types ASICs ASICs Lots of fixed I/Os, few macros, millions of standard cells Lots of fixed I/Os, few macros, millions of standard cells Design densities : 40-80% (IBM) Design densities : 40-80% (IBM) Flat and hierarchical designs Flat and hierarchical designs SoCs SoCs Many more macro blocks, cores Many more macro blocks, cores Datapaths + control logic Datapaths + control logic Can have very low design densities : < 20% Can have very low design densities : < 20% Micro-Processor (  P) Random Logic Macros(RLM) Micro-Processor (  P) Random Logic Macros(RLM) Hierarchical partitions are LS+P&R instances (5-30K) Hierarchical partitions are LS+P&R instances (5-30K) High placement densities : 80%-98% (low whitespace) High placement densities : 80%-98% (low whitespace) Many fixed I/Os, relatively few standard cells Many fixed I/Os, relatively few standard cells Note: “Partitioning w Terminals” DAC`99, ISPD `99, ASPDAC`00

Why Invest in Benchmarking Academia Academia Benchmarks can identify / capture new research problems Benchmarks can identify / capture new research problems Empirical validation of novel research Empirical validation of novel research Open-source tools/BMs can be analyzed and tweaked Open-source tools/BMs can be analyzed and tweaked Industry Industry Evaluation and transfer of academic research Evaluation and transfer of academic research Support for executive decisions (which tools are relatively week & must be improved) Support for executive decisions (which tools are relatively week & must be improved) Open-source tools/BMs can be analyzed and tweaked Open-source tools/BMs can be analyzed and tweaked When is an EDA problem (not) solved? When is an EDA problem (not) solved? Are there good solver implementations? Are there good solver implementations? Can they “solve” existing benchmarks? Can they “solve” existing benchmarks?

Participation / Leadership Necessary Activity 1: Benchmarking platform / flows Activity 1: Benchmarking platform / flows Activity 2: Establishing common evaluators Activity 2: Establishing common evaluators Static timing analysis Static timing analysis Congestion / yield prediction Congestion / yield prediction Power estimation Power estimation Activity 3: Standard-cell libraries Activity 3: Standard-cell libraries Activity 4: Large designs w bells & whistles Activity 4: Large designs w bells & whistles Activity 5: Automation of benchmarking Activity 5: Automation of benchmarking

Activity 1: Benchmarking Platform Benchmarking “platform”: a reasonable subset of Benchmarking “platform”: a reasonable subset of data model data model specific data representations (e.g., file formats) specific data representations (e.g., file formats) access mechanisms (e.g., APIs) access mechanisms (e.g., APIs) reference implementation (e.g., a design database) reference implementation (e.g., a design database) design examples in compatible formats design examples in compatible formats Base platforms available (next slide) Base platforms available (next slide) More participation necessary More participation necessary regular discussions regular discussions additional tasks / features outlined additional tasks / features outlined

Common Methodology Platform Synthesis (SIS, MVSIS…) Placement (Capo, Dragon, Feng Shui, mPl,…) Common Model (Open Access?) Blif  Bookshelf format Blue Flow exists, Common model hooks: To be Done

Placement Utilities Accept input in the GSRC Bookshelf format Accept input in the GSRC Bookshelf format Format converters Format converters LEF/DEF  Bookshelf LEF/DEF  Bookshelf Bookshelf  Kraftwerk (DAC98 BP, E&J) Bookshelf  Kraftwerk (DAC98 BP, E&J) BLIF(SIS)  Bookshelf BLIF(SIS)  Bookshelf Evaluators, checkers, postprocessors and plotters Evaluators, checkers, postprocessors and plotters Contributions in these categories are welcome Contributions in these categories are welcome

Placement Utilities (cont’d) Wirelength Calculator (HPWL) Wirelength Calculator (HPWL) Independent evaluation of placement results Independent evaluation of placement results Placement Plotter Placement Plotter Saves gnuplot scripts ( .eps,.gif, …) Saves gnuplot scripts ( .eps,.gif, …) Multiple views (cells only, cells+nets, rows,…) Multiple views (cells only, cells+nets, rows,…) Probabilistic Congestion Maps (Lou et al.) Probabilistic Congestion Maps (Lou et al.) Gnuplot scripts Gnuplot scripts Matlab scripts Matlab scripts better graphics, including 3-d fly-by views better graphics, including 3-d fly-by views.xpm files ( .gif,.jpg,.eps, …).xpm files ( .gif,.jpg,.eps, …)

Placement Utilities (cont’d) Legality checker Legality checker Simple legalizer Simple legalizer Layout Generator Layout Generator Given a netlist, creates a row structure Given a netlist, creates a row structure Tunable %whitespace, aspect ratio, etc Tunable %whitespace, aspect ratio, etc All available in binaries/PERL at All available in binaries/PERL at Most source codes are shipped w Capo Most source codes are shipped w Capo

Activity 2: Creating Evaluators Contribute measures/analysis tools for: Contribute measures/analysis tools for: Timing Analysis Timing Analysis Congestion/Yield Congestion/Yield Power Power Area Area Noise…. Noise….

Challenges for Evaluating Timing-Driven Optimizations QOR not defined clearly QOR not defined clearly Max path-length? Worst set-up slack? Max path-length? Worst set-up slack? With false paths or without?... With false paths or without?... Evaluation methods are not replicable (often shady) Evaluation methods are not replicable (often shady) Questionable delay models, technology params Questionable delay models, technology params Net topology generators (MST, single-trunk Steiner trees) Net topology generators (MST, single-trunk Steiner trees) Inconsistent results: path delays <  gate delays Inconsistent results: path delays <  gate delays Public benchmarks?... Public benchmarks?... Anecdote: TD-place benchmarks in Verilog (ISPD `01) Anecdote: TD-place benchmarks in Verilog (ISPD `01) Companies guard netlists, technology parameters Companies guard netlists, technology parameters Cell libraries; area constraints Cell libraries; area constraints

Metrics for Timing + Reporting STA non-trivial: use PrimeTime or PKS STA non-trivial: use PrimeTime or PKS Distinguish between optimization and evaluation Distinguish between optimization and evaluation Evaluate setup-slack using commercial tools Evaluate setup-slack using commercial tools Optimize individual nets and/or paths Optimize individual nets and/or paths E.g., net-length versus allocated budgets E.g., net-length versus allocated budgets Report all relevant data Report all relevant data How was the total wirelength affected? How was the total wirelength affected? Were per-net and per-path optimizations successful? Were per-net and per-path optimizations successful? Did that improve worst slack or did something else? Did that improve worst slack or did something else? Huge slack improvements reported in some 1990s papers, but wire delays were much smaller than gate delays Huge slack improvements reported in some 1990s papers, but wire delays were much smaller than gate delays

Benchmarking Needs for Timing Opt. A common, reusable STA methodology A common, reusable STA methodology High-quality, open-source infrastructure High-quality, open-source infrastructure False paths; realistic gate/delay models False paths; realistic gate/delay models Metrics validated against phys. synthesis Metrics validated against phys. synthesis The simpler the better, but must be good predictors The simpler the better, but must be good predictors Buffer insertion profoundly impacts layout Buffer insertion profoundly impacts layout The use of linear wirelength in timing-driven layout assumes buffers insertion (min-cut vs quadratic) The use of linear wirelength in timing-driven layout assumes buffers insertion (min-cut vs quadratic) Apparently, synthesis is affected too Apparently, synthesis is affected too

Vertical Benchmarks “Tool flow” “Tool flow” Two or more EDA tools, chained sequentially (potentially, part of a complete design cycle) Two or more EDA tools, chained sequentially (potentially, part of a complete design cycle) Sample contexts: physical synthesis, place & route, retiming followed by sequential verification Sample contexts: physical synthesis, place & route, retiming followed by sequential verification Vertical benchmarks Vertical benchmarks Multiple, redundant snapshots of a tool flow sufficient info for detailed analysis of tool performance Multiple, redundant snapshots of a tool flow sufficient info for detailed analysis of tool performance Herman is maintaining a resp. slot in the VLSI CAD Bookshelf Herman is maintaining a resp. slot in the VLSI CAD Bookshelf See See Include flat gate-level netlists Include flat gate-level netlists Library information ( < 250nm) Library information ( < 250nm) Realistic timing & fixed-die constraints Realistic timing & fixed-die constraints

Infrastructure Needs Need common evaluators of delay / power Need common evaluators of delay / power To avoid inconsistent / outdated results To avoid inconsistent / outdated results Relevant initiatives from Si2 Relevant initiatives from Si2 OLA (Open Library Architecture) OLA (Open Library Architecture) OpenAccess OpenAccess For more info, see For more info, see Still: no reliable public STA tool Still: no reliable public STA tool Sought: OA-based utilities for timing/layout Sought: OA-based utilities for timing/layout

Activity 3 : Standard-cell Libraries Libraries carry technology information Libraries carry technology information Impact of wirelength delays increases in recent technology generations Impact of wirelength delays increases in recent technology generations Cell characteristics must be compatible Cell characteristics must be compatible Some benchmarks in the Bookshelf use 0.25  m and 0.35  m libraries Some benchmarks in the Bookshelf use 0.25  m and 0.35  m libraries Geometry info is there, + timing (in some cases) Geometry info is there, + timing (in some cases) Cadence test library? Cadence test library? Artisan libraries? Artisan libraries? Use commercial tools to create libraries Use commercial tools to create libraries Prolific, Cadabra,… Prolific, Cadabra,…

Activity 4: Need New Benchmarks To Confirm / Defeat Tool Tuning Data on tuning from the ISPD03 paper “Benchmarking for Placement”, Adya et al. Data on tuning from the ISPD03 paper “Benchmarking for Placement”, Adya et al. Observe that Observe that Capo does well on Cadence-Capo, grid-like circuits Capo does well on Cadence-Capo, grid-like circuits Dragon does well on IBM-Place (IBM-Dragon) Dragon does well on IBM-Place (IBM-Dragon) FengShui does well on MCNC benchmarks FengShui does well on MCNC benchmarks mPL does well on PEKO mPL does well on PEKO This is hardly a coincidence This is hardly a coincidence Motivation for more / better benchmarks Motivation for more / better benchmarks P.S. Most differences above have been explained, all placers above have been improved P.S. Most differences above have been explained, all placers above have been improved

Activity 4: Large Benchmark Creation has large designs has large designs May be a good starting point – use vendor tools to create blif files (+post results) May be a good starting point – use vendor tools to create blif files (+post results) Note: there may be different ways to convert Note: there may be different ways to convert A group of design houses (IBM, Intel, LSI, HP) is planning a release of new large gate-level benchmarks for layout A group of design houses (IBM, Intel, LSI, HP) is planning a release of new large gate-level benchmarks for layout Probably no logic information Probably no logic information

Activity 5: Benchmarking Automation Rigorous benchmarking is laborious. Risk of errors is high Rigorous benchmarking is laborious. Risk of errors is high How do we keep things simple / accessible? How do we keep things simple / accessible? Encapsulate software management in an ASP Encapsulate software management in an ASP Web uploads for binaries and source in tar.gz w Makefiles Web uploads for binaries and source in tar.gz w Makefiles Web uploads for benchmarks Web uploads for benchmarks GUI interface for NxM simulations; tables created automatically GUI interface for NxM simulations; tables created automatically GUI interface for composing tool-flows; flows can be saved/reused GUI interface for composing tool-flows; flows can be saved/reused Distributed back-end includes job scheduling Distributed back-end includes job scheduling notification of job completion notification of job completion All files created are available on the Web (permissions & policies) All files created are available on the Web (permissions & policies) Anyone can re-run / study your experiment or interface with it Anyone can re-run / study your experiment or interface with it

Follow-on Action Plan Looking for volunteers to  -test Bookshelf.exe Looking for volunteers to  -test Bookshelf.exe Particularly, in the context of synthesis & verification Particularly, in the context of synthesis & verification Contact: Igor Contact: Igor Create a joint benchmarking group from industry and academia Create a joint benchmarking group from industry and academia Contact: Prabhakar Contact: Prabhakar Regular discussions Regular discussions Development based on common infrastructure Development based on common infrastructure