Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai.

Slides:



Advertisements
Similar presentations
Z. Stamenković 1, M. Giles 2, and F. Russi 2 1 IHP GmbH, Frankfurt (Oder), GERMANY 2 Synopsys Inc., Mountain View, CA, USA 13th IEEE European Test Symposium,
Advertisements

Methodology for High-Speed Clock Tree Implementation in Large Chips
TOPIC : SYNTHESIS DESIGN FLOW Module 4.3 Verilog Synthesis.
OCV-Aware Top-Level Clock Tree Optimization
© 2013 IBM Corporation Use of Hierarchical Design Methodologies in Global Infrastructure of the POWER7+ Processor Brian Veraa Ryan Nett.
Cadence Design Systems, Inc. Why Interconnect Prediction Doesn’t Work.
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Sequential Synthesis.
ISSS ’98 University of Rennes I IFSIC / IRISA C.Wolinski Hierarchical Conditional Dependency Graphs for False Path Identification A.Kountouris, C.Wolinski.
1 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea MAPLD 148:"Is Scaling the Correct Approach for Radiation Hardened Conversions.
Lou Scheffer Cadence San Jose, CA
Ch.7 Layout Design Standard Cell Design TAIST ICTES Program VLSI Design Methodology Hiroaki Kunieda Tokyo Institute of Technology.
Sherry Xiaoxia Wu*, Ravi Varadarajan †, Navneet Mohindru †, Durodami Lisk*, Riko Radojcic* *Qualcomm Inc. † Atrenta Inc. PathFinding Methodology for Interposer.
DCDL The Design Constraints Description Language An Emerging OVI Standard.
Graduate Computer Architecture I Lecture 15: Intro to Reconfigurable Devices.
A System-Level Stochastic Benchmark Circuit Generator for FPGA Architecture Research Cindy Mark Prof. Steve Wilton University of British Columbia Supported.
Kazi Spring 2008CSCI 6601 CSCI-660 Introduction to VLSI Design Khurram Kazi.
The Design Process Outline Goal Reading Design Domain Design Flow
Power-Aware Placement
Spring 07, Jan 16 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Introduction Vishwani D. Agrawal James J. Danaher.
A Timing-Driven Soft-Macro Resynthesis Method in Interaction with Chip Floorplanning Hsiao-Pin Su 1 2 Allen C.-H. Wu 1 Youn-Long Lin 1 1 Department of.
Cost-Based Tradeoff Analysis of Standard Cell Designs Peng Li Pranab K. Nag Wojciech Maly Electrical and Computer Engineering Carnegie Mellon University.
Threshold Voltage Assignment to Supply Voltage Islands in Core- based System-on-a-Chip Designs Project Proposal: Gall Gotfried Steven Beigelmacher 02/09/05.
Integration of Retiming with Architectural Floorplanning: A New Design Methodology for DSM Abdallah and Bassam Tabbara Profs: R.K.Brayton, A.R.Newton,
From Concept to Silicon How an idea becomes a part of a new chip at ATI Richard Huddy ATI Research.
1 ENTITY test is port a: in bit; end ENTITY test; DRC LVS ERC Circuit Design Functional Design and Logic Design Physical Design Physical Verification and.
Signal Integrity Methodology on 300 MHz SoC using ALF libraries and tools Wolfgang Roethig, Ramakrishna Nibhanupudi, Arun Balakrishnan, Gopal Dandu Steven.
2013 DAC Designer/User Track Presentation Inductor Design for Global Resonant Clock Distribution in a 28-nm CMOS Processor Visvesh Sathe 3, Padelis Papadopoulos.
1 Chapter 2. The System-on-a-Chip Design Process Canonical SoC Design System design flow The Specification Problem System design.
TM Efficient IP Design flow for Low-Power High-Level Synthesis Quick & Accurate Power Analysis and Optimization Flow JAN Asher Berkovitz Yaniv.
ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.
Kazi ECE 6811 ECE 681 VLSI Design Automation Khurram Kazi* Lecture 10 Thanks to Automation press THE button outcomes the Chip !!! Reality or Myth (*Mostly.
ECO Methodology for Very High Frequency Microprocessor Sumit Goswami, Srivatsa Srinath, Anoop V, Ravi Sekhar Intel Technology, Bangalore, India Introduction.
CAD for Physical Design of VLSI Circuits
ASIC Design Flow – An Overview Ing. Pullini Antonio
Logic Synthesis for Low Power(CHAPTER 6) 6.1 Introduction 6.2 Power Estimation Techniques 6.3 Power Minimization Techniques 6.4 Summary.
Lessons Learned The Hard Way: FPGA  PCB Integration Challenges Dave Brady & Bruce Riggins.
CMOS Design Methods.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
HDL-Based Layout Synthesis Methodologies Allen C.-H. Wu Department of Computer Science Tsing Hua University Hsinchu, Taiwan, R.O.C {
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
Chonnam national university VLSI Lab 8.4 Block Integration for Hard Macros The process of integrating the subblocks into the macro.
Integrated Placement and Skew Optimization for Rotary Clocking A paper by: Ganesh Venkataraman, Student Member, IEEE, Jiang Hu, Member, IEEE, and Frank.
Physical Synthesis Ing. Pullini Antonio
ASIC, Customer-Owned Tooling, and Processor Design Nancy Nettleton Manager, VLSI ASIC Device Engineering April 2000 Design Style Myths That Lead EDA Astray.
1. CAD Challenges for Leading-Edge Multimedia Designs Ira Chayut, Verification Architect (opinions are my own and do not necessarily represent the opinion.
SSV Summit November 2013 Cadence Tempus™ Timing Signoff Solution.
04/06/031 ECE 551: Digital System Design & Synthesis Lecture Set 9 9.1: Constraints and Timing (In separate file) 9.2: Optimization - Part 1 9.3: Optimization.
1 A Min-Cost Flow Based Detailed Router for FPGAs Seokjin Lee *, Yongseok Cheon *, D. F. Wong + * The University of Texas at Austin + University of Illinois.
ECE 260B – CSE 241A /UCB EECS Kahng/Keutzer/Newton Physical Design Flow Read Netlist Initial Placement Placement Improvement Cost Estimation Routing.
CHAPTER 8 Developing Hard Macros The topics are: Overview Hard macro design issues Hard macro design process Physical design for hard macros Block integration.
Dec 1, 2003 Slide 1 Copyright, © Zenasis Technologies, Inc. Flex-Cell Optimization A Paradigm Shift in High-Performance Cell-Based Design A.
Update on the Design Implementation Methodology for the 130nm process Microelecronics User Group meeting TWEPP 2010 – Aachen Sandro Bonacini CERN PH/ESE.
Routability-driven Floorplanning With Buffer Planning Chiu Wing Sham Evangeline F. Y. Young Department of Computer Science & Engineering The Chinese University.
Integrated Microsystems Lab. EE372 VLSI SYSTEM DESIGNE. Yoon 1-1 Panorama of VLSI Design Fabrication (Chem, physics) Technology (EE) Systems (CS) Matel.
VLSI Floorplanning and Planar Graphs prepared and Instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University July 2015VLSI Floor Planning and Planar.
Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California,
ASIC/FPGA design flow. Design Flow Detailed Design Detailed Design Ideas Design Ideas Device Programming Device Programming Timing Simulation Timing Simulation.
Real-Time System-On-A-Chip Emulation.  Introduction  Describing SOC Designs  System-Level Design Flow  SOC Implemantation Paths-Emulation and.
Interconnect Characteristics of 2.5-D System Integration Scheme Yangdong (Steven) Deng & Wojciech P. Maly
1/29 UTDSP: A VLIW Programmable DSP Processor Sean Hsien-en Peng Department of Electrical and Computer Engineering University of Toronto October 26 th,
-1- Soft Core Viterbi Decoder EECS 290A Project Dave Chinnery, Rhett Davis, Chris Taylor, Ning Zhang.
Gopakumar.G Hardware Design Group
ASIC Design Methodology
Top-level Schematics Digital Block Sign-off Digital Model of Chip
Revisiting and Bounding the Benefit From 3D Integration
Timing Analysis 11/21/2018.
HIGH LEVEL SYNTHESIS.
Measuring the Gap between FPGAs and ASICs
Presentation transcript:

Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai

2 Overview Introduction Challenges of hierarchical design Hierarchical methodology – Full chip physical prototyping Performance data Summary

3 Introduction As chip size and complexity grow, hierarchical design approach is necessary During last 12 months, there is a big increase in the number of chips designed with hierarchical approach The advantages of hierarchical approach is divide-and-conquer

4 The Challenges How to get full-chip (10 million gates+) physical reality early on to identify potential problems? How to have convergence process to reach design closure from beginning to end? How to achieve die utilization similar to “flat” approach? How to achieve clock speed and skews similar to “flat” approach? How to automatically generate optimal pin assignments for each module? How to automatically come up with realistic timing budgets for each module? How to achieve top level timing/signal integrity closure?

5 Creating the Physical Prototype Full-chip flat prototype delivers the complete physical, timing, clock and power data –Eliminates the guessing of the traditional block-based approaches Drives the partitioning in manageable blocks Flat Full-Chip Delivers an Accurate Physical Prototype

6 Estimation Prototyping Starts Early in the Flow Most accurate view possible at all design stages Physical timing budgeting drives synthesis RTL/ Black box 75% netlist/ Black box Complete netlist Refinement Optimization Design Completion P r o t o t y p i n g Initial timing budgets Refined timing budgets

7 Hierarchical Design Flow Flat Full Chip Physical Prototype Physically Feasible? Physical Partitioning Top Level Implementation CTS, Optimization, Power NO Optimized Top Level Netlist Die size Timing Clock skew Power SI LEF/GDSII RTL/Black Box Process Data Quick synthesis Floor planning Placement CTS Trial route Partition Data Block Implementation Place, CTS, Optimize Partition Data Partition Data Partition Data Partition Data Pin assignment Timing budget Clock spec Power grid DEF Placement Chip Level Timing Constraints DEF Placement

8 Hierarchical Partitioning Pin assignment Timing budgeting Clock tree generation Power grid planning Partitioning Independent block-level implementation SoC assembly

9 Accurate Pin Assignment Full-chip prototype results in optimal pin placement –Results in narrower channels and reduced die size –Reduces the routing congestion –Improves the chip timing Accurate Physical Prototype Flat Full-Chip Top Level Partition View

10 Timing Budgeting Each block requires: Clock definition Set_input_delay Set_output_delay Set_drive Set_load Path exceptions (false, multicycle paths) Block 1 Block 3 Block 2 L L L Accurate timing budgets result in predictable timing convergence

11 Hierarchical Clock Tree Synthesis Accurate physical timing data enables the creation of an optimal clock tree –Block-level followed by top-level clock tree Final clock tree routing generates near zero skew –Balanced tree at the top level Worst block skew + Zero top level skew = 150ps total clock skew Balanced clock tree 150ps skew 120ps skew 50ps skew 50ps skew 100ps skew 130ps skew

12 Full Chip Power Analysis

13 Hierarchical Power Grid Design P/G are planned at full chip level P/G network gets automatically pushed down during partitioning Full chip Block

14 Performance Data Design DescriptionNetlist to SDF Time 1.8M cells; 200 macros6 hours 900K cells3 hours 2.3M cells; 700 macros14 hours 2M cells; 100+ macros5 hours 2.8M cells10 hours 1.7M cells; 70 macros5 hours

15 High Performance Environment Design Import Detail Place Detail Route* RC Extract Delay Calculation Timing Analysis IPO Design Iteration 60x 4 min 4 hr 1x 3 hr 20 min 2 hr 50 min 56x 8 min 7 hr 30 min 57x 6 min 5 hr 45 min 33x 7 min 3 hr 50 min 7x 20 min 2 hr 15 min 5x 1 hr 50 min 9 hr 6x 5 hr 25 min 35 hr 40 min Design 580K cells, 0.25um process, 5LM, 100MHz Data collected on a 500MHz processor workstation (*) SPC Trial Route First Encounter Traditional

16 High Accuracy of the Prototype The prototype closely correlates with post-route layout –Comparison to ‘tape-out’ back-end flow –More than 90% of the interconnect and IO path delays within 2% Design:  5LM  0.25um  580K cells  620K nets  572 I/Os  4 blocks

17 Summary SoC Hierarchical Methodology Build a full-chip physical prototype early on –Start at RTL –Identify problems early Achieve design closure before partitioning –Close full-chip timing –Optimize die size –Meet power requirements –Resolve signal integrity issues Maintain the design closure throughout the design process