Incorporating Driver Sizing Into Buffer Insertion Via a Delay Penalty Technique Chuck Alpert, IBM Chris Chu, Iowa State Milos Hrkic, UIC Jiang Hu, IBM.

Slides:



Advertisements
Similar presentations
Porosity Aware Buffered Steiner Tree Construction C. Alpert G. Gandham S. Quay IBM Corp M. Hrkic Univ Illinois Chicago J. Hu Texas A&M Univ.
Advertisements

Gregory Shklover, Ben Emanuel Intel Corporation MATAM, Haifa 31015, Israel Simultaneous Clock and Data Gate Sizing Algorithm with Common Global Objective.
Advanced Interconnect Optimizations. Buffers Improve Slack RAT = 300 Delay = 350 Slack = -50 RAT = 700 Delay = 600 Slack = 100 RAT = 300 Delay = 250 Slack.
Ispd-2007 Repeater Insertion for Concurrent Setup and Hold Time Violations with Power-Delay Trade-Off Salim Chowdhury John Lillis Sun Microsystems University.
Buffer and FF Insertion Slides from Charles J. Alpert IBM Corp.
ELEN 468 Lecture 261 ELEN 468 Advanced Logic Design Lecture 26 Interconnect Timing Optimization.
© KLMH Lienig 1 Impact of Local Interconnects and a Tree Growing Algorithm for Post-Grid Clock Distribution Jiayi Xiao.
1 Interconnect Layout Optimization by Simultaneous Steiner Tree Construction and Buffer Insertion Presented By Cesare Ferri Takumi Okamoto, Jason Kong.
An Optimal Algorithm of Adjustable Delay Buffer Insertion for Solving Clock Skew Variation Problem Juyeon Kim, Deokjin Joo, Taehan Kim DAC’13.
The Cost of Fixing Hold Time Violations in Sub-threshold Circuits Yanqing Zhang, Benton Calhoun University of Virginia Motivation and Background Power.
Minimum-Buffered Routing of Non- Critical Nets for Slew Rate and Reliability Control Supported by Cadence Design Systems, Inc. and the MARCO Gigascale.
Dynamic Programming Technique. D.P.2 The term Dynamic Programming comes from Control Theory, not computer science. Programming refers to the use of tables.
38 th Design Automation Conference, Las Vegas, June 19, 2001 Creating and Exploiting Flexibility in Steiner Trees Elaheh Bozorgzadeh, Ryan Kastner, Majid.
Interconnect Optimizations. A scaling primer Ideal process scaling: –Device geometries shrink by  = 0.7x) Device delay shrinks by  –Wire geometries.
EE4271 VLSI Design Interconnect Optimizations Buffer Insertion.
UCLA TRIO Package Jason Cong, Lei He Cheng-Kok Koh, and David Z. Pan Cheng-Kok Koh, and David Z. Pan UCLA Computer Science Dept Los Angeles, CA
Statistical timing and synthesis Chandu paper. Canonical form Compute max(A,B) = C in canonical form (assuming  X i independent)
Interconnect Optimizations
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
EE4271 VLSI Design Advanced Interconnect Optimizations Buffer Insertion.
1 Integrating Logic Retiming and Register Placement Tzu-Chieh Tien, Hsiao-Pin Su, Yu-Wen Tsay Yih-Chih Chou, and Youn-Long Lin Department of Computer Science.
ELEN 468 Lecture 271 ELEN 468 Advanced Logic Design Lecture 27 Interconnect Timing Optimization II.
EE141 © Digital Integrated Circuits 2nd Combinational Circuits 1 Logical Effort - sizing for speed.
Pei-Ci Wu Martin D. F. Wong On Timing Closure: Buffer Insertion for Hold-Violation Removal DAC’14.
Modern VLSI Design 2e: Chapter 4 Copyright  1998 Prentice Hall PTR Topics n Crosstalk. n Power optimization.
UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.
Interconnect Synthesis. Buffering Related Interconnect Synthesis Consider –Layer assignment –Wire sizing –Buffer polarity –Driver sizing –Generalized.
1 Theory I Algorithm Design and Analysis (11 - Edit distance and approximate string matching) Prof. Dr. Th. Ottmann.
Advanced Interconnect Optimizations. Timing Driven Buffering Problem Formulation Given –A Steiner tree –RAT at each sink –A buffer type –RC parameters.
03/30/031 ECE 551: Digital System Design & Synthesis Lecture Set 9 9.1: Constraints and Timing 9.2: Optimization (In separate file)
VLSI Physical Design: From Graph Partitioning to Timing Closure Paper Presentation © KLMH Lienig 1 EECS 527 Paper Presentation Accurate Estimation of Global.
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Interconnect design. n Crosstalk. n Power optimization.
Discrete Gate Sizing CENG 5270 – Tutorial 9 WILLIAM CHOW.
1 Coupling Aware Timing Optimization and Antenna Avoidance in Layer Assignment Di Wu, Jiang Hu and Rabi Mahapatra Texas A&M University.
EE 5900 Advanced Algorithms for Robust VLSI CAD, Spring 2009 Static Timing Analysis and Gate Sizing.
A Polynomial Time Approximation Scheme For Timing Constrained Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, Charles J. Alpert** *Dept of Electrical.
CSCE350 Algorithms and Data Structure Lecture 17 Jianjun Hu Department of Computer Science and Engineering University of South Carolina
Dynamic Programming. Well known algorithm design techniques:. –Divide-and-conquer algorithms Another strategy for designing algorithms is dynamic programming.
Wen-Hao Liu 1, Yih-Lang Li 1, and Kai-Yuan Chao 2 1 Department of Computer Science, National Chiao-Tung University, Hsin-Chu, Taiwan 2 Intel Architecture.
Prof. Joongho Choi CMOS CLOCK-RELATED CIRCUIT DESIGN Integrated Circuits Spring 2001 Dept. of ECE University of Seoul.
A NEW ECO TECHNOLOGY FOR FUNCTIONAL CHANGES AND REMOVING TIMING VIOLATIONS Jui-Hung Hung, Yao-Kai Yeh,Yung-Sheng Tseng and Tsai-Ming Hsieh Dept. of Information.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 EECS 527 Paper Presentation Techniques for Fast.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Combinational network delay. n Logic optimization.
ECO Timing Optimization Using Spare Cells Yen-Pin Chen, Jia-Wei Fang, and Yao-Wen Chang ICCAD2007, Pages ICCAD2007, Pages
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Interconnect design. n Crosstalk. n Power optimization.
A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological.
ELEN 468 Lecture 271 ELEN 468 Advanced Logic Design Lecture 27 Gate and Interconnect Optimization.
Fast Algorithms for Slew Constrained Minimum Cost Buffering S. Hu*, C. Alpert**, J. Hu*, S. Karandikar**, Z. Li*, W. Shi* and C. Sze** *Dept of ECE, Texas.
Topics Combinational network delay.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Circuit design for FPGAs n Static CMOS gate vs. LUT n LE output drivers n Interconnect.
Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,
Routing Tree Construction with Buffer Insertion under Obstacle Constraints Ying Rao, Tianxiang Yang Fall 2002.
CEC 220 Digital Circuit Design Timing Diagrams, MUXs, and Buffers Mon, Oct 5 CEC 220 Digital Circuit Design Slide 1 of 20.
CEC 220 Digital Circuit Design Timing Diagrams, MUXs, and Buffers Friday, February 14 CEC 220 Digital Circuit Design Slide 1 of 18.
CEC 220 Digital Circuit Design Timing Diagrams, MUXs, and Buffers
Logic synthesis flow Technology independent mapping –Two level or multilevel optimization to optimize a coarse metric related to area/delay Technology.
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Combinational network delay. n Logic optimization.
An Efficient Surface-Based Low-Power Buffer Insertion Algorithm
Static Timing Analysis
A Fully Polynomial Time Approximation Scheme for Timing Driven Minimum Cost Buffer Insertion Shiyan Hu*, Zhuo Li**, Charles Alpert** *Dept of Electrical.
An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types Authors: Zhuo Li and Weiping Shi Presenter: Sunil Khatri Department of Electrical.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
An O(nm) Time Algorithm for Optimal Buffer Insertion of m Sink Nets Zhuo Li and Weiping Shi {zhuoli, Texas A&M University College Station,
Maze Routing with Buffer Insertion and Wire sizing Minghorng Lai, D.F. Wong DAC 2000.
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Two-phase Latch based design
Buffer Insertion with Adaptive Blockage Avoidance
2 University of California, Los Angeles
Buffered tree construction for timing optimization, slew rate, and reliability control Abstract: With the rapid scaling of IC technology, buffer insertion.
Buffered Steiner Trees for Difficult Instances
Presentation transcript:

Incorporating Driver Sizing Into Buffer Insertion Via a Delay Penalty Technique Chuck Alpert, IBM Chris Chu, Iowa State Milos Hrkic, UIC Jiang Hu, IBM Stephen Quay, IBM Gopal Gandham, IBM Chandramouli Kashyap, IBM

2 Which One is Not Like the Others?

3 Buffer insertion Driver sizing Wire sizing Steiner tree BIWS

4 Why Simultaneous Optimization? Electrically-challenged net Driver sizing alone Buffer insertion alone Simultaneous optimization

5 Integrating Driver Sizing Three choices

6 Driver Sizing Affects Multiple Nets slow stage

7 Upstream Capacitance Effects Slack (ns) # Nets Optimized

8 The Driver Sizing Penalty C1C1C1C1 C2C2C2C2 C1C1C1C1 Decoupling buffers Penalty is delay through fastest decoupling buffer/inverter chain

9 Delay Penalty Algorithm Continuous buffer library not realizable Assume set of discrete buffers B1,..., Bn such that C B1 < C B2 <... < C Bn monotone function delay(Bi, C) Apply dynamic programming

10 Example B1B1 Given optimal chains B2B2 B3B3 B4B4 C B5 Find optimal chain for B5

11 Dynamic Programming Recurrence To drive capacitance C Bi, combine optimal chain driving C Bj with buffer Bj D(C B1 )=0 D(C Bi )=min 0<j<i {D(C Bi ) + Delay(Bj, C Bi )}

12 Advantages O(n 2 ) complexity Compute once as a lookup table Can handle inverters and slew Virtually no CPU cost Applicable for many approaches

13 Experiments Five unoptimized circuits (73 – 303K cells) Three approaches VG (no driver sizing) Max (driver sizing with no delay penalty) DP (driver sizing with delay penalty) Run on thousands of nets

14 Total Buffers Inserted

15 Number of Upsized drivers

16 Total Area Percentage Increase

17 Worst Slack

18 And So... Simple to combine buffer insertion with driver sizing Virtually no CPU impact Extends to many buffer insertion approaches No timing graph queries Works well