Timing Margin Recovery With Flexible Flip-Flop Timing Model

Slides:

Advertisements

Similar presentations

OCV-Aware Top-Level Clock Tree Optimization

Advertisements

-1- VLSI CAD Laboratory, UC San Diego Post-Routing BEOL Layout Optimization for Improved Time- Dependent Dielectric Breakdown (TDDB) Reliability Tuck-Boon.

Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Sequential Synthesis.

A Robust, Fast Pulsed Flip- Flop Design By: Arunprasad Venkatraman Rajesh Garg Sunil Khatri Department of Electrical and Computer Engineering, Texas A.

Chapter 6 –Selected Design Topics Part 2 – Propagation Delay and Timing Logic and Computer Design Fundamentals.

Minimum Implant Area-Aware Gate Sizing and Placement

1 Lecture 28 Timing Analysis. 2 Overview °Circuits do not respond instantaneously to input changes °Predictable delay in transferring inputs to outputs.

UC San Diego / VLSI CAD Laboratory NOLO: A No-Loop, Predictive Useful Skew Methodology for Improved Timing in IC Implementation Tuck-Boon Chan, Andrew.

The Cost of Fixing Hold Time Violations in Sub-threshold Circuits Yanqing Zhang, Benton Calhoun University of Virginia Motivation and Background Power.

Yuanlin Lu Intel Corporation, Folsom, CA Vishwani D. Agrawal

ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN Dr. Shi Dept. of Electrical and Computer Engineering.

Timing Analysis Timing Analysis Instructor: Dr. Vishwani D. Agrawal ELEC 7770 Advanced VLSI Design Team Project.

Background: Scan-Based Delay Fault Testing Sequentially apply initialization, launch test vector pairs that differ by 1-bit shift A vector pair induces.

Power-Aware Placement

UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.

Statistical Crosstalk Aggressor Alignment Aware Interconnect Delay Calculation Supported by NSF & MARCO GSRC Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego.

Embedded Systems Hardware:

Impact of Guardband Reduction on Design Process Outcomes Kwangok Jeong Andrew B. Kahng Kambiz Samadi

Architectural-Level Prediction of Interconnect Wirelength and Fanout Kwangok Jeong, Andrew B. Kahng and Kambiz Samadi UCSD VLSI CAD Laboratory

Continuous Retiming EECS 290A Sequential Logic Synthesis and Verification.

Chung-Kuan Cheng†, Andrew B. Kahng†‡,

On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.

1 UCSD VLSI CAD Laboratory ISQED-2009 Revisiting the Linear Programming Framework for Leakage Power vs. Performance Optimization Kwangok Jeong, Andrew.

Toward Performance-Driven Reduction of the Cost of RET-Based Lithography Control Dennis Sylvester Jie Yang (Univ. of Michigan,

A Proposal for Routing-Based Timing-Driven Scan Chain Ordering Puneet Gupta 1 Andrew B. Kahng 1 Stefanus Mantik 2

Statistical Gate Delay Calculation with Crosstalk Alignment Consideration Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego

ECE Synthesis & Verification 1 ECE 667 ECE 667 Synthesis and Verification of Digital Systems Retiming.

Methodology from Chaos in IC Implementation Kwangok Jeong * and Andrew B. Kahng *,** * ECE Dept., UC San Diego ** CSE Dept., UC San Diego.

UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.

Timing Analysis and Optimization Implications of Bimodal CD Distribution in Double Patterning Lithography Kwangok Jeong and Andrew B. Kahng VLSI CAD LABORATORY.

UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.

DDRO: A Novel Performance Monitoring Methodology Based on Design-Dependent Ring Oscillators Tuck-Boon Chan †, Puneet Gupta §, Andrew B. Kahng †‡ and Liangzhen.

Optimization of Linear Problems: Linear Programming (LP) © 2011 Daniel Kirschen and University of Washington 1.

Enhanced Metamodeling Techniques for High-Dimensional IC Design Estimation Problems Andrew B. Kahng, Bill Lin and Siddhartha Nath VLSI CAD LABORATORY,

UC San Diego / VLSI CAD Laboratory Reliability-Constrained Die Stacking Order in 3DICs Under Manufacturing Variability Tuck-Boon Chan, Andrew B. Kahng,

Andrew B. Kahng‡†, Mulong Luo†, Siddhartha Nath†

Dose Map and Placement Co-Optimization for Timing Yield Enhancement and Leakage Power Reduction Kwangok Jeong, Andrew B. Kahng, Chul-Hong Park, Hailong.

DELAY INSERTION METHOD IN CLOCK SKEW SCHEDULING BARIS TASKIN and IVAN S. KOURTEV ISPD 2005 High Performance Integrated Circuit Design Lab. Department of.

Accuracy-Configurable Adder for Approximate Arithmetic Designs

-1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo.

A New Methodology for Reduced Cost of Resilience Andrew B. Kahng, Seokhyeong Kang and Jiajia Li UC San Diego VLSI CAD Laboratory.

UC San Diego / VLSI CAD Laboratory Toward Quantifying the IC Design Value of Interconnect Technology Improvement Tuck-Boon Chan, Andrew B. Kahng, Jiajia.

Horizontal Benchmark Extension for Improved Assessment of Physical CAD Research Andrew B. Kahng, Hyein Lee and Jiajia Li UC San Diego VLSI CAD Laboratory.

ASIC Design Flow – An Overview Ing. Pullini Antonio

UC San Diego / VLSI CAD Laboratory Incremental Multiple-Scan Chain Ordering for ECO Flip-Flop Insertion Andrew B. Kahng, Ilgweon Kang and Siddhartha Nath.

ECE Advanced Digital Systems Design Lecture 12 – Timing Analysis Capt Michael Tanner Room 2F46A HQ U.S. Air Force Academy I n t e g r i.

-1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI.

Kwangsoo Han, Andrew B. Kahng, Hyein Lee and Lutong Wang

A Power Grid Analysis and Verification Tool Based on a Statistical Prediction Engine M.K. Tsiampas, D. Bountas, P. Merakos, N.E. Evmorfopoulos, S. Bantas.

Kwangsoo Han‡, Andrew B. Kahng‡† and Hyein Lee‡

EEE2243 Digital System Design Chapter 7: Advanced Design Considerations by Muhazam Mustapha, extracted from Intel Training Slides, April 2012.

Skewed Flip-Flop Transformation for Minimizing Leakage in Sequential Circuits Jun Seomun, Jaehyun Kim, Youngsoo Shin Dept. of Electrical Engineering, KAIST,

Outline Introduction: BTI Aging and AVS Signoff Problem

D FLIP FLOP DESIGN AND CHARACTERIZATION -BY LAKSHMI SRAVANTHI KOUTHA.

UC San Diego / VLSI CAD Laboratory Learning-Based Approximation of Interconnect Delay and Slew Modeling in Signoff Timing Tools Andrew B. Kahng, Seokhyeong.

Mixed Cell-Height Implementation for Improved Design Quality in Advanced Nodes Sorin Dobre +, Andrew B. Kahng * and Jiajia Li * * UC San Diego VLSI CAD.

1 COMP541 Sequential Logic Timing Montek Singh Sep 30, 2015.

Outline Motivation and Contributions Related Works ILP Formulation

Synchronous Sequential Circuits by Dr. Amin Danial Asham.

-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,

-1- Delay Uncertainty and Signal Criticality Driven Routing Channel Optimization for Advanced DRAM Products Samyoung Bang #, Kwangsoo Han ‡, Andrew B.

Effective Linear Programming-Based Placement Techniques Sherief Reda UC San Diego Amit Chowdhary Intel Corporation.

-1- UC San Diego / VLSI CAD Laboratory Optimal Reliability-Constrained Overdrive Frequency Selection in Multicore Systems Andrew B. Kahng and Siddhartha.

Yuxi Liu The Chinese University of Hong Kong Circuit Timing Problem Driven Optimization.

Kun Young Chung*, Andrew B. Kahng+ and Jiajia Li+

Time-borrowing platform in the Xilinx UltraScale+ family of FPGAs and MPSoCs Ilya Ganusov, Benjamin Devlin.

Improved Performance of 3DIC Implementations Through Inherent Awareness of Mix-and-Match Die Stacking Kwangsoo Han, Andrew B. Kahng and Jiajia Li University.

Revisiting and Bounding the Benefit From 3D Integration

Timing Analysis 11/21/2018.

Fast Min-Register Retiming Through Binary Max-Flow

Presentation transcript:

Timing Margin Recovery With Flexible Flip-Flop Timing Model Andrew B. Kahng and Hyein Lee UC San Diego VLSI CAD Laboratory

Outline Preliminary Motivation Related Work Sequential LP-based Optimization Experimental Results Conclusions and Future Work

Preliminary: Static Timing Analysis Timing corners Min corner: the corner where gate/wire delay is minimum Max corner: the corner where gate/wire delay is maximum Timing modes Scenarios where different functions are performed in a design Test mode, function mode, etc. Types of analyses Graph-based analysis: Considers only the worst/best case Path-based analysis: Considers input vectors for more accurate analysis

Preliminary: Flip-Flop (FF) Timing Model FF timing components Setup time: the minimum amount of time input data should be steady before clock Hold time: the minimum amount of time input data should be steady after clock Clock-to-q (c2q) delay: the delay of output from clock Conventional timing model: Setup/hold time and c2q delay are fixed values

Motivation: Flexible FF Timing Setup/hold time/c2q delay is NOT a single value Various setup-hold-c2q sets can be used for timing analysis ⇒ Flexible FF timing model ⇒ Reduce pessimism ⇒ Tradeoff among setup/hold/c2q delay setup hold c2q1 c2qn ... setup-hold-c2q flexible model

Motivation: Why Flexible FF Timing? If data paths are independent of each other in PBA, Using fixed FF timing model can loose performance optimization opportunity Flexible timing model could reduce pessimism setup: 10ps c2q: 20ps setup: 20ps c2q: 10ps FF1 470ps 460ps 480ps 470ps 480ps 460ps Total: 500ps FF3 FF2 20ps 10ps  500ps!  520ps?

Related Work Setup-hold time interdependency Characterization [7] [8] [10] Timing analysis [1] [2] [3] [4] [5] [6] However, no consideration of c2q delay Setup-hold-c2q interdependency [1] Propose a timing analysis method by exploiting flexible flip-flop timing model However, iterative search can result in suboptimal solutions Our work Based on setup-hold-c2q interdependency Linear programming-based global optimization Mode/corner-specific timing analysis by exploiting flexible FF timing model

Outline Preliminary Motivation Related Work Sequential LP-based Optimization Experimental Results Conclusions and Future Work

Problem Formulation Objective: Find the best combination of setup/hold time and c2q for each FF to minimize setup/hold violations Solution space: 3d surface  not easy to obtain an accurate analytical model with three variables To reduce the dimension, we divide setup-hold-c2q optimization into: setup-c2q optimization + hold-c2q optimization setup c2q C2q-setup-hold surface setup hold c2q hold c2q

Problem Formulation: Sequential LP Solve two sub-problems sequentially The solution from one problem ⇒ used as input to another problem The sequence of solving problems can change depending on which problem is more critical Setup-c2q optimization Maximize setup slack Subject to c2q + dmax + Tsu + Ssu ≤ P c2q = f(Tsu) L ≤ Tsu ≤ U Hold-c2q optimization Maximize setup and hold slack Subject to c2q + dmax + Tsu + Ssu ≤ P dmin + Sh > Th c2q = f(Th) L ≤ Th ≤ U Where dmax/dmin : max/min data path delay, Tsu: setup time, Th: hold time, Ssu: setup slack, Sh: hold slack

Timing Signoff Across Corners/Modes Setup/hold time does not have to be a single value across corners/modes! Timing signoff across corners Max delay corner  hold violation , setup violation  ⇒ Select optimal setup time first Min delay corner  setup violation , hold violation  ⇒ Select optimal hold time first Timing signoff across modes Scan test mode  hold violation  ⇒ Select optimal hold time first The flexible flip-flop timing model is helpful especially for multi-corner multi-mode timing analysis. This is because, we don’t need to use a single value for each of setup time, hold time and c2q delay. For example, considering multiple corners, we have less hold time violations and more setup time violations at max delay corner. So, we can first optimize setup time. Similarly, at min delay corner, we have less setup time violations and more hold time violations. In this case, we can optimize hold time first. For multi-mode analysis, the same method can be applied. For instance, scan test mode can have more hold time violations because scan paths have less number of stages.

New Timing Signoff: Max Corner Setup-c2q optimization is performed first setup-c2q optimization for max delay paths Annotate setup/c2q to each FF hold-c2q optimization for hold violated paths Annotate hold/c2q to each FF

Characterization Setup-hold-c2q curves are characterized with exhaustive SPICE simulation Setup-hold-c2q triplets are obtained at every 5ps of timing points Use a pulse as the input data to characterize setup/hold time interdependency setup time: data rise to clock rise, hold time: clock rise to data fall, c2q: clock rise to q rise clock data input q output data slew clock slew (for rising edge triggered FF and rise input) c2q setup time hold time

New Timing Signoff: Overall Flow Netlist (and SPEF, if routed) Extract path timing information LP formulation with flexible flip-flop timing model Solve Sequential LP (STA_FTmax , STA_FTmin) Solution Annotate new timing model for each flip-flop Timing signoff with annotated timing

Outline Preliminary Motivation Related Work Sequential LP-based Optimization Experimental Results Conclusions and Future Work

Experimental Setup Testcases Commercial flow Open source designs with 65nm foundry technology Commercial flow Logic synthesis: Synopsys Design/DFT Compiler H-2013.03-SP3 P&R: Cadence Encounter Digital Implementation System XL 10.1 Timing signoff: Synopsys PrimeTime H-2013.06-SP2 LP solver: CPLEX 12.5.1 Our method is compared with Conventional: Conventional fixed FF timing model [4]: Flexible setup/hold time with fixed c2q delays cTool: a commercial tool with setup-hold pessimism reduction functionality

Experiment Results Results of corner-specific timing analysis at max/min corner In mode-specific analysis, the summation of setup/hold slacks is improved Our method fixes negative setup/hold time violations “for free” Worst setup slack Worst hold slack [ns]

Conclusion We exploit a flexible flip-flop timing model: three-dimensional tradeoff among setup time, hold time and clock-to-q delay We apply sequential LP-based approaches for multi-corner/mode timing signoff Worst slack improves by 48ps on average and by up to 130ps with 65nm technology (inverter delay = ~50ps) Future work Demonstration with advanced technologies More accurate timing model of setup-hold-c2q tradeoff Circuit optimization by exploiting FF timing model flexibility

Thank you!