CSE241 L4 System.1Kahng & Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Lecture 04: Static Timing Analysis.

Slides:



Advertisements
Similar presentations
OCV-Aware Top-Level Clock Tree Optimization
Advertisements

1 COMP541 Flip-Flop Timing Montek Singh Oct 6, 2014.
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Sequential Synthesis.
Timing Analysis - Delay Analysis Models
Logic Synthesis – 3 Optimization Ahmed Hemani Sources: Synopsys Documentation.
Modern VLSI Design 4e: Chapter 5 Copyright  2008 Wayne Wolf Topics n Performance analysis of sequential machines.
Introduction to CMOS VLSI Design Sequential Circuits
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 24: November 4, 2011 Synchronous Circuits.
ECE 667 Synthesis & Verificatioin - FPGA Mapping 1 ECE 667 Synthesis and Verification of Digital Systems Technology Mapping for FPGAs D.Chen, J.Cong, DAOMap.
Ch.7 Layout Design Standard Cell Design TAIST ICTES Program VLSI Design Methodology Hiroaki Kunieda Tokyo Institute of Technology.
Clock Skewing EECS 290A Sequential Logic Synthesis and Verification.
Sequential Timing Optimization. Long path timing constraints Data must not reach destination FF too late s i + d(i,j) + T setup  s j + P s i s j d(i,j)
1 Lecture 28 Timing Analysis. 2 Overview °Circuits do not respond instantaneously to input changes °Predictable delay in transferring inputs to outputs.
Logic Gate Delay Modeling -III
Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004.
CSE241 Formal Verification.1Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Recitation 6: Formal Verification.
Introduction to CMOS VLSI Design Clock Skew-tolerant circuits.
EE141 © Digital Integrated Circuits 2nd Timing Issues 1 Digital Integrated Circuits A Design Perspective Timing Issues Jan M. Rabaey Anantha Chandrakasan.
Synchronous Digital Design Methodology and Guidelines
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 19: Timing Issues; Introduction to Datapath.
Clock Design Adopted from David Harris of Harvey Mudd College.
RTL Hardware Design by P. Chu Chapter 161 Clock and Synchronization.
ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN Dr. Shi Dept. of Electrical and Computer Engineering.
Spring 2002EECS150 - Lec03-Timing Page 1 EECS150 - Digital Design Lecture 3 - Timing January 29, 2002 John Wawrzynek.
ENGIN112 L28: Timing Analysis November 7, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 28 Timing Analysis.
Modern VLSI Design 2e: Chapter4 Copyright  1998 Prentice Hall PTR.
Sequential Logic 1  Combinational logic:  Compute a function all at one time  Fast/expensive  e.g. combinational multiplier  Sequential logic:  Compute.
Pipelining and Retiming 1 Pipelining  Adding registers along a path  split combinational logic into multiple cycles  increase clock rate  increase.
Penn ESE Fall DeHon 1 ESE (ESE534): Computer Organization Day 19: March 26, 2007 Retime 1: Transformations.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 25: Sequential Circuit Design (3/3) Prof. Sherief Reda Division of Engineering,
Logic Simulation 2 Outline –Timing Models –Simulation Algorithms Goal –Understand timing models –Understand simulation algorithms Reading –Gate-Level Simulation.
CS294-6 Reconfigurable Computing Day 16 October 15, 1998 Retiming.
Processing Rate Optimization by Sequential System Floorplanning Jia Wang 1, Ping-Chih Wu 2, and Hai Zhou 1 1 Electrical Engineering & Computer Science.
Chapter #6: Sequential Logic Design 6.2 Timing Methodologies
ELEN 468 Lecture 221 ELEN 468 Advanced Logic Design Lecture 22 Timing Verification.
FPGA Technology Mapping. 2 Technology mapping:  Implements the optimized nodes of the Boolean network to the target device library.  For FPGA, library.
CS 151 Digital Systems Design Lecture 28 Timing Analysis.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 13, 2008 Retiming.
STATIC TIMING ANALYSIS
1 CSE370, Lecture 16 Lecture 19 u Logistics n HW5 is due today (full credit today, 20% off Monday 10:29am, Solutions up Monday 10:30am) n HW6 is due Wednesday.
ECE 260B – CSE 241A Timing Analysis & Correction 1http://vlsicad.ucsd.edu ECE260B – CSE241A Winter 2005 Timing Analysis and Correction Website:
03/30/031 ECE 551: Digital System Design & Synthesis Lecture Set 9 9.1: Constraints and Timing 9.2: Optimization (In separate file)
DELAY INSERTION METHOD IN CLOCK SKEW SCHEDULING BARIS TASKIN and IVAN S. KOURTEV ISPD 2005 High Performance Integrated Circuit Design Lab. Department of.
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Interconnect design. n Crosstalk. n Power optimization.
Kazi ECE 6811 ECE 681 VLSI Design Automation Khurram Kazi* Lecture 10 Thanks to Automation press THE button outcomes the Chip !!! Reality or Myth (*Mostly.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Combinational network delay. n Logic optimization.
EEE2243 Digital System Design Chapter 7: Advanced Design Considerations by Muhazam Mustapha, extracted from Intel Training Slides, April 2012.
1 CSE370, Lecture 17 Lecture 17 u Logistics n Lab 7 this week n HW6 is due Friday n Office Hours íMine: Friday 10:00-11:00 as usual íSara: Thursday 2:30-3:20.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Interconnect design. n Crosstalk. n Power optimization.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 7: February 3, 2002 Retiming.
ECE 551 Fall /6/2001 ECE Digital System Design & Synthesis Lecture 2 - Pragmatic Design Issues Overview  Classification of Issues oThree-State.
Timing Analysis Section Delay Time Def: Time required for output signal Y to change due to change in input signal X Up to now, we have assumed.
Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,
Sp09 CMPEN 411 L18 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 16: Static Sequential Circuits [Adapted from Rabaey’s Digital Integrated Circuits,
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Combinational network delay. n Logic optimization.
1 COMP541 Sequential Logic Timing Montek Singh Sep 30, 2015.
Static Timing Analysis
Clocking System Design
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 24: November 5, 2012 Synchronous Circuits.
04/21/20031 ECE 551: Digital System Design & Synthesis Lecture Set : Functional & Timing Verification 10.2: Faults & Testing.
Fast Synthesis of Clock Gating from Existing Logic Aaron P. Hurst Univ. of California, Berkeley Portions In Collaboration with… Artur Quiring and Andreas.
Retiming EECS 290A Sequential Logic Synthesis and Verification.
Timing issues.
Introduction to Static Timing Analysis:
Timing Analysis 11/21/2018.
Topics Performance analysis..
Chapter 10 Timing Issues Rev /11/2003 Rev /28/2003
332:578 Deep Submicron VLSI Design Lecture 14 Design for Clock Skew
Timing Analysis and Optimization of Sequential Circuits
Presentation transcript:

CSE241 L4 System.1Kahng & Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Lecture 04: Static Timing Analysis

CSE241 L4 System.2Kahng & Cichy, UCSD ©2003  All s and announcements are archived on the class web page  Remember: tomorrow (Friday, January 17) noon- 1:20pm (EBUI 3327) prepares you for the synthesis lecture on Tuesday (see syllabus on web for expected content)  HW questions 4-6 due today  Lab #1 report due Monday; turn-in instructions and script announced last night  This class: introduction to static timing analysis  Goals:  How does STA work?  What are use models for STA?  (What kinds of things are done with STA results?) Logistics

CSE241 L4 System.3Kahng & Cichy, UCSD ©2003  Determine fastest permissible clock speed (e.g. 100MHz) by determining delay (including set-up and hold time) of longest path from register to register (e.g. 10ns)  Largely eliminates need for gate-level simulation to verify delay of the circuit Goal of Static Timing Verification clk Combinational logic clk Combinational logic clk Combinational logic Courtesy K. Keutzer et al. UCB

CSE241 L4 System.4Kahng & Cichy, UCSD ©2003 Cycle Time - Critical Path Delay  Cycle time (T) cannot be smaller than longest path delay (T max )  Longest (critical) path delay is a function of:  Total gate, wire delays l logic levels clock Q1 Q2 T clock1 T clock2 critical path, ~5 logic levels T clock1 data cycle time Courtesy K. Keutzer et al. UCB

CSE241 L4 System.5Kahng & Cichy, UCSD ©2003 Cycle Time - Setup Time  For FFs to correctly capture data, must be stable for:  Setup time (T setup ) before clock arrives clock Q1 Q2 T clock1 T clock2 critical path, ~5 logic levels T clock1 data setup time Courtesy K. Keutzer et al. UCB

CSE241 L4 System.6Kahng & Cichy, UCSD ©2003 Cycle Time – Clock Skew clock Q1 Q2 T clock1 T clock2 T clock1 T clock2 Q2 data clock skew Q2 6  If clock network has unbalanced delay – clock skew  Cycle time is also a function of clock skew (T skew ) critical path, ~5 logic levels Courtesy K. Keutzer et al. UCB

CSE241 L4 System.7Kahng & Cichy, UCSD ©2003 Cycle Time – Flip-Flop Delay (Clock to Q)  Cycle time is also a function of propagation delay of FF (T clk-to-Q )  T clk-to-Q : time from arrival of clock signal till change at FF output) clock Q1 Q2 T clock1 T clock2 T clock1 T clock2 Q2 clock-to-Q data Q2 critical path, ~5 logic levels Courtesy K. Keutzer et al. UCB

CSE241 L4 System.8Kahng & Cichy, UCSD ©2003 Min Path Delay - Hold Time  For FFs to correctly latch data, data must be stable during:  Hold time (T hold ) after clock arrives  Determined by delay of shortest path in circuit (T min ) and clock skew (T skew ) clock Q1 Q2 T clock1 T clock2 short path, ~3 logic levels T clock1 data hold time Courtesy K. Keutzer et al. UCB

CSE241 L4 System.9Kahng & Cichy, UCSD ©2003 Setup, Hold, Cycle Times set-up time – D stable before clock cycle time Example of a single phase clock hold time – D stable after clock When signal may change Courtesy K. Keutzer et al. UCB

CSE241 L4 System.10Kahng & Cichy, UCSD ©2003 Approach – Reduce to Combinational clk Combinational logic clk Combinational logic clk Combinational logic original circuit extracted block Combinational logic Courtesy K. Keutzer et al. UCB

CSE241 L4 System.11Kahng & Cichy, UCSD ©2003 Combinational Block  Arrival time in green A C B f X Y Z W  Interconnect delay in red  Gate delay in blue  What’s the right mathematical object to use to represent this physical object? Courtesy K. Keutzer et al. UCB

CSE241 L4 System.12Kahng & Cichy, UCSD ©2003 Problem Formulation - 1 C B f X Y W A A C B f X Y Z W Z  Use a labeled directed graph  G =  Vertices represent gates, primary inputs and primary outputs  Edges represent wires  Labels represent delays  HW #7: In this graph model, both nodes and edges have delays. How would you modify this model so that only edges have delay, yet the same timing analysis remains possible? Courtesy K. Keutzer et al. UCB

CSE241 L4 System.13Kahng & Cichy, UCSD ©2003 Problem Formulation – Actual Arrival Time  Actual arrival time A(v) for a node v is latest time that signal can arrive at node v X Y A(Z) Z A(X) A(Y) Courtesy K. Keutzer et al. UCB

CSE241 L4 System.14Kahng & Cichy, UCSD ©2003 Problem Formulation - 2 C B f X Y W A A C B f X Y Z W Z  Use a labeled directed graph  G =  Enumerate all paths - choose the longest? Courtesy K. Keutzer et al. UCB

CSE241 L4 System.15Kahng & Cichy, UCSD ©2003 Problem Formulation - 3 C B f X Y W A Z Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Origin (Kirkpatrick 1966, IBM JRD) Courtesy K. Keutzer et al. UCB

CSE241 L4 System.16Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W A Z Origin 0 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

CSE241 L4 System.17Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W A Z Origin 0 0 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

CSE241 L4 System.18Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W A Z Origin Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

CSE241 L4 System.19Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W A Z Origin Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

CSE241 L4 System.20Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W A Z Origin 1.1 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

CSE241 L4 System.21Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W A Z Origin Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

CSE241 L4 System.22Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W A Z Origin Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

CSE241 L4 System.23Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W A Z Origin Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

CSE241 L4 System.24Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W A Z Origin Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

CSE241 L4 System.25Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W A Z Origin Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

CSE241 L4 System.26Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W A Z Origin Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

CSE241 L4 System.27Kahng & Cichy, UCSD ©2003 Critical Path (sub-graph) C B f X Y W A Z Origin Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

CSE241 L4 System.28Kahng & Cichy, UCSD ©2003 Timing for Optimization: Extra Requirements  Longest-path algorithm computes arrival times at each node  If we have constraints, need to propagate slack to each node l A measure of how much timing margin exists at each node l Can optimize a particular branch  Can trade slack for power, area, robustness clock Courtesy K. Keutzer et al. UCB

CSE241 L4 System.29Kahng & Cichy, UCSD ©2003 Required Arrival Time  Required arrival time R(v) is the time before which a signal must arrive to avoid a timing violation  Then recursively X Y R(Z) Z R(X) R(Y) Courtesy K. Keutzer et al. UCB

CSE241 L4 System.30Kahng & Cichy, UCSD ©2003 Required Time Propagation: Example C B f X Y W A Z  Assume required time at output R(f) = 5.80  Propagate required times backwards 0 O Origin Courtesy K. Keutzer et al. UCB

CSE241 L4 System.31Kahng & Cichy, UCSD ©2003 Timing Slack  From arrival and required time can compute slack. For each node v:  Slack reflects criticality of a node  Positive slack l Node is not on critical path. Timing constraints met.  Zero slack l Node is on critical path. Timing constraints are barely met.  Negative slack l There is a timing violation  Slack distribution is key for timing optimization! Courtesy K. Keutzer et al. UCB

CSE241 L4 System.32Kahng & Cichy, UCSD ©2003 Timing Slack Computation: Example  Compute slack at each node C B f X Y W A Z Origin Courtesy K. Keutzer et al. UCB

CSE241 L4 System.33Kahng & Cichy, UCSD ©2003 clk Combinational logic clk Combinational logic clk Combinational logic  Computing longest path delay Full path enumeration - potentially exponential Longest path algorithm on DAG (Kirkpatrick 1966, IBM JRD) (O(v+e) or O(g + p))  Currently applied to even the largest (>10M gate) circuits  Two challenges  Asynchronous subcircuits  False paths Approach of Static Timing Verification Courtesy K. Keutzer et al. UCB

CSE241 L4 System.34Kahng & Cichy, UCSD ©2003 Enhancements of STA  Incremental timing analysis  Nanometer-scale process effects – variation (  probabilistic timing analysis)  Interference – crosstalk  Multiple inputs switching  Conservatism of delay propagation  HW #8: Suppose you change the size of one (combinational) gate in your design, thus invalidating the previous timing analysis. How much work must be done to regain a correct timing analysis? Courtesy K. Keutzer et al. UCB

CSE241 L4 System.35Kahng & Cichy, UCSD ©2003 Timing Correction  Driven by STA l “Incremental performance analysis backplane”  Fix electrical violations l Resize cells l Buffer nets l Copy (clone) cells  Fix timing problems l Local transforms (bag of tricks) l Path-based transforms DAC-2002, Physical Chip Implementation

CSE241 L4 System.36Kahng & Cichy, UCSD ©2003 Local Synthesis Transforms  Resize cells  Buffer or clone to reduce load on critical nets  Decompose large cells  Swap connections on commutative pins or among equivalent nets  Move critical signals forward  Pad early paths  Area recovery DAC-2002, Physical Chip Implementation

CSE241 L4 System.37Kahng & Cichy, UCSD ©2003 Transform Example Delay = 4 ….. Double Inverter Removal ….. Delay = 2 DAC-2002, Physical Chip Implementation

CSE241 L4 System.38Kahng & Cichy, UCSD ©2003 Resizing b a d e f ? b a A b a C DAC-2002, Physical Chip Implementation

CSE241 L4 System.39Kahng & Cichy, UCSD ©2003 Cloning b a d e f g h 0.2 ? b a d e f g h A B DAC-2002, Physical Chip Implementation

CSE241 L4 System.40Kahng & Cichy, UCSD ©2003 Buffering b a d e f g h 0.2 ? b a d e f g h B B DAC-2002, Physical Chip Implementation

CSE241 L4 System.41Kahng & Cichy, UCSD ©2003 Redesign Fan-in Tree a c d b e Arr(b)=3 Arr(c)=1 Arr(d)=0 Arr(a)=4 Arr(e)= c d e Arr(e)=5 1 1 b 1 a DAC-2002, Physical Chip Implementation

CSE241 L4 System.42Kahng & Cichy, UCSD ©2003 Redesign Fan-out Tree Longest Path = Longest Path = 4 Slowdown of buffer due to load DAC-2002, Physical Chip Implementation

CSE241 L4 System.43Kahng & Cichy, UCSD ©2003 Decomposition DAC-2002, Physical Chip Implementation

CSE241 L4 System.44Kahng & Cichy, UCSD ©2003 Swap Commutative Pins 2 c a b a c b Simple sorting on arrival times and delay works DAC-2002, Physical Chip Implementation