Chapter 10 Timing Issues Rev.1.0 05/11/2003 Rev. 1.1 05/28/2003 EE141 Chapter 10 Timing Issues Rev.1.0 05/11/2003 Rev. 1.1 05/28/2003 Rev. 1.2 06/05/2003
Synchronous Pipelined Datapath Register Output Register In t pd,reg pd 1 D R Q CLK Logic Block # 2 3 Out
Self-Timed Logic (Asynchronous Datapath) Check Textbook (Sec. 10.4) for details!
Latch Parameters : Clock-to-Q delay Positive Latch : Data-to-Q delay tc-q : Clock-to-Q delay Positive Latch td-q : Data-to-Q delay D Q thold : Data hold time PW : Pulse Width Clk T Clk PW tsu D thold tc-q td-q Q tc-q Delays can be different for rising and falling data transitions
Register Parameters Positive Edge-Triggered Register tc-q : Clock-to-Q delay tsu D Clk Q : Data Setup time thold :Data hold time T Clk thold D tsu tc-q Q
Sources of Clock Uncertainties 2 : Device Variations 5 : Temperature
Clock Nonidealities Clock skew Clock jitter () Spatial variation in temporally equivalent clock edges: deterministic + random values. Clock jitter Temporal variations in consecutive edges of the clock signal: modulation + random noise () Cycle-to-cycle (short-term) tJitter Long-term tJitter Variation of the pulse width Important for level-sensitive (latch) clocking (not discussed)
Clock Skew and Jitter Clk tSK Clk tJS Both skew and jitter affect the effective cycle time
Clock Skew (Distribution) # of registers Earliest occurrence of Clk edge Nominal – /2 Latest occurrence of Clk edge Nominal + /2 Clk delay Insertion delay Max Clk skew
Positive and Negative Skew
Positive Skew Launching edge arrives before the receiving edge
Negative Skew Receiving edge arrives before the launching edge
Datapath Structure with Feedback
Positive Skew Launching edge arrives before the receiving edge
Timing Constraints Minimum cycle time fastest clock rate: T + tc-q + tlogic + tsu T tc-q + (tlogic - ) + tsu Eq. (10.3) Has the potential to improve the performance ( >0)
Timing Constraints Hold time constraint: t(c-q, cd) + t(logic, cd) > thold + Race between data and clock should be kept small
Impact of Clock Jitter T – 2 tjitter >= tc-q + tlogic + tsu CLK -t jitter T t j k l m n o CLK In Combinational Logic t c-q , t c-q, cd logic logic, cd su, hold REGS jitter T – 2 tjitter >= tc-q + tlogic + tsu Eq. (10.5)
Combined Impact of Skew and Jitter T + – 2 tjitter >= tc-q + tlogic + tsu Eq. (10.6) thold + < t(c-q, cd) + t(logic, cd) – 2 tjitter Eq. (10.7)
Latch-based Design L1, L2: Negative Latches (3) : Latch CLB_A (Combinational Block A) (4): Latch CLB_B (Combinational Block B)
Edge-triggered Pipeline Design
Slack-borrowing (Sec. 10.3.4, not covered)
Latch-Based Design for State Machine L1 latch is transparent when f = 0 L2 latch is transparent when f = 1 f L1 L2 Logic Latch Latch Logic
Clocking Schemes in VLSI Circuits
Clock Distribution H-tree Clock-distribution Network for 16 leaf nodes Clock is distributed in a tree-like fashion
More realistic H-tree
Grid Structure No rc-matching Allow a Low-skew clock distribution and physical design flexibility at the cost of larger power dissipation Grids are typically used in the final stage of a clock network to distribute the clock to the clocking element loads
Example: DEC Alpha 21164
21164 Clocking Clock waveform Location of clock driver on die EE141 21164 Clocking tcycle= 3.3ns 2 phase single wire clock, distributed globally 2 distributed driver channels Reduced RC delay/skew Improved thermal distribution 3.75nF clock load 58 cm final driver width Local inverters for latching Conditional clocks in caches to reduce power More complex race checking Device variation trise = 0.35ns tskew = 150ps Clock waveform pre-driver final drivers Location of clock driver on die
Clock Skew in Alpha Processor
EV6 (Alpha 21264) Clocking 600 MHz – 0.35 micron CMOS trise = 0.35ns EE141 EV6 (Alpha 21264) Clocking 600 MHz – 0.35 micron CMOS trise = 0.35ns tskew = 50ps tcycle= 1.67ns Global clock waveform 2 Phase, with multiple conditional buffered clocks 2.8 nF clock load 40 cm final driver width Local clocks can be gated “off” to save power Reduced load/skew Reduced thermal issues Multiple clocks complicate race checking
21264 Clocking
(20% to 80% Extrapolated to 0% to 100%) EE141 EV6 Clock Results ps 300 305 310 315 320 325 330 335 340 345 ps 5 10 15 20 25 30 35 40 45 50 GCLK Skew (at Vdd/2 Crossings) GCLK Rise Times (20% to 80% Extrapolated to 0% to 100%)
EV7 Clock Hierarchy Active Skew Management and Multiple Clock Domains EE141 EV7 Clock Hierarchy Active Skew Management and Multiple Clock Domains PLL regenerate the on-chip clocks DLLs compensate static and low- frequency variation Divides design and verification effort DLL/PLL designs and verification is added works
Summary Clocking schemes are very important in synchronous circuit designs dominated in speed performance and power consumption. Clock jitter and skews should be considered in early design phase. Register-based designs are dominated in modern VLSI than latch-based designs. Phase-locked loop (PLL) and Delay-locked loop (DLL) circuits are used to reduce the clock jitter and skews. Good clock distribution CAD tools are useful in analyzing the clock performance in modern chips.