EE141 © Digital Integrated Circuits 2nd Timing Issues 1 Digital Integrated Circuits A Design Perspective Timing Issues Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolić January 2003
EE141 © Digital Integrated Circuits 2nd Timing Issues 2 Synchronous Timing
EE141 © Digital Integrated Circuits 2nd Timing Issues 3 Timing Definitions
EE141 © Digital Integrated Circuits 2nd Timing Issues 4 Latch Parameters D Clk Q D Q t c-q t hold PW m t su t d-q Delays can be different for rising and falling data transitions T the register maximum propagation delay t c-q is clock to output and t d-q is data to output delay data D must be stable to be properly registered in the latch (no unintended changes when the latch is transparent) Intended change must come before the latch closes by at least t su
EE141 © Digital Integrated Circuits 2nd Timing Issues 5 Register Parameters D Clk Q D Q t c-q t hold T t su Delays can be different for rising and falling data transitions Data must be stable before the rising edge of the clock and held sufficiently long to be processed by the register
EE141 © Digital Integrated Circuits 2nd Timing Issues 6 Clock Uncertainties Sources of clock uncertainty
EE141 © Digital Integrated Circuits 2nd Timing Issues 7 Clock Nonidealities Clock skew (constant delay) Spatial variation in temporally equivalent clock edges; deterministic + random, t SK Clock jitter (random variations) Temporal variations in consecutive edges of the clock signal; modulation + random noise Cycle-to-cycle (short-term) t JS Long term t JL Variation of the pulse width Important for level sensitive clocking
EE141 © Digital Integrated Circuits 2nd Timing Issues 8 Clock Skew and Jitter Both skew and jitter affect the effective cycle time Only skew affects the race margin Clk t SK t JS
EE141 © Digital Integrated Circuits 2nd Timing Issues 9 Clock Skew # of registers Clk delay Insertion delay Max Clk skew Earliest occurrence of Clk edge Nominal – /2 Latest occurrence of Clk edge Nominal + /2
EE141 © Digital Integrated Circuits 2nd Timing Issues 10 Positive and Negative Skew
EE141 © Digital Integrated Circuits 2nd Timing Issues 11 Positive Skew Launching register clock edge arrives before the receiving register clock edge
EE141 © Digital Integrated Circuits 2nd Timing Issues 12 Negative Skew Receiving register clock edge arrives before the launching register clock edge
EE141 © Digital Integrated Circuits 2nd Timing Issues 13 Timing Constraints Minimum clock cycle time: T - = t c-q + t su + t logic Worst case is when receiving edge arrives early (negative ) thus a negative clock skew reduces the clock frequency Were cd stands for a contamination or a minimum delay both in register propagation time and combinational logic delay
EE141 © Digital Integrated Circuits 2nd Timing Issues 14 Timing Constraints Hold time constraint: t (c-q, cd) + t (logic, cd) > t hold + Worst case is when receiving edge arrives late (positive skew) Race between data and clock is more likely for a positive clock skew
EE141 © Digital Integrated Circuits 2nd Timing Issues 15 Impact of Jitter Since jitter is a random delay it increases the minimum clock period and increases likelihood for race between clock and data
EE141 © Digital Integrated Circuits 2nd Timing Issues 16 Longest Logic Path in Edge-Triggered Systems Clk T T SU T Clk-Q T LM Latest point of launching Earliest arrival of next cycle T JI + T LM - the maximum logic delay
EE141 © Digital Integrated Circuits 2nd Timing Issues 17 Clock Constraints in Edge-Triggered Systems If the launching edge is late and the receiving edge is early, the data will not be too late if: Minimum cycle time is determined by the maximum delays through the logic T c-q + T LM + T SU < T – T JI,1 – T JI,2 - T c-q + T LM + T SU + + 2 T JI < T Skew can be either positive or negative
EE141 © Digital Integrated Circuits 2nd Timing Issues 18 Shortest Path Clk T Clk-Q T Lm Earliest point of launching Data must not arrive before this time Clk THTH Nominal clock edge Shortest path effects feedback connections that typically have a negative clock skew T Lm - the minimum logic delay
EE141 © Digital Integrated Circuits 2nd Timing Issues 19 Clock Constraints in Edge-Triggered Systems Minimum logic delay If launching edge is early and receiving edge is late: T c-q + T Lm – T JI,1 > T H + T JI,2 + T c-q + T Lm > T H + 2T JI + For clock skew only we had: t (c-q, cd) + t (logic, cd) > t hold +
EE141 © Digital Integrated Circuits 2nd Timing Issues 20 How to counter Clock Skew?
EE141 © Digital Integrated Circuits 2nd Timing Issues 21 Flip-Flop – Based Timing Flip -flop Logic Flip-flop delay Skew Logic delay T SU T Clk-Q Representation after M. Horowitz, VLSI Circuits Clock cycle Logic propagation must finish before the next clock’s rising edge
EE141 © Digital Integrated Circuits 2nd Timing Issues 22 Flip-Flops and Dynamic Logic Logic delay T SU T Clk-Q Logic delay T SU T Clk-Q Precharge Evaluate Precharge Flip-flops are used only with static logic In dynamic logic gates logic propagation must finish before the clock’s falling edge Dual relation holds for the PUN controlled by inverted clocks
EE141 © Digital Integrated Circuits 2nd Timing Issues 23 Latch timing D Clk Q t D-Q t Clk-Q When data arrives to a transparent latch When data arrives to closed latch Data has to be ‘re-launched’ Latch is a ‘soft’ barrier
EE141 © Digital Integrated Circuits 2nd Timing Issues 24 Single-Phase Clock with Latches Latch Logic Clk P PW T skl T skt
EE141 © Digital Integrated Circuits 2nd Timing Issues 25 Latch-Based Design L1 Latch Logic L2 Latch L1 latch is transparent when = 0 L2 latch is transparent when = 1
EE141 © Digital Integrated Circuits 2nd Timing Issues 26 Slack-borrowing Q D In CLB_A Q DQD CLK1 L1 L2 L1 CLK2CLK1 CLB_B t pd,A t pd,B CLK1 CLK2 T CLK abcde t pd,A a valid b t DQ t pd,B c valid d t DQ e valid slack passed to next stage shortening the clock period requirement
EE141 © Digital Integrated Circuits 2nd Timing Issues 27 Clock Distribution Clock is distributed in a tree-like fashion H-tree balances the clock skew
EE141 © Digital Integrated Circuits 2nd Timing Issues 28 More realistic H-tree [Restle98]
EE141 © Digital Integrated Circuits 2nd Timing Issues 29 The Grid Clock Distribution Does not require rc-matching Large power dissipation Easier to satisfy metal density requirement in fabrication Good thermal distribution
EE141 © Digital Integrated Circuits 2nd Timing Issues 30 Example: DEC Alpha Clock Frequency: 300 MHz Million Transistors Total Clock Load: 3.75 nF Power in Clock Distribution network : 20 W (out of 50) Uses Two Level Clock Distribution: Single 6-stage driver at center of chip Secondary buffers drive left and right side clock grid in Metal3 and Metal4 Total driver size: 58 cm!
EE141 © Digital Integrated Circuits 2nd Timing Issues Clocking 2 phase single wire clock, distributed globally 2 distributed driver channels Reduced RC delay/skew Improved thermal distribution 3.75nF clock load 58 cm final driver width Local inverters for latching Conditional clocks in caches to reduce power More complex race checking Device variation effects symmetry t rise = 0.35ns t skew = 150ps t cycle = 3.3ns Clock waveform Location of clock driver on die pre-driver final drivers
EE141 © Digital Integrated Circuits 2nd Timing Issues 32
EE141 © Digital Integrated Circuits 2nd Timing Issues 33 Clock Skew in Alpha Processor Clock skew
EE141 © Digital Integrated Circuits 2nd Timing Issues 34 2 Phase, with multiple conditional buffered clocks 2.8 nF clock load 40 cm final driver width Local clocks can be gated “off” to save power Reduced load/skew Reduced thermal issues Multiple clocks complicate race checking t rise = 0.35nst skew = 50ps t cycle = 1.67ns EV6 (Alpha 21264) Clocking 600 MHz – 0.35 micron CMOS Global clock waveform
EE141 © Digital Integrated Circuits 2nd Timing Issues Clocking
EE141 © Digital Integrated Circuits 2nd Timing Issues 36 EV6 Clock Results GCLK Skew (at Vdd/2 Crossings) ps ps GCLK Rise Times (20% to 80% Extrapolated to 0% to 100%)
EE141 © Digital Integrated Circuits 2nd Timing Issues 37 EV7 Clock Hierarchy + widely dispersed drivers + DLLs compensate static and low- frequency variation + divides design and verification effort - DLL design and verification is added work + tailored clocks Active Skew Management and Multiple Clock Domains
EE141 © Digital Integrated Circuits 2nd Timing Issues 38 Self-timed and Asynchronous Design Functions of clock in synchronous design 1) Acts as completion signal 2) Ensures the correct ordering of events Truly asynchronous design 2) Ordering of events is implicit in logic 1) Completion is ensured by careful timing analysis Self-timed design 1) Completion ensured by completion signal 2) Ordering imposed by handshaking protocol
EE141 © Digital Integrated Circuits 2nd Timing Issues 39 Synchronous Pipelined Datapath Make sure that the clock period T is larger than the max delay T > max(t pd1,t pd2,t pd3 )+t pd,reg Problems: Clock skew and jitter Strong clock currents, induces noise due to package inductance Power dissipation Uneven stage delay could be used to support faster processing
EE141 © Digital Integrated Circuits 2nd Timing Issues 40 Self-Timed Pipelined Datapath Necessary for self-timed logic is a completion signal
EE141 © Digital Integrated Circuits 2nd Timing Issues 41 Completion Signal Generation Completion signal can be generated by: Replica delay Dual-rail coding LOGIC NETWORK DELAY MODULE Critical path replica In Out Start Done Using Delay Element (e.g. in memories)
EE141 © Digital Integrated Circuits 2nd Timing Issues 42 Completion Signal Generation Completion signal generation by dual-rail coding requires a redundancy in data representation Below two bits B0 and B1 represent a single bit value B value B
EE141 © Digital Integrated Circuits 2nd Timing Issues 43 Completion Signal in DCVSL PDN B0 In B1 Start V DD V Done B0 B1 Generation of a completion signal in DCVSL
EE141 © Digital Integrated Circuits 2nd Timing Issues 44 Self-Timed Adder Done signal generated after all carry signals are stable
EE141 © Digital Integrated Circuits 2nd Timing Issues 45 Completion Signal Using Current Sensing Current sensor outputs a low value when no current flows through the logic and a high value when logic is switching
EE141 © Digital Integrated Circuits 2nd Timing Issues 46 Hand-Shaking Protocol Two Phase Handshake Sender can cannot change its data once it sends the request signal which finishes its active cycle Receiver reads the data and produces acknowledge signal, this will start a new cycle and sender can process new data Req and Ack signals can be generated in both high-low and low-high transitions
EE141 © Digital Integrated Circuits 2nd Timing Issues 47 Event Logic – The Muller-C Element C A B F AB F n F n F n 1 (a) Schematic(b) Truth table Implementations of Muller-C element
EE141 © Digital Integrated Circuits 2nd Timing Issues 48 2-phase Handshake Protocol C Sender logic Receiver logic Data Data Ready Req Ack Data Accepted Handshake logic Initially Req, Ack, & Data Ready are 0 With Data Ready = 1 Req goes high and Data is transmitted Once this is finished Ack goes high and control is passed to the sender
EE141 © Digital Integrated Circuits 2nd Timing Issues 49 Example: Self-timed FIFO Data transferred on positive and negative transmission of En Done is a delayed En signal Examine operation of FIFO by plotting signals
EE141 © Digital Integrated Circuits 2nd Timing Issues 50 2-Phase Protocol
EE141 © Digital Integrated Circuits 2nd Timing Issues 51 Example From [Horowitz]
EE141 © Digital Integrated Circuits 2nd Timing Issues 52 Example
EE141 © Digital Integrated Circuits 2nd Timing Issues 53 Example
EE141 © Digital Integrated Circuits 2nd Timing Issues 54 Example
EE141 © Digital Integrated Circuits 2nd Timing Issues 55 4-Phase Handshake Protocol Slower, but unambiguous Also known as RTZ Used to initialize Muller C-elements in a fixed state
EE141 © Digital Integrated Circuits 2nd Timing Issues 56 4-Phase Handshake Protocol Implementation using Muller-C elements Initially Ack=0, Req=0, S=0, Data ready=0
EE141 © Digital Integrated Circuits 2nd Timing Issues 57 4-Phase Handshake Protocol Implementation using Muller-C elements Once Data ready=1 ->Req=1->S=1 and Data is transmitted
EE141 © Digital Integrated Circuits 2nd Timing Issues 58 4-Phase Handshake Protocol Implementation using Muller-C elements At the end of transmition Data ready=0 -> Req=0, S waits for Ack once the receiver sets Ack=1 -> S=0 and system waits for new Data ready
EE141 © Digital Integrated Circuits 2nd Timing Issues 59 Self-Resetting Logic Post-charge logic Logic block is precharged as soon as the successor block finishes its operation and does not need the old data A=1 ->int=0 -> out=1 ->precharge At this stage A must be low to avoid conflict
EE141 © Digital Integrated Circuits 2nd Timing Issues 60 Clock-Delayed Domino No global clock Clock from one stage drives the next one Transmission gate always switched on High speed operation
EE141 © Digital Integrated Circuits 2nd Timing Issues 61 Asynchronous-Synchronous Interface
EE141 © Digital Integrated Circuits 2nd Timing Issues 62 Synchronizers and Arbiters Arbiter: Circuit to decide which of 2 events occurred first Synchronizer: Arbiter with clock as one of the inputs Problem: Circuit HAS to make a decision in limited time - which decision is not important Caveat: It is impossible to ensure correct operation But, we can decrease the error probability at the expense of delay
EE141 © Digital Integrated Circuits 2nd Timing Issues 63 A Simple Synchronizer Data sampled on rising edge of the clock Latch will eventually resolve the signal value, but... this might take infinite time!
EE141 © Digital Integrated Circuits 2nd Timing Issues 64 Synchronizer: Output Trajectories Single-pole model for a flip-flop Initial increase Exponential with time constant Transient response of FF around the metastable point V MS Vout time [ps]
EE141 © Digital Integrated Circuits 2nd Timing Issues 65 Synchronizer: Output Trajectories Error occurs if the signal is still undefined after waiting period T A signal is undefined if it falls into the interval V IH V IL which happens if the initial value v(0) is Or when the initial value v(0) is close to V MS in the interval of length V out time [ps] Increasing the waiting time T decreases probability of error exponentially
EE141 © Digital Integrated Circuits 2nd Timing Issues 66 Mean Time to Failure Number of synchronization errors per second After time T number of synchronization errors drops to Probability that changing signal is undefined at the beginning of the sampling time for a signal duration time T signal. This probability falls down exponentially with time T used to decide the signal value T signal T is a synchronization time
EE141 © Digital Integrated Circuits 2nd Timing Issues 67 N sync (T) = errors/sec MTF (T) = sec = 8.3 years MTF (0) = 2.5 sec Example Probability of undefined signal : mean-time-to-fail MTF (T) = 1 / N sync (T) with and without synchronizer for this clock frequency Number of errors MTF with synchronizer MTF without synchronizer Synchronization time and clock frequency Average period of input signal transition Rise time Time constant Undefined region T = 10 nsec = T T signal = 50 nsec t r = 1 nsec = 310 psec V IH - V IL = 1 V (V DD = 5 V)
EE141 © Digital Integrated Circuits 2nd Timing Issues 68 Influence of Noise One would hope that noise may throw the system out of undefined range, however Low amplitude noise does not influence synchronization behavior Sometimes it helps sometimes it hurts Initial Distribution p(v) 0V IL V IH Uniform distribution around V M
EE141 © Digital Integrated Circuits 2nd Timing Issues 69 Typical Synchronizers Using delay line 2 phase clocking circuit
EE141 © Digital Integrated Circuits 2nd Timing Issues 70 Cascaded Synchronizers Reduce MTF Increased MTF is obtained at the expense of larger latency For instance cascading synchronizers reduces exponentially probability of synchronization failure
EE141 © Digital Integrated Circuits 2nd Timing Issues 71 Arbiters Req1 2 Ack1 2 Arbiter (a) Schematic symbol (c) Timing diagram Req1 2 A B Ack1 t V T gap metastable Arbiter decides which of two events occurred first, so a synchronizer is a special case of an arbiter which decides if the signal came before the clock or not
EE141 © Digital Integrated Circuits 2nd Timing Issues 72 PLL-Based Synchronization Lower frequency reference clock is increased to a desired frequency range by PLL
EE141 © Digital Integrated Circuits 2nd Timing Issues 73 PLL Block Diagram Phase detector Charge pump Divide by N Loop filter VCO Reference clock Local clock System Clock Up Down v cont div Based on up and down signals the charge pump either increases or decreases the V cont that controls the voltage controlled oscillator. VCO delivers precise system clock and frequency divider divides it to match with the reference clock. Phase detector issues up and down signals based on the phase difference.
EE141 © Digital Integrated Circuits 2nd Timing Issues 74 Phase Detector Output before filtering Transfer characteristic
EE141 © Digital Integrated Circuits 2nd Timing Issues 75 Phase-Frequency Detector (c) Timing waveforms (a) schematic(b) state transition diagram A B UP DN A B UP DN DQ DQ A B Rst UP DN UP = 0 DN = 1 UP = 0 DN = 0 UP = 1 DN = 0 B BA A AB
EE141 © Digital Integrated Circuits 2nd Timing Issues 76 Response to frequency – B has lower frequency than A Phase-Frequency Detector
EE141 © Digital Integrated Circuits 2nd Timing Issues 77 PFD Phase Transfer Characteristic Notice that if the signal shift is by multiplicity of the clock period it cannot be detected.
EE141 © Digital Integrated Circuits 2nd Timing Issues 78 Charge Pump Up signals increases the output voltage and down signal decreases it.
EE141 © Digital Integrated Circuits 2nd Timing Issues 79 PLL Simulation local
EE141 © Digital Integrated Circuits 2nd Timing Issues 80 Example of PLL-generated clock
EE141 © Digital Integrated Circuits 2nd Timing Issues 81 Clock Generation using DLLs Phase Det Charge Pump Filter VCDL PDCPVCO ÷N÷N Delay-Locked Loop DLL (Voltage Controlled Delay Line based – no frequency conversion) Phase-Locked Loop (VCO-Based) U D U D f REF fOfO fOfO Filter
EE141 © Digital Integrated Circuits 2nd Timing Issues 82 Delay Locked Loop
EE141 © Digital Integrated Circuits 2nd Timing Issues 83 DLL-Based Clock Distribution