Chapter 7 Sequential Circuits B.Supmonchai July 4th, 2005 Chapter 7 Sequential Circuits Boonchuay Supmonchai Integrated Design Application Research (IDAR) Laboratory August 20, 2004; Revised - July 4, 2005 2102-545 Digital ICs
Goals of This Chapter Implementation techniques for Register: latches and flipflops Schmitt Triggers Oscillator, pulse generators Static versus Dynamic Realization Clocking Strategies 2102-545 Digital ICs Sequential Logic
Sequential Logic COMBINATIONAL LOGIC State Register Storage Mechanisms Inputs Outputs Next state Current State Q D State Register CLOCK Storage Mechanisms Positive Feedback Charge-Based STATIC DYNAMIC 2102-545 Digital ICs Sequential Logic
Static vs Dynamic Storage B.Supmonchai July 4th, 2005 Static vs Dynamic Storage Static storage preserve state as long as the power is on have positive feedback (regeneration) with an internal connection between the output and the input useful when updates are infrequent (clock gating) Dynamic storage store state on parasitic capacitors only hold state for short periods of time (milliseconds) require periodic refresh usually simpler, so higher speed and lower power clock gating - conditional clocks - where the clock is turned off for unused modules to save on power 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Latches versus Flipflops B.Supmonchai July 4th, 2005 Latches versus Flipflops Latches (with Clock) level sensitive circuit that passes inputs to Q when the clock is high (or low) - transparent mode input sampled on the falling edge of the clock is held stable when clock is low (or high) - hold mode Flipflops (edge-triggered) edge sensitive circuits that sample the inputs on a clock transition positive edge-triggered: 0 1 negative edge-triggered: 1 0 built using latches (e.g., master-slave flipflops) Positive latch - can also have negative latch with flip definitions Should insert a slide with figure 7.3 after this one? 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Review: The Regenerative Property B.Supmonchai July 4th, 2005 Review: The Regenerative Property Vi2 Vo2 Vo1 Vi1 Cascaded Inverters Small deviation from bias point C (e.g., from noise) is amplified and regenerated around the circuit loop until either point A or B is reached If the gain in the transient region is larger than 1, only A and B are stable operation points. C is a metastable operation point. A Vi1 = Vo2 Vi2 = Vo1 B C bistability principle - a circuit having two stable states that represent 0 and 1 Consider just two inverters - VTC of first inverter and second inverter (later plot is rotated to accentuate that Vi2 = Vo1). Resulting circuit has just three possible operation points (A, B and C). A small deviation from the bias point C (e.g., from noise) is amplified and regenerated around the circuit loop until either point A or B is reached. A and B are stable operation points. At these points, the loop gain is much smaller than unity. 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Review: Bistable Circuits B.Supmonchai July 4th, 2005 Review: Bistable Circuits The cross-coupling of two inverters results in a bistable circuit (a circuit with two stable states) Vi1 Vi2 Have to be able to change the stored value by making A (or B) temporarily unstable by increasing the loop gain to a value larger than 1 done by applying a trigger pulse at Vi1 or Vi2 the width of the trigger pulse need be only a little larger than the total propagation delay around the loop circuit (twice the delay of an inverter) cutting the feedback loop is the most popular in today’s latches 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Review: SR Latch S R Q !Q S R Q !Q Action memory 1 set reset B.Supmonchai July 4th, 2005 Review: SR Latch S R Q !Q S R Q !Q Action memory 1 set reset disallowed 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Review: Clocked D Latch B.Supmonchai July 4th, 2005 Review: Clocked D Latch clock D Q !Q clock D Latch Q D clock transparent mode hold mode In our course All latches mean clocked latches 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Latches versus Flipflops II stores data when clock is low (high) Flipflop stores data when clock rises (falls) D Clk Q D Clk Q Clk D Q Clk D Q 2102-545 Digital ICs Sequential Logic
Positive and Negative Latches Q G In Out Clk Positive Latch D Q G In Out Clk Negative Latch In Out Clk In Out Clk Out Stable Out Follow In Out Stable Out Follow In Out Stable Out Follow In Out Stable Out Follow In 2102-545 Digital ICs Sequential Logic
Latch-Based Design Logic f N latch is transparent when f = 0 P latch is transparent when f = 1 N Latch P Latch f Logic 2102-545 Digital ICs Sequential Logic
Timing Metrics clock In time tsu thold tc-q Out time data stable B.Supmonchai July 4th, 2005 Timing Metrics clock D Q In Out clock In data stable time tsu thold tc-q tsetup – time that the data inputs (D) must be valid before the clock transition (0 ti 1 transition for a positive edge-triggered device) thold is the time that the data inputs must remain valid after the clock edge Tc-q is the worst case propagation delay (with reference to the clock edge) – time to copy D to Q Out output stable time 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Timing Definitions Setup time, tsetup is the time that the data inputs (D) must be valid before the clock transition 0 to 1 transition for a positive edge-triggered device 1 to 0 transition for a negative edge-triggered device Hold time, thold is the time that the data inputs must remain valid after the clock edge Propagation Delay, tc-q is the worst case propagation delay (with reference to the clock edge) time to copy D to Q 2102-545 Digital ICs Sequential Logic
System Timing Constraints B.Supmonchai July 4th, 2005 System Timing Constraints COMBINATIONAL LOGIC Inputs Outputs Next state Current State Q D State Register CLOCK contamination delay - minimum delay of the combinational logic or register Thus, it is important to minimize the values of the timing parameters associated with the register. In modern high-performance systems, the register propagation delay and set-up times account for a significant portion of the clock period. E.g., DEC Alpha EV6 has a maximum logic depth of 12 gates and the register overhead accounts for about 15% of the clock period. Hold time becomes and issue then there is little logic between registers or when the clocks at different registers are somewhat out of phase due to clock skew. Modern machines are characterized by a very-low logic depth and, in fact, the register propagation delay and setup times account for a significant portion of the clock period. E.g., DEC EV6 has a maximum logic depth of 12 gates and the register overhead accounts for approx. 15% of the clock period. T (clock period) tcd: contamination delay = minimum delay tcdreg + tcdlogic thold T tc-q + tplogic + tsu 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Notes on System Timing Constraints It is important to minimize the values of the timing parameters associated with the register. In modern high-performance systems, the register propagation delay and set-up times account for a significant portion of the clock period. DEC Alpha EV6 has a maximum logic depth of 12 gates and the register overhead accounts for about 15% of the clock period. Hold time becomes an issue when there is little logic between registers or when the clocks at different registers are somewhat out of phase due to clock skew. 2102-545 Digital ICs Sequential Logic
Building A (Static) Latch For a latch, use the clock as a decoupling signal, that distinguishes between the transparent and opaque states D CLK can implement as NMOS-only Cutting the feedback loop (Mux-based latch) Overpowering the feedback loop (as in Static RAM) 2102-545 Digital ICs Sequential Logic
MUX Based Latches Change the stored value by cutting the feedback loop B.Supmonchai July 4th, 2005 MUX Based Latches Change the stored value by cutting the feedback loop Negative Latch Q D clk 1 feedback Positive Latch Q D clk 1 feedback Nonratioed latch – sizing of the devices affects performance and is not critical to functionality Q = clk & Q | !clk & D Q = !clk & Q | clk & D transparent when the clock is low transparent when the clock is high 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
TG MUX Based Latch Implementation B.Supmonchai July 4th, 2005 TG MUX Based Latch Implementation Q D clk !clk Positive Latch input sampled (transparent mode) Positive latch – latch is transparent (D is copied to Q) when clk is high (bottom transmission gate is on) clk load is two transistors (and two for !clk) = clock load of 4 Also have the problem of having to generate both clk and !clk (nonoverlapping clocks) !clk clk clk load is two transistors (and two for !clk) = clock load of 4 Having to generate both clk and !clk (nonoverlapping clocks) feedback (hold mode) 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
PT MUX Based Latch Implementation B.Supmonchai July 4th, 2005 PT MUX Based Latch Implementation Q D clk !Q !clk input sampled (transparent mode) Positive latch – latch is transparent (D is copied to Q) when clk is high Have reduced clock load by replacing transmission gates with pass transistors, but that impacts both noise margin and switching performance and causes static power dissipation due to threshold drop. clk load is one transistors (and one for !clk) = clock load of 2 Still have the problem of having to generate both clk and !clk !clk clk Reduced clock load, but threshold drop at output of pass transistors so reduced noise margins and performance feedback (hold mode) 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Which value of B is stored? B.Supmonchai July 4th, 2005 Latch Race Problem B Combinational Logic clk Registers State B B’ clk So, a solution to the latch race problem is to design with edge-triggered (master-slave) devices Which value of B is stored? T tc-q + tplogic + tsu Two-sided clock constraint Thigh tc-q + tcdlogic 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Master Slave Based ET Flipflop B.Supmonchai July 4th, 2005 Master Slave Based ET Flipflop 1 Q D clk Slave Master QM clk D FF Q D clk On low phase of clock, master is transparent and D input is passed to master stage output QM. Slave is in hold mode, keeping its previous value using feedback. During the rising edge of the clock, the master stops sampling and goes into hold mode, the slave starts sampling coping QM to its output. The value of Q is the value of D right before the rising edge of the clock achieving a POSITIVE EDGE TRIGGERED effect. Can build a negative edge triggered by switching the order of master and slave (master positive, slave negative) D QM clk = 0 transparent hold clk = 1 hold transparent Q 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
MS ET Implementation 20 Transistors* 8 clock loads Q D clk QM Master B.Supmonchai July 4th, 2005 MS ET Implementation T1 T2 Q D clk QM I1 I2 I3 I4 I5 I6 T3 T4 Master Slave For lecture Note that !clk is generated locally 20 transistors (plus clock inverter) – 8 clock loads (4 on clk and 4 on !clk) (can ignore the buffer inverter overhead since it can be amortized over multiple register bits) 20 Transistors* 8 clock loads * Ignore clk buffer master transparent slave hold master hold slave transparent clk !clk 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
MS ET Timing Properties B.Supmonchai July 4th, 2005 MS ET Timing Properties Assume propagation delays are tpd_inv and tpd_tx, that the contamination delay is 0, and that the inverter delay to derive !clk is 0 Set-up time - time before rising edge of clk that D must be valid Propagation delay - time for QM to reach Q Hold time - time D must be stable after rising edge of clk For lecture set-up - how long before the rising edge does D have to be stable such that QM samples the value reliably? - D has to propagate through I1, T1, I3 and I2 before the rising edge to ensure that the node voltages on both terminals of T2 are the same value. prop delay - since the delay of I2 is included in the set-up time, the output of I4 is valid before the rising edge of clk, so the delay is simply the delay through T3 and I6 hold time - since T1 turns off when the clock goes high, any changes in D after clk goes high are not seen, so hold time is 0 tsu = 3 * tpd_inv + tpd_tx tpd = tpd_inv + tpd_tx thold = 0 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Notes on MS ET Timing Properties Set-up time How long before the rising edge does D have to be stable such that QM samples the value reliably? D has to propagate through I1, T1, I3 and I2 before the rising edge to ensure that the node voltages on both terminals of T2 are the same value. Propagation delay time Since the delay of I2 is included in the set-up time, the output of I4 is valid before the rising edge of clk, so the delay is simply the delay through T3 and I6 Hold time since T1 turns off when the clock goes high, any changes in D after clk goes high are not seen, so hold time is 0 2102-545 Digital ICs Sequential Logic
Set-up Time Simulation B.Supmonchai July 4th, 2005 Set-up Time Simulation Volts Time (ns) -0.5 0.5 1 1.5 2 2.5 3 0.2 0.4 0.6 0.8 Q I2 out clk D QM tsetup = 0.21 ns progressively skew the input wrt to the clock edge until the circuit fails. works correctly 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Set-up Time Simulation II B.Supmonchai July 4th, 2005 Set-up Time Simulation II -0.5 0.5 1 1.5 2 2.5 3 0.2 0.4 0.6 0.8 Volts Time (ns) Q I2 out clk D tsetup = 0.20 ns QM the clock is enabled before the nodes on both sides of the transmission gate T2 settle to the same value Fails! the clock is enabled before the nodes on both sides of the transmission gate T2 settle to the same value 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Propagation Delay Simulation B.Supmonchai July 4th, 2005 Propagation Delay Simulation tc-q (LH) = 160 psec tc-q (HL) = 180 psec -0.5 0.5 1 1.5 2 2.5 3 Volts Time (ns) Q Clk D hold time - D input edge is skewed relative to the clock signal until the circuit fails propagation delay – delay is measured from the 50% point of the clk edge to the 50% point of the Q output tc-q (LH) tc-q (HL) propagation delay is measured from the 50% point of the clk edge to the 50% point of the Q output 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
B.Supmonchai July 4th, 2005 Reduced Load MS ET FF Clock load per register is important since it directly impacts the power dissipation of the clock network. Can reduce the clock load (at the cost of robustness) by making the circuit ratioed 12 Transistors 4 clock loads !clk clk Q D I1 I2 I4 I3 QM T2 T1 12 transistors with clock load of 4 (2 on clk and 2 on !clk) – but now ratioed design (and thus less robust) T1 must overpower I2 to switch the state of the cross-coupled inverters I1 and I2. But want to use minimum (or close to it) transistors in T1 and T2 to keep clock load small (to reduce power dissipation in flipflops and in the clock distribution network). Thus, probably want to downsize the transistors in I1 (making them longer and thus weaker). Another problem is reverse conduction is possible – second stage can affect the state of the first latch. When slave (second latch) is on, it is possible for a combination of T2 and I4 to influence the data stored in I1&I2. As long as I4 is a weak device, this is not a problem reverse conduction to switch the state of the master, T1 must be sized to overpower I2 to avoid reverse conduction, I4 must be weaker than I1 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
B.Supmonchai July 4th, 2005 Non-Ideal Clocks Clk and !clk are never perfect inversions of one another We must generate !clk and route both signals Variations can exist in the wires used to route the two clock signals and load capacitances may vary Non-ideal clocks create skew resulting in clock overlap Clk and !clk are never perfect inversions of one another – must generate !clk and route both signals (variations can exist in the wires used to route the two clock signals and load capacitances can vary) Clock skew can result in clock overlap 1-1 Overlap 0-0 Overlap !clk clk Ideal clocks !clk clk Non-Ideal clocks 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Example of Clock Skew Problems B.Supmonchai July 4th, 2005 Example of Clock Skew Problems !Q D clk X !clk Q B A P1 P2 P3 P4 I1 I2 I3 I4 Race When clock goes high, slave should go into hold mode. But since clk and !clk are both high for a short period of time there is a direct path from D to Q. So data output could change on rising edge (not this is a negative et device!). Race condition where value of Q is a function of whether the input D arrives at node X before or after the falling edge of !clk. Node A is driven by both D and B when clk and !clk are both high resulting in an undefined state Race condition – direct path from D to Q during the short time when both clk and !clk are high (1-1 overlap) Undefined state – both B and D are driving A when clk and !clk are both high Dynamic storage – when clk and !clk are both low (0-0 overlap) 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Pseudostatic Two-Phase ET FF B.Supmonchai July 4th, 2005 Pseudostatic Two-Phase ET FF !Q D clk1 X clk2 Q B A P1 P2 P3 P4 I1 I2 I3 I4 Keep clock nonoverlap time large enough that no overlap occurs even in the presence of clock skew During the nonoverlap time, the ff is in the high-impedance state – the feedback loop is open (the loop gain is zero) and the input is disconnected. Leakage will destroy the state if this condition holds for too long – hence the name pseudostatic (the register employs a combination of static and dynamic storage approaches depending upon the state of the clock). Don’t want to stop the clocks when both are low!! dynamic storage master transparent slave hold clk1 tnon_overlap master hold slave transparent clk2 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Two Phase Clock Generator B.Supmonchai July 4th, 2005 Two Phase Clock Generator clk clk1 clk2 A B clk A B clk1 clk2 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Power PC Flipflop 16 Transistors 8 clock loads master transparent B.Supmonchai July 4th, 2005 Power PC Flipflop 1 D Q clk !clk 1 1 0 0 For class handouts 16 Transistors 8 clock loads master transparent slave hold !clk clk master hold slave transparent 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Overpowering The Feedback Loop Clocked SR Latch S R clk !Q Q M1 M2 M3 M4 M5 M6 M7 M8 Cross-coupled NANDs This is not used in datapaths any more, but is a basic building block for memory cell 2102-545 Digital ICs Sequential Logic
Ratioed CMOS Clocked SR Latch B.Supmonchai July 4th, 2005 Ratioed CMOS Clocked SR Latch 1 on off S R clk !Q Q M1 M2 M3 M4 M5 M6 M7 M8 on off 0 1 off ->on off ->on 0 1 0 1 For lecture - 8 transistor SR level sensitive latch - two clock loads (sized) No static power consumption, but … Ratioed device where sizing is critical to ensure proper functionality For the case shown, M7 and M8 must succeed in bringing Q low (overcoming M4) - below the threshold of M1 Therefore, must increase the sizes of transistors M5,M6,M7, and M8 off on 1 8 Transistors 2 Clock loads* * sized on off No static power consumption, but a ratioed device where sizing is critical to ensure proper functionality M7, M8 must overcome M4 to bring Q low, so must M5, M6 over M2 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Sizing Issues so (W/L)5 and 6 > 3 B.Supmonchai July 4th, 2005 Sizing Issues (W/L)2 and 4 = 1.5m/0.25 m (W/L)1 and 3 = 0.5m/0.25 m (W/L)5 and 6 !Q (Volts) so (W/L)5 and 6 > 3 Want VM at Vdd/2 Assuming Q=0, determine the minimum sizes of M5, M6, M7, and M8 to make the device switchable so the individual device ration for M5 or M6 must be larger than approx. 6. Analysis results give 2.26 (instead of 3) since it doesn’t take into account channel length modulation and DIBL (drain induced barrier loading). Output voltage depends on pull-down transistor width 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Transient Response !Q (Volts) Time (ns) S !Q W=0.5 µm W=0.6 µm B.Supmonchai July 4th, 2005 Transient Response 1 2 3 0.4 0.8 1.2 1.6 !Q (Volts) Time (ns) S !Q W=0.5 µm W=0.6 µm W=0.7 µm W=0.8 µm W=0.9 µm W=1 µm tp!Q = 120 psec tpQ = 230 psec Individual device ratio for M5 or M6 must be larger than approx. 6. Analysis results give 2.26 (instead of 3) since it doesn’t take into account channel length modulation and DIBL (drain induced barrier loading). 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
6 Transistor CMOS SR Latch B.Supmonchai July 4th, 2005 6 Transistor CMOS SR Latch clk S R Problems with noise margins and static power consumption due to threshold drop across pass transistors Once again, sizing is important - especially M5 and M6 Problems with noise margins and static power consumption due to threshold drop across pass transistors Once again, sizing is important - especially M5 and M6 Will see this structure again when we talk about SRAMs!! M1 M2 M3 M4 M5 M6 S R clk !Q Q 6 Transistors 2 Clock loads 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Review: Storage Mechanisms Static (Positive Feedback) D CLK Q Dynamic (charge-based) Useful when update is infrequent Simpler, Faster, and Lower Power 2102-545 Digital ICs Sequential Logic
Dynamic ET Flipflop master slave !clk clk Q QM D C1 C2 8 Transistors B.Supmonchai July 4th, 2005 Dynamic ET Flipflop master slave !clk clk T1 T2 I1 I2 Q QM D C1 C2 8 Transistors 4 Clock loads C1 is the gate cap of I1, the junction cap of T1 and the overlap gate cap of T1 8 transistors, so very efficient tsetup is delay of the transmission gate (time it takes C1 to sample D input) hold time is zero since T1 is turned off on the clock edge so further input changes are ignored tpFF is two inverter delays plus the delay of T2 Remember – dynamic nodes (C1 and C2) only hold their state so long, so ff has to be refreshed periodically to prevent state loss due to charge leakage tsu = thold = tc-q = tpd_tx master transparent slave hold zero 2 tpd_inv + tpd_tx !clk clk master hold slave transparent 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Dynamic ET FF Race Conditions B.Supmonchai July 4th, 2005 Dynamic ET FF Race Conditions !clk clk T1 T2 I1 I2 Q QM D C1 C2 clock overlap leads to race conditions 1-1 race fixed by enforcing a hold time - data must be stable during the high-high overlap period 0-0 race fixed by making sure there is enough delay between D and C2 so that new data sampled by the master does not propagate to the slave (can be ensured by enforcing appropriate setup time) 0-0 overlap race condition toverlap0-0 < tT1 + tI1 + tT2 !clk clk 1-1 overlap race condition toverlap1-1 < thold 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Dynamic Two-Phase ET FF B.Supmonchai July 4th, 2005 Dynamic Two-Phase ET FF clk1 clk2 T1 T2 I1 I2 Q QM D C1 C2 !clk1 !clk2 Keep clock nonoverlap time large enough that no overlap occurs even in the presence of clock skew But now have 4 clock signals to route! master transparent slave hold clk2 clk1 tnon_overlap master hold slave transparent 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Pseudostatic Dynamic Latch B.Supmonchai July 4th, 2005 Pseudostatic Dynamic Latch Robustness considerations limit the use of dynamic FF’s Coupling between signal nets and internal storage nodes can inject significant noise and destroy the FF state Leakage currents cause state to leak away with time Internal dynamic nodes don’t track fluctuations in VDD that reduces noise margins A simple fix is to make the circuit pseudostatic adding a weak feedback inverter to each latch comes at a slight cost in delay (adds to the capacitive load) and power consumption, but it improves noise immunity significantly clk T1 D !clk Slight increase in delay (adds to the capacitive load) and power consumption, but it improves noise immunity significantly 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
C2MOS (Clocked CMOS) ET Flipflop B.Supmonchai July 4th, 2005 C2MOS (Clocked CMOS) ET Flipflop clk !clk QM C1 C2 Q D M1 M3 M4 M2 M6 M8 M7 M5 Master Slave 8 Transistors 4 Clock loads on off on off For lecture Positive edge-triggered MS flipflop, just like the one two slides ago (and again only 8 transistors and 4 clock loads), however with one important difference A C2MOS flipflp with clk and !clk clocking is insensitive to clock overlap as long as the rise and fall times of the clock edges are sufficiently small Insensitive to clock overlap as long as the rise and fall times of the clock edges are sufficiently small master transparent slave hold !clk clk master hold slave transparent 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
C2MOS FF 0-0 Overlap Case QM C1 C2 Q D !clk clk !clk clk B.Supmonchai July 4th, 2005 C2MOS FF 0-0 Overlap Case QM C1 C2 Q D M1 M3 M4 M2 M6 M8 M7 M5 Does any new data sampled during the overlap window propagate to Q (race)? New data is sampled on QM, but cannot propagate to Q since M7 is off (slave is in hold). Any new data sampled on the falling clock edge is not seen at Q For clocking on left – at the end of the overlap period !clk = 1 and both M7 and M8 turn off, putting the slave stage in the hold mode For the clocking on the right – at the end of the overlap period clk = 1 and both M3 and M4 turn off, putting the master in the hold mode (affects setup time as well) Means that the FF is slower (slower tc-q time) !clk clk !clk clk 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Notes on C2MOS FF 0-0 Overlap Case Does any new data sampled during the overlap window propagate to Q (race)? New data is sampled on QM, but cannot propagate to Q since M7 is off (slave is in hold). Any new data sampled on the falling clock edge is not seen at Q For clocking on the left: at the end of the overlap period !clk = 1 and both M7 and M8 turn off, putting the slave in the hold mode For the clocking on the right: at the end of the overlap period clk = 1 and both M3 and M4 turn off, putting the master in the hold mode (affects setup time as well) The result: the FF is slower (slower tc-q time) 2102-545 Digital ICs Sequential Logic
C2MOS FF 1-1 Overlap Case 1 QM C1 C2 Q D !clk clk !clk clk B.Supmonchai July 4th, 2005 C2MOS FF 1-1 Overlap Case 1 QM C1 C2 Q D M1 M3 M4 M2 M6 M8 M7 M5 Does any new data sampled during the overlap window (right after the clock goes high) propagate to Q (race)? New data is sampled on QM, but cannot propagate to Q since M8 is off (slave is in hold). Any new data sampled on the falling clock edge is not seen at Q A bit more problematic than 0-0 overlap. Must enforce a hold time on D, so that D changing that makes it to QM is not copied to Q when overlap time is over (and !clk goes to zero turning on M8) - first clocking condition. By imposing a hold time on D - that D must be stable during clock overlap - overcome this problem as well However, if the rise/fall times of the clock are sufficiently slow, have possible race. Works correctly as long as the clock rise/fall times is smaller than approximately five times the propagation delay of the flipflop. !clk clk !clk clk 1-1 overlap constraint: toverlap1-1 < thold 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Notes on C2MOS FF 1-1 Overlap Case New data is sampled on QM, but cannot propagate to Q since M8 is off (slave is in hold). A bit more problematic than 0-0 overlap. It must enforce a hold time on D, so that changing D which reaches QM is not copied to Q when overlap time is over - first clocking condition. By imposing a hold time on D - that D must be stable during clock overlap - overcome this problem as well However, possible race can occur if the rise/fall times of the clock are sufficiently slow. Works correctly as long as the clock rise/fall times is smaller than approximately five times the propagation delay of the flipflop. 2102-545 Digital ICs Sequential Logic
C2MOS Transient Response B.Supmonchai July 4th, 2005 C2MOS Transient Response -0.5 0.5 1 1.5 2 2.5 3 4 6 8 Time (nsec) Volts QM(3) clk(0.1 ns) Q(3) Q(0.1) clk(3 ns) For slow clocks, potential for a race condition exists For slow clocks, potential for a race condition exists 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
True Single Phase Clocked (TSPC) Latches B.Supmonchai July 4th, 2005 True Single Phase Clocked (TSPC) Latches Negative Latch Positive Latch clk In Q clk In Q Uses only a single clock – so no clock overlap (skew) to worry about; also reduced clock load Transparent mode is equivalent to two cascaded inverters (latch is non-inverting) hold when clk = 1 transparent when clk = 0 transparent when clk = 1 hold when clk = 0 Uses only a single clock No clock overlap (skew) to worry about ; reduced clock load 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Embedding Logic in TSPC Latch B.Supmonchai July 4th, 2005 Embedding Logic in TSPC Latch clk A Q B clk In Q PUN PDN Can embed logic into latch (or ff) - reduces the delay overhead associated with the latch. Set-up time increased, but overall performance improved: the increase in the set-up time is typically smaller than the delay of an AND gate. E.g., using minimum size devices set-up of AND latch is 140 psec. Using the conventional approach of AND gate followed by latch has an effective set-up time of 600 psec. Technique used extensively in the design of the EV4 DEC Alpha microprocessor and many other high performance processors. A AND B Logic can be embedded into latch (or FF) Reduce delay overhead associated with the latch 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Notes on Embedding Logic in TSPC Latch Set-up time increased, but overall performance improved The increase in the set-up time is typically smaller than the delay of an AND gate. For example, using minimum size devices set-up of AND latch is 140 psec. Using the conventional approach of AND gate followed by latch has an effective set-up time of 600 psec. Technique used extensively in the design of the EV4 DEC Alpha microprocessor and many other high performance processors. 2102-545 Digital ICs Sequential Logic
TSPC ET FF clk D Master Slave Q QM on off on off 12 Transistors B.Supmonchai July 4th, 2005 TSPC ET FF clk D Master Slave Q QM on off on off For lecture Clock load of 4 transistors (similar to transmission gate or C2MOS) but only one clock to drive and route (12 transistors as compared to 8 in the previous two designs) Virtually all constraints removed - no clocks to overlap, no race Warning - similar to C2MOS, TSPC malfunctions when the slope of the clock is not sufficiently steep. Slow clock cause both the NMOS and PMOS clocked transistors to be on simultaneously, resulting in undefined values of the states and race conditions. Clock slopes thus must be carefully engineered. If necessary, local buffers must be introduced to ensure the quality of the clock signal. 12 Transistors 4 Clock loads master transparent slave hold master hold slave transparent clk Virtually all constraints removed - no clocks to overlap, no race 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Notes on TSPC ET FF Warning! - similar to C2MOS, TSPC flipflops malfunction when the slope of the clock is not sufficiently steep. Slow clock cause both the NMOS and PMOS clocked transistors to be ON simultaneously, resulting in undefined values of the states and race conditions. Clock slopes thus must be carefully engineered. If necessary, local buffers must be introduced to ensure the quality of the clock signal 2102-545 Digital ICs Sequential Logic
Simplified TSPC ET FF clk D Q X Y I1 I2 I3 on off D on off 1 !D B.Supmonchai July 4th, 2005 Simplified TSPC ET FF clk D Q X Y M1 M2 M3 M6 M5 M4 M7 M8 M9 I1 I2 I3 on off D on off 1 !D Positive edge triggered - ask class why! Still clock load of 4 transistors (similar to transmission gate or C2MOS) but only one clock to drive and route, and now only 9 (or 11 if really need Q not !Q) transistors (as compared to 8 in previous two) When clk=0, the input inverter is sampling D onto X, the second (dynamic inverter) is in the precharge mode so Y is 1, and the third inverter is in hold mode (so Q is stable). On the rising edge of the clock, the middle inverter evaluates and since the third inverter is sampling when clk=1 the output Q goes to its new state. On the positive edge of the clock, note that the node X transitions to a low if D is high. Therefore, the input must be kept stable until the value on node X before the rising edge of the clock propagates to Y – hold time of the register (less than 1 inverter delay since it takes 1 inverter delay for the input to affect node X). Propagation delay is essentially three inverters since the value on node X must propagate to output Q Set-up time is the time for node X to be valid – one inverter delay 9 Transistors* 4 Clock loads *(11 if Q is needed) I1 sample (transparent) I2 precharged I3 hold clk I1 hold I2 evaluate I3 sample (transparent) 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Notes on TSPC ET FF On the positive edge of the clock, note that the node X transitions to a low if D is high. Therefore, the input must be kept stable until the value on node X before the rising edge of the clock propagates to Y Hold time of the register (less than 1 inverter delay since it takes 1 inverter delay for the input to affect node X). Propagation delay is essentially three inverters since the value on node X must propagate to output Q Set-up time is the time for node X to be valid – one inverter delay 2102-545 Digital ICs Sequential Logic
Sizing Issues in Simplified TSPC ET FF B.Supmonchai July 4th, 2005 Sizing Issues in Simplified TSPC ET FF Time (nsec) Volts clk !Qorig Qorig !Qmod Qmod Transistor sizing Original width M4, M5 = 0.5m M7, M8 = 2m Modified width M4, M5 = 1m M7, M8 = 1m Sizing is critical – with improper sizing glitches may occur due to race condition when the clock transitions from low to high. When clk transitions from low to high, nodes Y and !Q start to discharge simultaneously (case for D low). Once Y is sufficiently low, the trend on !Q reverses. Note glitch (red case) and also reduces contamination delay. Can fix by resizing (note green case) so that the relative strengths of the pull-down paths of the second and third inverter let Y discharge faster than !Q Sizing is critical – with improper sizing glitches may occur due to race condition when the clock transitions from low to high 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Split-Output TSPC Latches B.Supmonchai July 4th, 2005 Split-Output TSPC Latches Positive Latch Negative Latch clk In Q A clk In Q A Also called split-output latches - reduces clock load by half (to two for a ff composed of a positive-negative latch pair). Downside is not all node voltages in the latch experience full logic swing due to threshold drop. E.g., for positive latch when D=0 and clk=1, A=Vdd-Vth (Also limits the amount of Vdd scaling possible with this latch). transparent when clk = 1 hold when clk = 0 hold when clk = 1 transparent when clk = 0 When In = 0, A = VDD - VTn When In = 1, A = | VTp | 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Split-Output TSPC ET FF B.Supmonchai July 4th, 2005 Split-Output TSPC ET FF 8 Transistors* 2 Clock loads *(10 if Q is needed) clk D Q QM A Which edge-triggered? Now clock load of only 2 transistors and 8+2 transistors Which edge-triggered? Downside is not all node voltages in the latch experience full logic swing due to threshold drop. E.g., for positive latch when D=0 and clk=1, A=Vdd-Vth (Also limits the amount of Vdd scaling possible with this latch). 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Pulse-Triggered Flipflops Another approach to design an edge-triggered flipflop is to use pulse-triggered. Master-Slave Flipflop Pulse-Triggered Flipflop D Clk Q Data L1 L2 L Data D Clk Q 2102-545 Digital ICs Sequential Logic
B.Supmonchai July 4th, 2005 Pulsed FF (AMD-K6) Pulse registers - a short pulse (glitch clock) is generated locally from the rising (or falling) edge of the system clock and is used as the clock input to the flipflop 1/0 ON/ OFF 0/Vdd ON/OFF 1 OFF ON X clk D Q M1 M2 M3 M4 M5 M6 P1 P2 P3 !clkd ON Vdd OFF 1 When the clock is low, M3 and M6 are off, and P1 is on precharging node X. And the output node Q is decoupled from X so is in hold mode. !clkd is a delayed inverted version of clk. On the rising edge of clk, M3 and M6 turn on while M1 and M4 stay on for a short period. During this period the ff is transparent and the input data D is sampled by the ff. Once !clkd goes low, node X is decoupled from the input and is either held or starts to precharge to Vdd by PMOS device P2. On the falling edge of the clock, node X is held at Vdd and the output is held stable by the cross-coupled inverters. Note that the one-shot (pulse) is integrated into the register. The transparency period determines the hold time. The window must be wide enough for the input data to propagate to Q. Note also that the set-up time can be NEGATIVE (if the transparency window is longer than the delay from input to output). This is attractive, as data can arrive at the register even after the clock goes high, meaning that time can be borrowed from the previous cycle. OFF 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Notes on Pulsed FF Race conditions are avoided by keeping the transparent mode time very short (during the pulse only) Reduce clock load but substantially increase complexity in verification The transparency period determines the hold time. The window must be wide enough for the input data to propagate to Q. The set-up time can be NEGATIVE (if the transparency window is longer than the delay from input to output). This is attractive, as data can arrive at the register even after the clock goes high, meaning that time can be borrowed from the previous cycle. 2102-545 Digital ICs Sequential Logic
Sense Amp FF (StrongArm SA100) B.Supmonchai July 4th, 2005 Sense Amp FF (StrongArm SA100) Sense amplifier is a circuit that accept small swing input signals and amplify them to full rail-to-rail signals 1 1 clk D Q !Q M1 M2 M3 M5 M6 M4 M9 M7 M8 M10 !S !R X Y Sense amplifier based 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Notes on Sensed Amp FF The key is transistor M4 (in the middle of Sensed amp); it delays signals that pass through to the other side of its terminal, making the change on the other side slower When D = 1, Y changes after X due to the delay of M4. By the time M6 reacts to the change at its terminal, it is already turned off by the terminal voltage at M4 (a 0). Thus, M6 holds a 1. M4 also provides DC-leakage path to ground for either node X or Y in case that the inputs change their value after the positive edge of CLK arrives. Advantages are reduced clock load and that it can be used as a receiver for reduced swing differential buses Where does the differential signal enter? 2102-545 Digital ICs Sequential Logic
Flipflop Comparison Chart B.Supmonchai July 4th, 2005 Flipflop Comparison Chart Name Type #clk ld #tr tset-up thold tpFF Mux Static 8 (clk-!clk) 20 3tpinv+tptx tpinv+tptx PowerPC 16 2-phase Ps-Static 8 (clk1-clk2) T-gate Dynamic 4 (clk-!clk) 8 tptx to1-1 2tpinv+tptx C2MOS TSPC 4 (clk) 11 tpinv 3tpinv S-O TSPC 2 (clk) 10 AMD K6 5 (clk) 19 SA 100 SenseAmp 3 (clk) 2102-545 Digital ICs Sequential Logic 2102-545 Digital ICs
Choosing a Clocking Strategy Choosing the right clocking scheme affects the functionality, speed, and power of a circuit Two-phase designs + robust and conceptually simple - need to generate and route two clock signals - have to design to accommodate possible skew between the two clock signals Single phase designs + only need to generate and route one clock signal + supported by most automated design methodologies + don’t have to worry about skew between the two clocks - have to have guaranteed slopes on the clock edges 2102-545 Digital ICs Sequential Logic
Non-Bistable Sequential Circuits Previously, we have defined a circuit having two stable states a bi-stable circuit Other regenerative circuits, which are non-bistable: Monostable Only one stable state -> Pulse generators, One-shot circuits Astable No stable states -> Oscillator, On-chip clock generator Schmitt Trigger A special regenerative circuit exhibiting hysteresis in VTC. 2102-545 Digital ICs Sequential Logic
Schmitt Trigger 2 important properties Non-Bistable Sequential Circuits 2 important properties Hysteresis Fast Transition Time at the output 2102-545 Digital ICs Sequential Logic
Noise Suppression using Schmitt Trigger VIN t0 t VM+ VM- VOUT t t0 + tp Example: Switch Debouncer 2102-545 Digital ICs Sequential Logic
CMOS Schmitt Trigger X VIN VOUT M1 M4 M2 M3 Low-to-High VDD Moves switching threshold of the first inverter Low-to-High reff = kM1/(kM2 + kM4) High-to-Low reff = (kM1 + kM3)/kM2 Adapting the ratio between PMOS and NMOS, depending upon the direction of the transition results in a shift in switching threshold 2102-545 Digital ICs Sequential Logic
Schmitt Trigger Simulated VTC M1 = 1 m/0.25 m, M2 = 3 m/0.25 m, M3 = 0.5 m/0.25 m M4 = 1.5 m/0.25 m M4 = k x 0.5 m/0.25 m 2.5 V M 2 1 in (V) 2.0 1.5 1.0 0.5 0.0 Vout(V) 2.5 k = 2 = 3 = 4 = 1 V in (V) 2.0 1.5 1.0 0.5 0.0 Vout(V) Voltage Transfer Characteristics with hysteresis Effect of varying the ratio of the PMOS device M4 2102-545 Digital ICs Sequential Logic
CMOS Schmitt Trigger (2) VIN VOUT X M1 M5 M6 M3 M4 How does the gate operate? Sketch VTC and find expression for VM- and VM+ 2102-545 Digital ICs Sequential Logic
Review: Ring Oscillator tp Period: T = 2 x tp x N Different Clock Duty-Cycles and phases can be derived using simple logic operations 2102-545 Digital ICs Sequential Logic
Voltage Controller Oscillator (VCO) Oscillation frequency of a VCO is a function (typically nonlinear) of a control voltage Delay of a current starved inverter depends on the current limit available to discharge the load capacitance of the gate 2102-545 Digital ICs Sequential Logic
Current-Starved Inverter Simulation The device is in the subthreshold region when Vctrl is smaller than VT, resulting in large variations of tp as the drive current is exponentially dependent on the drive voltage Delay sensitive to noise and variation in Vctrl Vctrl (V) tpHL (nsec) 2102-545 Digital ICs Sequential Logic
Differential Delay Element and VCO in 2 V ctrl o 1 delay cell + - two stage VCO v 1 2 3 4 - Inverting Inputs/Outputs + Non-Inverting Inputs/Outputs Oscillator with even number of stages can be implemented Differential-type VCO has better immunity to common mode noise (e.g., supply noise) but consume more power 2102-545 Digital ICs Sequential Logic
2-Stage VCO Simulation 0.5 0.0 1.0 1.5 2.0 2.5 3.0 2 V 1 3 4 time (ns) 3.5 The In-Phase and Quadrature Phase are produced simultaneously 2102-545 Digital ICs Sequential Logic