COMBINATIONAL LOGIC - 2
Sizing Logic Path for Speed
Sizing Logic Paths for Speed Frequently, input capacitance of a logic path is constrained Logic also has to drive some capacitance Example: ALU load in an Intel’s microprocessor is 0.5pF How do we size the ALU datapath to achieve maximum speed? We have already solved this for the inverter chain – can we generalize it for any type of logic?
Buffer Example To find N: fi = Ci+1/Ci ~ 4
Buffer Example Generalisation In Out CL 1 2 N Rewrite delay in new form (in units of tinv) For inverter: pi = internal delay = 1 gi = gate to internal cap ratio = 1/ =1 for =1 How to generalize this to any logic path?
Delay in Logic gates f = effective fanout (Cout/Cin) Logical effort is a function of topology, independent of sizing Effective fanout (electrical effort) is a function of load/gate size
Electrical Effort Estimates of intrinsic delay factors of various logic types assuming simple layout styles, and a fixed PMOS/NMOS ratio.
Logical Effort Inverter has the smallest logical effort and intrinsic delay of all static CMOS gates Logical effort of a gate presents the ratio of its input capacitance to the inverter capacitance when sized to deliver the same current Logical effort increases with the gate complexity
Logical Effort g = 1 g = 4/3 g = 5/3 Logical effort is the ratio of input capacitance of a gate to the input capacitance of an inverter with the same output current
Logical Effort of Gates
Logical Effort of Gates pNAND g = p = d = pINV t Normalized delay (d) g = p = d = F(Fan-in) 1 2 3 4 5 6 7 Fan-out (h)
Logical Effort of Gates pNAND g = 4/3 p = 2 d = (4/3) +2 pINV t Normalized delay (d) g = 1 p = 1 d = gf+1 F(Fan-in) 1 2 3 4 5 6 7 Fan-out (h)
Logical Effort of Gates
Add Branching Effort Branching effort:
Multistage Networks Stage effort: hi = gifi Path electrical effort: F = f1f2 … fn = Cout/Cin Path logical effort: G = g1g2…gN Branching effort: B = b1b2…bN Path effort: H = GFB (for inverter H = F) Path delay D = Sdi = Spi + Shi
Optimum Effort per Stage When each stage bears the same effort: Stage efforts: g1f1 = g2f2 = … = gNfN Effective fanout of each stage: Minimum path delay
Optimal Number of Stages For a given load, and given input capacitance of the first gate Find optimal number of stages and optimal sizing Substitute ‘best stage effort’
Example: Optimize Path Effective fanout, F = G = H = h = a = b =
Example: Optimize Path Effective fanout, F = 5 G = 25/9 H = 125/9 = 13.9 h = 1.93 a = 1.93 b = ha/g2 = 2.23 c = hb/g3 = 5g4/h = 2.59
Example – 8-input AND
Method of Logical Effort Compute the path effort: H = GBF Find the best number of stages N ~ log4H Compute the stage effort h = H1/N Sketch the path with this number of stages Work either from either end, find sizes: Cin = Cout*g/h Reference: Sutherland, Sproull, Harris, “Logical Effort, Morgan-Kaufmann 1999.
Summary Replace F, f with H and h Replace H, h with F and f Sutherland, Sproull Harris
Ratioed Logic
Overview
Static CMOS Circuit At every point in time (except during the switching transients) each gate output is connected to either V DD or ss via a low-resistive path. The outputs of the gates assume at all times the value of the Boolean function, implemented by the circuit (ignoring, once again, the transient effects during switching periods). This is in contrast to the dynamic circuit class, which relies on temporary storage of signal values on the capacitance of high impedance circuit nodes.
Static CMOS
Alternatives to Static CMOS Static CMOS is robust and reliable l But » Large (2N transistors) » Slow (large capacitance) l Hence … A quest for alternative logic styles that are smaller, faster, or lower power.
Ratioed Logic
Ratioed Logic
Active Loads
Load Lines of Ratioed Gates
Pseudo-NMOS
Pseudo-NMOS VTC
Pseudo-NMOS Performance
Pseudo-NMOS NAND Gate VDD GND
Improved Loads (1)
Improved Loads (2) Differential Cascode Voltage Switch Logic (DCVSL) V DD DD M1 M2 Out Out A A PDN1 PDN2 B B V V SS SS Differential Cascode Voltage Switch Logic (DCVSL)
DCVSL Example
DCVSL AND/NAND Gate
DCVSL NAND/AND Transient Response 0.2 0.4 0.6 0.8 1.0 -0.5 0.5 1.5 2.5 A B [V] e g A B a t o l V A , B A,B Time [ns]
Pass Transistor Logic
Pass-Transistor Logic
Complimentary Pass Transistor Logic
Complimentary Pass Transistor Logic (Example)
4 Input NAND in CPL
NMOS-Only Logic In Out x [V] e g a t l o V Time [ns] 3.0 2.0 1.0 0.0 0.5 1 1.5 2 Time [ns]
NMOS-only Switch V does not pull up to 2.5V, but 2.5V - V A = 2.5 V A = 2.5 V B M B n C M 1 L V does not pull up to 2.5V, but 2.5V - V B TN Threshold voltage loss causes static power consumption NMOS has higher threshold than PMOS (body effect)
Cascading Rules for PTL
Solution 1: NMOS Only Logic: Level Restoring Transistor Advantage: Full Swing Disadvantage: More Complex, Larger Capacitance Careful sizing of Mr is requied
Level Restoring Transistor Size VX = VDD Rn/(Rr + Rn) < VM • Advantage: Full Swing • Restorer adds capacitance, takes away pull down current at X • Ratio problem
Level Restoring Transistor (a) Output node (b) Intermediate node X 2 4 6 t (nsec) -0.5 0.5 1.5 2.5 V o u ( ) with 2.5 without without 1.5 with X V V B 0.5 -0.5 2 4 6 t (nsec)
Level Restoring Transistor Size VM (inv)= VDD/2 Upper limit on restorer size Pass-transistor pull-down can have several transistors in stack
Solution 2: Zero Threshold (VT=0 )Pass Transistor Zero Threshold (VT =0) Transistor
Solution 2: Single Transistor Pass Gate with VT=0 WATCH OUT FOR LEAKAGE CURRENTS
Solution 3: Transmission Gate C = 2.5 V A =2.5 V B C L C = 0 V
Resistance of Transmission Gate
Pass-Transistor Based Multiplexer VDD GND In1 S S In2
Transmission Gate XOR A A M2 B B F M1 M3/M4 A A
Transmission Gate Full Adder P = A B S = Ci P Co = [P +A] • [P +Ci] Similar delays for sum and carry
Delay in Transmission Gate Networks V n-1 n C 2.5 In 1 i i+1 V 1 i-1 C 2.5 i i+1 R eq C (a) (b) m R eq R eq R eq In C C C C (c)
Elmore Delay
Delay Optimization