CMOS Inverter: Dynamic V DD RnRn V out = 0 V in = V DD CLCL t pHL = f(R n, C L ) Transient, or dynamic, response determines the maximum speed at which a device can be operated.
Sources of Resistance MOS structure resistance - R on Source and drain resistance Contact (via) resistance Wiring resistance Top view Drain n+Source n+ W L Poly Gate
MOS Structure Resistance The simplest model assumes the transistor is a switch with an infinite “off” resistance and a finite “on” resistance R on However R on is nonlinear, so use instead the average value of the resistances, R eq, at the end-points of the transition (V DD and V DD /2) R eq = ½ (R on (t 1 ) + R on (t 2 )) R eq = ¾ V DD /I DSAT (1 – 5/6 V DD ) S D R on V GS V T
Equivalent MOS Structure Resistance For V DD >>V T +V DSAT /2, R eq is independent of V DD (see plot). Only a minor improvement in R eq occurs when V DD is increased (due to channel length modulation) Once the supply voltage approaches V T, R eq increases dramatically V DD (V) NMOS(k ) PMOS (k ) R eq (for W/L = 1), for larger devices divide R eq by W/L V DD (V) R eq (Ohm) x10 5 (for V GS = V DD, V DS = V DD V DD /2) The on resistance is inversely proportional to W/L. Doubling W halves R eq
Source and Drain Resistance More pronounced with scaling since junctions are shallower With silicidation R is reduced to the range 1 to 4 / RSRS RDRD S G D R S,D = (L S,D /W)R where L S,D is the length of the source or drain diffusion R is the sheet resistance of the source or drain diffusion (20 to 100 / )
Contact Resistance Transitions between routing layers (contacts through via’s) add extra resistance to a wire l keep signals wires on a single layer whenever possible l avoid excess contacts l reduce contact resistance by making vias larger (beware of current crowding that puts a practical limit on the size of vias) or by using multiple minimum-size vias to make the contact Typical contact resistances, R C, (minimum-size) l 5 to 20 for metal or poly to n+, p+ diffusion and metal to poly l 1 to 5 for metal to metal contacts More pronounced with scaling since contact openings are smaller
Inverter Propagation Delay Propagation delay is proportional to the time-constant of the network formed by the pull-down resistor and the load capacitance t pHL = ln(2) R eqn C L = 0.69 R eqn C L t pLH = ln(2) R eqp C L = 0.69 R eqp C L t p = (t pHL + t pLH )/2 = 0.69 C L (R eqn + R eqp )/2 To equalize rise and fall times make the on-resistance of the NMOS and PMOS approximately equal. V DD RnRn V out = 0 V in = V DD CLCL t pHL = f(R n, C L )
Inverter Transient Response V in V out (V) t (sec) x V DD =2.5V 0.25 m W/L n = 1.5 W/L p = 4.5 R eqn = 13 k ( 1.5) R eqp = 31 k ( 4.5)
Inverter Transient Response V in V out (V) t (sec) x V DD =2.5V 0.25 m W/L n = 1.5 W/L p = 4.5 R eqn = 13 k ( 1.5) R eqp = 31 k ( 4.5) t pHL = 36 psec t pLH = 29 psec so t p = 32.5 psec tftf trtr t pHL t pLH From simulation: t pHL = 39.9 psec and t pLH = 31.7 psec
Inverter Propagation Delay, Revisited To see how a designer can optimize the delay of a gate have to expand the R eq in the delay equation V DD (V) t p(normalized) t pHL = 0.69 R eqn C L = 0.69 (3/4 (C L V DD )/I DSATn ) 0.52 C L / (W/L n k’ n V DSATn )
Design for Performance Reduce C L l internal diffusion capacitance of the gate itself -keep the drain diffusion as small as possible l interconnect capacitance l fanout Increase W/L ratio of the transistor l the most powerful and effective performance optimization tool in the hands of the designer l watch out for self-loading! – when the intrinsic capacitance dominates the extrinsic load Increase V DD l can trade-off energy for performance l increasing V DD above a certain level yields only very minimal improvements l reliability concerns enforce a firm upper bound on V DD
NMOS/PMOS Ratio If speed is the only concern, reduce the width of the PMOS device! l widening the PMOS degrades the t pHL due to larger parasitic capacitance = (W/L p )/(W/L n ) r = R eqp /R eqn (resistance ratio of identically-sized PMOS and NMOS) opt = r when wiring capacitance is negligible So far have sized the PMOS and NMOS so that the R eq ’s match (ratio of 3 to 3.5) l symmetrical VTC l equal high-to-low and low-to-high propagation delays
PMOS/NMOS Ratio Effects = (W/L p )/(W/L n ) t p (sec) x of 2.4 (= 31 k /13 k ) gives symmetrical response of 1.6 to 1.9 gives optimal performance t pLH tptp t pHL
Device Sizing for Performance Divide capacitive load, C L, into l C int : intrinsic - diffusion and Miller effect l C ext : extrinsic - wiring and fanout t p = 0.69 R eq C int (1 + C ext /C int ) = t p0 (1 + C ext /C int ) l where t p0 = 0.69 R eq C int is the intrinsic (unloaded) delay of the gate Widening both PMOS and NMOS by a factor S reduces R eq by an identical factor (R eq = R ref /S), but raises the intrinsic capacitance by the same factor (C int = SC iref ) t p = 0.69 R ref C iref (1 + C ext /(SC iref )) = t p0 (1 + C ext /(SC iref )) l t p0 is independent of the sizing of the gate; with no load the drive of the gate is totally offset by the increased capacitance l any S sufficiently larger than (C ext /C int ) yields the best performance gains with least area impact
Sizing Impacts on Delay S t p (sec) x The majority of the improvement is already obtained for S = 5. Sizing factors larger than 10 barely yield any extra gain (and cost significantly more area). for a fixed load self-loading effect (intrinsic capacitance dominates)
Impact of Fanout on Delay Extrinsic capacitance, C ext, is a function of the fanout of the gate - the larger the fanout, the larger the external load. First determine the input loading effect of the inverter. Both C g and C int are proportional to the gate sizing, so C int = C g is independent of gate sizing and t p = t p0 (1 + C ext / C g ) = t p0 (1 + f/ ) i.e., the delay of an inverter is a function of the ratio between its external load capacitance and its input gate capacitance: the effective fan-out f f = C ext /C g
Inverter Chain If C L is given l How should the inverters be sized? l How many stages are needed to minimize the delay? InOut CLCL Real goal is to minimize the delay through an inverter chain the delay of the j-th inverter stage is t p,j = t p0 (1 + C g,j+1 /( C g,j )) = t p0 (1 + f j / ) and t p = t p1 + t p t pN so t p = t p,j = t p0 (1 + C g,j+1 /( C g,j )) C g,1 12N
Sizing the Inverters in the Chain The optimum size of each inverter is the geometric mean of its neighbors – meaning that if each inverter is sized up by the same factor f wrt the preceding gate, it will have the same effective fan-out and the same delay f = C L /C g,1 = F where F represents the overall effective fan-out of the circuit (F = C L /C g,1 ) and the minimum delay through the inverter chain is t p = N t p0 (1 + ( F ) / ) The relationship between t p and F is linear for one inverter, square root for two, etc. NN N
Example of Inverter Chain Sizing C L /C g,1 has to be evenly distributed over N = 3 inverters C L /C g,1 = 8/1 f = InOut C L = 8 C g,1 C g,1 1
Example of Inverter Chain Sizing C L /C g,1 has to be evenly distributed over N = 3 inverters C L /C g,1 = 8/1 f = InOut C L = 8 C g,1 C g,1 1 f = 2f 2 = 4 3 8 = 2
Determining N: Optimal Number of Inverters What is the optimal value for N given F (=f N ) ? l if the number of stages is too large, the intrinsic delay of the stages becomes dominate l if the number of stages is too small, the effective fan-out of each stage becomes dominate NN The optimum N is found by differentiating the minimum delay expression divided by the number of stages and setting the result to 0, giving + F - ( F lnF)/N = 0 For = 0 (ignoring self-loading) N = ln (F) and the effective-fan out becomes f = e = For = 1 (the typical case) the optimum effective fan-out (tapering factor) turns out to be close to 3.6
Optimum Effective Fan-Out Choosing f larger than optimum has little effect on delay and reduces the number of stages (and area). l Common practice to use f = 4 (for = 1) l But too many stages has a substantial negative impact on delay F opt f normalized delay
Example of Inverter (Buffer) Staging C L = 64 C g,1 C g,1 = 1 1 C L = 64 C g,1 C g,1 = C L = 64 C g,1 C g,1 = C L = 64 C g,1 C g,1 = N f t p
Impact of Buffer Staging for Large C L Impressive speed-ups with optimized cascaded inverter chain for very large capacitive loads. F ( = 1) UnbufferedTwo Stage Chain Opt. Inverter Chain , ,00010,
Input Signal Rise/Fall Time In reality, the input signal changes gradually (and both PMOS and NMOS conduct for a brief time). This affects the current available for charging/discharging C L and impacts propagation delay. t s (sec) t p (sec) x for a minimum-size inverter with a fan-out of a single gate t p increases linearly with increasing input slope, t s, once t s > t p t s is due to the limited driving capability of the preceding gate
Design Challenge A gate is never designed in isolation: its performance is affected by both the fan-out and the driving strength of the gate(s) feeding its inputs. t i p = t i step + t i-1 step ( 0.25) Keep signal rise times smaller than or equal to the gate propagation delays. l good for performance l good for power consumption Keeping rise and fall times of the signals small and of approximately equal values is one of the major challenges in high-performance designs - slope engineering.
Delay with Long Interconnects When gates are farther apart, wire capacitance and resistance can no longer be ignored. t p = 0.69R dr C int + (0.69R dr +0.38R w )C w (R dr +R w )C fan where R dr = (R eqn + R eqp )/2 = 0.69R dr (C int +C fan ) (R dr c w +r w C fan )L r w c w L 2 c int V in c fan (r w, c w, L) V out Wire delay rapidly becomes the dominate factor (due to the quadratic term) in the delay budget for longer wires.