9/23-30/04ELEC / ELEC / (Fall 2004) Advanced Topics in Electrical Engineering Designing VLSI for Low-Power and Self-Test Power Consumption in a CMOS Circuit Vishwani D. Agrawal James J. Danaher Professor Department of Electrical and Computer Engineering Auburn University
9/23-30/04ELEC / Motivation Low power applications – Remote systems (e.g., satellite) – Portable systems (e.g., mobile phone) Methods of low power design – Reduced supply voltage – Adiabatic switching – Clock suppression – Logic design for reduced activity – Reduce Hazards (40% in arithmetic logic) – Software techniques Reference: Chandrakasan and Brodersen
9/23-30/04ELEC / Low-Power Design Design practices that reduce power consumption at least by one order of magnitude; in practice 50% reduction is often acceptable. General topics –High-level and software techniques –Gate and circuit-level methods –Power estimation techniques –Test power
9/23-30/04ELEC / VLSI Chip Power Density Pentium® P Year Power Density (W/cm 2 ) Hot Plate Nuclear Reactor Rocket Nozzle Sun’s Surface Source: Intel
9/23-30/04ELEC / Specific Topics on Low-Power Power dissipation in CMOS circuits Low-power CMOS technologies Dynamic reduction techniques Leakage power Power estimation
9/23-30/04ELEC / Components of Power Dynamic –Signal transitions Logic activity Glitches –Short-circuit Static –Leakage
9/23-30/04ELEC / Power of a Transition V DD Ground CLCL R R Power = C L V DD 2 /2 + P sc ViVi VoVo i sc
9/23-30/04ELEC / Short Circuit Current, i sc (t) Time (ns) 0 1 Amp Volt V DD i sc (t) 45μA 0 V i (t) V o (t) V DD - V Tp V Tn tBtB tEtE I scmaxr
9/23-30/04ELEC / Peak Short Circuit Current Increases with the size (or gain, β) of transistors Decreases with load capacitance, C L Largest when C L = 0 Reference: M. A. Ortega and J. Figueras, “Short Circuit Power Modeling in Submicron CMOS,” PATMOS’96, Aug. 1996, pp
9/23-30/04ELEC / Short-Circuit Energy per Transition E scr = ∫ t B t E V DD i sc (t)dt = (t E – t B ) I scmaxr V DD /2 E scr = t r (V DD + V Tp -V Tn ) I scmaxr /2 E scf = t f (V DD + V Tp -V Tn ) I scmaxf /2 E scf = 0, when V DD = |V Tp | + V Tn
9/23-30/04ELEC / Short-Circuit Energy Increases with rise and fall times of input Decreases for larger output load capacitance Decreases and eventually becomes zero when V DD is scaled down but the threshold voltages are not scaled down
9/23-30/04ELEC / Short-Circuit Power Calculation Assume equal rise and fall times Model input-output capacitive coupling (Miller capacitance) Use a spice model for transistors –T. Sakurai and A. Newton, “Alpha-power Law MOSFET model and Its Application to a CMOS Inverter,” IEEE J. Solid State Circuits, vol. 25, April 1990, pp
9/23-30/04ELEC / P sc vs. C C (fF) Input rise time 3ns 0% 45% 0.5ns P sc /P total 0.7μ CMOS 3575
9/23-30/04ELEC / Technology Scaling Scale down by factors of 2 and 4, i.e., model 0.7, 0.35 and 0.17 micron technologies Constant electric field assumed Capacitance scaled down by the technology scale down factor
9/23-30/04ELEC / Technology Scaling Results t r (ns) 0% 70% P sc /P total L=0.7μ, C=40fF % L=0.35μ, C=20fF L=0.17μ, C=10fF
9/23-30/04ELEC / Effects of Scaling Down 1-16% short-circuit power at 0.7 micron 4-37% at 0.35 micron 12-60% at 0.17 micron Reference: S. R. Vemuru and N. Steinberg, “Short Circuit Power Dissipation Estimation for CMOS Logic Gates,” IEEE Trans. on Circuits and Systems I, vol. 41, Nov. 1994, pp
9/23-30/04ELEC / Summary: Short-Circuit Power Short-circuit power is consumed by each transition (increases with input transition time). Reduction requires that gate output transition should not be slower than the input transition (faster gates can consume more short-circuit power). Scaling down of supply voltage with respect to threshold voltages reduces short-circuit power.
9/23-30/04ELEC / Components of Power Dynamic –Signal transitions Logic activity Glitches –Short-circuit Static –Leakage
9/23-30/04ELEC / Leakage Power IGIG IDID I sub I PT I GIDL n+ Ground V DD R
9/23-30/04ELEC / Leakage Current Components Subthreshold conduction, I sub Reverse bias pn junction conduction, I D Gate induced drain leakage, I GIDL due to tunneling at the gate-drain overlap Drain source punchthrough, I PT due to short channel and high drain-source voltage Gate tunneling, I G through thin oxide
9/23-30/04ELEC / Subthreshold Current I sub = μ 0 C ox (W/L) V t 2 exp{(V GS -V TH )/nV t } μ 0 : carrier surface mobility C ox : gate oxide capacitance per unit area L: channel length W: gate width V t = kT/q: thermal voltage n: a technology parameter
9/23-30/04ELEC / I DS for Short Channel Device I sub = μ 0 C ox (W/L) V t 2 exp{(V GS -V TH +ηV DS )/nV t } V DS = drain to source voltage η: a proportionality factor
9/23-30/04ELEC / Increased Subthreshold Leakage 0V TH ’V TH Log I sub Gate voltage Scaled device IcIc
9/23-30/04ELEC / Summary: Leakage Power Leakage power as a fraction of the total power increases as clock frequency drops. Turning supply off in unused parts can save power. For a gate it is a small fraction of the total power; it can be significant for very large circuits. Scaling down features requires lowering the threshold voltage, which increases leakage power; roughly doubles with each shrinking. Multiple-threshold devices are used to reduce leakage power.
9/23-30/04ELEC / Components of Power Dynamic –Signal transitions Logic activity Glitches –Short-circuit Static –Leakage
9/23-30/04ELEC / Power of a Transition V DD Ground CLCL R R Power = C L V DD 2 /2 + P sc ViVi VoVo i sc
9/23-30/04ELEC / Dynamic Power Each transition of a gate consumes CV 2 /2. Methods of power saving: –Minimize load capacitances Transistor sizing Library-based gate selection –Reduce transitions Logic design Glitch reduction
9/23-30/04ELEC / Glitch Power Reduction Design a digital circuit for minimum transient energy consumption by eliminating hazards
9/23-30/04ELEC / Theorem 1 For correct operation with minimum energy consumption, a Boolean gate must produce no more than one event per transition
9/23-30/04ELEC / Given that events occur at the input of a gate (inertial delay = d ) at times t 1 <... < t n, the number of events at the gate output cannot exceed Theorem 2 min ( n, 1 + ) t n – t d t n - t 1 t n - t 1 t 1 t 2 t 3 t n t 1 t 2 t 3 t n time time
9/23-30/04ELEC / Minimum Transient Design Minimum transient energy condition for a Boolean gate: | t i - t j | < d Where t i and t j are arrival times of input events and d is the inertial delay of gate
9/23-30/04ELEC / Balanced Delay Method All input events arrive simultaneously Overall circuit delay not increased Delay buffers may have to be inserted ?
9/23-30/04ELEC / Hazard Filter Method Gate delay is made greater than maximum input path delay difference No delay buffers needed (least transient energy) Overall circuit delay may increase ? 3?
9/23-30/04ELEC / Linear Program Variables: gate and buffer delays Objective: minimize number of buffers Subject to: overall circuit delay Subject to: minimum transient condition for multi-input gates AMPL, MINOS 5.5 (Fourer, Gay and Kernighan)
9/23-30/04ELEC / Variables: Full Adder add1b
9/23-30/04ELEC / Objective Function Ideal: minimize the number of non-zero delay buffers Actual: sum of buffer delays
9/23-30/04ELEC / Specify Critical Path Delay Sum of delays on critical path ≤ maxdel
9/23-30/04ELEC / Multi-Input Gate Condition d1 d2 d d1 - d2 ≤ d d2 - d1 ≤ d d d
9/23-30/04ELEC / AMPL Solution: maxdel =
9/23-30/04ELEC / AMPL Solution: maxdel =
9/23-30/04ELEC / AMPL Solution: maxdel ≥
9/23-30/04ELEC / Power Estimates for add1b maxdel No.ofbuf. Power* with respect to Ref. Ref: model del. Ref: unit del. PeakAve.PeakAve. 67≥ * Hsiao et al., ICCAD-97
9/23-30/04ELEC / Power Calculation in Spice VDD Ground Circuit Large C Open at t = 0 Ref.: M. Shoji, CMOS Digital Circuit Technology, Prentice Hall, 1988, p t Energy, E(t) E(t) = -- C VDD C V 2 ~ C VDD ( VDD - V ) V
9/23-30/04ELEC / Power Dissipation of ALU4 Energy in nanojoules microseconds Original ALU delay ~ 3.5ns Minimum energy ALU delay ~ 10ns 1 micron CMOS, 57 gates, 14 PI, 8 PO 100 random vectors simulated in Spice
9/23-30/04ELEC / F0 Output of ALU4 Signal Amplitude, Volts nanoseconds Original ALU, delay = 7 units (~3.5ns) Minimum energy ALU, delay = 21 units (~10ns) 5 0
9/23-30/04ELEC / References E. Jacobs and M. Berkelaar, “Using Gate Sizing to Reduce Glitch Power,” Proc. ProRISC/IEEE Workshop on Circuits, Systems and Signal Processing, Nov. 1996, pp ; also Int. Workshop on Logic Synthesis, May V. D. Agrawal, “Low-Power Design by Hazard Filtering,” Proc. 10th Int. Conf. VLSI Design, Jan. 1997, pp V. D. Agrawal, M. L. Bushnell, G. Parthasarathy, and R. Ramadoss, “Digital Circuit Design for Minimum Transient Energy and a Linear Programming Method,” Proc. 12th Int. Conf. VLSI Design, Jan. 1999, pp Last two papers are available at website
9/23-30/04ELEC / A Limitation Constraints are written by path enumeration. Since number of paths in a circuit can be exponential in circuit size, the formulation is infeasible for large circuits. Example: c880 has 6.96M constraints.
9/23-30/04ELEC / Timing Window Define two timing window variables per gate output: –t i Earliest time of signal transition at gate i. –T i Latest time of signal transition at gate i. t 1, T 1 t n, T n t i, T i Ref: T. Raja, Master’s Thesis, Rutgers Univ., 2002 i
9/23-30/04ELEC / Linear Program Gate variables d 4... d 12 Buffer Variables d d 29 Corresponding window variables t 4... t 29 and T 4... T 29.
9/23-30/04ELEC / Multiple-Input Gate Constraints For Gate 7: T 7 > T 5 + d 7 ; t 7 T 7 - t 7 ; T 7 > T 6 + d 7 ; t 7 < t 6 + d 7 ;
9/23-30/04ELEC / Single-Input Gate Constraints T 16 + d 19 = T 19 ; t 16 + d 19 = t 19 ; Buffer 19:
9/23-30/04ELEC / Overall Delay Constraints T 11 < maxdelay T 12 < maxdelay
9/23-30/04ELEC / Advantage of Timing Window Path constraints (exponential in n): 2 × 2 × … 2 = 2 n paths between I/O pair A single variable specifies I/O delay. Total variables, O(n). LP constraint set is linear in the size of circuit.
9/23-30/04ELEC / Comparison of Constraints Number of gates in circuit Number of constraints
9/23-30/04ELEC / Results: 1-Bit Adder
9/23-30/04ELEC / Estimation of Power Circuit is simulated by an event-driven simulator for both optimized and un- optimized gate delays. All transitions at a gate are counted as Events[gate]. Power consumed Events[gate] x # of fanouts. Ref: “Effects of delay model on peak power estimation of VLSI circuits,” Hsiao, et al. (ICCAD`97).
9/23-30/04ELEC / Original 1-Bit Adder Color codes for number of transitions
9/23-30/04ELEC / Optimized 1-Bit Adder Color codes for number of transitions
9/23-30/04ELEC / Results: 1-Bit Adder Simulated over all possible vector transitions Average power = optimized/unit delay = 244 / 308 = Peak power = optimized/unit delay = 6 / 10 = 0.60 Power Savings : Peak = 40 % Average = 21 %
9/23-30/04ELEC / Results: 4-Bit ALU maxdelayBuffers inserted Power Savings : Peak = 33 %, Average = 21 %
9/23-30/04ELEC / Benchmark Circuits Circuit C432 C880 C6288 c7552 Maxdel. (gates) No. of Buffers Average Peak Normalized Power
9/23-30/04ELEC / Physical Design Gate l/w Gate l/w Gate l/w Gate l/w Gate delay modeled as a linear function of gate size, total load capacitance, and fanout gate sizes (Berkelaar and Jacobs, 1996). Layout circuit with some nominal gate sizes. Enter extracted routing delays in LP as constants and solve for gate delays. Change gate sizes as determined from a linear system of equations. Iterate if routing delays change.
9/23-30/04ELEC / Power Dissipation of ALU4
9/23-30/04ELEC / References R. Fourer, D. M. Gay and B. W. Kernighan, AMPL: A Modeling Language for Mathematical Programming, South San Francisco: The Scientific Press, M. Berkelaar and E. Jacobs, “Using Gate Sizing to Reduce Glitch Power,” Proc. ProRISC Workshop, Mierlo, The Netherlands, Nov. 1996, pp V. D. Agrawal, “Low Power Design by Hazard Filtering,” Proc. 10 th Int’l Conf. VLSI Design, Jan. 1997, pp V. D. Agrawal, M. L. Bushnell, G. Parthasarathy and R. Ramadoss, “Digital Circuit Design for Minimum Transient Energy and Linear Programming Method,” Proc. 12 th Int’l Conf. VLSI Design, Jan. 1999, pp M. Hsiao, E. M. Rudnick and J. H. Patel, “Effects of Delay Model in Peak Power Estimation of VLSI Circuits,” Proc. ICCAD, Nov. 1997, pp T. Raja, A Reduced Constraint Set Linear Program for Low Power Design of Digital Circuits, Master’s Thesis, Rutgers Univ., New Jersey, 2002.
9/23-30/04ELEC / Conclusion Glitch-free design through LP: constraint-set is linear in the size of the circuit. LP solution: –Eliminates glitches at all gate outputs, –Holds I/O delay within specification, and –Combines path-balancing and hazard-filtering to minimize the number of delay buffers. Linear constraint set LP produces results exactly identical to the LP requiring exponential constraint-set. Results show peak power savings up to 68% and average power savings up to 64%.