Download presentation
Presentation is loading. Please wait.
Published byValentijn Boer Modified over 6 years ago
1
Introduction to CMOS VLSI Design Chapter 5 Power
This work is protected by Canadian copyright laws and is provided solely for the use of instructors in teaching their courses and assessing student learning. Dissemination or sale of any part of this work (including on the Internet) will destroy the integrity of the work and is not permitted. The copyright holder grants permission to instructors who have adopted the textbook accompanying this work to post this material online only if the use of the website is restricted by access codes to students in the instructor's class that is using the textbook and provided the reproduced material bears this copyright notice. slides from David Harris adapted by Duncan Elliott Textbook: CMOS VLSI Design - A Circuits and Design Perspective, 4th Edition, N.H.E. Weste & D. Harris Chapter 5
2
Outline Power and Energy Dynamic Power Static Power Chapter 5
3
Why Power Matters Packaging costs Power supply rail design
Chip and system cooling costs Noise immunity and system reliability Battery life (in portable systems) Environmental concerns Office equipment accounted for 5% of total US commercial energy usage in 1993 and increasing Chapter 5
4
Battery Chemistry and Energy/Mass Density
Lead Nickel Lithium (ion, polymer) What’s next? Chapter 5
5
Power and Energy Power is drawn from a voltage source attached to the VDD pin(s) of a chip. Instantaneous Power: Energy: Average Power: Chapter 5
6
Power versus Energy Power is height of curve Watts
Lower power design could simply be slower Approach 1 Approach 2 time Energy is area under curve Watts Two approaches require the same energy Energy – changing the operating frequency does not change the energy consumption! Approach 1 Approach 2 time Chapter 5
7
Power Dissipation Sources
Ptotal = Pdynamic + Pstatic Dynamic power: Pdynamic = Pswitching + Pshortcircuit Switching load capacitances Short-circuit (crowbar) current Static power: Pstatic = (Isub + Igate + Ijunct + Icontention)VDD Subthreshold leakage Gate leakage Junction leakage Contention current Chapter 5
8
Power in Circuit Elements
Chapter 5
9
Charging a Capacitor When the gate output rises
Energy stored in capacitor is But energy drawn from the supply is Half the energy from VDD is dissipated in the pMOS transistor as heat, other half stored in capacitor When the gate output falls Energy in capacitor is dumped to GND Dissipated as heat in the nMOS transistor Chapter 5
10
Switching Waveforms Example: VDD = 1.0 V, CL = 150 fF, f = 1 GHz
Chapter 5
11
Switching Power Chapter 5
12
Dynamic Power f01 Energy/transition = CL * VDD2 * P01
Vin Vout CL f01 Energy/transition = CL * VDD2 * P01 Pdyn = Energy/transition * f = CL * VDD2 * P01 * f Half of the energy is dissipated in the PMOS device, the remainder is stored on the load capacitor. Notice that this energy dissipation is independent of the size (and hence the resistance) of the PMOS device. During the high-to-low transition, this capacitor is discharged and the stored energy is dissipated in the NMOS transistor. P0->1 is the transition probability - probability that the output is going to make the transition from 0 to 1 (the energy consuming transition). Ceff is the effective capacitance representing the average capacitance switched every clock cycle Dynamic power accounts for 60 to 80% of the total power consumption of today’s parts. Advances in technology result in ever higher values for f0->1 (as tp decreases). At the same time the total capacitance on the chip (CL) increases as more and more gates are placed on a single die. With CL = 6fF, Edyn = 37.5 fJoules (for 2.5 V supply). If clocked at the maximum rate (T = 1/f = tpLH + tpHL = 2tp) then Pdyn = Edyn/(2tp) = 580 microWatts Data dependent - a function of switching activity! Chapter 5
13
Activity Factor Suppose the system clock frequency = f
Let fsw = af, where a = activity factor If the signal is a clock, a = 1 If the signal switches once per cycle, a = ½ Dynamic power: Chapter 5
14
Low V-Swing Signaling Driver power Receiver power Latching receivers
Chapter 5
15
Short Circuit (Crowbar) Current
When transistors switch, both nMOS and pMOS networks may be momentarily ON at once Leads to a blip of “short circuit” current. < 10% of dynamic power if rise/fall times are comparable for input and output We will generally ignore this component. It is included in circuit simulation Chapter 5
16
Short Circuit Power Vin Isc Vout CL Accounts for 20 to 40% of power of today’s technology Finite slope of the input signal causes a direct current path between VDD and GND for a short period of time during switching when both the NMOS and PMOS transistors are conducting. Chapter 5
17
Short Circuit Currents
Esc = tsc VDD Ipeak P01 Psc = tsc VDD Ipeak f01 Duration and slope of the input signal, tsc Ipeak determined by the saturation current of the P and N transistors which depend on their sizes, process technology, temperature, etc. strong function of the ratio between input and output slopes a function of CL Direct path currents tsc is when both devices are conducting – peak and duration of Isc both increase as the input slope decreases Chapter 5
18
Impact of CL on Psc in next Stage
Isc 0 Isc Imax Vin Vout Vin Vout CL CL Large capacitive load Output fall time significantly larger than input rise time. Small capacitive load Output fall time substantially smaller than the input rise time. Left case - input moves through the transient region before the output starts to change. As the source-drain voltage of the PMOS is approximately 0 during that period, the device shuts off without ever delivering any current, so Isc is close to zero. Right case - Drain-source voltage of PMOS equals VDD for most of the transition period, giving maximum Isc Chapter 5
19
Dynamic Power Example 1 billion transistor chip 50M logic transistors
Average width: 12 l Activity factor = 0.1 950M memory transistors Average width: 4 l Activity factor = 0.02 1.0 V 50 nm process C = 1 fF/mm (gate) fF/mm (diffusion) Estimate dynamic power 1 GHz. Neglect wire capacitance and short-circuit current. Chapter 5
20
Solution Chapter 5
21
Dynamic Power Reduction
Try to minimize: Activity factor Capacitance Supply voltage Voltage swing Frequency Chapter 5
22
Activity Factor Estimation
Let Pi = Prob(node i = 1) Pi = 1-Pi ai = Pi * Pi Completely random data has P = 0.5 and a = 0.25 Data is often not completely random e.g. upper bits of 64-bit words representing bank account balances are usually 0 Data propagating through ANDs and ORs has lower activity factor Depends on design, but typically a ≈ 0.1 Chapter 5
23
Gates and Probability Chapter 5
24
Example A 4-input AND is built out of two levels of gates
Estimate the activity factor at each node if the inputs have P = 0.5 and inputs are uncorrelated with each other and in time: Chapter 5
25
Transition Probabilities
Switching activity is a strong function of the input signal statistics PA and PB are the probabilities that inputs A and B are one A B A Understanding the signal statistics and their impact on switching events can be used to significantly impact the power dissipation. Observe how the graph degrades into the simple inverter case when one of the input probabilities is set to 0 B CL PA 1 1 PB P01 = P0 x P1 = (1-(1-PA)(1-PB)) (1-PA)(1-PB) Chapter 5
26
Transition Probabilities
P01 = Pout=0 x Pout=1 NOR (1 - (1 - PA)(1 - PB)) x (1 - PA)(1 - PB) OR (1 - PA)(1 - PB) x (1 - (1 - PA)(1 - PB)) NAND PAPB x (1 - PAPB) AND (1 - PAPB) x PAPB XOR (1 - (PA + PB- 2PAPB)) x (PA + PB- 2PAPB) X 0.5 A Z 0.5 B For lecture. Ignoring signal statistics can result in substantial errors in energy/power estimation Need to look at the truth tables to understand the equations. Computation of the probabilities is straightforward: signal and transition probabilities are evaluated in an ordered fashion progressing from input to output node. Approach has two major limitations: 1-it does not deal with circuits with feedback 2-it assumes that the signal probabilities at the input of each gate are independent. For X: P01 = P0 x P1 = (1-PA) PA = 0.5 x 0.5 = 0.25 For Z: P01 = P0 x P1 = (1-PXPB) PXPB = (1 – (0.5 x 0.5)) x (0.5 x 0.5) = 3/16 Chapter 5
27
Inter-signal Correlations
Determining switching activity is complicated by the fact that signals exhibit correlation in space and time reconvergent fan-out (1-0.5)(1-0.5)x(1-(1-0.5)(1-0.5)) = 3/16 0.5 A X 0.5 B Z For lecture. Even if the primary inputs are uncorrelated, the signals become correlated (“colored”) as they propagate through the logic network. Assume PA = PB = 0.5 But notice that Z = (A or B) and B = AB or B = B, so 0 -> 1 should be (and is) 1/2 x 1/2 = 1/4 !!! (1- 3/16 x 0.5) x (3/16 x 0.5) = 0.085 Reconvergent is P(Z=1) = P(B=1) & P(A=1 | B=1) ? What is Z? Have to use conditional probabilities Chapter 5
28
Ignores glitching effects
Logic Restructuring Logic restructuring: changing the topology of a logic network to reduce transitions AND: P01 = P0 x P1 = (1 - PAPB) x PAPB 3/16 0.5 A Y 0.5 (1-0.25)*0.25 = 3/16 A B W 7/64 0.5 15/256 X B F 15/256 0.5 0.5 C C F 0.5 D D Z 0.5 0.5 3/16 Look at designing for speed – 8-input AND gate. Which implementation is lower energy? Which is lower delay? So which is better overall? Also look at slide speed.19, Design Technique 3 – when deciding which configuration consumes less power and has the best performance Chain implementation has a lower overall switching activity than the tree implementation for random inputs Ignores glitching effects Chapter 5
29
Input Ordering (1-0.5x0.2)x(0.5x0.2)=0.09 (1-0.2x0.1)x(0.2x0.1)=0.0196
B A X X C B F F 0.1 A 0.2 C 0.5 0.1 For lecture Activity at output node, F, equal in both cases Beneficial to postpone the introduction of signals with a high transition rate (signals with signal probability close to 0.5) Chapter 5
30
Glitching A X B Z C ABC 101 000 Unit Delay X Z
Gates have a nonzero propagation delay resulting in spurious transitions or glitches (dynamic hazards) glitch: node exhibits multiple transitions in a single cycle before settling to the correct logic value A X B Z C ABC 101 000 Unit Delay X For lecture Assumes unit delay model (gates have non zero delay) Shaded area is a glitch (aka critical race, dynamic hazard) Z Chapter 5
31
So equalize the lengths of timing paths through logic
Balanced Delay Paths Glitching is due to a mismatch in the path lengths in the logic network; if all input signals of a gate change simultaneously, no glitching occurs F1 F2 F3 1 F1 1 F2 2 F3 If you can arrange it so that all the inputs change simultaneously -> no glitching Making the path lengths to the inputs of a gate approximately the same is usually sufficient to eliminate glitches - delay balancing So equalize the lengths of timing paths through logic Chapter 5
32
Clock Gating The best way to reduce the activity is to turn off the clock to registers in unused blocks Saves clock activity (a = 1) Eliminates all switching activity in the block Requires determining if block will be used Chapter 5
33
Capacitance Gate capacitance Fewer stages of logic Small gate sizes
Wire capacitance Good floorplanning to keep communicating blocks close to each other Drive long wires with inverters or buffers rather than complex gates Chapter 5
34
Voltage / Frequency Run each block at the lowest possible voltage and frequency that meets performance requirements Voltage Domains Provide separate supplies to different blocks Level converters required when crossing from low to high VDD domains Dynamic Voltage Scaling Adjust VDD and f according to workload Chapter 5
35
Multiple VDD Domains IO / core
2.5V or 3.3V IO to interface with chips using existing standards 1.2V to 0.8V core for small geometry transistors Need 2 thicknesses of gate oxides Still lower power, low speed circuits in the core How to get signals between VDD domains? Multiple core supplies not as popular as multiple VT masks High VT for low power (low leakage, low crowbar) Low VT for high speed (high current) Chapter 5
36
Multiple VDD VDDH Vout VDDL Vin How many VDD? – Two is becoming common
Many chips already have two supplies (one for core and one for I/O) When combining multiple supplies, level converters are required whenever a module at the lower supply drives a gate at the higher supply (step-up) If a gate supplied with VDDL drives a gate at VDDH, the PMOS never turns off The cross-coupled PMOS transistors do the level conversion The NMOS transistor operate on a reduced supply Level converters are not needed for a step-down change in voltage Overhead of level converters can be mitigated by doing conversions at register boundaries and embedding the level conversion inside the flipflop VDDH Vin Vout VDDL The delay of the level converter is quite sensitive to transistor sizing and supply voltage fluctuations. For a low VDDL, the delay can become very long. (Since the NMOS operate with a reduced drive (VDDL-VT) they have to be made larger to be able to overpower the feedback). Chapter 5
37
Dual-Supplies Minimum energy consumption is achieved if all logic paths are critical (have the same delay) Clustered voltage-scaling Each path starts with VDDH and switches to VDDL (gray logic gates) when delay slack is available Level conversion is done in the flipflops at the end of the paths VDDL A number of studies have shown that for typical delay path distributions, adding more supplies (than two) yields only marginal additional savings. When using clustered voltage scaling, the dual-supply approach is more effective when large capacitances are concentrated towards the end of the logic block (such as in buffer chains). Chapter 5
38
Static Power Static power is consumed even when chip is quiescent.
Leakage draws power from nominally OFF devices Ratioed circuits burn power in fight between ON transistors Chapter 5
39
Leakage Power VDD Ileakage
Vout Drain junction leakage Sub-threshold current Gate leakage For drain junction: Leakage current per unit drain area typically ranges between 10 and 100 picoA/micron**2 at room temperature (25 degrees C) for 0.25 micron CMOS (1 million gates, each with drain area of 0.5 micron**2 = milliW). However, values increase with increasing junction temperature - exponentially! JS doubles for every 9 deg C! At 85 degrees C (commonly imposed upper bound for junction temperatures) the leakage currents increase by a factor of 60 over room temperature. As temperature is a strong function of the dissipated heat and its removal mechanisms – limit power heat and use chip packages that support efficient heat removal. Have to be more concerned about sub-threshold current. The closer the threshold voltage is to zero, the larger the leakage current at VGS = 0 V. To offset this effect, threshold voltages are not scaled as aggressively as supply voltages (narrowing noise margins). Unfortunately, scaling the supply voltage and not scaling threshold hurts performance, esp as VDD approaches 2 VT. Sub-threshold current is the dominant factor. All increase exponentially with temperature! Chapter 5
40
Leakage and VT Continued scaling of supply voltage and the subsequent scaling of threshold voltage will make subthreshold conduction a dominant component of power dissipation. 10-2 Each 255mV increase in VT gives 3 orders of magnitude reduction in leakage (but adversely affects performance) 10-7 The choice of VT represents a trade-off between performance and static power dissipation. Process technologies with sharper turn-off characteristics (like SOI with slope factors closer to the ideal 60mV/decade) will become more attractive. With sizable static power dissipation, it is essential that non-active modules are powered-down (put in standby) by disconnecting the unit from the supply rails or by lowering the supply voltage. There are other leakage factors that we are ignoring here including: Drain-Induced Barrier Lowering (I3) Gate-Induced Drain Leakage (I4) Punchthrough (I5) Narrow Width Effect (I6) Gate Oxide Tunneling (I7) Hot Carrier(I8) 10-12 Chapter 5
41
Leakage Current Increase
Ileakage(nA/m) Note y axis is log – doubles for every 10 degree increase in temperature Will leakage power still be 6 orders of magnitude smaller (as it is today) than dynamic power for future technologies even at room temperature. Example – the following configurations have the same power performance: 3V VDD, 0.7V VT and a 0.45V VDD, 0.1V VT The dynamic power consumption of the latter is, however, 45 times smaller. Temp(C) From De,1999 Chapter 5
42
Static Power Example Revisit power estimation for 1 billion transistor chip Estimate static power consumption Subthreshold leakage Normal Vt: nA/mm High Vt: 10 nA/mm High Vt used in all memories and in 95% of logic gates Gate leakage 5 nA/mm Junction leakage negligible Chapter 5
43
Solution 14% of 6.1W dynamic power What if 90% idle? Chapter 5
44
Subthreshold Leakage For Vds > 50 mV Typical values in 65 nm
Ioff = leakage at Vgs = 0, Vds = VDD Typical values in 65 nm Ioff = 100 Vt = 0.3 V Ioff = 10 nA/mm @ Vt = 0.4 V Ioff = 1 nA/mm @ Vt = 0.5 V h = 0.1 kg = 0.1 S = 100 mV/decade Chapter 5
45
Stack Effect Series OFF transistors have less leakage
Vx > 0, so N2 has negative Vgs Leakage through 2-stack reduces ~10x Leakage through 3-stack reduces further Chapter 5
46
Leakage Control Leakage and delay trade off
Aim for low leakage in sleep and low delay in active mode To reduce leakage: Increase Vt: multiple Vt Use low Vt only in critical circuits Increase Vs: stack effect Input vector control in sleep Decrease Vb Reverse body bias in sleep Or forward body bias in active mode Chapter 5
47
Gate Leakage Extremely strong function of tox and Vgs
Negligible for older processes Approaches subthreshold leakage at 65 nm and below in some processes An order of magnitude less for pMOS than nMOS Control leakage in the process using tox > 10.5 Å High-k gate dielectrics help Some processes provide multiple tox e.g. thicker oxide for 3.3 V I/O transistors Control leakage in circuits by limiting VDD Chapter 5
48
NAND3 Leakage Example 100 nm process Ign = 6.3 nA Igp = 0
Ioffn = 5.63 nA Ioffp = 9.3 nA Data from [Lee03] Chapter 5
49
Junction Leakage From reverse-biased p-n junctions
Between diffusion and substrate or well Ordinary diode leakage is negligible Band-to-band tunneling (BTBT) can be significant Especially in high-Vt transistors where other leakage is small Worst at Vdb = VDD Gate-induced drain leakage (GIDL) exacerbates Worst for Vgd = -VDD (or more negative) Chapter 5
50
Power Gating Turn OFF power to blocks when they are idle to save leakage Use virtual VDD (VDDV) Gate outputs to prevent invalid logic levels to next block Voltage drop across sleep transistor degrades performance during normal operation Size the transistor wide enough to minimize impact Switching wide sleep transistor costs dynamic power Only justified when circuit sleeps long enough Chapter 5
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.