Download presentation
Presentation is loading. Please wait.
Published byAbel Rogers Modified over 9 years ago
1
Sp09 CMPEN 411 L14 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 14: Designing for Low Power [Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]
2
Sp09 CMPEN 411 L14 S.2 Reminders Next lecture l Dynamic logic -Reading assignment – Rabaey, et al, 6.3
3
Sp09 CMPEN 411 L14 S.3 Review: CMOS Power Equations P = C L V DD 2 f + t sc V DD I peak f + V DD I leak Dynamic power Short-circuit power Leakage power
4
Sp09 CMPEN 411 L14 S.4 Power and Energy Design Space Constant Throughput/Latency Variable Throughput/Latency EnergyDesign TimeNon-active ModulesRun Time Active (Dynamic) Logic design Reduced V dd TSizing Multi-V dd Clock Gating DFS, DVS (Dynamic Freq, Voltage Scaling) Leakage (Standby) Multi-V T Stack effect Pin ordering Sleep Transistors Multi-V dd Variable V T Input control Variable V T
5
Sp09 CMPEN 411 L14 S.5 Transistor Sizing for Minimum Energy Device sizing COMBINED with supply voltage reduction is a very effective way to reduce the energy consumption of a logic network Device sizing affects dynamic energy consumption l gain is largest for networks with large overall effective fan-outs (F = C L /C g,1 )
6
Sp09 CMPEN 411 L14 S.6 Dynamic Power as a Function of Device Size Device sizing affects dynamic energy consumption l gain is largest for networks with large overall effective fan-outs (F = C L /C g,1 ) The optimal gate sizing factor (f) for dynamic energy is smaller than the one for performance, especially for large F’s l e.g., for F=20, f opt (energy) = 3.53 while f opt (performance) = 4.47 If energy is a concern avoid oversizing beyond the optimal 1234567 0 0.5 1 1.5 f normalized energy F=1 F=2 F=5 F=10 F=20 From Nikolic, UCB
7
Sp09 CMPEN 411 L14 S.7 Dynamic Power Consumption is Data Dependent ABOut 001 010 100 110 2-input NOR Gate With input signal probabilities P A=1 = 1/2 P B=1 = 1/2 Static transition probability P 0 1 = P out=0 x P out=1 = P 0 x (1-P 0 ) Switching activity, P 0 1, has two components l A static component – function of the logic topology l A dynamic component – function of the timing behavior (glitching) NOR static transition probability = 3/4 x 1/4 = 3/16
8
Sp09 CMPEN 411 L14 S.8 NOR Gate Transition Probabilities CLCL A B BA P 0 1 = P 0 x P 1 = (1-(1-P A )(1-P B )) (1-P A )(1-P B ) PAPA PBPB 0 101 Switching activity is a strong function of the input signal statistics l P A and P B are the probabilities that inputs A and B are one
9
Sp09 CMPEN 411 L14 S.9 Transition Probabilities for Some Basic Gates P 0 1 = P out=0 x P out=1 NOR(1 - (1 - P A )(1 - P B )) x (1 - P A )(1 - P B ) OR(1 - P A )(1 - P B ) x (1 - (1 - P A )(1 - P B )) NANDP A P B x (1 - P A P B ) AND(1 - P A P B ) x P A P B XOR(1 - (P A + P B - 2P A P B )) x (P A + P B - 2P A P B ) B A Z X 0.5 For Z: P 0 1 = For X: P 0 1 =
10
Sp09 CMPEN 411 L14 S.10 Transition Probabilities for Some Basic Gates P 0 1 = P out=0 x P out=1 NOR(1 - (1 - P A )(1 - P B )) x (1 - P A )(1 - P B ) OR(1 - P A )(1 - P B ) x (1 - (1 - P A )(1 - P B )) NANDP A P B x (1 - P A P B ) AND(1 - P A P B ) x P A P B XOR(1 - (P A + P B - 2P A P B )) x (P A + P B - 2P A P B ) B A Z X 0.5 For Z: P 0 1 = P 0 x P 1 = (1-P X P B ) P X P B For X: P 0 1 = P 0 x P 1 = (1-P A ) P A = 0.5 x 0.5 = 0.25 = (1 – (0.5 x 0.5)) x (0.5 x 0.5) = 3/16
11
Sp09 CMPEN 411 L14 S.11 Another Example B A Z X 0.5 (1-0.5)(1-0.5)x(1-(1-0.5)(1-0.5)) = 3/16 (1- 3/16 x 0.5) x (3/16 x 0.5) = 0.085
12
Sp09 CMPEN 411 L14 S.12 Inter-signal Correlations B A Z X P(Z=1) = P(B=1) & P(A=1 | B=1) 0.5 (1-0.5)(1-0.5)x(1-(1-0.5)(1-0.5)) = 3/16 (1- 3/16 x 0.5) x (3/16 x 0.5) = 0.085 Reconvergent Determining switching activity is complicated by the fact that signals exhibit correlation in space and time l reconvergent fan-out Have to use conditional probabilities notice that Z = (A or B) and B = AB or B = B, so 0 -> 1 should be (and is) 1/2 x 1/2 = 1/4 !!!
13
Sp09 CMPEN 411 L14 S.13 Logic Restructuring Chain implementation has a lower overall switching activity than the tree implementation for random inputs Logic restructuring: changing the topology of a logic network to reduce transitions A B C D F A B C DZ F W X Y 0.5 (1-0.25)*0.25 = 3/16 0.5 7/64 15/256 3/16 15/256 AND: P 0 1 = P 0 x P 1 = (1 - P A P B ) x P A P B
14
Sp09 CMPEN 411 L14 S.14 Input Ordering A B C X F 0.5 0.2 0.1 B C A X F 0.2 0.1 0.5 Which is better wrt transition probabilities?
15
Sp09 CMPEN 411 L14 S.15 Input Ordering Beneficial to postpone the introduction of signals with a high transition rate (signals with signal probability close to 0.5) A B C X F 0.5 0.2 0.1 B C A X F 0.2 0.1 0.5 (1-0.5x0.2)x(0.5x0.2)=0.09(1-0.2x0.1)x(0.2x0.1)=0.0196 Which is better wrt transition probabilities?
16
Sp09 CMPEN 411 L14 S.16 Glitching in Static CMOS Networks ABC X Z 101000 Unit Delay A B X Z C Gates have a nonzero propagation delay resulting in spurious transitions or glitches (dynamic hazards) l glitch: node exhibits multiple transitions in a single cycle before settling to the correct logic value
17
Sp09 CMPEN 411 L14 S.17 Glitching in Static CMOS Networks ABC X Z 101000 Unit Delay A B X Z C Gates have a nonzero propagation delay resulting in spurious transitions or glitches (dynamic hazards) l glitch: node exhibits multiple transitions in a single cycle before settling to the correct logic value
18
Sp09 CMPEN 411 L14 S.18 Glitching in an RCA S0 S1 S2S14 S15 Cin S0 S1 S2 S3 S4 S5 S10 S15
19
Sp09 CMPEN 411 L14 S.19 Balanced Delay Paths to Reduce Glitching So equalize the lengths of timing paths through logic F1F1 F2F2 F3F3 0 0 0 0 1 2 F1F1 F2F2 F3F3 0 0 0 0 1 1 Glitching is due to a mismatch in the path lengths in the logic network; if all input signals of a gate change simultaneously, no glitching occurs
20
Sp09 CMPEN 411 L14 S.20 Power and Energy Design Space Constant Throughput/Latency Variable Throughput/Latency EnergyDesign TimeNon-active ModulesRun Time Active (Dynamic) Logic design Reduced V dd TSizing Multi-V dd Clock Gating DFS, DVS (Dynamic Freq, Voltage Scaling) Leakage (Standby) Multi-V T Stack effect Pin ordering Sleep Transistors Multi-V dd Variable V T Input control Variable V T
21
Sp09 CMPEN 411 L14 S.21 Dynamic Power as a Function of V DD Decreasing the V DD decreases dynamic energy consumption (quadratically) But, increases gate delay (decreases performance) V DD (V) t p(normalized) Determine the critical path(s) at design time and use high V DD for the transistors on those paths for speed. Use a lower V DD on the other gates, especially those that drive large capacitances (as this yields the largest energy benefits).
22
Sp09 CMPEN 411 L14 S.22 Multiple V DD Considerations How many V DD ? – Two is becoming common l Many chips already have two supplies (one for core and one for I/O) When combining multiple supplies, level converters are required whenever a module at the lower supply drives a gate at the higher supply (step-up) l If a gate supplied with V DDL drives a gate at V DDH, the PMOS never turns off -The cross-coupled PMOS transistors do the level conversion -The NMOS transistor operate on a reduced supply l Level converters are not needed for a step-down change in voltage l Overhead of level converters can be mitigated by doing conversions at register boundaries and embedding the level conversion inside the flipflop (see Figure 11.47) V DDH V in V out V DDL
23
Sp09 CMPEN 411 L14 S.23 Dual-Supply Inside a Logic Block Minimum energy consumption is achieved if all logic paths are critical (have the same delay) Clustered voltage-scaling l Each path starts with V DDH and switches to V DDL (gray logic gates) when delay slack is available l Level conversion is done in the flipflops at the end of the paths
24
Sp09 CMPEN 411 L14 S.24 Dual-Supply Inside a Logic Block Minimum energy consumption is achieved if all logic paths are critical (have the same delay) Clustered voltage-scaling l Each path starts with V DDH and switches to V DDL (gray logic gates) when delay slack is available l Level conversion is done in the flipflops at the end of the paths
25
Sp09 CMPEN 411 L14 S.25 Power and Energy Design Space Constant Throughput/Latency Variable Throughput/Latency EnergyDesign TimeNon-active ModulesRun Time Active (Dynamic) Logic design Reduced V dd TSizing Multi-V dd Clock Gating DFS, DVS (Dynamic Freq, Voltage Scaling) Leakage (Standby) Multi-V T Stack effect Pin ordering Sleep Transistors Multi-V dd Variable V T Input control Variable V T
26
Sp09 CMPEN 411 L14 S.26 Stack Effect Subthreshold leakage is a function of the circuit topology and the value of the inputs V T = V T0 + ( |-2 F + V SB | - |-2 F |) where V T0 is the threshold voltage at V SB = 0; V SB is the source- bulk (substrate) voltage; is the body-effect coefficient AB B A Out VXVX Leakage is least when A = B = 0 Leakage reduction due to stacked transistors is called the stack effect
27
Sp09 CMPEN 411 L14 S.27 Short Channel Factors and Stack Effect In short-channel devices, the subthreshold leakage current depends on V GS,V BS and V DS. The V T of a short-channel device decreases with increasing V DS due to DIBL (drain-induced barrier loading). l Typical values for DIBL are 20 to 150mV change in V T per voltage change in V DS so the stack effect is even more significant for short-channel devices. l V X reduces the drain-source voltage of the top nfet, increasing its V T and lowering its leakage even more For our 0.25 micron technology, V X settles to ~100mV in steady state so V BS = -100mV and V DS = V DD -100mV which is 20 times smaller than the leakage of a device with V BS = 0mV and V DS = V DD
28
Sp09 CMPEN 411 L14 S.28 Leakage as a Function of Design Time V T Reducing the V T increases the sub- threshold leakage current (exponentially) l 90mV reduction in V T increases leakage by an order of magnitude But, reducing V T decreases gate delay (increases performance) Determine the critical path(s) at design time and use low V T devices on the transistors on those paths for speed. Use a high V T on the other logic for leakage control. l A careful assignment of V T ’s can reduce the leakage by as much as 80%
29
Sp09 CMPEN 411 L14 S.29 Dual-Thresholds Inside a Logic Block Minimum energy consumption is achieved if all logic paths are critical (have the same delay) Use lower threshold on timing-critical paths l Assignment can be done on a per gate or transistor basis; no clustering of the logic is needed l No level converters are needed
30
Sp09 CMPEN 411 L14 S.30 IBM Cu11/Cu08 Blue Logic Library ä ASIC Cu11 (130nm) Library : Dual-vt library ä 2690 total cells in standard cell library ä Nominal Vt level (~300mv) ä Low Vt level (~210mv) ä Low-vt version has same physical footprint ä ~15% improvement in gate delay ä ~10x increase in leakage power ä ASIC Cu08 (90nm) Library : Multi-vt library ä 2118 total cells in standard cell library ä Intermediate-vt (AVT) and Low-vt (LVT) version of each cell ä Two more vt levels being planned (very lowvt and high vt)
31
Sp09 CMPEN 411 L14 S.31 An example to summarize all design-time techniques Critical path
32
Sp09 CMPEN 411 L14 S.32 Design Time Low Power Techniques Lower Vdd Higher Vdd Level Converter
33
Sp09 CMPEN 411 L14 S.33 Design Time Low Power Techniques Higher Vth Lower Vth
34
Sp09 CMPEN 411 L14 S.34 Design Time Low Power Techniques Stack Forcing
35
Sp09 CMPEN 411 L14 S.35 Low Power Techniques – Interaction w/ each other Higher Vth Lower Vth Apply high Vth and size-up to recover speed
36
Sp09 CMPEN 411 L14 S.36 Next Lecture and Reminders Next lecture l Dynamic logic -Reading assignment – Rabaey, et al, 6.3
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.