Penn ESE534 Spring DeHon 1 ESE534 Computer Organization Day 8: February 10, 2010 Energy, Power, Reliability
Penn ESE534 Spring DeHon 2 Today Energy Tradeoffs? Voltage limits and leakage? Variations Transients Thermodynamics meets Information Theory (brief, if we get to it)
Penn ESE534 Spring DeHon 3 At Issue Many now argue power will be the ultimate scaling limit –(not lithography, costs, …) Proliferation of portable and handheld devices –…battery size and life biggest issues Cooling, energy costs may dominate cost of electronics –Even server room applications
Preclass 1 1GHz case –Voltage? –Energy per Operation? –Power required for 2 processors? 2GHz case –Voltage? –Energy per Operation? –Power required for 1 processor? Penn ESE534 Spring DeHon 4
5 Energy and Delay gd =Q/I=(CV)/I I d,sat =( C OX /2)(W/L)(V gs -V TH ) 2
Penn ESE534 Spring DeHon 6 Tradeoff E V 2 gd 1/V We can trade speed for energy E×( gd ) 2 constant Martin et al. Power-Aware Computing, Kluwer gd =(CV)/I I d,sat (V gs -V TH ) 2
Penn ESE534 Spring DeHon 7 Parallelism We have Area-Time tradeoffs Compensate slowdown with additional parallelism …trade Area for Energy Architectural Option
Penn ESE534 Spring DeHon 8 Question How far can this go? What limits us?
DeHon-Workshop FPT Challenge: Power
DeHon-Workshop FPT Origin of Power Challenge Limited capacity to remove heat –~100W/cm 2 force air –1-10W/cm 2 ambient Transistors per chip grow at Moore’s Law rate = (1/F) 2 Energy/transistor must decrease at this rate to keep constant power density E/tr CV 2 –…but V scaling more slowly than F
Penn ESE534 Spring DeHon 11 Energy per Operation C total = # transistors × C tr C tr scales (down) as F # transistors scales as F -2 …ok if V scales as F…
DeHon-Workshop FPT ITRS Vdd Scaling: V Scaling more slowly than F
DeHon-Workshop FPT CV 2 scaling from ITRS: More slowly than (1/F) 2
DeHon-Workshop FPT Origin of Power Challenge Transistors per chip grow at Moore’s Law rate = (1/F) 2 Energy/tr must decrease at this rate to keep constant E/tr CV 2 f
DeHon-Workshop FPT Historical Power Scaling [Horowitz et al. / IEDM 2005]
Impact Can already place more transistors on a chip than we can afford to turn on. Power potential challenge/limiter for all future chips. Penn ESE534 Spring DeHon 16
DeHon-Workshop FPT Source: Carter/Intel 17 Power Limits Integration Impact
Penn ESE534 Spring DeHon 18 How far can we reduce voltage?
Penn ESE534 Spring DeHon 19 Limits Ability to turn off the transistor Noise Parameter Variations
MOSFET Conduction Penn ESE534 Spring DeHon 20 From:
Transistor Conduction Penn ESE534 Spring DeHon 21 Three regions –Subthreshold (V gs <V TH ) –Linear (V gs >V TH ) and (V ds < (V gs -V TH )) –Saturation (V gs >V TH ) and (V ds > (V gs -V TH ))
Saturation Region Penn ESE534 Spring DeHon 22 I d,sat =( C OX /2)(W/L)(V gs -V TH ) 2 Saturation Region (V gs >V TH ) (V ds > (V gs -V TH ))
Linear Region Penn ESE534 Spring DeHon 23 I d,lin =( C OX )(W/L)(V gs -V TH )V ds -(V ds ) 2 /2 (V gs >V TH ) (V ds < (V gs -V TH ))
DeHon-Workshop FPT Subthreshold Region (V gs <V TH ) [Frank, IBM J. R&D v46n2/3p235]
Operating a Transistor Concerned about I on and I off I on drive current for charging –Determines speed: T gd = CV/I I off leakage current –Determines leakage power E leak = V×I leak ×T cycle Penn ESE534 Spring DeHon 25
DeHon-Workshop FPT Leakage To avoid leakage want I off very small Switch V from 0 to V dd V gs in off state is 0 V gs <V TH
DeHon-Workshop FPT Leakage S 90mV for single gate S 70mV for double gate 4 orders of magnitude I VT /I off V TH >280mV Leakage limits V TH in turn limits V dd
How maximize I on /I off ? Maximize I on /I off – for given V dd ? E sw CV 2 Get to pick V TH, V dd Penn ESE534 Spring DeHon 28 I d,lin =( C OX )(W/L)(V gs -V TH )V ds -(V ds ) 2 /2 I d,sat =( C OX /2)(W/L)(V gs -V TH ) 2
Preclass 2 E = E sw + E leak E leak = V×I leak ×T cycle E sw CV 2 I chip-leak = N devices ×I tr-leak Penn ESE534 Spring DeHon 29
Preclass 2 E leak (V) ? T cycle (V)? Penn ESE534 Spring DeHon 30
In Class Assign calculations Collect results Penn ESE534 Spring DeHon 31
Values VT(v)Esw(V)Eleak(V)E(V) E E E E E E E E E E E E E E E E E E E E-074E E E E E E E E E E E E E E E E E E-118.1E E-10 Penn ESE534 Spring DeHon 32
Graph for In Class Penn ESE534 Spring DeHon 33
Impact Subthreshold slope prevents us from scaling voltage down arbitrarily. Induces a minimum operating energy. Penn ESE534 Spring DeHon 34
DeHon-Workshop FPT Challenge: Variation (This section was a little rushed)
Penn ESE535 Spring DeHon 36 Statistical Dopant Count and Placement [Bernstein et al, IBM JRD 2006]
Penn ESE535 Spring DeHon 37 V th 65nm [Bernstein et al, IBM JRD 2006]
DeHon-Workshop FPT Variation Fewer dopants, atoms increasing Variation How do we deal with variation? 38 % variation in V TH (From ITRS prediction)
Penn ESE534 Spring DeHon 39 Impact of Variation? Higher V TH ? –Not drive as strongly –I d,sat (V gs -V TH ) 2 Lower V TH ? –Not turn off as well leaks more
Penn ESE535 Spring DeHon 40 Variation Margin for expected variation Must assume V TH can be any value in range Probability Distribution V TH I on,min =I on ( Vth,max ) I d,sat (V gs -V TH ) 2
Penn ESE535 Spring DeHon 41 Margining Must raise V dd to accommodate worst- case value increase energy Probability Distribution V TH I on,min =I on ( Vth,max ) I d,sat (V gs -V TH ) 2
DeHon-Workshop FPT Variation Increasing variation forces higher voltages –On top of our leakage limits 42
DeHon-Workshop FPT Margins growing due to increasing variation Margined value may be worse than older technology? Variations Probability Distribution Delay New Old
DeHon-Workshop FPT End of Energy Scaling? [Bol et al., IEEE TR VLSI Sys 17(10):1508—1519] Black nominal Grey with variation
Chips Growing Larger chips sample further out on distribution curve Penn ESE534 Spring DeHon 45 From:
Lecture Ended Here (Didn’t really cover material in transient and thermodynamics sections) Penn ESE534 Spring DeHon 46
Challenge Transients Penn ESE534 Spring DeHon 47
Transient Sources Effects –Thermal noise –Timing –Ionizing particles particle 10 5 to 10 6 electrons Calculated gates with electrons last time –Even if CMOS restores, takes time Penn ESE534 Spring DeHon 48
Voltage and Error Rate Penn ESE534 Spring DeHon 49 [Austin et al.--IEEE Computer, March 2004]
Errors versus Frequency Penn ESE534 Spring DeHon 50 Conventional Design Max TP Resilient Design Max TP V CC & Temperature F CLK Guardband [Bowman, ISSCC 2008]
DeHon-Workshop FPT Scaling and Error Rates Source: Carter/Intel 51 Increasing Error Rates Technology (nm) SEU/bit Norm to 130nm cache arrays logic 2X bit/latch count increase per generation
DeHon-Workshop FPT Power and Reliability Intersection is the challenge Push V dd in opposite directions Both reach inflection points –From doesn’t matter –To major concern
Penn ESE534 Spring DeHon 53 Thermodynamics
Penn ESE534 Spring DeHon 54 Lower Bound? Reducing entropy costs energy Single bit gate output –Set from previous value to 0 or 1 –Reduce state space by factor of 2 –Entropy: S= k×ln(before/after)=k×ln2 –Energy=T S=kT×ln(2) Naively: setting a bit costs at least kT×ln(2)
Penn ESE534 Spring DeHon 55 Numbers (ITRS 2005) kT×ln(2) = 2.87× J (at R.T. K=300) W/L=3 W=21nm=0.021 m C 8× F F Table 41d E op =CV 2 =2.5 × F
Penn ESE534 Spring DeHon 56 Recycling… Thermodynamics only says we have to dissipate energy if we discard information Can we compute without discarding information? Will that help us?
Penn ESE534 Spring DeHon 57 Three Reversible Primitives
Penn ESE534 Spring DeHon 58 Universal Primitives These primitives –Are universal –Are all reversible If keep all the intermediates they produce –Discard no information –Can run computation in reverse
Penn ESE534 Spring DeHon 59 Thermodynamics In theory, at least, thermodynamics does not demand that we dissipate any energy (power) in order to compute.
Admin Assignment grades, feedback on blackboard for HW1 and HW2 Class Wed. No class next Monday (2/22) Penn ESE534 Spring DeHon 60
Penn ESE534 Spring DeHon 61 Big Ideas Can trade time for energy –…area for energy Noise and leakage limit voltage scaling Power major limiter going forward –Can put more transistors on a chip than can switch Continued scaling demands –Deal with noisier components High variation and high transient upsets Thermodynamically admissible to compute without dissipating energy