Basics of Energy & Power Dissipation Lecture notes S. Yalamanchili, S. Mukhopadhyay. A. Chowdhary
Outline Basic Concepts Dynamic power Static power Time, Energy, Power Tradeoffs Activity model for power estimation Combinational and sequential logic
Background Reading http://en.wikipedia.org/wiki/CPU_power_dissip ation http://en.wikipedia.org/wiki/CMOS#Power:_sw itching_and_leakage http://www.xbitlabs.com/articles/cpu/display/c ore-i5-2500t-2390t-i3-2100t-pentium- g620t.html http://www.cpu-world.com/info/charts.html
Where Does the Power Go in CMOS? Dynamic Power Consumption Caused by switching transitions cost of switching state Static Power Consumption Caused by leakage currents in the absence of any switching activity Power consumption per transistor changes with each technology generation No longer reducing at the same rate What happens to power density? Vin Vout Vdd PMOS Ground NMOS AMD Trinity APU
n-channel MOSFET tox GATE SOURCE DRAIN L GATE SOURCE BODY DRAIN L Vgs < Vth transistor off - Vth is the threshold voltage Vgs > Vth transistor on Impact of threshold voltage Higher Vth, slower switching speed, lower leakage Lower Vth, faster switching speed, higher leakage Actual physics is more complex but this will do for now!
Charge as a State Variable For computation we should be able to identify if each of the variable (a,b,c,x,y) is in a ‘1’ or a ‘0’ state. We could have used any physical quantity to do that Voltage Current Electron spin Orientation of magnetic field ……… a b c x y All nodes have some capacitance associated with them We choose voltage on the capacitor at each node (a,b,c…) to distinguish between a ‘0’ and a ‘1’. a b c x y Logic 1: Cap is charged Logic 0: Cap is discharged + -
Abstracting Energy Behavior How can we abstract energy consumption for a digital device? Consider the energy cost of charge transfer Modeled as an on/off resistance Vin Vout Vdd PMOS Ground NMOS Vin Vout 1 Modeled as an output capacitance
Switch from one state to another To perform computation, we need to switch from one state to another Connect the cap to GND thorough an ON NMOS Logic 1: Cap is charged Logic 0: Cap is discharged + - Connect the cap to VCC thorough an ON PMOS The logic dictates whether a node capacitor will be charged or discharged.
Dynamic Power Dynamic power is used in charging and discharging the capacitances in the CMOS circuit. Time VDD Voltage T Output Capacitor Charging Output Capacitor Discharging Input to CMOS inverter iDD CL CL = load capacitance
Switching Delay Vout Vdd Vdd Charging to logic 1
Switching Delay Vout Vdd Vdd Discharging to logic 0
Switching Energy CL is the load capacitance Energy dissipated per transition? Note: An equal amount is dissipated when C discharged Power = CL V2 f for a node switching at a rate f (f may be ½ the clock rate)
Same Energy = area under the curve Power Vs. Energy P2 Power(watts) P1 P0 Same Energy = area under the curve Time Power(watts) P0 Time Energy is a rate of expenditure of energy One joule/sec = one watt Both profiles use the same amount of energy at different rates or power
Dynamic Power vs. Dynamic Energy Dynamic power: consider the rate at which switching (energy dissipation) takes place VDD VDD iDD CL Time VDD Voltage T iDD CL Input to CMOS inverter Output Capacitor Charging Output Capacitor Discharging activity factor = fraction of total capacitance that switches each cycle
Higher Level Blocks Vdd A B C A C B Vdd A B C A B C
Implications What if I halved the clock rate to reduce dynamic energy? Combinational Logic clk cond input What if I halved the clock rate to reduce dynamic energy? What if I turned off the clock to a block of logic? What if I lower the voltage? How can I reduce the capacitance?
Static Power Technology scaling has caused transistors to become smaller and smaller. As a result, static power has become a substantial portion of the total power. GATE SOURCE DRAIN Gate Leakage Junction Leakage Sub-threshold Leakage
Energy-Delay Interaction Energy or delay Energy-Delay Product (EDP) VDD VDD Delay decreases with supply voltage but energy/power increases
Static Energy-Delay Interaction leakage or delay Vth leakage delay tox SOURCE DRAIN L GATE Static energy increases exponentially with decrease in threshold voltage Delay increases with threshold voltage
Optimizing Power vs. Energy Maximize battery life minimize energy Should we reduce clock frequency by 2 (reduce dynamic power) or reduce voltage and frequency by 2 (reduce both static and dynamic power). Let power be P and time T. Option 1: (p/2 + P)2T = 3PT Option 2: (P/8 + P/2)2T = 1.25PT But takes twice as long Thermal envelopes minimize peak power Example:
What About Wires? Lumped RC Model Resistance per unit length Capacitance per unit length We will not directly address delay or energy expended in the interconnect in this class Simple architecture model: lump the energy/power with the source component
ALU Energy Consumption 3 R e s u l t O p r a i o n 1 C y I B v b 2 L Can we count the number of transitions in each 1-bit ALU for an operation? Can we estimate static power? Computing per operation energy
Closer Look: A 4-bit Ripple Adder Carry Cin S3 S2 S1 S0 Critical Path = DXOR+4*(DAND+DOR) for 4-bit ripple adder (9 gate levels) For an N-bit ripple adder Critical Path Delay ~ 2(N-1)+3 = (2N+1) Gate delays Activity (and therefore power) is a function of the input data values!
Modeling Component Energy Per-use energies can be estimated from Gate level designs and analyses Circuit-level designs and analyses Implementation and measurement There are various open-source tools for analysis Mentor, Cadence, Synopsys, etc. Hardware Design Technology Parameters Circuit-level Estimation Tool Estimation Results: Area, Energy, Timing, etc.
Example: A Simple Energy Model We can use a simple model of per-access energy for the architecture components @16nm Common Components Access Energy (10-12 joules) Inst. Cache + TLB Read 19.22 Write 21.6 Data Cache + TLB Read 25.28 Write 27.26 Inst. Decoder Logic Switching 16.78 Inst. Registers Read 2.74 Write 4.38 FP. Registers Read 1.26 Write 1.98 Other Buffers Read 9.74 Write 11.18 ALU + Result Bus (interconnect) Logic Switching 123.2 FPU + Result Bus (interconnect) Logic Switching 241.02 Each unit can be accessed multiple times depending on instruction type An x86 instruction consumes 600pJ ~ 4nJ dynamic energy.
Summary Two major classes of energy/power dissipation – static and dynamic Managing energy is different from managing power leads to different solutions Technology plays a major role in determining relative costs Energy of components are often estimated using approximate models of switching activity
Study Guide Explain the difference between energy dissipation and power dissipation Distinguish between static power dissipation and dynamic power dissipation What is the impact of threshold voltage on the delay and energy dissipation? As you increase the supply voltage what is the behavior of the delay of logic elements? Why? As you increase the supply voltage what is the behavior of static and dynamic energy and static and dynamic power of logic elements?
Study Guide (cont.) Do you expect the 0-1 and 1-0 transitions at the output of a gate to dissipate the same amount of energy? For a mobile device, would you optimize power or energy? Why? What are the consequences of trying to optimize one or the other? Why does the energy dissipation of a 32-bit integer adder depend on the input values? If I double the processor clock frequency and run the same program will it take less or more energy?
Glossary Dynamic Energy Dynamic Power Load capacitance Static Energy Static Power Time constant Threshold voltage Switching energy
Clocked CMOS Logic CL CL A Out B Out Out Eliminates PMOS array that would take 3X the chip area of an N-mos array. Clocking eliminates "race" conditions. A B Out NAND Vdd Vdd A B Clock Out CL Out A CL A B B Standard CMOS Clocked Mixed-MOS J. A. Copeland, R. H. Krambeck, "Functionally Static Type Semiconductor Shift Register with Half Dynamic-Half Static Stages," patent 3,993,916, Nov. 23, 1976.