Presentation is loading. Please wait.

Presentation is loading. Please wait.

Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US.

Similar presentations


Presentation on theme: "Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US."— Presentation transcript:

1

2 Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US

3 So, What’s Going On ? At 65nm node Static Power is equal to Active Power  Clock distribution accounts for half of active power

4 Why Can’t We Keep Scaling V t ? 1 10 100 1000

5 Low Power Opportunities Most of the Power reduction techniques exploit this positive slack. Power4 Timing Histogram 5%10%15%20% Exploiting positive slacks

6 Low Power Levers Structural Techniques  Voltage Islands  Multi-threshold devices  Multi-oxide devices  Minimize capacitance by custom design  Power efficient circuits  Parallelism in micro-architecture Dynamic Techniques  Clock gating  Power gating  Variable frequency  Variable voltage supply  Variable device threshold

7 Outline Clock & Latch Optimization Clock Power Active Power Leakage Power Voltage Islands Power Gating

8 Outline Clock & Latch Optimization Clock Power Active Power Leakage Power Voltage Islands Power Gating

9 Minimizing Active Power: Coarse Grained Voltage Islands Trade off power for delay by running functional blocks at different voltages Can use mix of Low and High V t to balance performance and leakage Switch off inactive blocks to reduce leakage power  E.g.: Telecom ASIC 1.0/1.2 V islands saved: 16 % active power 50 % standby power  High VT

10 Fine-Grained Voltage Islands Secondary power drop V ddl = 1.2V V ddh = 1.5V PowerPC 405 No timing degrade, and no area increase for the core!

11 Outline Clock & Latch Optimization Clock Power Active Power Leakage Power Voltage Islands Power Gating

12 Minimizing Clock Power: Local Clock buffer - Latch clustering Clocks consume large amount of power in high-performance designs  Large portion of that power goes to the last stage of the clock tree Minimize the Capacitive loading on local clock buffers by clustering latches around them.  Tradeoff between latch placement flexibility and clock power savings  Reduction in clock skew between capturing and launching latch compensates for loss in latch placement flexibility.

13 Clock Power Savings Reduces total capacitance on the local clock buffer by 25% Direct savings in clock power in the Random Control Logic

14 Outline Clock & Latch Optimization Clock Power Active Power Leakage Power Voltage Islands Power Gating

15 Minimizing Leakage Power: Power Supply Gating Leakage power is now more than switching power  Limits the performance of microprocessors Power gating is one of the most effective ways of minimizing leakage power  Cut-off power to inactive units/components Dynamic/workload based power gating  Reduces both gate and sub-threshold leakage  Over 20-2000x reduction in leakage with little or no cycle time penalty.

16 L2 P1P2 P3 P4 Dedicated Units L2 P1P2 P3 P4 offon More Power Available to Scalar Units Higher SPEC Performance Dedicated Units Available for Higher Application Performance Performance on Demand Power Gating Concept

17 Normal Operation Mode CORE I ACTIVE V DS,LINEAR VDDL GNDL V DS V GS = V DD I DS To reduce the performance degradation, the voltage drop across SLEEP transistor should be minimized to reduce active leakage current. Requires sizing up of footer device I DS,MAX V GS = 0 V VGND

18 Sleep Mode CORE VGND VDDL GNDL V DS V GS = V DD I DS During the sleep mode, all of the internal capacitive nodes and VGND node are charged up to near V DD. Requires sizing down of footer device to reduce standby leakage. I DS,MAX V GS = 0 V

19 Wake-Up Mode CORE I TURN_ON VDDL GNDL V DS V GS = V DD I DS When the SLEEP transistor is turned on, the maximum instant current can flow. Requires sizing up of footer device. I DS,MAX V GS = 0 V VGND Rs

20 Sleep / Wake / Run State Control enable fence deassert wake/run run Enter sleep state charge off assert wake off discharge run Exit sleep state assert run disable fence sleep run (idle) discharge cycle (wake) charge cycles & )

21 < 1% Frequency Loss 10x-20x Leakage Reduction Footer Selection and Sizing 100x 50x 25x 20x 15.5x 33x Leakage Reduction

22 Power vs Performance Tradeoff ~8% Performance Degradation Due to Sleep Transistor with 1% area overhead Target Specification: 250MHz at 0.9V ~ 500MHz at 1.4V 1% footer size is used for a 2-stage pipelined 40-bit ALU 130nm Hardware

23 More Than 8% Performance Degradation Less Than 2% Performance Degradation 130nm Hardware Sleep Transistor Sizing and Performance

24 Leakage Power Reduction ~2000 x ~8.4 x Leakage Suppression using Power Gating Structure with 1% area overhead Leakage Suppression Using VDD Scaling 130nm Hardware

25 Physical Design: External Footer Switch

26 Physical Design: Internal Footer Switch Internal fine-grained power gating is more efficient in addressing:  Electro-Migration and Current Delivery.

27 Ground Redistribution M2 V1 M1 Contact M3 V2 Footer Cell Logic Device The ‘real’ chip-level ground distribution is M4 and above. It is unchanged by power gating Global ground Virtual ground This part of the redistribution is electrically similar to an unmodified distribution

28 Without FootersWith Footers Footer Rows Physical Design: Footer Insertion

29 Gated and non-gated logic have identical width 5% total area overhead for power gating 20X leakage reduction <1% performance degradation Power Gating in High-Performance Non-gated LogicGated Logic

30 Power Gating: Footer area overhead 10.4% 5.7% 10mV Virtual Ground

31 Conclusions Power is the limiting factor in traditional CMOS scaling and must be dealt with aggressively  Controlling leakage is crucial for future scaling  Power gating and voltage islands are effective techniques to minimize leakage and active power  Special consideration to clock distribution must be given in high performance designs to minimize clock power In order to keep hot chips cool, a holistic power minimization approach across the whole design stack is required which must include :  Device level techniques  Circuit level techniques  System level power management


Download ppt "Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US."

Similar presentations


Ads by Google