Download presentation
Presentation is loading. Please wait.
Published by용하 연 Modified over 6 years ago
1
Microarchitectural Techniques for Power Gating of Execution Units
Authors: Zhigang Hu, Alper Buyuktosunoglu, Viji Srinivasan, Victor Zyuban, Hans Jacobson, Pradip Bose IBM T.J. Watson Research Center Page: 32-37, In International Symposium for Low Power Electronic Devices, 2004. Presenter: Sai Raghunath T
2
Sources of Power dissipation
Sub-threshold leakage Gate leakage current Circuit level approach for leakage power reduction Body bias control Dual threshold Domino circuits Input vector control Power gating
3
Architectural level leakage power reduction in caches and buffers
Tristating the drivers of bitlines of SRAM Determination of Sleep mode activation policies for the integer functional units using Dual-Vt Domino logic circuits Role of compiler to detect long idle periods for different functional units and enable power gating.
4
Work done in the paper: Exploiting work load phases and characteristics to dynamically power gate OFF/ON selected units within a pipeline using Time based technique and Branch prediction technique Specifications of out of-order issue Super scalar processor - Turandot
5
Fundamentals of Power gating:
Power gating is achieved by using suitably sized header or footer for a circuit. ‘Sleep’ signal is applied when the logic detects sufficiently long idle period and the macro is turned OFF.
6
Sequence T1-T0= T(idle detect) T2-T1= T(idle delay)
T3-T2= T(breakeven) T4-T2= T(full discharge) T5= detection of next busy interval T6-T5= T(busy delay) T7-T6= T(wakeup) Sequence 1. T0 -> T1= Leakage energy 2. T1 -> T2= Overhead energy+ Leakage energy (Overhead energy is the energy required to generate ‘Sleep’ signal) Savings in leakage energy increase with decrease in supply voltage 3. T5 -> T6= Overhead energy 4. T6 -> T7= Leakage energy
7
T(breakeven) is the point when the aggregate leakage energy savings E(avg saved) equals the energy overhead of switching ON and OFF the header/footer device. Typically, the value of N (breakeven) is 10 DIBL= Drain Induced Barrier Lowering factor (typically 0.1) WH = total area of header device total area of clock gated macro α- switching factor m = 0.1
8
Power gating of execution units
Quantifying the Power gating potential for out-of- order Superscalar processor model using different applications from SPEC2K suite. Assumptions: T(idle delay)= T(busy delay)=0 →perfect predictor T(idle) > T(overhead) ( =T(wakeup)+T(breakeven))
9
The following equations estimate the fraction of cycles the units can be power gated:
Ex: Sequence of activity bits of some unit T(overhead) =3 Opp cycles = (5-3)+ (4-3) +(6-3) =6 Power gating potential = 6/33 =18.18 %~ 19%
10
values of T(overhead)
Power gating potential averaged across SPEC2K FP applications for various values of T(overhead)
11
Power gating potential averaged across SPEC2K integer applications for
various values of T(overhead)
12
Time-Based Power Gating:
Assumptions: T(breakeven)= T(breakeven)+ T(idle delay) T(wakeup)= T(wakeup) +T(busy delay) One issue queue per execution unit Logic used: Observe the state of an execution unit and turn it OFF when a long streak of idle cycles is seen
13
FSM: State Machine of an execution unit when power gating is engaged
14
% of cycles in sleep mode for FPU with different T(idle detect) and
T(breakeven). T(wakeup)= 3 cycles
15
Avg IPC of SPECFP2K suite with different T(idle detect) and T(wake up)
values. T(break even)=9 cycles. IPC is normalized to the base case where Power gating is disabled. Long idle periods coupled with smaller values of T(break even) and T(wakeup) help achieve large leakage reductions and mitigate overall performance loss savings T(idle detect)= 6-12 cycles for optimum balance between performance and power
16
% of cycles in sleep mode for FXU with different T(idle detect) and
T(breakeven). T(wakeup)= 3 cycles
17
Avg IPC of SPECINT2K suite with different T(idle detect) and T(wake up)
values. T(break even)=9 cycles. IPC is normalized to the base case where Power gating is disabled.
18
Branch prediction guided Power gating:
Observations from the previous graphs show that FXU typically had short idle periods. So, it is difficult to efficiently implement Power gating in integer execution units. Branch mispredictions are highly disruptive events in speculative out-of-order processors – Good chance of implementing Power gating techniques. In the event of branch misprediction, the pipeline is flushed and correct instruction is fetched During this process, execution unit is idle.
19
New branch prediction guided power gating technique:
As soon as the branch misprediciton is detected, all idle FXUs are transferred to ‘Uncompensated’ state →reduction in T(idle detect) → higher % of cycles in ‘sleep’ mode → smaller performance loss and better leakage reduction
20
T(breakeven)=9 cycles; T(wakeup)= 3 cycles
% of performance loss in sleep mode versus performance degradation techniques T(breakeven)=9 cycles; T(wakeup)= 3 cycles
21
Conclusions and critique:
Time based technique is efficient for FP execution units which have relatively high idle time. Branch prediction technique is efficient for Integer execution units. No mention about the advantage/disadvantage of power gating over other circuit level approaches for leakage power reduction. How efficient is Power gating if the above mentioned assumptions are relaxed?? What is the power consumption from the macro generating ‘Sleep’ signal? What is the ratio of its power consumption to power savings?
22
How is this paper relevant to the class??
State-of-art microprocessors are facing the problem of high power leakage due to scaling of technology. Leakage power is high from the execution units which are the most important blocks in the microprocessor. This paper gives a good insight in understanding techniques to reduce leakage power. Also, various power gating techniques to reduce the power dissipation in CMP and SMT architectures can be explored.
23
Project: Considering a small integer ALU and comparing various circuit level approaches with Power gating and suggesting the better technique(s)- the idea that will be suggested can be a optimum mix of using 2 or more circuit level approaches.
24
THANK YOU Q &A?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.