Presentation is loading. Please wait.

Presentation is loading. Please wait.

126 March 2006ODES-4 Performance Optimization for Low-Leakage Caches based on Sleep-Line Access Density Reiko Komiya †, Koji Inoue ‡ and Kazuaki Murakami.

Similar presentations


Presentation on theme: "126 March 2006ODES-4 Performance Optimization for Low-Leakage Caches based on Sleep-Line Access Density Reiko Komiya †, Koji Inoue ‡ and Kazuaki Murakami."— Presentation transcript:

1 126 March 2006ODES-4 Performance Optimization for Low-Leakage Caches based on Sleep-Line Access Density Reiko Komiya †, Koji Inoue ‡ and Kazuaki Murakami ‡ † Fukuoka University, Japan ‡ Kyushu University, Japan

2 226 March 2006ODES-4 Outline Introduction –Leakage energy of cache memory –Conventional low leakage cache : Cache decay Problem of cache decay approach Solution: Always-Active approach Evaluation Conclusions

3 326 March 2006ODES-4 Introduction Dynamic Pwr Static Pwr The breakdown of energy consumption in a processor family * 1 Cache leakage reduction is very important!! Energy consumption = Dynamic energy + Static energy Leakage energy increases with the progress of process technology consumed by charging & dischargingby leakage current *1 Fred Pollack (Intel Fellow): New Microarchitecture Challenges in the Coming Generations of CMOS Process Technologies [Micro32] *2 Simon Segars, “ Low Power Design Techniques for Microprocessors, ” ISSCC2001 Cache energy is 44% Power Analysis of ARM920T

4 426 March 2006ODES-4 Conventional Low-Leakage Cache Sleep mode (destroy the data to reduce leakage) Conventional low-leakage cache: Cache decay Conventional cache doesn’t support any leakage reduction technique active mode ( high-leakage ) sleep mode ( low-leakage ) no-access time ≧ decay itnerval access ( miss ) initial state Active mode (high-leakage to preserve the data) Sleep-miss (degrades processor performance) The mode of each line transits based on this state transition diagram

5 526 March 2006ODES-4 Performance Impact of Sleep-misses Many sleep-misses cause large performance degradation!

6 626 March 2006ODES-4 Our Goal High-performance, low-leakage cache! Problem of conventional low-leakage cache –Performance degradation caused by sleep- misses Our approach –To improve performance, reduce sleep-misses –Prohibit some cache lines from going to sleep mode

7 726 March 2006ODES-4 Analysis of Sleep-misses Sleep-Miss Density (SMD): shows amount of sleep-misses in each line SMD i = the number of sleep-misses at the cache line i the average number of sleep-misses for all cache lines The number of sleep-misses at each cache line Example 651 241 60110 The total number of sleep-misses: 90 The number of lines: 9 ⇒ The average number of sleep-misses : 10 SMD 6 =6 SMD 7 =0.1 SMD 8 =1 Cache lines which often cause sleep-misses have high SMD !

8 826 March 2006ODES-4 Characteristics of Sleep-misses 4 ≦ SMD2 ≦ SMD < 41 ≦ SMD < 2SMD < 1 The breakdown of cache lines in terms of SMD The breakdown of sleep-misses in terms of SMD Breakdown of lines Breakdown of sleep-miss A small number of high SMD lines often produce sleep-misses 3.1% of lines cause 94.4% of sleep-misses

9 926 March 2006ODES-4 Always-Active Approach Support “Always-Active mode (AA mode)” AA mode prohibits the corresponding line from going to sleep mode Cache lines which cause frequently sleep- misses should operate in AA mode Such lines are called “Always-Active lines (AA lines)”

10 1026 March 2006ODES-4 initial state How to Decide AA Lines A line which causes frequently sleep-misses ⇒ AA line 651 241 60110 The number of sleep-misses at each cache lineSMD at each cache line 0.60.50.1 0.20.40.1 6 1 SMD > Threshold SMD ≦ Threshold active mode sleep mode no-access time ≧ decay interval access always-active mode

11 1126 March 2006ODES-4 How to Measure SMD Dynamically SMD i = the average number of sleep-misses for all cache lines > Threshold ① > ② × ③ Example ) The number of cache lines = 1024 (=2 10 ) , Threshold = 2 (=2 1 ) ① ② ③ the total number of sleep-misses 10bit right shift ② ②×③②×③ ① >?>? AA mode active mode yes no 1bit left shift the number of sleep-misses at the cache line i

12 1226 March 2006ODES-4 Hardware Implementation Sleep-miss counter Always-active flag 1023 0 1 2 Decay flag 2 bit local counter tagdata Voltage Control gated Vdd or 0V total sleep-miss counter ¼ decay interval >? > shifter global counter = ? If a line is in sleep mode, Cache decay ⇒ tag is in sleep mode AA approach ⇒ tag is in active mode The line is in sleep-mode && tag match ⇒ a sleep-miss occurs!

13 1326 March 2006ODES-4 Experimental Setup Evaluation model –Cache decay: conventional low-leakage cache –AA1: Cache decay with AA approach (threshold value=1) Cache configuration –L1 data cache Cache size: 32KB Associativity: 2way Hit latency: 1 clock cycle Miss penalty: 32 clock cycles Evaluation items –Performance improvement –Energy reduction

14 1426 March 2006ODES-4 Results Cache decayAA1 Higher performance and lower energy consumption Improve the performance by increasing energy consumption Normalized execution time Normalized energy

15 1526 March 2006ODES-4 Conclusions We have proposed a high-performance, low-leakage cache: AA approach –Detect lines which cause sleep-misses frequently at run time –The performance is improved by operating the line as AA mode Evaluation results –Higher performance and lower energy consumption –The best case (f183.equake): Performance degradation: 19% →4.2% Energy consumption: 20% reduction Future work –Compare AA approach with an adaptive decay technique (Kaxiras ISCA’00)

16 ODES-4 1626 March 2006 Thank you ! ありがとう ! ( in Japanese )

17 1726 March 2006ODES-4

18 1826 March 2006ODES-4 Impact of Threshold Cache decayAA4AA2AA1 Threshold is small ⇒ high performance. Because the number of AA lines increase! Normalized execution time Normalized energy

19 1926 March 2006ODES-4 Breakdown of Energy Consumption AA1 is ・ Leakage energy increase ・ Dynamic energy accompanying reduce ‐ Because the number of sleep-miss reduce Energy reduction is tradeoff of DE memory and LE L1 AA1 Cache decay Breakdown of energy (J)

20 2026 March 2006ODES-4 Performance Impact of Decay Interval Cache decay: Performance improve along with the extension of decay interval AA approach: Even if it uses short decay interval, performance fully improve

21 2126 March 2006ODES-4 Energy Impact of Decay Interval Cache decay: Leakage energy increase along with the extension of decay interval AA approach: Leakage reduction is large than cache decay using long decay interval

22 2226 March 2006ODES-4 Energy Model(1/3) E total = LE L1 + DE L1 + DE memory LE L1 = {LE bit ×N active (i)} CC: プログラム実行時間 LE bit : 1 クロックサイクルにおける 1 ビット SRAM セル での 平均リーク消費エネルギー N active (i): i clock cycle 時の活性状態 SRAM ビット数 LE L1 : L1 キャッシュのリーク消費エネルギー DE L1 : L1 キャッシュの動的消費エネルギー DE memory :主記憶アクセス消費エネルギー 従来型低リー ク 常活性ブロック方 式 従来型 CC 長い短い N active (i) 少ない多い ☺ ☺ ☹ ☹

23 2326 March 2006ODES-4 DE L1 = DE 常活性 + DE 従来低 + DE 従来 消費エネルギー・モデル (2/3) 従来型低リー ク 常活性方式 DE 常活 性 - オーバヘッ ド DE 従来 低 オーバヘッド ☹ ☹ ☹ DE 常活性 : 常活性ブロック方式の適用による 動的消費エネルギー・オーバヘッド DE 従来低 : 従来型低リーク・キャッシュの適用による動的消費エネル ギー オーバヘッド DE 従来 : 従来型キャッシュでのアクセス消費エネルギー

24 2426 March 2006ODES-4 消費エネルギー・モデル (3/3) パラメー タ アクセス当りの 平均消費エネルギ ー 積算根拠 LE bit 0.13pJ 文献 [1] を参考 DE org 1.90nJ CACTI3.0 を用いて測 定 DE 従来 0.1pJ+0.5pJ 文献 [2] を参考 DE 常活性 4.20pJ テーブルサイズと DE org から見積もり DE memory 38.0nJ DE org ×20 と見積もり [1] K.Flautner, N.S.Kim, S.Martin, D.Blaauw, and T.Mudge, “Drowsy Caches: Simple Techniques for Reducing Leakage Power,” Proc. of the 29th Int, Symp. on Computer Architecture, pp.148-157, May 2002. [2] S.Kaxiras, Z.Hu, and M.Martonosi, “Cache Decay: Exploiting Generational Behavior to Reduce Cache Leakage Power,” Proc. of the 28th Int, Symp. on Computer Architecture, pp.240-251, June 2001.


Download ppt "126 March 2006ODES-4 Performance Optimization for Low-Leakage Caches based on Sleep-Line Access Density Reiko Komiya †, Koji Inoue ‡ and Kazuaki Murakami."

Similar presentations


Ads by Google