Download presentation
Presentation is loading. Please wait.
Published byTheresa West Modified over 9 years ago
1
Improving Energy Efficiency of Configurable Caches via Temperature-Aware Configuration Selection Hamid Noori †, Maziar Goudarzi ‡, Koji Inoue ‡, and Kazuaki Murakami ‡ Speaker: Tohru Ishihara ‡ † Institute of Systems & Information Technologies/KYUSHU, Japan ‡ Kyushu University, Japan
2
ISVLSI2008@Montpellier, France Kyushu University 2/26 Outline Background Motivation Problem Definition Proposed Approach Architecture Reconfiguration Flow Experimental Results Conclusions
3
ISVLSI2008@Montpellier, France Kyushu University 3/26 Outline Background Motivation Problem Definition Proposed Approach Architecture Reconfiguration Flow Experimental Results Conclusions
4
ISVLSI2008@Montpellier, France Kyushu University 4/26 Background(1/2) The dynamic energy per a cache access The leakage power of a cache memory
5
ISVLSI2008@Montpellier, France Kyushu University 5/26 Background(2/2)
6
ISVLSI2008@Montpellier, France Kyushu University 6/26 Outline Background Motivational Example Problem Definition Proposed Approach Architecture Reconfiguration Flow Experimental Results Conclusions
7
ISVLSI2008@Montpellier, France Kyushu University 7/26 Motivational Example (1/3)
8
ISVLSI2008@Montpellier, France Kyushu University 8/26 Motivational Example (2/3) Total dynamic energy for executing a program Total static energy for executing a program
9
ISVLSI2008@Montpellier, France Kyushu University 9/26 Motivational Example (3/3) Minimum-energy cache size
10
ISVLSI2008@Montpellier, France Kyushu University 10/26 Outline Background Motivation Problem Definition Proposed Approach Architecture Reconfiguration Flow Experimental Results Conclusions
11
ISVLSI2008@Montpellier, France Kyushu University 11/26 Problem Definition (1/3) Objective function: total memory energy Cache dynamic energy Cache static energy Off-chip memory access energy Energy consumption during processor stall CPU I-$ D-$ Main memory
12
ISVLSI2008@Montpellier, France Kyushu University 12/26 Problem Definition (2/3) energy_memory(C, Temp, Tech) = energy_dynamic(C, Tech) + energy_static(C, Temp, Tech) (1) energy_dynamic(C, Tech) = cache_accesses(C) * energy_cache_access(C, Tech) + cache_misses(C) * energy_miss(C,Tech) (2) energy_miss(C, Tech) = energy_off_chip_stall + energy_cache_block_refill(C, Tech) (3) energy_static(C, Temp, Tech) = executed_clock_cycles(C) * clock_period * leakage_power(C, Temp, Tech) (4)
13
ISVLSI2008@Montpellier, France Kyushu University 13/26 Problem Definition (3/3) “For a given application, processor architecture, technology, and valid configurations of the configurable cache, find a valid cache configuration that results in minimum energy consumption in a specific temperature over the entire execution of the given application.”
14
ISVLSI2008@Montpellier, France Kyushu University 14/26 Outline Background Motivation Problem Definition Proposed Approach Architecture Reconfiguration Flow Experimental Results Conclusions
15
ISVLSI2008@Montpellier, France Kyushu University 15/26 Architecture TACC BCC (proposed by Zhang et al. [1]) Cache size (way shutdown) Number of ways (way concatenation) Line size Thermal sensor Accessible port for reading the thermal sensor [1] C. Zang, F. Vahid and W. Najjar,.“A Highly Configurable Cache Architecture for Embedded Systems,” ACM Trans. on Embedded Computing Systems, vol.4, no.2, May 2005
16
ISVLSI2008@Montpellier, France Kyushu University 16/26 Reconfiguration Flow
17
ISVLSI2008@Montpellier, France Kyushu University 17/26 Outline Background Motivation Problem Definition Proposed Approach Architecture Reconfiguration Flow Experimental Results Conclusions
18
ISVLSI2008@Montpellier, France Kyushu University 18/26 Experiment Setup (1/2) Mibench Simplescalar Cache hit: one clock cycle Cache miss: 100 clock cycles Clock freq of the base processor: 200 MHz CACTI 4.2 Target technology 70nm (Vdd=0.9) BCC (16KB) 16KB (4-, 2-, 1-way) 8KB (2-, and 1-way) 4KB (1-way) The line size for each of the configurations can be 8-, 16-, or 32- byte.
19
ISVLSI2008@Montpellier, France Kyushu University 19/26 Experimental Setup (2/2) Base Configurable Cache (BCC) It has the same architecture proposed by Zhang et al. [1] It supports a limited set of configurations It is configured for each application for corner-case (i.e. leakage at 100°C) Temperature-Aware Configurable Cache (TACC) TACC is configured for each execution of an application considering the chip temperature at that time [1] C. Zang, F. Vahid and W. Najjar,.“A Highly Configurable Cache Architecture for Embedded Systems,” ACM Trans. on Embedded Computing Systems, vol.4, no.2, May 2005
20
ISVLSI2008@Montpellier, France Kyushu University 20/26 Energy & Performance Evaluation Energy Saving = × 100 Performance Enhancement = × 100
21
ISVLSI2008@Montpellier, France Kyushu University 21/26 Data and Instruction Cache D$ qsortdjpeglamedijkstrapatriciashaadpcmcrcfft 0°C16K, 32, 2 16K, 32, 416K, 32, 2 8K, 32, 2 16K, 32, 4 20°C8K, 32, 216K, 32, 216K, 32, 416K, 32, 2 8K, 32, 18K, 32, 2 16K, 32, 4 40°C8K, 32, 216K, 32, 216K, 32, 48K, 32, 216K, 32, 24K, 32, 18K, 32, 2 16K, 32, 4 60°C8K, 32, 216K, 32, 2 8K, 32, 2 4K, 32, 14K, 16, 18K, 32, 2 80°C8K, 32, 2 16K, 32, 28K, 32, 2 4K, 32, 14K, 16, 14K, 32, 18K, 32, 2 100°C4K, 32, 18K, 32, 2 4K, 32, 1 8K, 32, 2 I$ basimathqsortdjpeglamedijkstrablowfishrijndaelgsmfft 0°C16K, 8, 4 16K, 32, 116K, 32, 216K, 32, 116K, 16, 216K, 32, 116K, 16, 48K, 32, 1 20°C16K, 16, 4 16K, 32, 116K, 32, 216K, 32, 116K, 16, 216K, 32, 116K, 32, 28K, 32, 1 40°C16K, 16, 4 8K, 32, 2 16K, 32, 216K, 32, 116K, 32, 28K, 32, 1 60°C16K, 16, 4 8K, 32, 2 16K, 32, 216K, 32, 18K, 32, 28K, 32, 1 80°C16K, 32, 4 8K, 32, 2 16K, 32, 14K, 32, 18K, 32, 1 100°C16K, 32, 4 8K, 32, 2 16K, 32, 24K, 32, 18K, 32, 1
22
ISVLSI2008@Montpellier, France Kyushu University 22/26 Energy Saving
23
ISVLSI2008@Montpellier, France Kyushu University 23/26 Performance Enhancement
24
ISVLSI2008@Montpellier, France Kyushu University 24/26 Outline Background Motivation Problem Definition Proposed Approach Architecture Reconfiguration Flow Experimental Results Conclusions
25
ISVLSI2008@Montpellier, France Kyushu University 25/26 Conclusions 1. Importance of temperature-aware configurable cache for finer technologies. Up to 61% (17% on average) energy consumption in 70nm technology for instruction cache 2. Data cache is more easily affected by temperature than instruction cache. Using a configurable data cache, up to 77% (36% on average) energy can be saved in 70nm technology. 3. The TACC improves the performance for instruction cache up to 28% (5% on average) and for data cache, it is up to 17% (8.1% in average).
26
ISVLSI2008@Montpellier, France Kyushu University 26/26 Thank you for your attention Please ask any questions to noori@c.csce.kyushu-u.ac.jp
27
ISVLSI2008@Montpellier, France Kyushu University 27/26 Backup slides
28
ISVLSI2008@Montpellier, France Kyushu University 28/26
29
ISVLSI2008@Montpellier, France Kyushu University 29/26
30
ISVLSI2008@Montpellier, France Kyushu University 30/26 ARM7TDMIARM966E-S 130nmPower consumption 7.98 mW62.5 mW Frequency133 MHz250 MHz 90nmPower consumption 7.08 mW51.7 mW Frequency236 MHz470 MHz
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.