Presentation is loading. Please wait.

Presentation is loading. Please wait.

ISLPED’99 International Symposium on Low Power Electronics and Design

Similar presentations


Presentation on theme: "ISLPED’99 International Symposium on Low Power Electronics and Design"— Presentation transcript:

1 ISLPED’99 International Symposium on Low Power Electronics and Design
Way-Predicting Set-Associative Cache for High Performance and Low Energy Consumption Koji Inoue, Tohru Ishihara, and Kazuaki Murakami Department of Computer Science and Communication Engineering Kyushu University

2 Conventional 4-Way Set-Associative Cache
Tag subarray Cache-line subarray Way 0 Way 1 Way 2 Way 3 Step1. Address Decode Decode circuit Step2.Read out of a tag and a line from each way Activate of word line Activate senseamps pre(dis)charge bit lines Total energy for an access for decode for I/O pin drive Ecache = Edecode + Ememory + Eio Step3. Tag comparison for SRAM access Hit Miss Step4.Provide the required data Step4.Cache replacement Activate of I/O pins

3 Phased 4-Way Set-Associative Cache for Low Energy Consumption
Energy consumption improvement by sacrificing the performance Step1. Address Decode Step2.Read out of only tags Cycle 1 Step3. Tag comparison Miss Hit Step4. Cache replacement Cycle 2 Step4.Read out of only the desired line Step5.Provide the required data

4 Way-Predicting Set-Associative Cache - Concept -
How can we achieve high-performance and low energy consumption at the same time? Fast access by reading out both of tag and line simultaneously Conventional : Good! Phased : Bad! Low energy by avoiding unnecessary line read access Conventional : Bad! Phased : Good! Predict which way has the data desired by the processor before the cache access is started

5 4Way-Predicting Set-Associative Cache - Operation -
Way Prediction (Cache-line Base MRU Algorithm) Step0.Way prediction Step1. Address decode Step2.Read out the predicted tag and line Cycle 1 Step3. Tag comparison Miss Prediction Hit Step4.Read out the remaining tags and lines Step4.End Cycle 2 Step5. Tag comparison Prediction Miss Cache Miss Step6.End Step6.Cache replacement

6 4Way-Predicting Set-Associative Cache - Organization -
MRU Algorithm

7 Evaluation Environment
Cache Models Conventional 4-way Set-Associative Cache (4SACache) Phased 4-way Set-Associative Cache (P4SACache) Way-Predicting 4-way Set-Associative Cache (WP4SACache) Cache Size : 16 K Byte, Cache-line Size : 32 Byte, Replacement Algorithm : LRU Evaluation Items Performance (Tcache): average number of clock cycles for an access Energy (Ecache) : average energy consumption for an access Energy consumed for accessing a tag-subarray Energy consumed for accessing a line-subarray Ecache ~ Ememory = Ntag x Etag + Ndata x Edata Ave. number of tag-subarray accessed for an access Ave. number of line-subarray accessed for an access

8 Static Analysis - Energy and Performance Expression -
4SACache P4SACache E4SACache EP4SACache 4 Etag + 4 Edata 4 Etag + Edata x CHR T4SACache TP4SACache 1 1 + 1 x CHR EWP4SACache WP4SACache (Etag + Edata) + (3 Etag + 3 Edata) x (1 - PHR) TWP4SACache CHR:Cache Hit Rate PHR:Prediction Hit Rate 1 + 1 x (1 - PHR)

9 Static Analysis - Best and Worst Case -
4SACache (Conventional) P4SACache (Phased) WP4SACache (Ours) Energy Consumption (Etag = 0.078Edata) Performance Compare with Conventional (4SACache) Best Case (PHR = 100%) : 75% energy improvement without any performance degradation Worst Case (PHR = 0%) : 100% performance overhead without any energy improvement

10 Experimental Analysis - Prediction Hit Rate -

11 Experimental Analysis - Result of Instruction Cache -
4SACache = 1.0 P4SACache Normalized Tcache WP4SACache (Our approach) Normalized Ecache

12 Experimental Analysis - Result of Data Cache -
4SACache = 1.0 P4SACache Normalized Tcache WP4SACache (Our approach) Normalized Ecache

13 Experimental Analysis - Energy and Performance -
Average of all benchmarks Conventional (4SACache) Phased (P4SACache) Way-Predicting (WP4SACache) 199.4% 195.8% 200 200 I-Cache D-Cache 113.0% 104.1% Normalized Results (%) 100 Normalized Results (%) 100 30.3% 29.4% 28.1% 35.2% Ecache Tcache Ecache Tcache

14 Cache Power Consumption
Cache Size trend Effect of on-chip caches to total chip power consumption DEC CPU* StrongARM SA-110 CPU* Bipolar ECL CPU** 25% 43% 50% * Kamble, et. Al., “Analytical energy Dissipatiion Models for Low Power Caches”, ILPED’97 ** Joouppi, et. Al., “A 300-MHz 115-W 32-b Bipolar ECL Microprocessor” ,IEEE Journal of Solid-State Circuits’93

15 Energy Consumption Model
Components of the power dissipation Bit line Word line Sense Amp Output driver Addr input Comparator Latche 32KB Direct-mapped I-Cache 32KB 4-way D-Cache Ememory=95.6% Ememory=97.7% Ghose, et. Al. : Energy Efficient Cache Organizations for Superscalar Processors, Power-Driven microarchitecture Workshop in Conjunction with ISCA’98 Average Energy Consumption for an access Energy consumed for accessing a tag-subarray Energy consumed for accessing a line-subarray Ecache ~ Ememory = Ntag x Etag + Ndata x Edata Ave. number of tag-subarray accessed for an access Ave. number of line-subarray accessed for an access

16 Experimental Analysis - Environment -
Benchmarks SPECint95 099.go, 124.m88ksim, 126.gcc, 129.compress, 130.li, 132.ijpeg, 134.perl, 147.vortex SPECfp95 101.tomcatv, 102.swim, 103.su2cor, 104.hydro2d


Download ppt "ISLPED’99 International Symposium on Low Power Electronics and Design"

Similar presentations


Ads by Google