Dynamic Associative Caches: Reducing Dynamic Energy of First Level Caches Karthikeyan Dayalan, Meltem Ozsoy and Dmitry Ponomarev Department of Computer Science State University of New York at Binghamton Presented at the 32nd IEEE International Conference on Computer Design (ICCD), October 19-22, 2014
Direct-Mapped Cache Direct indexing, only one cache way is checked Tag index Byte Offset 01010111101100000 000000011 101001 Tag index Byte Offset 01010111101100000 000000011 101001 Tag Index Byte Offset 01010111101100000 000000011 101001 Direct indexing, only one cache way is checked Can have high miss rates TAG DATA TAG DATA == == HIT / MISS ICCD 2014
Set-Associative Cache Tag Index Byte Offset 0101011110110000001 000011 101001 Tag Index Byte Offset 0101011110110000001 000011 101001 Tag Index Byte Offset 0101011110110000001 000011 101001 Indexing into a set, all ways in a set are checked, at most one has the data Energy-Inefficient TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA ENERGY WASTAGE == == == == == == == == HIT / MISS ICCD 2014 3
Dynamic-Associative Cache (DAC) Key idea: Dynamically change cache access mode between Direct-Mapped and Set-Associative Part of the index is used to directly determine the way to access in Direct-Mapped mode Shadow tags are used to keep track of performance in each mode and switch as appropriate Need to invalidate cache contents when switching from Set-Associative to Direct-Mapped ICCD 2014
DAC Operation Set-Associative Mode: Direct-Mapped Mode: Exactly the same as traditional cache. Direct-Mapped Mode: Use least significant bits of the tag to select the way to be accessed. Imagine stacking the ways of a Set-Associative cache on top of each other to form Direct-Mapped cache. ICCD 2014
Direct-Mapped Access in DAC Tag Index Byte Offset 01010111101100000 00 000010 101001 Tag Index Byte Offset 01010111101100000 00 000010 101001 Tag Index Byte Offset 01010111101100000 00 000010 101001 Tag Index Byte Offset 01010111101100000 00 000010 101001 WAY SELECTION LOGIC TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA ENERGY SAVINGS == == == == == HIT / MISS ICCD 2014
Direct-Mapped Access in DAC Tag Index Byte Offset 01010111101100000 11 000010 101001 Tag Index Byte Offset 01010111101100000 11 000010 101001 Tag Index Byte Offset 01010111101100000 11 000010 101001 Tag Index Byte Offset 01010111101100000 11 000010 101001 WAY SELECTION LOGIC TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA ENERGY SAVINGS == == == == == HIT / MISS ICCD 2014
When to Switch Modes: Shadow Tags Index Byte Offset Shadow tags track hypothetical cache performance in the other mode To reduce complexity, only a few sets are shadowed WAY SELECTION LOGIC TAG TAG TAG TAG == == == == ICCD 2014
DAC Access in Direct-Mapped Mode Tag Index Byte Offset 01010111101100000 11 000001 101001 Tag Index Byte Offset 01010111101100000 11 000001 101001 Tag Index Byte Offset 01010111101100000 11 000001 101001 Tag Index Byte Offset 01010111101100000 11 000001 101001 WAY SELECTION LOGIC WAY SELECTION LOGIC TAG TAG TAG TAG TAG TAG TAG TAG TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA SHADOW TAGS CACHE Once the counter reaches its threshold value, mode transition happens == == == == == == == == == == == == == MISS HIT COUNTER-- COUNTER++ HIT MISS ICCD 2014
DAC Access in Set-Associative Mode Tag Index Byte Offset 01010111101100000 11 000001 101001 Tag Index Byte Offset 01010111101100000 11 000001 101001 Tag Index Byte Offset 01010111101100000 11 000001 101001 Tag Index Byte Offset 01010111101100000 11 000001 101001 WAY SELECTION LOGIC WAY SELECTION LOGIC TAG TAG TAG TAG TAG TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA SHADOW TAGS CACHE == == == == == == == == == == == == == HIT MISS COUNTER++ COUNTER-- MISS HIT ICCD 2014
DAC Design Variations DAC Budget: Only Direct-Mapped to Set-Associative mode transition is supported. Cache gets reset to Direct-Mapped mode on context switches DAC Deluxe: Both transitions are supported ICCD 2014
DAC Mode Transition Start in Direct-Mapped mode. Keep track of the difference between the number of misses in both modes (using shadow tags to track the other mode). Periodically compare the counter value against a threshold. If exceeded, trigger mode transition. If not, reset counter to zero. ICCD 2014
DAC Mode Transition Context Switch: SWITCH = 0 Direct-Mapped MISS HIT Cache access Shadow Hit Shadow Miss SWITCH++ SWITCH-- SWITCH > Threshold No Yes Reset Period: SWITCH = 0 Set-Associative ICCD 2014
Line Invalidations on SA->DM Transition Tag Index Byte Offset 01010111101100000 00 000010 101001 Tag Index Byte Offset 01010111101100000 00 000010 101001 Tag Index Byte Offset 01010111101100000 00 000010 101001 Tag Index Byte Offset 01010111101100000 00 000010 101001 Now Mode transition triggered from Set-Associative to Direct-Mapped mode Data was placed in Set-Associative mode WAY SELECTION LOGIC TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA TAG DATA Duplicate Entries == == == == == MISS ICCD 2014
Simulation Methodology Parameter Configuration Machine Width 8-wide fetch, issue and commit Window Size 128- entry ROB, 48-entry LSQ and Issue Queue Physical Registers 128 Integer + 128 FP Physical Registers L1 I-Cache 32 KB, 4-way set-associative, 64 byte line, 1 cycle hit time L1 D-Cache L2 Unified Cache 512 KB, 8-way set-associative, 128 byte line, 10 cycle hit time Memory latency 300 cycles ICCD 2014
CACTI Parameters Direct-Mapped (32 KB) 4 way Set-Associative DAC Direct-Mapped (32 KB) 4 way Set-Associative DAC (in DM mode) Energy/access 0.16618(nJ) 0.39985(nJ) 0.15457(nJ) Leakage/bank 28.4582(mW) 64.5997(mW) 19.7505(mW) Ndwl/Ntwl 4/2 8/2 Ndbl/Ntbl 2/2 ICCD 2014
Impact on IPC The performance loss is less than 2% and DAC Budget stays in Set-Associative mode for long time. ICCD 2014
Impact on Cache Misses DAC covers about half of the MPKI gap between the Direct-Mapped and Set-Associative caches. ICCD 2014
DAC Impact on Energy Consumption DAC saves 80% of the energy compared to Set-Associative caches. ICCD 2014
Percentage of Accesses in Direct-Mapped Mode DAC Deluxe spends more time in Set-Associative mode, as expected. ICCD 2014
Conclusions It is possible to dynamically change cache associativity an obtain performance advantages of Set-Associative caches with energy- consumption of Direct-Mapped caches DAC saves 80% of the dynamic energy in L1 cache with a performance loss of less than 2%. DAC can be implemented using simple control logic, and a few extra tags to control the switching between the operating modes. ICCD 2014
THANK YOU !! QUESTIONS ?? ICCD 2014
Backup Slide: Handling Synonyms If OS ensures that the least significant bit do not change during address translation, synonym won’t occur. Another way is to check all the tags of the set. This will consume more power, but very small as most of the power is consumed by data arrays. ICCD 2014