Download presentation
Presentation is loading. Please wait.
1
A highly Configurable Cache Architecture for Embedded Systems Chuanjun Zhang*, Frank Vahid**, and Walid Najjar* *University of California, Riverside **The Center for Embedded Computer Systems at UC Irvine ISCA 2003 Reviewer: Liang Zhu
2
Outline Why a Configurable Cache? Configurable Associativity by Way Concatenation Configurable Way Concatenation and Way Shutdown Conclusions
3
Computing Total Memory-Related Energy Considers CPU stall energy and off-chip memory energy Excludes CPU active energy Thus, represents all memory-related energy energy_mem = energy_dynamic + energy_static Underlined – measured quantities SimpleScalar (cache_hits, cache_misses, cycles) energy_dynamic = cache_hits * energy_hit + cache_misses * energy_miss energy_static = cycles * energy_static_per_cycle
4
Why Choose Cache: Impacts Performance and Power Performance impacts are well known Power ARM920T: Caches consume 50% of total processor system power (Segars 01) M*CORE: Unified cache consumes 50% of total processor system power (Lee/Moyer/Arends 99)
5
Cache Associativity Reduces miss rate – thus improving performance Impact on power and energy?
6
Associativity is Costly Associativity improves hit rate, but at the cost of more power per access Are the power savings from reduced misses outweighed by the increased power per hit? Energy per access for 8 Kbyte cache
7
Best performing cache is not always lowest energy Significantly poorer energy Associativity and Energy
8
Associativety Dilemma Direct mapped cache Poor hit rate on most examples But Low power per access Four-way set-associative cache Good hit rate on nearly all examples But high power per access
9
So What’s the Best Cache? Looking at popular embedded processors, there’s obviously no standard cache Dilemma Direct mapped –good performance and energy for most programs Four-way – good performance for all programs, but at cost of higher power per access for all programs Whether to design for the average case or the worst case?
10
Solution to the Dilemma Configurable cache can be configured as four way, two way, or one way Four-way set-associative base cache Ways can be concatenated to form two-way Can be further concatenated to direct- mapped Way 1Way 2Way 3 Way 4 four-way Way 1Way 2 two-way Way 1 direct mapped
11
Original Cache Layout
12
Configurable Cache Design: Way Concatenation
13
Simulated the circuit in Cadence’s Spectra Note energy savings with reduction of ways Concerns over access time addressed With transistor sizing: match delay, +1% area Analyzing the Results
14
Way Concatenate Experiments Experiment Motorola PowerStone benchmark g3fax Considering dynamic power only L1 access energy, CPU stall energy, memory access energy
15
Previous Method – Way Shutdown Albonesi proposed a cache where ways could be shut down -To save dynamic power Motorola M*CORE has same way-shutdown feature Unified cache – even allows setting each way as I, D, both, or off Way 1Way 2Way 3 Way 4 Reduces dynamic power by accessing fewer ways But, decreases total size, so may increase miss rate
16
Experimental results
17
Way Shutdown Can be Good for Static Power When off, prevents leakage. But 5% area overhead, and increase 8% performance Static power (leakage) increasingly important in nanoscale technologies We combine way shutdown with way concatenate Use sleep transistor method of Powell (ISLPED 2000)
18
Way Concatenate Plus Way Shutdown 100% = 4-way conventional cache
19
Conclusions have introduced a novel configurable cache design method called way concatenation. For dynamic power, way concatenation shows average energy savings of 40% compared to a conventional four-way set-associative way concatenation to be superior to previously proposed way shutdown methods A configurable cache with way concatenation, way shutdown can save a lot of energy
20
Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.