Download presentation
Presentation is loading. Please wait.
Published byStella Nichols Modified over 8 years ago
1
Moshovos © 1 ReCast: Boosting L2 Tag Line Buffer Coverage “for Free” Won-Ho Park, Toronto Andreas Moshovos, Toronto Babak Falsafi, CMU www.eecg.toronto.edu/aenao
2
Moshovos © 2 Power-Aware High-Level Caches AENAO target: High-Performance & Power-Aware Memory Hierarchies Much work on L1 / Our focus is on L2 power Much opportunity at L2 and higher caches L2 power will increase Absolute: L2 size and associativity will increase Application footprint L1 size is latency limited Relative: As L1 and core is optimized
3
Moshovos © 3 ReCast: Caching a Few Tag Sets n Revisit “line buffer” concept for L2 n Increase Coverage via S-Shift l 50% up from 32% for conventional indexing n L2 Tag Power Savings l 38% for writeback L1D / 85% for writethrough L1D tagdata L1I tagdata L1D tag data L2 tagdata L1I tagdata L1D tag data L2 ReCast f() S-Shift Conventional w/ ReCast
4
Moshovos © 4 Roadmap n ReCast Concept and Organization n S-Shift indexing / Trade-offs n Experimental Results
5
Moshovos © 5 ReCast Concept tag0tag1tag7set index #entries offset set tag Address from L1 ? 1 2 ReCast Hit L2 Hit/Miss ReCast Miss L2 Tags ReCast
6
Moshovos © 6 ReCast Power Tradeoffs n ReCast Hit l Entry determines L2 cache hit or miss l No need to access L2 tags: Power Reduced l Latency can be reduced n ReCast Miss l Need to access the L2 tags l Power Increased by ReCast overhead l Latency is increased n A win for typical applications
7
Moshovos © 7 ReCast Organization n Distributed over the tag arrays L2 tag subarray recast L2 tag subarray recast L2 tag subarray recast L2 tag subarray recast address
8
Moshovos © 8 Increasing L2 Set Locality n Goal: l Make consecutive L1 blocks map onto the same L2 set l Exploit Spatial Locality n Larger L2 block: won’t work n Change the L2 indexing function: S-Shift offset S S New TagNew Set Block Address Set Tag Affects L2 Hit Rate – Net Win for Most Applications
9
Moshovos © 9 How S-Shift Increases Locality n Steam of sequential references, e.g., a[i++] way 0way 1 set 0 set 1 set n Conventional Indexingw/ 1-Shift way 0way 1 set 0 set 1 set n Not the same as increasing L2 block size May increase or decrease set pressure/ L2 miss rate
10
Moshovos © 10 Experimental Results n Filter Rates l How often we find the set in ReCast n L2 Miss Rate n L2 Power Savings n More in the paper l Performance with various latency models u Fixed or variable latency
11
Moshovos © 11 Methodology n SPEC CPU 2000 (some) n Up to 30 Billion Committed Instructions n 8-way OOO core n Up to 128 in-flight instructions n L1: 32K, 32-byte blocks, 2-way SA n L2: 1M, 64-byte blocks, 8-way SA n L3: 4M, 128-byte blocks, 8-way SA n Recast Organization shown: 8 banks, each 4 sets, 2-way SA
12
Moshovos © 12 ReCast Filter Rate n 1-Shift Increases Filter Rate from 32% to 50% n 2-Shift Increases Filter Rate further… better
13
Moshovos © 13 L2 Miss Rate n Mostly unchanged / but varies for some programs n Application analysis in the paper better
14
Moshovos © 14 L2 Power Savings: Writeback L1D n L2 tag power reduced by 38% n Overall L2 power reduced by 16% better
15
Moshovos © 15 L2 Power Savings: Writethrough L1D n L2 tag power reduced by 85%
16
Moshovos © 16 ReCast n Revisited the concept of “Line Buffers” for L2 n L2 power increasingly important l In Absolute and Relative Terms n ReCast: l An L2 Tag Set Cache n S-Shift: l Improves L2 Set Locality “for free” n Results l 1-Shift Filter Rate: 50% l L2 Tag Power Savings: 38%
17
Moshovos © 17 ReCast: L2 Power and Latency ReCast L2PowerLatency HitHit HitMiss MissMiss MissHit ReCast Hit: Set in ReCast L2 Hit: Data in L2 Reduces Power on Misses and Hits Needs Set Locality
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.