Download presentation
Presentation is loading. Please wait.
Published byBrenda Amie Harrington Modified over 9 years ago
1
Non-Uniform Cache Architecture Prof. Hsien-Hsin S
Non-Uniform Cache Architecture Prof. Hsien-Hsin S. Lee School of Electrical and Computer Engineering Georgia Tech Guest lecture for ECE4100/6100 for Prof. Yalamanchili
2
Non-Uniform Cache Architecture
ASPLOS 2002 proposed by UT-Austin Facts Large shared on-die L2 Wire-delay dominating on-die cache 3 cycles 1MB 180nm, 1999 11 cycles 4MB 90nm, 2004 24 cycles 16MB 50nm, 2010
3
Multi-banked L2 cache Bank=128KB 11 cycles 2MB @ 130nm
Bank Access time = 3 cycles Interconnect delay = 8 cycles
4
Multi-banked L2 cache Bank=64KB 47 cycles 16MB @ 50nm
Bank Access time = 3 cycles Interconnect delay = 44 cycles
5
Static NUCA-1 Use private per-bank channel
Tag Array Data Bus Address Bus Bank Sub-bank Predecoder Sense amplifier Wordline driver and decoder Use private per-bank channel Each bank has its distinct access latency Statically decide data location for its given address Average access latency =34.2 cycles Wire overhead = 20.9% an issue
6
Static NUCA-2 Tag Array Bank Switch Data bus Predecoder Wordline driver and decoder Use a 2D switched network to alleviate wire area overhead Average access latency =24.2 cycles Wire overhead = 5.9%
7
Dynamic NUCA Data can dynamically migrate
Move frequently used cache lines closer to CPU
8
Dynamic NUCA bank 8 bank sets one set way 0 way 1 way 2 way 3
Simple Mapping All 4 ways of each bank set needs to be searched Farther bank sets longer access
9
Dynamic NUCA bank 8 bank sets one set way 0 way 1 way 2 way 3
Fair Mapping Average access time across all bank sets are equal
10
Dynamic NUCA bank 8 bank sets way 0 way 1 way 2 way 3 Shared Mapping
Sharing the closet banks for farther banks
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.