Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optical Overlay NUCA: A High Speed Substrate for Shared L2 Caches

Similar presentations


Presentation on theme: "Optical Overlay NUCA: A High Speed Substrate for Shared L2 Caches"— Presentation transcript:

1 Optical Overlay NUCA: A High Speed Substrate for Shared L2 Caches
OONUCA Optical Overlay NUCA: A High Speed Substrate for Shared L2 Caches Eldhose Peter*, Anuj Arora**, Akriti Bagaria* and Dr. Smruti R Sarangi* *IIT Delhi, **CISCO Bangalore

2 Motivation Overlay NUCA Architecture Results

3 Understand the problem - Cache
UCA Cache performance α 1/(hit latency) Cache performance α hit rate Sets Static => Not adaptable based on access pattern L2 L2 L2 L2 L2 Improved cache utilization L2 L2 L2 L2 NUCA Lower Level Memory Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

4 Understand the problem – Optical Communication
Electrical Optical 50-60 cycles 1-2 cycles Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

5 Optical Communication
S D1 D2 D3 Basic Components Reservation assisted Single Write Multi Read(R-SWMR) Methods to Leverage Optical Networks for Multicore Processors 11/22/2018

6 Prior Approaches No prior work in cache using optical NOC
Electrical NOC SNUCA DNUCA RNUCA L1 Migration near to the core L1 L1 L1 L1 Search L1 L1 L1 L1 Lower Level Memory 1000 100011 Tag Set Index Block Size HomeBank 10100 Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

7 Equidistant nodes Banks are equidistant in terms of delay(approx)
Dynamic creation of sets Improves the utilization of banks Improves hit rate I am near to S X cycles S I am also near to S X cycles Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

8 Phases of operation Phase 1: SNUCA + Profiling
Phase 2 : Reconfiguration Phase 3 : OONUCA Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

9 Optical Overlay Profiling information – Cache bank accesses, bank contention, cache lines used Experimentally determined that the ring topology 8 banks is the best Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

10 Creation of overlay High Low 1 2 20 3 4 5 13 6 7 8 30 Hybrid 6 9 10 11
14 12 17 13 14 19 15 16 8 3 Infreq 17 18 19 7 20 12 21 22 4 23 31 24 26 25 25 26 27 28 29 30 31 32 15 5 29 32 11 16 28 10 23 Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

11 Operations in Overlay NUCA – Search
Home Bank 2 23 15 Two-Side Incremental (TSI) 20 27 Broadcast 18 31 5 Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

12 Operations in Overlay NUCA - Eviction
Main Memory Home Bank Eviction from L2 18 20 21 24 26 29 31 32 Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

13 TSI - Protocol L1 Cache Miss L2 Cache Bank If message type is request?
If space available in message queue? Hit Reply If non home bank Kill to home bank Kill to opposite branch Yes - Search Remove RCB Entry Type of Message ? Miss No – NACK to sender(Exponential back off) Notify RCB Entry Notify Kill Miss Home bank? Add notify Create Entry in RCB (Home bank) Yes No Any Child? Remove MQ Entry If notify = 2 Send request to Main memory Hit Remove entry from RCB Remove entry from MQ Yes No Miss Send request message to children Notify message to home bank Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

14 Home Bank Controller Main memory Read Response Collection Buffer
Main memory Write Main Memory Message ID Block Addr MRBV Response to the sender Miss Eviction Logic Hit Victim Buffer Migrate block Miss Overlay Info store Evicted block Cache Bank NACK Message Fill Bank Read/Write NACK controller Response Forward request to other banks Search Logic Notify Full Message Queue Hit To core Read/Write Search Bank Kill controller Kill Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

15 Message Structure Control Message Data Message 1 Flit – 3 cycles
NACK, Kill, Notify Data Message 5 Flits – 7 cycles Request, Response Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

16 Clustered architecture – 16 stations
32 cores and 32 cache banks Clustered architecture – 16 stations Distribute directory Off chip laser Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

17 Story till now Optical Overlay Operations Home Bank Controller
Message Format Architecture Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

18 Configuration System L1 cache L2 cache Main memory
Cores – 32 Technology – 18nm Frequency – 3.4 GHz Laser – Off chip L1 cache Block size- 64 B Write mode – Write back Size – 32 KB MSHR - 32 L2 cache Banks – 32 Associativity – 8 Size – 256 KB Main memory Latency – 250 cycles Memory controllers – 4 Auxiliary Structures RCB – 128 MQ – 16 VB - 20 Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

19 Results Hits in Home Bank
More non-home bank hits => high performance in Optical Overlay NUCA More non home bank hits Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

20 Results Normalized Average Hit Latency 0.4-0.8 0.2-0.55
Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

21 Comparable to DNUCA, much better than SNUCA
Results More home bank hits L2 Hit Rate Comparable to DNUCA, much better than SNUCA Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

22 Results Normalized IPC 167% 161% 2-3% 50, 24,18%
High non-home bank hits Less L2 requests Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

23 What we achieved Flexible and efficient access mechanism for large shared L2 caches TSI protocol has less number of accesses compared to naive broadcast protocol Performance difference of TSI and broadcast is minimal(2-3%) Mean speedup of 50% over SNUCA Proposal is agnostic to the network topology Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018

24 Thank you Optical overlay NUCA: A high speed substrate for shared L2 caches 11/22/2018


Download ppt "Optical Overlay NUCA: A High Speed Substrate for Shared L2 Caches"

Similar presentations


Ads by Google