Download presentation
Presentation is loading. Please wait.
1
“ NAHALAL : Cache Organization for Chip Multiprocessors ” New LSU Policy By : Ido Shayevitz and Yoav Shargil Supervisor: Zvika Guz
2
NAHALAL ARCHTECTURE NAHALAL architecture defines the memory cache banks of the L2 cache. Each processor has a private backyard bank and all processors shared a small bank. The architecture is based on the hot shared line phenomenon.
3
LSU Improvement Placement Policy Replacement Policy from Private Bank : LRU Replacement Policy from Public Bank : NAHALAL LRU X LSU LSU policy wisely select the Least Shared Used line to throw from the public bank.
4
LSU Implementation Shift-register with N cells for each Line. Each cell in the shift-register hold CPU num In throwing by CPUi : For each shift-register do XOR between each cell and the ID of CPUi. The shift-register on which the XOR produce 0, will be the chosen one. If non produce 0 then do regular LRU. In order ro reduce memory overhead, define N=4. Therefore 2 *4*3 = 0.1875MB 18.75% memory overhead. 14 Simple, short time algorithm in HW
5
Simulation Structure in Simics Using pyhton script we defined :
6
Writing Benchmarks Writing Benchmarks is done in the simulated target console :
7
Writing Benchmarks Using Threads with pthread library Each Thread is associated to a CPU using sched library. Parallel code is written in the benchmark Also OS code and pthread code cause to Parallel code. Each benchmark we run first without LSU and second with LSU.
8
Collecting Statistics Cache statistics: l2c ----------------- Total number of transactions: 610349 Total memory stall time: 31402835 Total memory hit stall time: 28251635 Device data reads (DMA): 0 Device data writes (DMA): 0 Uncacheable data reads: 17 Uncacheable data writes: 30738 Uncacheable instruction fetches: 0 Data read transactions: 403488 Total read stall time: 17488735 Total read hit stall time: 14383135 Data read remote hits: 0 Data read misses: 10352 Data read hit ratio: 97.43% Instruction fetch transactions: 0 Instruction fetch misses: 0 Data write transactions: 176106 Total write stall time: 4687600 Total write hit stall time: 4687600 Data write remote hits: 0 Data write misses: 0 Data write hit ratio: 100.00% Copy back transactions: 0 Number of replacments in the middle (NAHALAL): 557
9
Results 1. Improvement of 54% in average stall time per transaction. 2. Improvement of 61% in average stall time per transaction. 3. 8.375% from the transactions cause a replacement in the middle without LSU, and with LSU only 0.09% ! Improvement of ∆=8.28% 4. 8.75% from the transactions cause a replacement in the middle without LSU, and with LSU only 0.02% ! Improvement of ∆=8.73% 1 2 3 4
10
Conclusions LSU policy significantly improve average stall time per transaction, Therefore : LSU Policy implemented in NAHALAL architecture significantly reduce number of cycles for a benchmark. LSU policy significantly reduce number of replacements in the middle, Therefore : LSU Policy implemented in NAHALAL architecture, better keep the hot shared lines in the public bank. According to our implementation, LRU is activated if LSU did not find a line, Therefore : LSU Policy as we implemented is always preferable then LRU.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.