Download presentation
Presentation is loading. Please wait.
Published byMelvyn Jenkins Modified over 5 years ago
1
Addressing Service Interruptions in Memory with Thread-to-Rank Assignment
Manjunath Shevgoor, Rajeev Balasubramonian, University of Utah Niladrish Chatterjee, NVIDIA Jung-Sik Kim, Samsung Electronics 4/18/2016 ISPASS 2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
2
DRAM Refresh: Quick Recap
DRAM cell leaks through access transistor Leakage increases with temperature DRAM cell must be Refreshed every 64ms 1/8K of the DRAM rank is refreshed every 7.8µs Bit Line Word DRAM Cell Leaks more with Temperature Leak 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
3
Refresh Timing Parameters
7.8 ms or 3.9 ms tREFI tRFC tRFC tRFC 640 ns (32 Gb) tRefresh tRecovery 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
4
tRFC Projections 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
5
Refresh determines memory peak power
Refresh Power in DRAM Command Current (mA) Act 67 Read 125 Write Refresh 245 Refresh determines memory peak power 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
6
Stagger refresh to reduce peak power
Rank 1 Rank 2 Rank 3 Rank 4 MC 8-core CMP MC Channel 1 Channel 2 Stagger refresh to reduce peak power 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
7
Effect of Staggered Refresh
4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
8
Talk Outline DRAM refresh background
Goal: Low peak power of staggered refresh, performance of simultaneous refresh Analyzing stalls from refresh Solution: Thread-to-rank assignment Results 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
9
Each Staggered Refresh
Rank 1 Rank 2 Rank 3 Rank 4 Each Staggered Refresh stalls many cores MC 8-core CMP MC Channel 1 Channel 2 Stalled T1 R1 T2 R3 R2 T7 T3 T8 Stalled Thread Rank T1 R2 T2 R3 T1 R1 T2 R2 T2 R1 T3 R1 Rank 1 Refreshing => 3 Threads Stalled Rank 3 Refreshing => 3 Threads Stalled 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
10
Limit the Spread- Address Mapping
4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
11
% Refreshes Affecting a Thread
Highest Performance Loss 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
12
37% increase in Execution Time
Highest Performance Loss 37% increase in Execution Time 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
13
Rank Assigned Page Mapping
Thread 1 Thread 2 Thread 3 Thread 4 Thread 5 Thread 6 Thread 7 Thread 8 Rank 1 Rank 3 Rank 2 Rank 4 8-core CMP MC MC Channel 1 Channel 2 Strict mapping of threads to ranks. e.g., used for cache partitioning by Lin et al., HPCA 2008 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
14
Limit the Spread- Page Mapping
Thread 1 Thread 2 Thread 3 Thread 4 Thread 5 Thread 6 Thread 7 Thread 8 Rank 1 Rank 3 Rank 2 Rank 4 MC 8-core CMP MC Channel 1 Channel 2 Relaxed mapping of threads to ranks. 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
15
Modified Clock Algorithm
P List of Pages in Memory P P P P P P P P P P P Baseline Hand 1 2 3 4 Modified List of Pages in Ranks Hands 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
16
Methodology Simics + USIMM DRAM Specifications
8 RISC cores, UltraSPARC III ISA 3.2 GHz, 4-wide OoO, 64-entry RoB 32 KB I&D L1 caches, 4 cycles 4/8 MB shared L2 cache, 10 cycles DRAM Specifications 2 Channels, 2 Ranks per Channel, 16 Banks per Rank 800MHz DDR4 DRAM SPEC 2006, NPB, and Cloudsuite, Parsec 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
17
18% better than Staggered Refresh
Thread-to-rank Assignment 18% better than Staggered Refresh 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
18
Relaxing Rank Assignment
4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
19
Comparisons to Prior Work
4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
20
Conclusions Exposes an important artifact in memory stalls
Service interruptions require a re-evaluation of data placement RA (rank assignment) is a simple solution for an emerging problem RA can also be leveraged to reduce the impact of NVM write drain RA is a software solution that only requires best-effort page mapping Outperforms hardware-only schemes 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
21
Thank You 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.