Download presentation
Presentation is loading. Please wait.
Published byWillis Howard Modified over 9 years ago
1
Embedded System Lab. 최 길 모최 길 모 Kilmo Choi rlfah926@naver.com A Software Memory Partition Approach for Eliminating Bank-level Interference in Multicore Systems Lei Liu, Zehan Cui, Mingjie Xing, Yungang Bao, Mingyu Chen, Chengyong Wu
2
Embedded System Lab. 최 길 모최 길 모 Contents Background and Motivation Bank-Level Partition Mechanism(BPM) Results Conclusion Reference
3
Embedded System Lab. 최 길 모최 길 모 Background and Motivation Memory bank The same set of memory access speed Multicore platform
4
Embedded System Lab. 최 길 모최 길 모 Background and Motivation Bank-Level Parallelism(BLP) and Bank Sharing Multiple banks can serve memory requests concurrently and independently Memory system usually employs a bank-interleaved address mapping schema Memory interference on multicore platform Causes performance degradation(throughput slowdown and unfairness ) ex. row buffer hit rate decrease from 1 core(over 60%) to 16 core(35%) CoreCore MC CoreCore Bank row buffer conflict row buffer conflict
5
Embedded System Lab. 최 길 모최 길 모 Background and Motivation Numerous new memory scheduling algorithms have been proposed to address the interference problem However, these algorithms usually employ complex scheduling logic and need hardware modification to memory controllers Bank-level conflicts can be fully eliminated by exclusively mapping a thread’s data to specific banks How much influence the performance of thread amount of available bank?
6
Embedded System Lab. 최 길 모최 길 모 Bank-Level Partition Mechanism(BPM) Overview of BPM OS memory management system uses a page-coloring mechanism to partition banks into several groups and maps each thread (process) to a specific bank group Address mapping policy Advantages row buffer conflict ↓ row buffer hit ↑ BPM is entirely software approach Flexible Easier for OS to monitor thread’s behavior than hardware
7
Embedded System Lab. 최 길 모최 길 모 Bank-Level Partition Mechanism(BPM) Discover bank bits by software method
8
Embedded System Lab. 최 길 모최 길 모 Results Environments 4 cores, 2.8GHz Intel Core i7-860 processor, 8GB DDR3 main memory CentOS Linux 5.4 with kernel 2.6.32.15 SPEC CPU2006
9
Embedded System Lab. 최 길 모최 길 모 Results Overall system performance
10
Embedded System Lab. 최 길 모최 길 모 Results Page-Policy and Power
11
Embedded System Lab. 최 길 모최 길 모 Results BPM VS Cache-Partition-Only The correlation between BPM improvements and Per-core bandwidth
12
Embedded System Lab. 최 길 모최 길 모 Reference J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems. In HPCA-14, 2008. Dimitris Kaseridis, Jeffrey Stuecheli, Lizy Kurian John. Minimalist Open-page: A DRAM Page-mode Scheduling Policy for the Many- core Era. In MICRO 44, 2011
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.