Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 DIEF: An Accurate Interference Feedback Mechanism for Chip Multiprocessor Memory Systems Magnus Jahre †, Marius Grannaes † ‡ and Lasse Natvig † † Norwegian.

Similar presentations


Presentation on theme: "1 DIEF: An Accurate Interference Feedback Mechanism for Chip Multiprocessor Memory Systems Magnus Jahre †, Marius Grannaes † ‡ and Lasse Natvig † † Norwegian."— Presentation transcript:

1 1 DIEF: An Accurate Interference Feedback Mechanism for Chip Multiprocessor Memory Systems Magnus Jahre †, Marius Grannaes † ‡ and Lasse Natvig † † Norwegian University of Science and Technology ‡ Energy Micro

2 2 Chip Multiprocessor Resources Hardware-controlled, shared resources –Interconnect bandwidth –Shared cache capacity –Memory bus bandwidth –Memory capacity is allocated by the operating system Interference can occur in all shared units Current CMP implementations do not take interference into account

3 3 Why Control Resource Allocation? Provide predictable performance Support OS scheduler assumptions Cloud: Fulfill Service Level Agreement

4 4 Resource Allocation Tasks Focus of this work

5 5 Resource Allocation Baselines Baseline = Interference-free configuration Quantify performance impact from interference Private Mode and Shared Mode

6 6 Multi-Programmed Baseline All processes in a workload run concurrently Static and equal partitioning of all shared resources

7 7 Single Program Baseline The process is run alone in one core All other cores are idle Exclusive access to all shared resources

8 8 Baseline Weaknesses Multiprogrammed Baseline –Only accounts for interference in partitioned resources –Static and equal division of DRAM bandwidth does not give equal latency –Complex relationship between resource allocation and performance Single Program Baseline –Does not exist in shared mode Dynamic Interference Estimation Framework (DIEF)

9 9 Outline Introduction Dynamic Interference Estimation Framework –Shared Cache –Memory Bus –On-chip interconnect Results Summary

10 10 Interference Estimation Full-System Interference Estimation Aggregate interference from different units Common unit of measure Average Latency (Clock Cycles) DIEF General, component-based framework

11 11 Interference Definition Interference Private Mode Latency Estimate Error Private Mode Latency Measurement Private Mode Latency Measurement Shared Mode Latency Private Mode Latency Estimate Private Mode Latency Estimate

12 12 Shared Cache Interference B NM ABAMN Auxiliary Tag Directories C P U 0 C P U 1 Cache Accesses: B Shared Cache..................

13 13 Shared Cache Interference B NM AABMN Auxiliary Tag Directories C P U 0 C P U 1 Cache Accesses: B Shared Cache.................. CC Eviction may not be interference

14 14 Shared Cache Interference B NM AABM Auxiliary Tag Directories C P U 0 C P U 1 Cache Accesses: B Shared Cache.................. CCC B N Interference cost = miss penalty Hit Miss

15 15 Bus Interference Requirements Out-of-order memory bus scheduling Shared mode only cache misses and cache hits Shared cache writebacks Computing private latency based on shared mode queue contents is difficult Emulate private scheduling in the shared mode

16 16 ED Shared Bus Queue CB DCBA 12020040 Arrival Order Head Pointer Execution Order 15 32 Latency Lookup Table Bank0 1...... Open Page Emulation Registers Memory Latency Estimation Buffer Bank/Page Mapping:A  (0,15),B  (0,19),C  (0,15),D  (1,32) Estimated Queue Latency12040 ++= B C D 200

17 17 Interconnect Interference A FE BCCPU0 1 L2Bank0 L2 1 Interference Counters 00 A E 4 8 CPU 1 delays CPU 0

18 18 Outline Introduction Dynamic Interference Estimation Framework –Shared Cache –Memory Bus –On-chip interconnect Results Summary

19 19 Relative Estimation Errors

20 20 RMS Error Breakdown Remaining units contribute less than 2 clock cycles

21 21 Auxiliary Tag Directory Accuracy

22 22 Outline Introduction Dynamic Interference Estimation Framework –Shared Cache –Memory Bus –On-chip interconnect Results Summary

23 23 Summary Memory system interference causes unpredictable performance DIEF provides –Accurate private mode latency estimates –Accurate shared mode latency measurements Future opportunities –Guiding dynamic optimizations –Guiding OS scheduling decisions –Debugging and optimization

24 24 Thank you! Visit our website: http://research.idi.ntnu.no/multicore/ Questions?

25 25 Experiment Methodology M5 simulator –Extended with crossbar and ring on-chip interconnect models –DDR2 memory bus model Randomly generated workloads of SPEC2000 benchmarks –40 4-core workloads –20 8-core workloads –10 16-core workloads


Download ppt "1 DIEF: An Accurate Interference Feedback Mechanism for Chip Multiprocessor Memory Systems Magnus Jahre †, Marius Grannaes † ‡ and Lasse Natvig † † Norwegian."

Similar presentations


Ads by Google