Presentation is loading. Please wait.

Presentation is loading. Please wait.

Department of Computer Science Mining Performance Data from Sampled Event Traces Bret Olszewski IBM Corporation – Austin, TX Ricardo Portillo, Diana Villa,

Similar presentations


Presentation on theme: "Department of Computer Science Mining Performance Data from Sampled Event Traces Bret Olszewski IBM Corporation – Austin, TX Ricardo Portillo, Diana Villa,"— Presentation transcript:

1 Department of Computer Science Mining Performance Data from Sampled Event Traces Bret Olszewski IBM Corporation – Austin, TX Ricardo Portillo, Diana Villa, Patricia J. Teller The University of Texas at El Paso Department of Computer Science

2 Outline  Motivation  Data Collection Environment Workload & Platform Monitored Events  Data Analysis & Results  Conclusions and Future Work

3 Department of Computer Science Motivation  Capturing Event Traces  System Simulation: Overhead penalty is too high  Real-time Metrics: Capture every event during actual execution  Problem Growing size of full event traces is becoming unmanageable  Goal Use sampled event traces to analyze execution behavior

4 Department of Computer Science Data Collection Environment  Workload TPC-C benchmark  Commercial  OLTP  Platform IBM eServer pSeries 690 architecture (p690) 8- and 32-processor configurations

5 Department of Computer Science P X XP XP L2 L3 MCM 0 8-processor p690 configuration Platform P X XP XP P L2 L3 MCM 1 X XP L2

6 Department of Computer Science 32-processor p690 configuration Platform P P PP PP P L2 L3 MCM 0 P P P PP PP P L2 L3 MCM 2 P P P PP PP P L2 L3 MCM 1 P P P PP PP P L2 L3 MCM 3 P

7 Department of Computer Science Monitored Events  L2-cache data-load misses L2.5 L2.75 L3 L3.5 MEM

8 Department of Computer Science P X XP XP L2 L3 MCM 0 P X XP XP P L2 L3 MCM 1 X XP L2 Where is L2 Miss Resolved? L2

9 Department of Computer Science P X XP XP L2 L3 MCM 0 P X XP XP P L2 L3 MCM 1 X XP L2 Where is L2 Miss Resolved? L2.5 Event

10 Department of Computer Science P X XP XP L2 L3 MCM 0 P X XP XP P L2 L3 MCM 1 X XP L2 Where is L2 Miss Resolved? L2.5 Event L2.75 Event

11 Department of Computer Science P X XP XP L2 L3 MCM 0 P X XP XP P L3 MCM 1 X XP L2 Where is L2 Miss Resolved? L2.5 Event L2.75 EventL3 Event

12 Department of Computer Science P X XP XP L2 MCM 0 P X XP XP P L3 MCM 1 X XP Where is L2 Miss Resolved? L2.5 Event L2.75 EventL3 Event L3 L2 L3.5 Event

13 Department of Computer Science Data Collection  Performance Monitoring Unit (PMU) Special-purpose registers Programming interface Kernel extension  eprof PMU configuration Event-based sampling

14 Department of Computer Science Sampled Event Trace  10-minute observation interval Record periodic occurrences of an event 100 events/sec/CPU  Event record 372872 184469 0.328104637 000000000000A8C4 0000000000218880 PIDTIDTimestamp Effective Instruction Address Effective Data Address  Average number of samples collected/event 238,448 for 8-processor data 212,396 for 32-processor data

15 Department of Computer Science Analysis Memory Hotspots Individual Address Region Process Migration

16 Department of Computer Science L3 and Memory are most active memory levels Counted total number of L3 hits Counted number of L3 hits per address region Counted number of unique cache lines referenced per region Memory Hotspots

17 Department of Computer Science Memory Hotspots

18 Department of Computer Science Individual Address Region We can look at an address region in more detail Looked at Buffer Pool region Counted number of references per memory level Counted number of unique cache lines referenced per memory level

19 Department of Computer Science 0 20000 40000 60000 80000 100000 120000 L2L2.5 MODL2.75 MODL3L3.5MEM Event Name Distribution of Data Load Hits: BUFFER_POOL DataLoadHits UniqueCacheLines Individual Address Region

20 Department of Computer Science Process Migration Process migration from one chip to another can degrade performance when all or part of the process' working set must follow, via L2-cache misses Looked at 885 threads Counted number of migrations per thread Counted number of L2.5 hits per thread

21 Department of Computer Science Process Migration

22 Department of Computer Science  Only a few addresses in Buffer Pool region are causing most of its L3 hits  For Buffer Pool, heavily referenced shared data is constantly resolved outside an MCM  Process migration is not a source of performance degradation Conclusions

23 Department of Computer Science  Quantify representativeness of sampled event traces  Suggest more ways to improve p690 application performance  Study sampled event traces for other workloads  In depth study of process characterization Future Work

24 Department of Computer Science Thank You!


Download ppt "Department of Computer Science Mining Performance Data from Sampled Event Traces Bret Olszewski IBM Corporation – Austin, TX Ricardo Portillo, Diana Villa,"

Similar presentations


Ads by Google