Download presentation
Presentation is loading. Please wait.
Published byAnnice Fox Modified over 9 years ago
1
Department of Computer Science Mining Performance Data from Sampled Event Traces Bret Olszewski IBM Corporation – Austin, TX Ricardo Portillo, Diana Villa, Patricia J. Teller The University of Texas at El Paso Department of Computer Science
2
Outline Motivation Data Collection Environment Workload & Platform Monitored Events Data Analysis & Results Conclusions and Future Work
3
Department of Computer Science Motivation Capturing Event Traces System Simulation: Overhead penalty is too high Real-time Metrics: Capture every event during actual execution Problem Growing size of full event traces is becoming unmanageable Goal Use sampled event traces to analyze execution behavior
4
Department of Computer Science Data Collection Environment Workload TPC-C benchmark Commercial OLTP Platform IBM eServer pSeries 690 architecture (p690) 8- and 32-processor configurations
5
Department of Computer Science P X XP XP L2 L3 MCM 0 8-processor p690 configuration Platform P X XP XP P L2 L3 MCM 1 X XP L2
6
Department of Computer Science 32-processor p690 configuration Platform P P PP PP P L2 L3 MCM 0 P P P PP PP P L2 L3 MCM 2 P P P PP PP P L2 L3 MCM 1 P P P PP PP P L2 L3 MCM 3 P
7
Department of Computer Science Monitored Events L2-cache data-load misses L2.5 L2.75 L3 L3.5 MEM
8
Department of Computer Science P X XP XP L2 L3 MCM 0 P X XP XP P L2 L3 MCM 1 X XP L2 Where is L2 Miss Resolved? L2
9
Department of Computer Science P X XP XP L2 L3 MCM 0 P X XP XP P L2 L3 MCM 1 X XP L2 Where is L2 Miss Resolved? L2.5 Event
10
Department of Computer Science P X XP XP L2 L3 MCM 0 P X XP XP P L2 L3 MCM 1 X XP L2 Where is L2 Miss Resolved? L2.5 Event L2.75 Event
11
Department of Computer Science P X XP XP L2 L3 MCM 0 P X XP XP P L3 MCM 1 X XP L2 Where is L2 Miss Resolved? L2.5 Event L2.75 EventL3 Event
12
Department of Computer Science P X XP XP L2 MCM 0 P X XP XP P L3 MCM 1 X XP Where is L2 Miss Resolved? L2.5 Event L2.75 EventL3 Event L3 L2 L3.5 Event
13
Department of Computer Science Data Collection Performance Monitoring Unit (PMU) Special-purpose registers Programming interface Kernel extension eprof PMU configuration Event-based sampling
14
Department of Computer Science Sampled Event Trace 10-minute observation interval Record periodic occurrences of an event 100 events/sec/CPU Event record 372872 184469 0.328104637 000000000000A8C4 0000000000218880 PIDTIDTimestamp Effective Instruction Address Effective Data Address Average number of samples collected/event 238,448 for 8-processor data 212,396 for 32-processor data
15
Department of Computer Science Analysis Memory Hotspots Individual Address Region Process Migration
16
Department of Computer Science L3 and Memory are most active memory levels Counted total number of L3 hits Counted number of L3 hits per address region Counted number of unique cache lines referenced per region Memory Hotspots
17
Department of Computer Science Memory Hotspots
18
Department of Computer Science Individual Address Region We can look at an address region in more detail Looked at Buffer Pool region Counted number of references per memory level Counted number of unique cache lines referenced per memory level
19
Department of Computer Science 0 20000 40000 60000 80000 100000 120000 L2L2.5 MODL2.75 MODL3L3.5MEM Event Name Distribution of Data Load Hits: BUFFER_POOL DataLoadHits UniqueCacheLines Individual Address Region
20
Department of Computer Science Process Migration Process migration from one chip to another can degrade performance when all or part of the process' working set must follow, via L2-cache misses Looked at 885 threads Counted number of migrations per thread Counted number of L2.5 hits per thread
21
Department of Computer Science Process Migration
22
Department of Computer Science Only a few addresses in Buffer Pool region are causing most of its L3 hits For Buffer Pool, heavily referenced shared data is constantly resolved outside an MCM Process migration is not a source of performance degradation Conclusions
23
Department of Computer Science Quantify representativeness of sampled event traces Suggest more ways to improve p690 application performance Study sampled event traces for other workloads In depth study of process characterization Future Work
24
Department of Computer Science Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.