Download presentation
Presentation is loading. Please wait.
Published byLiani Lesmana Modified over 5 years ago
1
Determining the Accuracy of Event Counts - Methodology
Design and implement microbenchmarks for Event count prediction Determination of algorithm implemented in processor E.g., branch prediction or cache prefetch algorithm Predicted event count verification Collect data Scripts permit a large number of experiments to be performed Means and standard deviations computed Compare predicted and actual event counts Repeat process if necessary
2
Microbenchmarks Contain simple code segments Small when possible
Comprised of regular patterns that permit mathematical modeling of associated event count Scalable w.r.t. granularity, i.e., number of generated events
3
Four Classes of Errors Overhead Bias Multiplicative Random Unknown
not predictable but verifiable not predictable and cannot determine veracity e.g., randomness in algorithm does not allow prediction Error – difference between predicted and actual Drive home fact that “error” does not mean that the count is wrong (necessarily) but it does indicate that without further knowledge the count may not be very useful or its use may be diminished. Overhead bias: a bias associated with PAPI Multiplicative: e.g., twice predicted Random: among x experiments, y of them have significantly different event counts.
4
When Event Counts Can Be Used to Tune Performance
Overhead Bias Error adjust counts or granularity accordingly Multiplicative adjust counts accordingly Random perform multiple experiments and verify that standard deviation is small Unknown, not predictable but verifiable not useful for fine performance tuning but useful for coarse tuning Note that event counts can be used in these four cases but you need knowledge about what is being counted and how it is being counted in order to make use of them Overhead bias error if error is well defined, i.e., small standard deviation among experiments then adjust counts or adjust granularity so that bias is not significant Unknown
5
When Event Counts Cannot Be Used to Tune Performance
Unknown vendor assistance is needed to understand what is being counted or what algorithm is implemented in the processor segregate combinations of error classes
6
Overhead Bias Error Itanium Power3 R12K Pentium Loads 86 28 46 N/A
Stores 129 31 Mult. Error Standard deviation less than one for those cited
7
Multiplicative Error Floating-point – compare all platforms – Power3 has double All but Power3 have an error of 0 For Power3, it is 2x, where x is the predicted count
8
Random Error Itanium L1 Data Cache Misses Mean Standard Deviation
90% of data – 1M accesses 1,290 170 10% of data – 1M accesses 782,891 566,370
9
Unknown – Not Predictable But Verifiable
L1 D cache misses on three platforms The error is shown w.r.t. # accesses generated expressed as a percentage of cache size; this “normalizes” the data for the various platforms which have different size caches (16K, 32K and 64K) This is the array benchmark – sequential access of array elements. If one looked at these results and was ignorant of stream buffers, 0 misses as the # accesses increases does not make sense – one might think the counters are wrong. Without knowledge of algorithm used for streaming cannot predict event counts; but using timing (cycle counts) we can verify that an approximate number of cache misses were, in fact, generated A microbenchmark that randomly accesses the array and a simulator that simulates the generated access pattern results in increased predictability.
10
Unknown – Not Predictable and Not Verifiable
Branch prediction Algorithms used for prediction are very complex Without proprietary information cannot make predictions
11
Future Work Expand events and platforms studied
Compare accuracy of sampling with that of aggregate counts Determine usefulness of event counts generated by both sampling and aggregate counts for specific DoD applications
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.