Download presentation
Presentation is loading. Please wait.
Published byJessie Stewart Modified over 9 years ago
1
Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002, September 23, 2002
2
September 23, 2002PACT 20022 Introduction Microprocessor design: simulation of workload = set of programs + inputs –constrained in size due to time limitation –taken from suites, e.g., SPEC, TPC, MediaBench Workload design: –which programs? –which inputs? –representative: large variation in behavior –benchmark-input pairs should be “different”
3
September 23, 2002PACT 20023 Main idea Workload design space is p-D space –with p = # relevant program characteristics –p is too large for understandable visualization –correlation between p characteristics Idea: reduce p-D space to q-D space –with q small (typically 2 to 4) –without losing important information –no correlation –achieved by multivariate data analysis techniques: PCA and cluster analysis
4
September 23, 2002PACT 20024 Goal Measuring impact of input data sets on program behavior –“far away” or weak clustering: different behavior –“close” or strong clustering: similar behavior Applications: –selecting representative program-input pairs e.g., one program-input pair per cluster e.g., take program-input pair with smallest dynamic instruction count –getting insight in influence of input data sets –profile-guided optimization
5
September 23, 2002PACT 20025 Overview Introduction Workload characterization Data analysis –Principal components analysis (PCA) –Cluster analysis Evaluation Discussion Conclusion
6
September 23, 2002PACT 20026 Workload characterization (1) Instruction mix –int, logic, shift&byte, load/store, control Branch prediction accuracy –bimodal (8K*2 bits), gshare (8K*2 bits) and hybrid (meta: 8K*2 bits) branch predictor Data and instruction cache miss rates –Five caches with varying size and associativity
7
September 23, 2002PACT 20027 Workload characterization (2) Number of instructions between two taken branches Instruction-Level Parallelism –IPC of an infinite-resource machine with only read-after-write dependencies In total: p = 20 variables
8
September 23, 2002PACT 20028 Overview Introduction Workload characterization Data analysis –Principal components analysis (PCA) –Cluster analysis Evaluation Discussion Conclusion
9
September 23, 2002PACT 20029 PCA Many program characteristics (variables) are correlated PCA computes new variables –p principal components PC i –linear combination of original characteristics –uncorrelated –contain same total variance over all benchmarks –Var[PC 1 ] > Var [PC 2 ] > Var[PC 3 ] > … –most have near-to-zero variance (constant) –reduce dimension of workload space to q = 2 to 4
10
September 23, 2002PACT 200210 PCA: Interpretation Interpretation –Principal Components (PC) along main axes of ellipse –Var(PC 1 ) > Var(PC 2 ) >... –PC 2 is less important to explain variation over program-input pairs Reduce No. of PC’s –throw out PCs with negligible variance Variable 1 Variable 2 PC 1 PC 2
11
September 23, 2002PACT 200211 Cluster analysis Hierarchic clustering Based on distance between program- input pairs Can be represented by a dendrogram
12
September 23, 2002PACT 200212 Overview Introduction Workload characterization Data analysis –Principal components analysis (PCA) –Cluster analysis Evaluation Discussion Conclusion
13
September 23, 2002PACT 200213 Methodology Benchmarks –SPECint95 Inputs from SPEC: train and ref Inputs from the web (ijpeg) Reduced inputs (compress) –TPC-D on postgres v6.3 –Compiled with –O4 on Alpha –79 program-input pairs ATOM –Instrumentation –Measuring characteristics STATISTICA –Statistical analysis
14
September 23, 2002PACT 200214 GCC: principal components 2 PC’s: 96,9% of total variance
15
September 23, 2002PACT 200215 GCC emit-rtl insn-emit protoize varasm explow recog reload1 expr cp-decl insn-recog print-tree dbxout toplev High branch prediction accuracyHigh I-cache miss rates High D-cache miss rates Many control & shift insn Many LD/STs and ILP 7 inputs
16
September 23, 2002PACT 200216 Workload space: 4 PCs -> 93.1% ijpeg, compress and go are isolated Go: low branch prediction accuracy Compress: high data cache miss rate Ijpeg: high LD/STs rate, low ctrl ops rate Go: low branch prediction accuracy Compress: high data cache miss rate Ijpeg: high LD/STs rate, low ctrl ops rate
17
September 23, 2002PACT 200217 Workload space strong clustering
18
September 23, 2002PACT 200218 Small versus large inputs Vortex: –Train: 3.2B insn –Ref: 92.5B insn –Similar behavior: linkage distance ~ 1.4 Not for m88ksim –Linkage distance ~ 4 Reference input for compress can be reduced without significantly impacting behavior: 2B vs. 60B instructions
19
September 23, 2002PACT 200219 Impact of input on behavior For TPC-D queries: –Weak clustering –Large impact –I-cache behavior In general: variation between programs is larger than the variation between input sets for the same program –However: there are exceptions where input has large impact on behavior, e.g., TPC-D and perl
20
September 23, 2002PACT 200220 Overview Introduction Workload characterization Data analysis –Principal components analysis (PCA) –Cluster analysis Evaluation Discussion Conclusion
21
September 23, 2002PACT 200221 Conclusion Workload design –representative –not long running Principal Components Analysis (PCA) and cluster analysis help in detecting input data sets resulting in similar or different behavior of a program Applications: –workload design: representativeness while taking into account simulation time –impact of input data sets on program behavior –profile-guided optimizations
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.