Reducing Training Time in a One-shot Machine Learning-based Compiler John Thomson, Michael O'Boyle, Grigori Bursin, Björn Franke Presented by: Muhsin Zahid UGUR Dept of Computer & Information Sciences University of Delaware
A brief introduction of the paper The cluster-based approach Results
A brief introduction Iterative compilation Performance Training cost
The cluster-based approach
The cluster-based approach (the steps in detail) Clustering Clustered using GustafsonKessel algorithm Distances are minimized
The cluster-based approach (the steps in detail) - cont. Training Find the best optimization settings Build a model One-shot compilation Use a nearest neighbor model Deployment Extracted features input to nearest neighbor classifier Benchmark compiled and executed
Cluster approach 6 typical programs represent the clusters Select 4000 random flag settings Best performing one recorded
Standard Random Training Selection Use random selection to select programs to train on 6 benchmarks Robust mean performance
Generating the upper bound Apply 4000 different optimizations A reasonable upper bound limit
Results
Results (cont.)
Results (cont.)
Conclusion Reduce the amount of training Better characterize the program-space 1.14 speedup on EEMBCv2
Questions?
Thank you.