System Software for Parallel Computing
Two System Software Components Hard to do the innovation Replacement for Tradition Optimizing Compilers Replacement for conventional large monolithic OS
Quick View of Optimizing Compiler
Autotuners vs Traditional Compilers Quality of Generated Code Which Optimizations to perform Choosing parameters for the optimizations Selecting from among alternative implementations Resulting Optimizing Space
Difficulty of Enhancing Modern Compilers Constraints of Modern Compilers Million lines of code New optimizations are difficult to add Large investment Functional Correctness is more imp than output code quality Hence peak performance may still require handcrafting of the program
Promise of Search Based Autotuners Search based technique used in several areas of code generation Generates many variants of a given kernel Benchmarks each variant by running on the target platform Time to complete on the target platform ( tries many or all optimization switches ) Often find non-intuitive loop unrolling or register blocking factors that lead to better performance
Recent Autotuners Earlier Auto -Tuners were used concentrate on non-intuitive loop unrolling Recent Auto-Tuners are applicable for general-purpose parallel programs Auto-Tuning Cycle Auto-Tuners as Libraries Auto-Tuners as Stand-Alone Application Integrating Auto-Tuners as part of Operating System Compiler Extensions for Auto-Tuning Note: Taken from More recent paper " Auto-Tuning Support for Manycore applications - Perspectives for Operating Systems and Compilers
References High-Performance Compilers for Parallel Computing by Michael WolfeMichael Wolfe Optimizing Compilers for Modern Architectures: A Dependence-based Approach by Randy AllenRandy Allen C.A. Schaefer, V.Pankratius and W.F.Ticy. Atune-IL: An instrumentation language for autotuning parallel applications. Technical Report, University of Karlsruhe, 2009