Download presentation
Presentation is loading. Please wait.
1
A Practical Method For Quickly Evaluating Program Optimizations Grigori Fursin, Albert Cohen, Michael O’Boyle and Olivier Temam ALCHEMY Group, INRIA Futurs and LRI, Paris- Sud Universit, France Institute for Computing Systems Architecture, University of Edinburgh, UK Presented by Shaofeng Liu
2
Outline Short background Short background What’s problem we want to solve? What’s problem we want to solve? How to solve the problem? How to solve the problem? The main challenges The main challenges Result Result Conclusion Conclusion
3
Background Iterative optimization has great potential for a large range of optimization techniques; Iterative optimization has great potential for a large range of optimization techniques; By running the program repeatedly, a new optimization technique is tested at each execution; By running the program repeatedly, a new optimization technique is tested at each execution; What’s the problem of this evaluation approach? What’s the problem of this evaluation approach?
4
The problem of Iterative evaluation Problem: Problem: Optimization option space could be huge.Optimization option space could be huge. An naïve iterative search could be very time- consuming.An naïve iterative search could be very time- consuming. e.g. The mgrid SpecFP2000 benchmark, it’s original execution time is 290s, if we have 32 optimization options, the total evaluation time will be 290x32=9280s.e.g. The mgrid SpecFP2000 benchmark, it’s original execution time is 290s, if we have 32 optimization options, the total evaluation time will be 290x32=9280s. [Note: An optimization option is not necessarily a single optimization technique, it could be a combined set of techniques. ] [Note: An optimization option is not necessarily a single optimization technique, it could be a combined set of techniques. ] Few work provides a practical approach for effectively applying iterative optimization. Few work provides a practical approach for effectively applying iterative optimization.
5
Can we do better? The idea is: The idea is: Can we evaluate multiple optimization options in a single run of the program?Can we evaluate multiple optimization options in a single run of the program? To do this: To do this: This paper does some research on the programs. Other than knowing nothing about the programs as last two iterative papers said, this work takes advantage of an interesting property of many programs.This paper does some research on the programs. Other than knowing nothing about the programs as last two iterative papers said, this work takes advantage of an interesting property of many programs. The interesting thing is: The interesting thing is: The programs (scientific applications) tend to have some performance stability.The programs (scientific applications) tend to have some performance stability. Some papers has shown that many programs exhibit phases, i.e. program trace intervals of several millions instructions where performance is similar. Some papers has shown that many programs exhibit phases, i.e. program trace intervals of several millions instructions where performance is similar.
6
What’s does phase mean? Phase is actually the stable consecutive periodic runs of the same piece of code. (e.g. time-consuming function calls, or big loop). Phase is actually the stable consecutive periodic runs of the same piece of code. (e.g. time-consuming function calls, or big loop). A phase only corresponds to one piece of code;A phase only corresponds to one piece of code; But one piece of code may have multiple phases;But one piece of code may have multiple phases; For example, if we monitor the subroutines “resid” & “psinv” in mgrid, we can see their behaviors are quite stable and predictable. For example, if we monitor the subroutines “resid” & “psinv” in mgrid, we can see their behaviors are quite stable and predictable.
7
The stability of execution time of subroutine resid We can see that the execution time of resid is quite stable with a period of 7. We can see that the execution time of resid is quite stable with a period of 7. Can we take advantage of this stability? Can we take advantage of this stability?
8
More examples
9
The Main Idea Find some time-consuming functions and big loops to optimize; Find some time-consuming functions and big loops to optimize; (I think this is done by the EKOPath compiler and users). Insert the codes optimized with different optimization options into the original code, i.e, multi-version code; Insert the codes optimized with different optimization options into the original code, i.e, multi-version code; Detect the phases of the program; Detect the phases of the program; Apply different optimization options within one phase and measure the execution time. Since these executions are supposed to have same execution time without optimization, so the changes of execution time is the effect of the optimization techniques. Apply different optimization options within one phase and measure the execution time. Since these executions are supposed to have same execution time without optimization, so the changes of execution time is the effect of the optimization techniques.
10
Add monitor code Compiler instrumentation Compiler instrumentation Two monitoring routines timer_start and timer_stop are added before and after each monitored code section;Two monitoring routines timer_start and timer_stop are added before and after each monitored code section; timer_start: select one piece of code from the multi-version, and record the starting time; timer_stop: record the completion time and compute the IPC; detect phases and regularity. The overhead of the instrumentation is very small. (less than 1%).The overhead of the instrumentation is very small. (less than 1%).
11
Detect stability For each phase, it is assigned a unique identifier, so they are evaluated independently. For each phase, it is assigned a unique identifier, so they are evaluated independently. It’s true that one optimization could benefit code of one phase but do harm to the code of others;It’s true that one optimization could benefit code of one phase but do harm to the code of others; So we only consider a single phase So we only consider a single phase We define stability by 3 consecutive of periodic code section execution instance with the same IPC.We define stability by 3 consecutive of periodic code section execution instance with the same IPC. It’s easy to design an algorithm to find the distance between consecutive periodic execution.It’s easy to design an algorithm to find the distance between consecutive periodic execution.
12
Evaluating Optimization Options To evaluate a single optimization options, we need four consecutive periodic executions of the code: To evaluate a single optimization options, we need four consecutive periodic executions of the code: The first two executions run code with the optimization option, to double check the optimization performance;The first two executions run code with the optimization option, to double check the optimization performance; The next two executions run original code, to verify the prediction is correct; if the execution time remains the baseline performance, then the prediction is correct, otherwise we start over to detect the regularity again.The next two executions run original code, to verify the prediction is correct; if the execution time remains the baseline performance, then the prediction is correct, otherwise we start over to detect the regularity again. [Note: the miss rate is fairlow If we have N optimization options, then we can evaluate all of them in roughly 4*N consecutive periodic executions of the code; If we have N optimization options, then we can evaluate all of them in roughly 4*N consecutive periodic executions of the code; The rest of the program is executed using the best code; so we have a self-tuned program now! The rest of the program is executed using the best code; so we have a self-tuned program now!
13
Result The evaluation process is greatly accelerated; we can evaluate more optimization options in a single run; The evaluation process is greatly accelerated; we can evaluate more optimization options in a single run; The self-tuned program has The self-tuned program has
14
Data Structure The paper uses a Phase Detection and Prediction Table (PDPT) to record the running of the program. It looks like: The paper uses a Phase Detection and Prediction Table (PDPT) to record the running of the program. It looks like:
15
Conclusion and Future work Conclusion Conclusion The time required to search the huge program transformations space is the main issue to prevent iterative optimization from being widely used;The time required to search the huge program transformations space is the main issue to prevent iterative optimization from being widely used; This paper uses a new approach to speed up the search by a factor of 32-962 over a set of benchmark;This paper uses a new approach to speed up the search by a factor of 32-962 over a set of benchmark; The method has other benefit: self-tuned program across different architectures;The method has other benefit: self-tuned program across different architectures; Future work Future work Analysis of large complex transformation spaces;Analysis of large complex transformation spaces; Improve phase detection and prediction scheme;Improve phase detection and prediction scheme;
16
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.