University of California San Diego Locality Phase Prediction Xipeng Shen, Yutao Zhong, Chen Ding Computer Science Department, University of Rochester Class Discussion prepared by Bumyong Choi
University of California San Diego Memory Adaptation Programs exhibit dynamic locality Several studies have been done, but require manual analysis to find program phases Locality-based phase prediction can solve the problem
University of California San Diego Previous Analysis Interval Based Unclear what the best interval length is Code-based The program structure may not reveal its locality pattern In-lined function, intertwined functions calls
University of California San Diego The new technique Locality analysis No fixed-size windows Phase marking All instructions in the program binary
University of California San Diego Locality Phase A period of a program execution that has stable or slow changing data locality. We are interested in phases that are repeatedly executed with similar locality for optimization purpose. Phase Prediction: knowing a phase and its locality whenever the execution enters the phase.
University of California San Diego Examples of Recurring Locality Phases The aging of airplane model Structural/mechanical/molecular Other scientific and commercial simulations GREAT DEMAND FOR COMPUTING RESOURCES!! Exhibit dynamic but stable phases Good candidates for adaptation, if we can predict locality phases
University of California San Diego Program Phase Source: Phase Tracking and Prediction (Sherwood et al)
University of California San Diego Downside
University of California San Diego Motivation for the use of locality analysis Recent studies found that reuse-distance histograms change in predictable patterns in many programs Reuse distance reveals patterns in program locality
University of California San Diego Reuse Distance The number of distinctive data elements accessed between two consecutive uses of the same element
University of California San Diego Reuse Distance Example a b c a a c b rd=2
University of California San Diego Reuse Distance Example a b c a a c b rd=0
University of California San Diego Reuse Distance Example a b c a a c b rd=1
University of California San Diego Reuse Distance Example a b c a a c b rd=2
University of California San Diego Reuse Distance Example a b c a a c b rd=0
University of California San Diego The reuse-distance trace of Tomcatv
University of California San Diego What the example confirms.. Major shifts in program locality are marked by radical changes Locality phases have different length The size changes greatly with program inputs A phase is a unit of repeating behavior rather than a unit of uniform behavior
University of California San Diego New Locality Prediction Method 1. Analyzes the data locality in profiling runs 1. Variable-distance sampling 2. Wavelet filtering 3. Optimal Phase Partitioning 2. Analyzes the instruction trace and identifies the phase boundaries in the code 3. Uses grammar compression to identify phase hierarchies and then inserts program markers through binary rewriting.
University of California San Diego Off-line Analysis Optimal Phase Partitioning Variable-distance sampling Filtering(Wavelet)
University of California San Diego Variable-distance sampling 1. A small number of representative data 2. Only long-distance reuses 3. Uses dynamic feedback to find suitable thresholds
University of California San Diego Wavelet Filtering Used as a filter to expose abrupt changes in the reuse pattern – removes temporal redundancy Common Technique in signal an image processing Shows the change of frequency over time. Further Reading on Wavelet: I.Daubechies. Ten Lectures on Wavelets. Capital City Press, Montpelier, Vermont, 1992
University of California San Diego Wavelet Filtering The wavelet filtering removes reuses of the same data within a phase
University of California San Diego Optimal Phase Partitioning Removes the spatial redundancy. Conditions for a good phase partition A phase should include accesses to as many data samples as possible. A phase should not include multiple accesses of the same data sample.
University of California San Diego Optimal Phase Partitioning Filtered trace -> a directed acyclic graph Each edge has a weight. More details : in the paper.
University of California San Diego New Prediction Method 1. Analyzes the data locality in profiling runs 1. Variable-distance sampling 2. Wavelet filtering 3. Optimal Phase Partitioning 2. Analyzes the instruction trace and identifies the phase boundaries in the code 3. Uses grammar compression to identify phase hierarchies and then inserts program markers through binary rewriting.
University of California San Diego Phase Marker Selection This step finds the basic blocks in the code that uniquely mark detected phases. Examines all instruction blocks Possible that the high level program structure may be lost due to compiler optimizations
University of California San Diego Phase Marker Selection Phase detection finds the number of phases but cannot locate the precise time of phase transitions. Hundreds of memory access vs a few memory references in basic block What about gradual transition?
University of California San Diego Phase Marker Selection Solution? Using the frequency of the phases instead of the time of their transition Marker Block: a basic block that is always executed at the beginning of phase based on the frequency found If blank region (removed blocks) is larger than threshold, it is considered as a phase execution.
University of California San Diego New Prediction Method 1. Analyzes the data locality in profiling runs 1. Variable-distance sampling 2. Wavelet filtering 3. Optimal Phase Partitioning 2. Analyzes the instruction trace and identifies the phase boundaries in the code 3. Uses grammar compression to identify phase hierarchies and then inserts program markers through binary rewriting.
University of California San Diego Hierarchical Construction SEQUITUR Compresses a string of symbols into a Context Free Grammar By constructing the phase hierarchy, we find phases of the largest granularity.
University of California San Diego Phase Marker Insertion ATOM- binary rewriting tool The basic phases (the leaves of the phase hierarchy) have unique markers in the program, so their prediction is trivial. Based on the phase hierarchy, we make prediction. Finite automaton to recognize the current phase in the phase hierarchy.
University of California San Diego Evaluation 1. Measure the granularity and accuracy of phase prediction 2. Cache resizing 3. Memory remapping 4. Test the result against manual phase marking
University of California San Diego Phase Prediction
University of California San Diego Phase Prediction
University of California San Diego Adaptive Cache-resizing
University of California San Diego Memory-remapping Assume: the support of Impluse controller Key requirement: identify when remapping is profitable
University of California San Diego Manual vs Phase
University of California San Diego Conclusions General method for predicting hierarchical memory phases in programs with input- dependent but consistent phase-behavior Predicts the length and locality with near perfect accuracy It reduces cache size by 40% without increasing the number of cache misses It improves program performance by 35% when used for memory remappings
University of California San Diego Conclusion (cont.) Locality phase detection should benefit modern adaptation techniques for increasing performance reducing energy other improvements
University of California San Diego Questions?