Download presentation
Presentation is loading. Please wait.
Published byLucas King Modified over 9 years ago
1
CISC 879 - Machine Learning for Solving Systems Problems Presented by: John Tully Dept of Computer & Information Sciences University of Delaware Using Machine Learning to Guide Architecture Simulation Greg Hamerly (Baylor University) Gerez Perelman, Jeremy Lau, Brad Calder (UCSD) Timothy Sherwood (UCSB) Journal of Machine Learning Research 7 (2006) http://cseweb.ucsd.edu/~calder/papers/JMLR-06-SimPoint.pdf
2
CISC 879 - Machine Learning for Solving Systems Problems Simulation is Critical! Allows engineers to understand cycle-level behavior of processor before fabrication Can play with design options cheaply. How are performance, complexity, area, power affected when I make modification X, and remove feature Y?
3
CISC 879 - Machine Learning for Solving Systems Problems But... Simulation is SLOW Modelling at cycle level is very slow Simplescalar in cycle-accurate mode: a few hundred million cycles per hour Modelling at gate level is very, very, very slow ETI cutting-edge emulation technology: 5,000 cycles/second (24 hours = ~1 second of Cyclops-64 instructions).
4
CISC 879 - Machine Learning for Solving Systems Problems Demands are increasing Size of benchmarks: applications can be quite large. Number of programs: Industry standard benchmarks are large suites. Many focus on variety (i.e. SPEC – 26 programs. Stress ALUs, FPUs, Memory, Cache, etc.) Iterations required: just to experiment with one feature (cache size) can take hundreds of thousands of benchmark runs
5
CISC 879 - Machine Learning for Solving Systems Problems ‘Current’ Remedies Simulate programs for N instructions (whatever your timeframe allows), and just stop. Similarly, fast-forward through initialization portion, and then simulate N instructions. Simulate N instructions from only the “important” (most computationally intensive) portions of a program. Neither work well, and at their worst are embarrassing: error rates of almost 4,000%!
6
CISC 879 - Machine Learning for Solving Systems Problems SimPoint to the Rescue 1. As a program executes, its behavior changes. The changes aren’t random – they’re structured as sequences of recurring behavior (termed phases). 2. If repetitive and structured behavior can be identified, then we only need to sample each unique behavior of a program (and not the whole thing) to get an idea for its execution profile. 3. How can we identify repetitive, structured behavior? Use machine learning! Now, only a small set of samples needed. Collect points from each phase (simulation points), and weigh them – this accurately depicts execution of the entire program.
7
CISC 879 - Machine Learning for Solving Systems Problems Defining Phase Behavior Seems pretty easy at first... let's just collect hardware-based statistics, and classify phases accordingly CPI (performance) Cache Miss Rates Branch Statistics (Frequency, Prediction Rate) FPU instructions per cycle But what's the problem here?
8
CISC 879 - Machine Learning for Solving Systems Problems Defining Phase Behavior Problem: if we use hardware-based stats, we're tying phases to architectural configuration! Every time we tweak architecture, we must re-define phases! Underlying methodology: identify phases without relying on architectural metrics. Then, we can find a set of samples that can be used across our entire design space. But what can we use that's independent of hardware- based status, but still relates to fundamental changes in what the hardware is doing?
9
CISC 879 - Machine Learning for Solving Systems Problems Defining Phase Behavior Basic Block Vector (BBV): a structure designed to capture how a program changes behavior over time. A distribution of how many times each basic block is executed over an interval (can use a 1D-array) Each entry weighted by # of instructions in the BB (so all instructions have equal weight). Subsets of information in BBVs can also be extracted Register usage vectors Loop / branch execution frequencies
10
CISC 879 - Machine Learning for Solving Systems Problems Defining Phase Behavior Now, we can use BBVs to find patterns in the program. But can we prove they're useful? Detailed study by Lau et. al: very strong correlation between the following: 1) Difference in BBV of the interval, and BBV of the whole program (code changes) 2) CPI of the interval (performance) Graphic on next slide...... Things are looking really good now – we can create a set of phases (and therefore, points to simulate) by ONLY looking at executed code.
11
CISC 879 - Machine Learning for Solving Systems Problems Defining Phase Behavior
12
CISC 879 - Machine Learning for Solving Systems Problems Extracting Phases Next step: how do I actually turn my BBV vectors into phases? Create a function to compare two BBVs: how similar are they? Use machine learning data clustering algorithms to group similar BBVs. Each cluster (set of similar points) = a phase! SimPoint is the implementation of this Profiles programs (divides them into intervals, and creates BBVs for each). Use k-means clustering algorithm. Input includes granulatiry of clusters - that dictates the size and abundance of phases!
13
CISC 879 - Machine Learning for Solving Systems Problems Choosing Simulation Pts Final Step: choose simulation points. From each phase, SimPoint chooses one representative interval that will be simulated (in full detail) to represent the whole phase. All points in the phase are (theoretically) similar in performance statistics – so we can extrapolate. Machine learning also used to pick representative points of a cluster (the interval to use from a phase). Points are weighed based on interval size (and phase size, of course) Only needs to be done one per program+input combination – remember why?
14
CISC 879 - Machine Learning for Solving Systems Problems Choosing Simulation Pts User can tweak interval length, # clusters, etc – tradeoff between number of points simulated, and simulation time.
15
CISC 879 - Machine Learning for Solving Systems Problems Experimental Framework Test Programs: SPEC Benchmarks (26 applications, about half integer, half FP; designed to stress all aspects of a processor.) Simulation: SimpleScalar, Alpha architecture. Metrics: accuracy of simulation measured in CPI prediction error
16
CISC 879 - Machine Learning for Solving Systems Problems Million Dollar Question... How does phase classification do? SPEC2000, 100 million instruction intervals, no more than 10 simulation points Gzip, Gcc: only 4 and 8 phases found, respectively
17
CISC 879 - Machine Learning for Solving Systems Problems Million Dollar Question...
18
CISC 879 - Machine Learning for Solving Systems Problems How accurate is this thing? A lot better than “current” methods..... Million Dollar Question...
19
CISC 879 - Machine Learning for Solving Systems Problems Million Dollar Question...
20
CISC 879 - Machine Learning for Solving Systems Problems How much time are we saving? In previous result, we're only simulating 400-800 million instructions for SimPoint results. According to SPEC benchmark data sheet, 'reference' input configurations are 50 billion and 80 billion instructions, respectively. So, baseline simulation needed to execute ~100 times more instructions for this configuration – took several months! Imagine if we needed to run on a few thousand combinations of cache size, memory latency, etc.... Intel / Microsoft use it - must be pretty good. Million Dollar Question...
21
CISC 879 - Machine Learning for Solving Systems Problems Putting it all together First implementation of machine learning techniques to perform program phase analysis. Main thing to take away: applications (even complex ones) only exhibit a few unique behaviors – they're simply interleaved with each other over time. Using machine learning, we can find these behaviors with methods that are independent of architectural metrics. By doing so, we only need to simulate a few carefully chosen intervals, which greatly reduces simulation time.
22
CISC 879 - Machine Learning for Solving Systems Problems Related / Future Work Other clustering algorithms with same data (multinomail clustering, regression trees) – k-means appears to do the best. “Un-tie” simulation points from binary – how could we do this? Map behavior back to source level after detecting it Now, we can use same simulation points for different compilations / input of a program Accuracy is just about as good as with fixed intervals (Lau et. al)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.