Presentation is loading. Please wait.

Presentation is loading. Please wait.

Phase Capture and Prediction with Applications

Similar presentations


Presentation on theme: "Phase Capture and Prediction with Applications"— Presentation transcript:

1 Phase Capture and Prediction with Applications
Martin Hock Brian Pellin Karthik Jayaraman Vivek Shrivastava University of Wisconsin-Madison

2 Phases Definition: A period of execution that exhibits the same characteristics

3 Motivation Programs go through different phases of their execution
Phases are often repeated at different times in execution During each phase hardware is exercised differently

4 Sample Phase Behavior : gcc

5 Outline Phase Tracking Phase Prediction Applications
Phase Based Branch Prediction Phase Based Cache Configuration Summary / Conclusions

6 Phase Tracking Goal: Identify program phases with different behavior
Based on “Phase Tracking and Prediction” [Sherwood, Sair, Calder] Use reconfigurable hardware to take advantage of phase information Reconfigurable caches Instruction window size Dynamic branch predictor

7 Detecting Phases Track groups of 10 million instructions
Collect information about instructions and store Build a phase footprint After each 10 m insts. Compare footprint with past footprints If footprint close enough, it is considered a repetition of the phase

8 Accumulator Branch PC Hash # of inst. since branch +

9 Accumulator Branch PC 2 Hash # of inst. since branch 20 + Branch occurs, must increment entry 2 by 20.

10 Accumulator Branch PC 20 3 Hash # of inst. since branch 80 + New branch, increment entry 3 by 10.

11 Accumulator Branch PC 20 80 Hash # of inst. since branch + After a phase completes we need somewhere to store data about previous phases.

12 Past Footprint Table Accumulator Branch PC 20 80 Hash # of inst. since branch + *At 100 instructions

13 Past Footprint Past Footprint Table Accumulator Branch PC 20 80 Hash # of inst. since branch + Accumulator Data is stored in Past Footprint table

14 Past Footprint Table Past Footprint Accumulator 90 Branch PC 20 5 80 Hash # of inst. since branch 5 + *At 200 instructions Take the Manhattan distance between accumulator and Past Footprints = 190

15 Past Footprint Table Past Footprint Accumulator 90 Branch PC 20 80 5 Hash # of inst. since branch 5 + *At 200 instructions

16 Past Footprint Past Footprint Table Accumulator 90 Branch PC 21 20 79 80 5 Hash # of inst. since branch 5 + *At 300 instructions Manhattan distance between this phase and first phase is 2. This phase is close enough to the first phase to be considered the same as phase one.

17 Past Footprint Past Footprint Table Accumulator 430 Branch PC 21 20 9 10 80 Hash # of inst. since branch 70 + *At 30 million instructions Manhattan distance between this phase and first phase is 2. This phase is close enough to the first phase to be considered the same as phase one.

18 Outline Phase Tracking Phase Prediction Applications
Phase Based Branch Prediction Phase Based Cache Configuration Summary / Conclusions

19 Phase prediction When we detect a phase, it’s over
In order to adjust hardware, we need to know what phase we are in Three strategies Last seen Markov with RLE Perceptron

20 Last seen Predict next phase = last phase
Because last seen is so simple, another predictor would have to beat it significantly to justify the added cost

21 RLE Markov Adapted from Sherwood
Assumes that if we see phase X exactly Y times in a row, followed by phase Z, then if we see phase X exactly Y times again, it will again be followed by Z

22 Perceptron Individual perceptrons work in binary (±1)
Given history h1, h2, …, hn (±1), weights w0, w1, w2, …, wn (integers), compute S = w0 + w1h1 + w2h2 + … + wnhn If S ≥ 0, predict “yes”, else predict “no” To train, if hi = current , increment wi, else decrement (for w0, add current) But there are many phases, not just 2 Combine perceptrons for multivalue prediction

23 Multivalue perceptron
We have perceptrons P1, P2, …, Pn Perceptron Pi tries to predict phase i Train Pi only if in phase i History hi = 1 if it agrees with the current phase, -1 if disagrees Have the perceptrons vote for who is correct – most positive one wins

24 Phase prediction results
GCC: Last phase: 96% accurate RLE Markov: 94% accurate Perceptron: much lower

25 Phase prediction comments
Sherwood had lower accuracy for last phase (70%), perhaps due to oscillation Training cost of multiple perceptron means that it does not always adapt quickly Not worth improving due to the accuracy of last phase

26 Outline Phase Tracking Phase Prediction Applications
Phase Based Branch Prediction Phase Based Cache Configuration Summary / Conclusions

27 Phase Based Dynamic Branch Predictor
Previous research shows the usefulness of adapting branch predictors at run time “Dynamic history-length fitting: a third level of adaptivity for branch prediction” [Juan, Sanjeevan, Navarro]. “Combining Branch Predictors” [McFarling] Single branch predictor may not perform well within and across different executions. “A study of Branch Prediction Strategies” [Smith] Program behavior almost uniform within a phase -> choose best predictor for each phase

28 Methodology Select a small group of relevant predictors
At the beginning of each new phase, sample all the predictors and choose the best Save the best for each phase and use it if a phase reoccurs

29 Methodology Select a small group of relevant predictors
At the beginning of each new phase, sample all the predictors and choose the best Save the best for each phase and use it if a phase reoccurs

30 Methodology Select a small group of relevant predictors
At the beginning of each new phase, sample all the predictors and choose the best Save the best for each phase and use it if a phase reoccurs

31 Methodology Select a small group of relevant predictors
At the beginning of each new phase, sample all the predictors and choose the best Save the best for each phase and use it if a phase reoccurs

32 Methodology Select a small group of relevant predictors
At the beginning of each new phase, sample all the predictors and choose the best Save the best for each phase and use it if a phase reoccurs Phase 1

33 Methodology Select a small group of relevant predictors
At the beginning of each new phase, sample all the predictors and choose the best Save the best for each phase and use it if a phase reoccurs Phase 1 Phase 2

34 Dynamic Adaptations Possible dynamic adaptations
Multiple Branch Predictors 2Level, Bimodal Sample each for one profiling period Select on basis of [miss rate, number of mis-speculated instructions, …] Varying History Lengths History lengths [0,12] Some workloads give better performance with smaller history

35 Multiple Branch Predictors
Set of predictors 2level [1:1024:8] (Baseline predictor) Bimodal [1024] 2level [8: 512 :8] 2level [1: 512 :8] Profile period 10 million instructions

36 Multiple Branch Predictors
Simulator Used Simplescalar v3.0d Set of benchmarks gcc, vpr, mcf, ammp, art Selection Criterion Least Miss Rate If miss rates of two predictors are within 1 %, select the less expensive (simpler) one

37 Multiple Branch Predictor : Results IPC (gcc)

38 Multiple Branch Predictors: Results Branch Predictor Misses (gcc)

39 Multiple Branch Predictor : Results IPC (vpr)

40 Multiple Branch Predictors: Results Branch Predictor Misses (vpr)

41 Multiple Branch Predictors: Results Branch Predictor Misses (mcf)

42 Multiple Branch Predictors IPC Comparison

43 Multiple Branch Predictors Branch Prediction Misses Comparison

44 Varying History Length
G-share predictor with varying history lengths Set of history lengths sampled [0,3,6,8,12] Selection Criterion Least Miss Rate If miss rates of two predictors are within 1 %, select the less expensive (simpler) one

45 Varying History Length
Set of benchmarks gcc, mcf Simulator Used Simplescalar v3.0d Profile Period 10 million instructions

46 Varying History Length: Results IPC (gcc)

47 Varying History Length: Results Branch Predictor Misses (gcc)

48 Varying History Length: Result Instruction Cache Misses(IL1) (gcc)

49 Outline Phase Tracking Phase Prediction Applications
Phase Based Branch Prediction Phase Based Cache Configuration Summary / Conclusions

50 Cache optimization Smaller caches use less power
Some phases of execution will use less memory or execute a smaller region of code and therefore need less cache We can use a smaller cache for these phases without affecting performance

51 Methodology Try 4 possibilities of data and instruction cache simultaneously Data cache and instruction cache misses should be independent Select the best combination Data Instr Phase 1 Phase 2

52 Cache optimization results
GCC IPC Fixed 32K cache (16K + 16K): 1.807 Fixed 128K cache (64K + 64K): 1.896 Optimizer: 1.855 Average: 49K total

53 Cache comparison

54 Outline Phase Tracking Phase Prediction Applications
Phase Based Branch Prediction Phase Based Cache Configuration Summary / Conclusions

55 Summary Significant reduction in branch mispredictions (29.88% %) using phase based branch predictors Simple predictors beat more complex predictor in many phases Marginal gains in IPC using multiple branch predictor (2.24% %) Marginal gains in IL1 misses using phase based multiple branch predictors.

56 Summary (cont...) Phase based dynamic history length fitting does not give good gains

57 Conclusions [1] Phase based optimizations provides scope for improvements using reconfigurable hardware Using phase specific branch predictor provides good improvements in mis predictions A good strategy for saving power as mis-predictions may result in reduction of mis- speculated instructions,

58 Conclusion [2] However, varying history length does not result in substantial savings More benchmarks need to be considered to understand the effect of history length adaptations

59 Questions??


Download ppt "Phase Capture and Prediction with Applications"

Similar presentations


Ads by Google