Download presentation
Presentation is loading. Please wait.
Published byBryanna Norland Modified over 9 years ago
1
Wish Branches Combining Conditional Branching and Predication for Adaptive Predicated Execution The University of Texas at Austin *Oregon Microarchitecture Lab Electrical and Computer Engineering Intel Corporation Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt
2
2 Talk Outline Problem Wish Branches Experimental Methodology Results Conclusion
3
3 Predicated Execution Convert control flow dependency to data dependency Pro: Eliminate hard-to-predict branches (normal branch code) CB D A T N p1 = (cond) branch p1, TARGET mov b, 1 jmp JOIN TARGET: mov b, 0 A B C B C D A (predicated code) A B C if (cond) { b = 0; } else { b = 1; } Cons:(1) Fetch blocks B and C all the time (2) Wait until p1 is resolved D add x, b, 1 p1 = (cond) (!p1) mov b, 1 (p1) mov b, 0
4
4 p1 = (cond) (!p1) mov b, 1 (p1) mov b, 0 The Overhead of Predicated Execution If all overhead is ideally eliminated, predicated execution would provide 16% improvement in average execution time A B C (Predicated code) D add x, b, 1 non-predicated p1 = (cond) (0) mov b,1 (1) mov b,0 -2% 13%16%
5
5 The Problem Due to the predication overhead, predicated execution sometimes reduces performance Branch misprediction characteristics are dependent on run-time behavior: input set, control-flow path and phase behavior. The compiler cannot accurately estimate the run-time behavior of branches
6
6 Talk Outline Problem Wish Branches Experimental Methodology Results Conclusion
7
7 Wish Branches A new type of control flow instruction 3 types: wish jump/join and wish loop The compiler generates code (with wish branches) that can be executed either as predicated code or non-predicated code (normal branch code) The hardware decides to execute predicated code or normal branch code at run-time based on the confidence of branch prediction Easy to predict: normal branch code Hard to predict: predicated code
8
8 TARGET: (p1) mov b,0 TARGET: (1) mov b,0 (!p1) mov b,1 wish.join !p1 JOIN (1) mov b,1 wish.join (1) JOIN Low Confidence Wish Jump/Join p1 = (cond) branch p1, TARGET CB D A T N mov b, 1 jmp JOIN TARGET: mov b,0 normal branch code A B C B C D A p1 = (cond) (!p1) mov b,1 (p1) mov b,0 predicated code A B C wish jump/join code B A C D wish jump p1=(cond) wish.jump p1 TARGET A B C wish join D JOIN: High Confidence nop Taken Not-Taken
9
9 Low Confidence Wish Loop X Y N T LOOP: add a, a, 1 add i, i, 1 p1 = (i<N) branch p1, LOOP EXIT: X Y N T H mov p1, 1 LOOP: (p1) add a, a, 1 (p1) add i, i, 1 (p1) p1 = (cond) wish. loop p1, LOOP EXIT: normal backward branch code do { a++; i++; } while (i<N); X H X wish loop code YY High Confidence (1)
10
10 Mispredicted Case 1: Early-Exit X1X1 X2X2 X3X3 Y TTN Correct execution: Early-exit: (Low confidence) X1X1 X2X2 T Y N X3X3 Y N Flush pipeline Compared to normal branch code: predicate data dependency and one extra instruction (-) … X Y N T H H H
11
11 Mispredicted Case 2: Late-Exit X1X1 X2X2 X3X3 Y TTN Correct execution: Late-exit: (Low confidence) X1X1 X2X2 T X3X3 T Compared to normal branch code: pro : reduce flush penalty (+++) cons: predicate data dependency and one extra instruction (-) T X4X4 T X5X5 N Y… nop X Y N T H H H
12
12 Mispredicted Case 3: No-Exit X1X1 X2X2 X3X3 Y TTN Correct execution: No-exit: (Low confidence) X1X1 X2X2 T X3X3 T Compared to normal branch code: predicate data dependency and one extra instruction (-) T X4X4 T X5X5 T X6X6 … T Flush pipeline Y X Y N T H H H
13
13 Advantages/Disadvantages of Wish Branches Advantages compared to predicated execution Reduce the overhead of predication Increase the benefits of predicated code by allowing the compiler to generate more aggressively-predicated code Provide a mechanism to exploit predication to reduce the branch misprediction penalty for backward branches (Wish loops) Make predicated code less dependent on machine configuration (eg. branch predictor)
14
14 Advantages/Disadvantages of Wish Branches Disadvantages compared to predicated execution Extra branch instructions use machine resources Extra branch instructions increase the contention for branch predictor table entries May constrain the compiler ’ s scope for code optimizations
15
15 Wish Branch Support ISA Support predicated execution, wish branch instruction Compiler Support Wish branch generation algorithms The compiler needs to decide which branches are predicated, which are converted to wish branches, and which stay as normal branches Hardware Support Confidence estimator Front-end and branch misprediction detection/recovery module
16
16 Talk Outline Problem Wish Branches Experimental Methodology Results Conclusion
17
17 Experimental Infrastructure IA-64 provides full support for predication Convert IA-64 traces to micro-ops to simulate an out-of-order superscalar processor model IA-64 Compiler (ORC) Source Code IA-64 Binary IA-64 Trace µ ops Trace generation module Micro-op Translator Micro-op Simulator
18
18 Simulation Methodology Nine SPEC 2000 integer benchmarks Baseline Processor Configuration Front End Large and accurate branch predictor (64KB hybrid branch predictor: gshare + local) Minimum 30-cycle branch misprediction penalty 64KB, 2-cycle latency I-cache Execution Core 8-wide out-of-order processor 512-entry instruction window Confidence Estimator 1KB tagged 16-bit history JRS confidence estimator (Jacobsen et al. MICRO-29)
19
19 Talk Outline Problem Wish Branches Experimental Methodology Results Conclusion
20
20 SELECTIVE-PREDICATION: branches are selectively predicated using compile-time cost-benefit analysis AGGRESSIVE-PREDICATION: all branches that are suitable for if- conversion are predicated 16% over conditional branch prediction (w/o mcf) 11% over selective-predication (w/o mcf) 7 % over aggressive predication (w/o mcf) 14% over conditional branch prediction and 13% over selective-predication and 16% over aggressive-predication 12% over conditional branch prediction 11% over selective-predication 13 % over aggressive predication Performance Improvement 24% 8% 14% -4% non-predicated 2.02
21
21 Talk Outline Problem Wish Branches Experimental Methodology Results Conclusion
22
22 Conclusion New control flow instructions: wish branches (jump/join/loop) Wish branches improve performance by dividing the work of predication between the compiler and the microarchitecture Compiler: analyzes the control-flow graph and generates code Microarchitecture: makes run-time decision to use predication Wish branches provide significant performance benefits 16% compared to conditional branch prediction 13% compared to selectively predicated code Wish branches can make predicated execution more viable and effective in high performance processors By enabling adaptive and aggressive predicated execution
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.