Presentation is loading. Please wait.

Presentation is loading. Please wait.

Predication ECE 721 Prof. Rotenberg.

Similar presentations


Presentation on theme: "Predication ECE 721 Prof. Rotenberg."— Presentation transcript:

1 Predication ECE 721 Prof. Rotenberg

2 If-Conversion Technique for removing a branch
Always fetch the control-dependent instructions, but conditionally execute them If-conversion is implemented via “predication”, also known as “guarding” Two styles of predication support in the ISA General predication Conditional moves

3 General Predication Predicate Register File
Example: pred0 – pred7 (8 predicate registers) Each predicate register is just 1 bit A predicate is either false (0) or true (1) Some or all instruction opcodes may be predicated, depending on ISA A predicated opcode has an additional source register (implicit or explicit), which is a predicate register Example ADD rd, rs, rt, pred if (pred) rd = rs + rt else NOP

4 General Predication (cont.)
Advantages of general predication Ability to directly predicate any instruction is more efficient than conditional moves, in terms of dynamic instruction count Thus permits more aggressive application of predication (predicate larger if/else regions) Disadvantages of general predication ISA design: Specifying a predicate register in many or all instructions takes instruction encoding space Microarchitecture design: More complex microarchitecture by virtue of almost every instruction having extra source register

5 General Predication (cont.)
ISAs that have general predication VLIW ISAs TI DSPs Intel IA64 (general predicate register file) Heavy reliance on compiler-based scheduling requires getting rid of as many branches as possible (enlarge basic blocks for larger static scheduling scope) Some “RISC” ISAs ARM (condition codes serve as predicate registers)

6 Conditional Moves Most major commercial ISAs had to retrofit predication support (e.g., x86, Alpha, MIPS) Not feasible to extend existing instruction formats to reference predicate registers Instead, add a single instruction opcode called a “conditional move”, which is a predicated move Example CMOV rd, rs, pred if (pred) rd = rs else NOP ISA CMOV specification Comment x86 CMOVcc rd, rs “pred” is a test of condition codes. The test is specified in the opcode. Alpha, MIPS CMOVx rd, rs, rt “pred” is a test of a general-purpose register (rt). The test is specified in the opcode.

7 Examples r1:x, r2:y, r3:sum, r4,r5,r6:temps Source Code
General Predication Conditional Moves if (x == y) sum++; CEQ pred5, r1, r2 ADDI r3, r3, #1, pred5 SUB r6, r1, r2 ADDI r4, r3, #1 CMOVZ r3, r4, r6 else sum--; SUBI r3, r3, #1, !pred5 SUB r6, r1, r2 ADDI r4, r3, #1 SUBI r5, r3, #1 CMOVZ r3, r4, r6 CMOVNZ r3, r5, r6 OR SUB r6, r1, r2 ADDI r4, r3, #1 SUBI r3, r3, #1 CMOVZ r3, r4, r6

8 Architecture Abstraction vs. Microarchitecture Implementation
Architecture abstraction (ISA) Predicated instruction either executes or is converted into a NOP Microarchitecture implementation In-order pipeline, or OoO pipeline without register renaming: Always execute instruction If predicate is true, instruction writes its value into the destination register If predicate is false, instruction does not write into the destination register (old value is preserved) OoO pipeline with register renaming: Because logical destination is mapped to a new, “blank” physical destination register, the instruction must always perform a write to it If predicate is true, instruction writes new value of logical dest. into its physical destination register If predicate is false, instruction writes old value of logical dest. into its physical destination register. This means the logical destination register is also a logical source register ISA says: CMOV rd, rs, pred // (pred ? rd = rs : NOP) Microarchitecture says: CMOV rd, rs, rd, pred // rd = (pred ? rs : rd) p99 = (p22 ? p7 : p50) … after renaming

9 Limitations of Predication
Predication is not profitable when branch’s control-dependent region is large and complex Many instructions Nested control-flow, e.g., loops, function calls, etc. Dynamic instruction count explodes, yielding lower performance than mispredicting the branch 10% of the time

10 Exploiting Control Independence
See alternate slides

11 Control-flow strategy Pros Cons
prediction Most streamlined when correct: No excess instructions. CIDD instructions are not delayed by branch predicates. Big misprediction penalty predication Eliminates mispredictions of predicated branch. Three performance overheads: Excess CD instructions, from non-selected path. CIDD instructions are delayed by branch predicates. CIDD instructions are delayed by CMOV’s, if ISA uses these or if employing dynamic hammock predication. Not applicable/profitable for many branches. control independence Exploits branch prediction: (1) Maximally streamlined when prediction is correct. (2) Reduced misprediction penalty when prediction is incorrect. Only CD and CIDD instructions are delayed. Complex implementation: Selective repair of control-flow within structures (CD instructions) and data-flow (CIDD instructions).


Download ppt "Predication ECE 721 Prof. Rotenberg."

Similar presentations


Ads by Google