Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.2. A 4-stage pipeline.
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.4. Pipeline stall caused by a cache miss in F2.
Figure 8.6. Pipeline stalled by data dependency between D 2 and W 1.
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.7. Operand forwarding in a pipelined processor.
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.9. Branch timing.
F : Fetch instruction E : Execute instruction W : Write results D : Dispatch/ Decode Instruction queue Instruction fetch unit Figure Use of an instruction queue in the hardware organization of Figure 8.2b. unit
X Figure 8.11.Branch timing in the presence of an instruction queue. Branch target address is computed in the D stage. F 1 D 1 E 1 E 1 E 1 W 1 F 4 W 3 E 3 I 5 (Branch) I 1 F 2 D Clock cycle E 2 W 2 F 3 D 3 E 4 D 4 W 4 F 5 D 5 F 6 F k D k E k F k+1 D 1 I 2 I 3 I 4 I 6 I k I 1 W k E Queue length1 Time
Add LOOPShift_leftR1 Decrement Branch=0 R2 LOOP NEXT (a) Original program loop LOOPDecrementR2 Branch=0 Shift_left LOOP R1 NEXT (b) Reordered instructions Figure Reordering of instructions for a delayed branch. Add R1,R3
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure Execution timing showing the delay slot being filled during the last two passes through the loop in Figure 8.12.
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure Timing when a branch decision has been incorrectly predicted as not taken.
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure State-machine representation of branch prediction algorithms.
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure Figure Equivalent operations using complex and simple addressing modes.
Add Compare Branch=0 R1,R2 R3,R4... Compare Add Branch=0 R3,R4 R1,R2... (a) A program fragment (b) Instructions reordered Figure Instruction reordering.
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure Datapath modified for pipelined execution, with Interstage buffers at the input and output of the ALU.
I 1 (Fadd) D 1 D 2 D 3 D 4 E 1A E 1B E 1C E 2 E 3 E 3 E 3 E 4 W 1 W 2 W 3 W 4 I 2 (Add) I 3 (Fsub) I 4 (Sub) Figure 8.20.An example of instruction execution flow in the processor of Figure 8.19, assuming no hazards are encountered Clock cycle Time F 1 F 2 F 3 F 4 7
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure Instruction completion in program order.
LDXR3,0,R6Loadnumberofitemsinthelist. ORR0, R4 tobeusedasoffsetinthelist ORR0, R7ClearR7tobeusedasaccumulator. LOOPSTARTLDXR3,R4,R5LoadlistitemintoR5. ADDR5,R7,R7Addnumbertoaccumulator. ADDR4,8,R4Pointtothenextentry. SUBccR6,1,R6DecrementR6andsetconditionflags. BGxcc,LOOPSTARTLoopifmoreitemsinthelist. NEXT... (a) Desired program loop LDXR3,0,R6 ORR0, R4 ORR0, R7 LOOPSTARTLDXR3,R4,R5 ADDR4,8,R4 SUBccR6,1,R6 BG,ptxcc,LOOPSTARTPredictedtaken,Annulbit=0 ADDR5,R7,R7 NEXT... (b) Instructions reorganized to use the delay slot Figure 8.22.An addition loop showing the use of the branch delay slot and branch prediction.
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure Main building blocks of the UltraSPARC II processor.
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure Example of instruction grouping.
ADDR3,R5,R6GECN1N2N3W LDSWR4,R7,R6GECN1N2N3W (a) Instructions with common destination MOVRZR1,R6,R7GECN1N2N3W ORR7,R8,R9GECN1N2N3W (b) Delay caused by MOVR instruction Figure 8.26 Dispatch delays due to hazards.
Inte ger re gister file Anne x IEU0 IEU1 ALU Interstage buffers Figure Integer execution unit.
I 1 (Icc)GEC I 2 (BRcc)GEC I 3 GEC I 4 GEC I 5 GE I 6 GE I 7 GE I 8 GE I 9 G I 10 G I 11 G I 12 G Abort Figure Worst-case timing for an incorrectly predicted branch.
Integer register file/ annex Figure Load and store unit. GECN1 data tags dTLB D-Cache Compare Load/store queue Miss To E-Cache
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure Execution flow.
Please see “portrait orientation” PowerPoint file for Chapter 8 Table 8.1. Examples of SPARC instructions.