Download presentation
Presentation is loading. Please wait.
Published byJanice Scott Modified over 9 years ago
1
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.1. Basic idea of instruction pipelining.
2
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.2. A 4-stage pipeline.
4
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.4. Pipeline stall caused by a cache miss in F2.
6
Figure 8.6. Pipeline stalled by data dependency between D 2 and W 1.
7
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.7. Operand forwarding in a pipelined processor.
9
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.9. Branch timing.
10
F : Fetch instruction E : Execute instruction W : Write results D : Dispatch/ Decode Instruction queue Instruction fetch unit Figure 8.10. Use of an instruction queue in the hardware organization of Figure 8.2b. unit
11
X Figure 8.11.Branch timing in the presence of an instruction queue. Branch target address is computed in the D stage. F 1 D 1 E 1 E 1 E 1 W 1 F 4 W 3 E 3 I 5 (Branch) I 1 F 2 D 2 123456789Clock cycle E 2 W 2 F 3 D 3 E 4 D 4 W 4 F 5 D 5 F 6 F k D k E k F k+1 D 1 I 2 I 3 I 4 I 6 I k I 1 W k E 1 10 111123211Queue length1 Time
12
Add LOOPShift_leftR1 Decrement Branch=0 R2 LOOP NEXT (a) Original program loop LOOPDecrementR2 Branch=0 Shift_left LOOP R1 NEXT (b) Reordered instructions Figure 8.12. Reordering of instructions for a delayed branch. Add R1,R3
13
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.13. Execution timing showing the delay slot being filled during the last two passes through the loop in Figure 8.12.
14
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.14. Timing when a branch decision has been incorrectly predicted as not taken.
15
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.15. State-machine representation of branch prediction algorithms.
16
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.16. Figure 8.16. Equivalent operations using complex and simple addressing modes.
17
Add Compare Branch=0 R1,R2 R3,R4... Compare Add Branch=0 R3,R4 R1,R2... (a) A program fragment (b) Instructions reordered Figure 8.17. Instruction reordering.
18
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.18. Datapath modified for pipelined execution, with Interstage buffers at the input and output of the ALU.
20
I 1 (Fadd) D 1 D 2 D 3 D 4 E 1A E 1B E 1C E 2 E 3 E 3 E 3 E 4 W 1 W 2 W 3 W 4 I 2 (Add) I 3 (Fsub) I 4 (Sub) Figure 8.20.An example of instruction execution flow in the processor of Figure 8.19, assuming no hazards are encountered. 123456Clock cycle Time F 1 F 2 F 3 F 4 7
21
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.21. Instruction completion in program order.
22
LDXR3,0,R6Loadnumberofitemsinthelist. ORR0, R4 tobeusedasoffsetinthelist ORR0, R7ClearR7tobeusedasaccumulator. LOOPSTARTLDXR3,R4,R5LoadlistitemintoR5. ADDR5,R7,R7Addnumbertoaccumulator. ADDR4,8,R4Pointtothenextentry. SUBccR6,1,R6DecrementR6andsetconditionflags. BGxcc,LOOPSTARTLoopifmoreitemsinthelist. NEXT... (a) Desired program loop LDXR3,0,R6 ORR0, R4 ORR0, R7 LOOPSTARTLDXR3,R4,R5 ADDR4,8,R4 SUBccR6,1,R6 BG,ptxcc,LOOPSTARTPredictedtaken,Annulbit=0 ADDR5,R7,R7 NEXT... (b) Instructions reorganized to use the delay slot Figure 8.22.An addition loop showing the use of the branch delay slot and branch prediction.
23
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.23. Main building blocks of the UltraSPARC II processor.
25
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.25. Example of instruction grouping.
26
ADDR3,R5,R6GECN1N2N3W LDSWR4,R7,R6GECN1N2N3W (a) Instructions with common destination MOVRZR1,R6,R7GECN1N2N3W ORR7,R8,R9GECN1N2N3W (b) Delay caused by MOVR instruction Figure 8.26 Dispatch delays due to hazards.
27
Inte ger re gister file Anne x IEU0 IEU1 ALU Interstage buffers Figure 8.27. Integer execution unit.
28
I 1 (Icc)GEC I 2 (BRcc)GEC I 3 GEC I 4 GEC I 5 GE I 6 GE I 7 GE I 8 GE I 9 G I 10 G I 11 G I 12 G Abort Figure 8.28. Worst-case timing for an incorrectly predicted branch.
29
Integer register file/ annex Figure 8.29. Load and store unit. GECN1 data tags dTLB D-Cache Compare Load/store queue Miss To E-Cache
30
Please see “portrait orientation” PowerPoint file for Chapter 8 Figure 8.30. Execution flow.
31
Please see “portrait orientation” PowerPoint file for Chapter 8 Table 8.1. Examples of SPARC instructions.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.