Presentation is loading. Please wait.

Presentation is loading. Please wait.

High-level view Out-of-order pipeline

Similar presentations


Presentation on theme: "High-level view Out-of-order pipeline"— Presentation transcript:

1 High-level view Out-of-order pipeline
Two decoupled pipelines: fetch/dispatch and issue/execute Pipelines decoupled by buffers Many names: reservation stations, issue queues/buffers, scheduling queues “instruction window” fetch Instruction fetch/dispatch pipeline decode In-order data dependence checking, register renaming DISPATCH (insert into window) WINDOW ISSUE (take from window) Out-of-order issue Instruction issue/execute pipeline execute complete (writeback) ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

2 Early dynamically scheduled machines
CDC 6600 Centralized control: “CDC scoreboard” Many replicated functional units All values pass through the register file Stall on WAR/WAW hazards IBM 360/91 (Tomasulo’s algorithm) Distributed control: “reservation stations” Several, fully-pipelined functional units (equivalent to replicating functional units) Values broadcast to waiting instructions and register file in parallel (via the Common Data Bus) Introduced register renaming: handles WAR/WAW hazards without stalling ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

3 CDC 6600 (MIPS version) Four stages after fetch Dispatch
Check for structural and WAW hazards Structural: Stall in dispatch stage if FU busy. WAW: Stall in dispatch stage if an outstanding instruction in the scoreboard writes the same destination register. Enter instruction into scoreboard and determine data dependences. Route instruction to a free FU, where it waits until data operands are available Issue Wait for operands to become ready. Scoreboard signals when operands are ready. Instruction reads registers from the register file, then issues to FU for execution. Execute Write result Check for WAR hazard. Stall if an outstanding prior instruction in the scoreboard reads the same register being written, and the read has not yet taken place ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

4 Warning – H&P naming H&P uses different names than what we will use
We: Dispatch They: Issue We: Issue They: Read operands We: Execute They: Execute We: Write result They: Write result ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

5 Integer Unit (integer MIPS pipeline)
CDC 6600 (MIPS version) Registers FP MULT (1) FP MULT (2) FP DIV FP ADD Integer Unit (integer MIPS pipeline) Scoreboard control/status control/status ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

6 Scoreboard Three data structures Instruction status
Which stage the instruction is in Functional unit status Busy - FU is busy executing an instruction Op - what instruction is the FU busy with Fi - destination register Fj, Fk - source registers Qj, Qk - functional units producing src regs Rj, Rk - flags indicating src regs are ready Register result status Which FU is going to write each register ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

7 Running example Example used for CDC and Tomasulo L.D F6 , 34(R2)
MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

8 CDC 6600 example What happens when the first instruction is dispatched? Instruction status DISPATCH ISSUE EXECUTE WRITE RESULT L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 Functional-unit status FU busy op Fi Fj Fk Qj Qk Rj Rk Integer no MULT1 MULT2 ADD DIV Register-result status F0 F2 F4 F6 F8 F10 F12 ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

9 CDC 6600 example (1) L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4
Instruction status DISPATCH ISSUE EXECUTE WRITE RESULT x L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 Functional-unit status FU busy op Fi Fj Fk Qj Qk Rj Rk Integer yes MULT1 no MULT2 ADD DIV Register-result status F0 F2 F4 F6 F8 F10 F12 ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

10 CDC 6600 example (2) On the next cycle, the instr. is issued. What else happens? Instruction status DISPATCH ISSUE EXECUTE WRITE RESULT x L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 Functional-unit status FU busy op Fi Fj Fk Qj Qk Rj Rk Integer yes L.D F6 R2 MULT1 no MULT2 ADD DIV Register-result status F0 F2 F4 F6 F8 F10 F12 Integer ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

11 CDC 6600 example (3) L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4
Instruction status DISPATCH ISSUE EXECUTE WRITE RESULT x L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 Functional-unit status FU busy op Fi Fj Fk Qj Qk Rj Rk Integer yes L.D F2 R3 MULT1 no MULT2 ADD DIV Register-result status F0 F2 F4 F6 F8 F10 F12 Integer ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

12 CDC 6600 example (4) L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4
Instruction status DISPATCH ISSUE EXECUTE WRITE RESULT x L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 Functional-unit status FU busy op Fi Fj Fk Qj Qk Rj Rk Integer yes L.D F2 R3 MULT1 MUL.D F0 F4 no MULT2 ADD DIV Register-result status F0 F2 F4 F6 F8 F10 F12 MULT1 Integer ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

13 CDC 6600 example (5) L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4
Instruction status DISPATCH ISSUE EXECUTE WRITE RESULT x L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 Functional-unit status FU busy op Fi Fj Fk Qj Qk Rj Rk Integer yes L.D F2 R3 MULT1 MUL.D F0 F4 no MULT2 ADD SUB.D F8 F6 DIV Register-result status F0 F2 F4 F6 F8 F10 F12 MULT1 Integer ADD ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

14 CDC 6600 example (6) L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4
Instruction status DISPATCH ISSUE EXECUTE WRITE RESULT x L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 Functional-unit status FU busy op Fi Fj Fk Qj Qk Rj Rk Integer yes L.D F2 R3 MULT1 MUL.D F0 F4 no MULT2 ADD SUB.D F8 F6 DIV DIV.D F10 Register-result status F0 F2 F4 F6 F8 F10 F12 MULT1 Integer ADD DIV ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

15 CDC 6600 example (7) L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4
Instruction status DISPATCH ISSUE EXECUTE WRITE RESULT x L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 Functional-unit status FU busy op Fi Fj Fk Qj Qk Rj Rk Integer yes L.D F2 R3 MULT1 MUL.D F0 F4 MULT2 no ADD SUB.D F8 F6 DIV DIV.D F10 Register-result status F0 F2 F4 F6 F8 F10 F12 MULT1 Integer ADD DIV ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

16 CDC 6600 example (8) L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4
Instruction status DISPATCH ISSUE EXECUTE WRITE RESULT x L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 Functional-unit status FU busy op Fi Fj Fk Qj Qk Rj Rk Integer yes L.D F2 R3 MULT1 MUL.D F0 F4 MULT2 no ADD SUB.D F8 F6 DIV DIV.D F10 Register-result status F0 F2 F4 F6 F8 F10 F12 MULT1 Integer ADD DIV ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

17 CDC 6600 example (9) L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4
Instruction status DISPATCH ISSUE EXECUTE WRITE RESULT x L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 Functional-unit status FU busy op Fi Fj Fk Qj Qk Rj Rk Integer yes L.D F2 R3 MULT1 MUL.D F0 F4 MULT2 no ADD DIV DIV.D F10 F6 Register-result status F0 F2 F4 F6 F8 F10 F12 MULT1 Integer DIV ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

18 CDC 6600 example (10) L.D F6 , 34(R2) L.D F2 , 45(R3)
Instruction status DISPATCH ISSUE EXECUTE WRITE RESULT x L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 Functional-unit status FU busy op Fi Fj Fk Qj Qk Rj Rk Integer yes L.D F2 R3 MULT1 MUL.D F0 F4 MULT2 no ADD ADD.D F6 F8 DIV DIV.D F10 Register-result status F0 F2 F4 F6 F8 F10 F12 MULT1 Integer ADD DIV ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

19 CDC 6600 example (11) MULTD about to write result… L.D F6 , 34(R2)
Instruction status DISPATCH ISSUE EXECUTE WRITE RESULT x L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 finishing.. RAW (F0) WAR (F6) Functional-unit status FU busy op Fi Fj Fk Qj Qk Rj Rk Integer no MULT1 yes MULTD F0 F2 F4 MULT2 ADD ADDD F6 F8 DIV DIVD F10 Register-result status F0 F2 F4 F6 F8 F10 F12 MULT1 ADD DIV ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

20 CDC 6600 timing diagram L.D F6 , 34(R2) L.D F2 , 45(R3)
ID IS EX WR ….. L.D F6 , 34(R2) L.D F2 , 45(R3) MUL.D F0 , F2, F4 SUB.D F8 , F6, F2 DIV.D F10, F0, F6 ADD.D F6 , F8, F2 1 2 3 4 6 5 7 => Execution latencies: L.D (2 – agen + access), MUL.D (10), DIV.D (40), SUB.D/ADD.D (2) => Notice there are always 2 cycles between EX of data dependent instructions (e.g., L.D and MUL.D): producer does WR and consumer does last IS cycle in which registers are read from the register file. This is an artifact of the CDC 6600: all values must first pass through the register file (no bypasses). Shaded boxes indicate stalls. RAW 2. L.D-MUL.D (F2) 3. L.D-SUB.D (F2) 4. MUL.D-DIV.D (F0) 5. SUB.D-ADD.D (F8) Structural 1. L.D-L.D (Integer unit) 6. SUB.D-ADD.D (ADD unit) WAR 7. DIV.D-ADD.D (F6) ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University

21 Remaining bottlenecks
CDC 6600 does a good job of dynamic scheduling around RAW hazards Remaining performance limitations Amount of instruction-level parallelism (ILP) in the program Maybe not enough data-independent operations Increase size of window to look farther ahead. Above requires branch prediction. Number of scoreboard entries (window size) Dictates how far processor can look ahead Number and type of functional units, register ports, etc. Structural hazards Anti- and output dependences Dynamic scheduling exposes more WAW+WAR hazards because early (OOO) writes are possible WAR made worse in CDC due to late reads (read operands when finally issuing) WAW handled like a structural hazard in dispatch ECE 463/521, Profs. Gehringer, Rotenberg, & Conte, Dept. of ECE, NC State University


Download ppt "High-level view Out-of-order pipeline"

Similar presentations


Ads by Google