Data Hazards RAW Hazard ADD.D F3, F1, F2 SUB.D F5, F6, F3 No Solution, normal property of programs WAW Hazard DIV.D F3, F1, F2 SUB.D F3, F6, F5 This instruction will complete first Div writes wrong value later, hence stalls may be need for proper operation WAR Hazard DIV.D F3, F1, F2 SUB.D F5, F6, F3 OR ADD.D F3, F6, F7 SUB.D reads wrong value of a register hence stalls may be need in some architectures not in FP pipeline on the next page. DIV.D F3, F1, F2 SUB.D F5, F6, F3 ADD.D F6, F6, F7
EX MemWBIFIDA1A1 A2A2 A3A3 A4A4 M1M1 M2M2. M7M7 Divide Typical MIPS Floating Point Pipeline
Other Possible Implementations EX MemWBIFIDA1A1 A2A2 A3A3 A4A4 M1M1 M2M2. M7M7 Divide EX Mem WB IFIDA1A1 A2A2 A3A3 A4A4 M1M1 M2M2. M7M7 Divide EX Mem WB IFIDA1A1 A2A2 A3A3 A4A4 M1M1 M2M2. M7M7 Divide EX WB IFIDA1A1 A2A2 A3A3 A4A4 M1M1 M2M2. M7M7 Divide EX + MEM
TOO MANY ID STAGE STALLS WAW, STRUCTURAL SOLUTION?
Earliest Possible Implementation THE SCOREBOARD CDC6600, (No 1 Supercomputer) No Full Pipelining No Forwarding FPADD = 2 CC, MUL = 10, DIV = 60 Total Scoreboard Hardware = 1 FU (simple)
10 Function Units (source Wiki) floating point multiply (2 copies) floating point divide floating point add "long" integer add Memory (2 copies; performed memory load/store) shift boolean logic branch
ISSUE/ID1 Read Operands Check for WAW, FU Check for RAW, Read Values from Register File when free Read Operands EX(EA) + MEM WB EX Floating Point Add Multiply DIVIDE Read Operands Check for WAR Register File
I C a c h e S1 S2 OP S1 S2 OPD S1 S2 OP S1 S2 OP S1 S2 OP S1 S2 OP S1 S2 OP Fj Fk ADDER Fi Op WRITE RWRITE R Fj Fk MULT1 Fi Op WRITE RWRITE R Fj Fk MULT2 Fi Op WRITE RWRITE R Fj Fk DIV Fi Op WRITE RWRITE R Instruction Queue Fj, Fk, Source Register Number (5-bit) Fi, Destination, 5-bit (32-registers) Rj, Rk, Flags Qi,Qj, 4 or 5-bit, FU Number
Scoreboard Operation function issue(op, dst, src1, src2) Wait Until (!Busy[FU] AND !Result[dst]); // FU can be any functional unit that can execute an operation op Busy[FU] ← Yes; Op[FU] ← op; Fi[FU] ← dst; Fj[FU] ← src1; Fk[FU] ← src2; Qj[FU] ← Result[src1]; Qk[FU] ← Result[src2]; Rj[FU] ← not Qj; //1 if Qj = 0 Rk[FU] ← not Qk; //1 if Qk = 0 Result[dst] ← FU;
Operation Contd… function read_operands(FU) wait until (Rj[FU] AND Rk[FU]); Rj[FU] ← No; Rk[FU] ← No; {As soon as both Rj and Rk = 1 or Yes go to next(EXE) stage but also leave Rj = Rk = 0 for next instruction as default} function execute(FU) // Execute whatever FU must do
function write_back(FU) wait until ( f {(Fj[f]≠Fi[FU] OR Rj[f]=No) AND (Fk[f]≠Fi[FU] OR Rk[f]=No)}) for each f do if Qj[f]=FU then Rj[f] ← Yes; if Qk[f]=FU then Rk[f] ← Yes; Result[Fi[FU]] ← 0; Busy[FU] ← No; Operation Contd…
Scoreboard (The Shift In-charge) Functions Instructions are issued in order but executed and committed out of order (OOOE+OOOC) Reduces many ID Stage stalls by out-of-order execution of independent instruction (Instruction Level Independence hence possibility of parallel execution also called Instruction Level Parallelism (ILP)) Keeps records of instruction and in which stage they are currently in. No Forwarding, Read Operands happens after Write Result (Not in same clock cycle as result could only be read from register after write). Keeps a big record of each function unit BUSY status Op code assigned Destination Register (Fi) Source Register (Fj, Fk) Function units that will produce the result to be used by this function unit (Qj, Qk) Operand available Status (Rj, RK indicate when Fj and Fk are ready and not yet read. Set to NO when they are read and FU goes into execution stage. Register Result Status: For every register, it indicates what function unit has a pending result for this register.
Example MEM Unit busy L.D R5, 0(R3) ISROEXE+ MEM WR L.D R7, 0(R4) IS Forget about MIPS FP pipeline
Example MEM Unit busy L.D F5, 0(R3) ISROEXE+ MEM WR ADD F6, F5, F2 ISRO E_ADD WR
Example DIV.D F6, F5, F2 ISRODIVWR S.D F6, 0(R3) ISRoRORO RO RORO EX E ADD.D F4, F6, F5 ISRORO RO RORO EX MUL.D F7, F6, F9 ISRO RORO EX
Slides Prepared by Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F SUB.D F8, F6, F DIV.D F10, F0, F ADD.D F6, F8, F Name of FU FU BUSY Op Code Source Operand Register ID Fi Dest Results Coming From FU Operand Available FjFkQjQkRjRk Int YESLoadR2-F600YesYES Mult Add Div Result Status Register F0F1F2F3F4F6F8F10F12 FU Integer Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) Cannot Issue, FU Busy 5678 MUL.D F0, F2, F SUB.D F8, F6, F DIV.D F10, F0, F ADD.D F6, F8, F Name of FU FU BUSY Op Code Source Operand Register ID Fi Dest Results Coming From FU Operand Available FjFkQjQkRjRk Int YESLoadR2-F600YesYES Mult Add Div Result Status Register F0F1F2F3F4F6F8F10F12 FU Integer Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) Address Calculation + Memory Access in L.D F2, 45(R3) 5678 MUL.D F0, F2, F SUB.D F8, F6, F DIV.D F10, F0, F ADD.D F6, F8, F Name of FU And ID FU BUSY Op Code Source Operand Register ID Fi Dest Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) YESLoad[2-F600NO Mult (2) Add (3) Div (4) Result Status Register F0F1F2F3F4F6F8F10F12 FU Integer Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) FU Free 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F SUB.D F8, F6, F DIV.D F10, F0, F ADD.D F6, F8, F Name of FU And ID FU BUSY Op Code Source Operand Register ID Fi Dest Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) Add (3) Div (4) Result Status Register F0F1F2F3F4F6F8F10F12 FU Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3)Can be issued now 5678 MUL.D F0, F2, F SUB.D F8, F6, F DIV.D F10, F0, F ADD.D F6, F8, F Name of FU And ID FU BUSY Op Code Source Operand Register ID Fi Dest Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) YESLOADR3-F200Yes Mult (2) Add (3) Div (4) Result Status Register F0F1F2F3F4F6F8F10F12 FU Int(1) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) Read Operand Values in Fj and Fk 5678 MUL.D F0, F2, F4 Issued But RAW F2 Busy SUB.D F8, F6, F DIV.D F10, F0, F ADD.D F6, F8, F Name of FU And ID FU BUSY Op Code Source Operand Register ID Fi Dest Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) YESLOADR3-F200Yes Mult (2) YESMULF2F4F0Int(1)0NoYES Add (3) Div (4) Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)Int(1) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 Stalled in RO Stage 67-8, SUB.D F8, F6, F2 Issued But RAW F2 Busy DIV.D F10, F0, F ADD.D F6, F8, F Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) YESLOADR3-F200NO Mult (2) YESMULF2F4F0Int(1)0NoYES Add (3) YESSUBF6F2F80Int(1)YESNO Div (4) Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)Int(1)ADD(3) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 Stalled in RO Stage 67-8, SUB.D F8, F6, F2 Stalled in RO Stage 78, DIV.D F10, F0, F6 Issued but RAW on F ADD.D F6, F8, F Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) YESMULF2F4F000 No→ Yes YES Add (3) YESSUBF6F2F800YES No→ Yes Div (4) YESDIVF0F6F10Mul(2)0NoYes Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)ADD(3)DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 Both read operands simultaneously 67-8, SUB.D F8, F6, F2 78,98, DIV.D F10, F0, F6Stalled at RO stage ADD.D F6, F8, F2 Cannot Issue Adder Busy Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) YESMULF2F4F000YesYES Add (3) YESSUBF6F2F800YES Yes Div (4) YESDIVF0F6F10Mul0NoYES Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)ADD(3)DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 10 CC in EXE 67-8, SUB.D F8, F6, F2 2 CC in EXE 78, DIV.D F10, F0, F6Stalled at RO stage ADD.D F6, F8, F2 Cannot Issue Adder Busy Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) YESMULF2F4F000No Add (3) YESSUBF6F2F800No Div (4) YESDIVF0F6F10Mul0NoYES Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)ADD(3)DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 10 cc in EXE 67-8,910,1120 SUB.D F8, F6, F2 2 CC in EXE 78, DIV.D F10, F0, F6Stalled at RO stage ADD.D F6, F8, F2 Cannot Issue FU Busy Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) YESMULF2F4F0 Add (3) YESSUBF6F2F8 Div (4) YESDIVF0F6F10Mul0NoYES Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)ADD(3)DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 10 CC in Exe 67-8,9 10,11, SUB.D F8, F6, F2 78, DIV.D F10, F0, F6Stalled at RO stage ADD.D F6, F8, F2 Cannot Issue FU Busy Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) Mult (2) YESMULF2F4F0 Add (3) NO Div (4) YESDIVF0F6F10Mul0NoYES Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)0DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 10 CC in Exe 67-8,9 10,11,12,13 20 SUB.D F8, F6, F2 78, DIV.D F10, F0, F6Stalled at RO stage ADD.D F6, F8, F2 FU Free so Issue Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) YESMULF2F4F0 Add (3) YESADDF8F2F600YES Div (4) YESDIVF0F6F10Mul0NoYES Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)Add(3)DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 10 CC in Exe 67-8,9 10,11,12,13, SUB.D F8, F6, F2 78, DIV.D F10, F0, F6Stalled at RO stage ADD.D F6, F8, F2 Reads Operands Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) YESMULF2F4F0 Add (3) YESADDF8F2F600YES Div (4) YESDIVF0F6F10Mul0NoYES Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)Add(3)DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 10 CC in Exe 67-8, SUB.D F8, F6, F2 78, DIV.D F10, F0, F6Stalled at RO stage ADD.D F6, F8, F Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) YESMULF2F4F0 Add (3) YESADDF8F2F600NO Div (4) YESDIVF0F6F10Mul0NoYES Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)Add(3)DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 10 CC in Exe 67-8, SUB.D F8, F6, F2 78, DIV.D F10, F0, F6Stalled at RO stage ADD.D F6, F8, F2 2 CC in EXE Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) YESMULF2F4F0 Add (3) YESADDF8F2F6 Div (4) YESDIVF0F6F10Mul0NoYES Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)Add(3)DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 10 CC in Exe 67-8, SUB.D F8, F6, F2 78, DIV.D F10, F0, F6Stalled at RO stage ADD.D F6, F8, F2 Wait in WB as WAW ,22 Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) YESMULF2F4F0 Add (3) YESADDF8F2F6 Div (4) YESDIVF0F6F10Mul0NoYES Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)Add(3)DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 10 CC in Exe 67-8, SUB.D F8, F6, F2 78, DIV.D F10, F0, F6Stalled at RO stage ADD.D F6, F8, F2 Wait in WB as WAW ,18,2 2 Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) YESMULF2F4F0 Add (3) YESADDF8F2F6 Div (4) YESDIVF0F6F10Mul0NoYES Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)Add(3)DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 10 CC in Exe 67-8, SUB.D F8, F6, F2 78, DIV.D F10, F0, F6Stalled at RO stage ADD.D F6, F8, F2 Wait in WB as WAW ,2 2 Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) YESMULF2F4F0 Add (3) YESADDF8F2F6 Div (4) YESDIVF0F6F10Mul0NoYES Result Status Register F0F1F2F3F4F6F8F10F12 FU MUL (2)Add(3)DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 Finally MUL Finished 67-8, SUB.D F8, F6, F2 78, DIV.D F10, F0, F6Stalled at RO stage 89—20, ADD.D F6, F8, F2 Wait in WB as WAW ,2 2 Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) NO Add (3) YESADDF8F2F6 Div (4) YESDIVF0F6F1000YES Result Status Register F0F1F2F3F4F6F8F10F12 FU 0Add(3)DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 67-8, SUB.D F8, F6, F2 78, DIV.D F10, F0, F6Reads Operans 89—20, ADD.D F6, F8, F2 Wait in WB as WAW ,2 2 Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) NO Add (3) YESADDF8F2F6 Div (4) YESDIVF0F6F1000YES Result Status Register F0F1F2F3F4F6F8F10F12 FU Add(3)DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 67-8, SUB.D F8, F6, F2 78, DIV.D F10, F0, F640 CC in EXE 89—20, 2122, 6162 ADD.D F6, F8, F ,, 2 2 Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NO Mult (2) NO Add (3) No Div (4) YESDIVF0F6F1000NO Result Status Register F0F1F2F3F4F6F8F10F12 FU DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 67-8, SUB.D F8, F6, F2 78, DIV.D F10, F0, F640 CC in EXE 89—20, 2122, 6162 ADD.D F6, F8, F ,, 2 2 Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NOXXXX00 Mult (2) NOXXXX00 Add (3) NoXXXX00NO Div (4) YESDIVF0F6F1000NO Result Status Register F0F1F2F3F4F6F8F10F12 FU DIV(4) Jahangir Ikram, Oct 2006
Instruction Status (Stage Completed) InstructionCommentsIssueRead OperandsExecution Complete Write Result L.D F6, 34(R2) 1234 L.D F2, 45(R3) 5678 MUL.D F0, F2, F4 67-8, SUB.D F8, F6, F2 78, DIV.D F10, F0, F6 89—20, 2122, 6162 ADD.D F6, F8, F2 Finally Writes ,, 2 2 Name of FU And ID FU BUSY Op Code Source Operand Register ID Dest Fi Results Coming From FU Operand Available FjFkQjQkRjRk Int(1) NOXXXX00 Mult (2) NOXXXX00 Add (3) NoXXXX00NO Div (4) NoXXXX00NO Result Status Register F0F1F2F3F4F6F8F10F12 FU Jahangir Ikram, Oct 2006
Scoreboard Limitations Long WAW delays. Ideally need to separate waiting for values (resolving RAW hazards) from issue. Basic Scoreboard system stalls on both WAR and WAW hazards which could be resolved by register renaming (discussed in chapter 3 and 4). No way to deal with memory based RAW hazards (or WAR or WAW) e.g. any memory load must wait for all outstanding stores to complete, in case they are to same address. (unless compiler can detect?) No memory based operations can occur in parallel/out of order – again limits instruction level parallelism. Deferred READ: Another problem not mentioned in the book but often associated with scoreboards.