Data Dependence Types and Associated Pipeline Hazards Chapter 4 — The Processor — 1 Sections 4.7
Type #1: Data Flow Dependence (True Dependence) InstrK reads operand written by InstrM Instr K is data dependent (aka true dependence) on Instr M – Data must flow from instruction M to instruction K If two instructions are data dependent, they cannot execute simultaneously or be completely overlapped Data dependence in an instruction sequence implies data dependence in source code If data dependence caused a hazard in pipeline, it would be a RAW (Read After Write) hazard CPE 4322 M: add $s0, $t0, $t1 K: sub $t2, $s0, $t3
Chapter 4 — The Processor — 3 RAW Data Hazards add$s0, $t0, $t1 sub$t2, $s0, $t3
Type #2: Name Dependence (Anti-dependence) Name dependence is when 2 instructions (e.g. N and M) use the same register or memory location (called a name), but there is no flow of data between the instructions associated with that name Instr M writes an operand that Instr N reads – Called an “anti-dependence” by compiler writers (we say that instruction N is data anti-dependant on instruction M) This results from the reuse of the name “r1” If anti-dependence caused a hazard in the pipeline, it would be a Write After Read (WAR) hazard CPE 4324 N: sub r4,r1,r3 M: add r1,r2,r3 K: mul r6,r1,r7
CPE432 Chapter 4B.5Dr. W. Abu-Sufah, UJ Anti-dependence does NOT cause WAR Data Hazard in this pipeline ALU IM Reg DMReg sub$s0,$t0,$t1 ALU IM Reg DMReg CC0CC1CC2CC3CC4 CC5 CC6CC7CC8 add$t0,$t2,$t3
Type #3: Name Dependence (Output dependence) Instr M writes operand that Instr N writes. Called an “output dependence” by compiler writers This also results from the reuse of name “r1” If output dependance caused a hazard in the pipeline, it would be a Write After Write (WAW) hazard CPE 4326 N: sub r1,r4,r3 M: add r1,r2,r3 K: mul r6,r1,r7
Chapter 4 — The Processor — 7 Code Scheduling to Avoid Stalls Reorder code to avoid use of load result in the next instruction C code for A = B + E; C = B + F; lw$t1, 0($t0) lw$t2, 4($t0) add$t3, $t1, $t2 sw$t3, 12($t0) lw$t4, 8($t0) add$t5, $t1, $t4 sw$t5, 16($t0) stall 13 cycles Code Order 1#
Chapter 4 — The Processor — 8 2 Stalls: 13 Cycles
Chapter 4 — The Processor — 9 Code Scheduling to Avoid Stalls Reorder code to avoid use of load result in the next instruction C code for A = B + E; C = B + F; lw$t1, 0($t0) lw$t2, 4($t0) lw$t4, 8($t0) add$t3, $t1, $t2 sw$t3, 12($t0) add$t5, $t1, $t4 sw$t5, 16($t0) 11 cycles Code Order #2
Chapter 4 — The Processor — 10 Reodering of Code; No Hazards: 11 Cycles lw $t2, 4($t0) add $t3,$t1,$t2 lw $t4, 8($t0) add$t5, $t1,$t4