Download presentation
Presentation is loading. Please wait.
Published byGeorgina Smith Modified over 9 years ago
1
EE524/CptS561 Jose G. Delgado-Frias 1 Processor Basic steps to process an instruction IFID/OFEXMEMWB Instruction Fetch Instruction Decode / Operand Fetch Execute Memory Access Write Back
2
EE524/CptS561 Jose G. Delgado-Frias 2 Instruction Fetch Write Back Memory AccessExecute Inst. Dec. Op. Fetch Datapath IR Reg imm A B ALUALU PC Inst. Mem. Data Mem. +4 zero IR Mem[PC] NPC PC + 4 A Reg[IR 6..10 ] B Reg[IR 11..15 ] Imm ((IR 16 ) 16 ## IR 11..15 ] NPC Multiplexers (mux)
3
EE524/CptS561 Jose G. Delgado-Frias 3 Datapath (Arith/Logic Inst.) IR Reg imm A B ALUALU PC Inst. Mem. Data Mem. +4 zero ALUoutput A op B ALUoutput A op Imm Reg[IR16..20] ALUoutput IR Mem[PC] NPC PC + 4 A Reg[IR 6..10 ] B Reg[IR 11..15 ] Imm ((IR 16 ) 16 ## IR 11..15 ]
4
EE524/CptS561 Jose G. Delgado-Frias 4 Datapath (Load Inst.) IR Reg imm A B ALUALU PC Inst. Mem. Data Mem. +4 zero ALUoutput A op Imm Reg[IR11-15] LMD IR Mem[PC] NPC PC + 4 A Reg[IR 6..10 ] B Reg[IR 11..15 ] Imm ((IR 16 ) 16 ## IR 11..15 ]
5
EE524/CptS561 Jose G. Delgado-Frias 5 Datapath (Store Inst.) IR Reg imm A B ALUALU PC Inst. Mem. Data Mem. +4 zero ALUoutput A op Imm IR Mem[PC] NPC PC + 4 A Reg[IR 6..10 ] B Reg[IR 11..15 ] Imm ((IR 16 ) 16 ## IR 11..15 ] Mem[ALUoutput] B
6
EE524/CptS561 Jose G. Delgado-Frias 6 Datapath (Branch Inst.) IR Reg imm A B ALUALU PC Inst. Mem. Data Mem. +4 zero ALUoutput (PC+4) op Imm IR Mem[PC] NPC PC + 4 A Reg[IR 6..10 ] B Reg[IR 11..15 ] Imm ((IR 16 ) 16 ## IR 11..15 ]
7
Instructions of a program EE524/CptS561 Jose G. Delgado-Frias 7 1 IFIDEXMEMWBIFIDEXWB 2 IFID 3 Time (clock cycles)
8
Instructions of a program EE524/CptS561 Jose G. Delgado-Frias 8 1 2 3 4 5 6 ID IF EX IF ID MEM IF EX ID WB IF MEM EX ID WB MEM EX ID IF CLOCK CYCLE WB MEM EX ID IF WB MEM EX IF ID WB MEM ID EX 7 8
9
Pipelining Lessons EE524/CptS561 Jose G. Delgado-Frias 9 Pipelining doesn’t help latency of single task, it helps throughput of entire workload Pipeline rate limited by slowest pipeline stage Multiple tasks operating simultaneously Potential speedup = Number pipe stages Unbalanced lengths of pipe stages reduces speedup Time to “fill” pipeline and time to “drain” it reduces speedup
10
EE524/CptS561 Jose G. Delgado-Frias 10 Datapath w/ pipeline Reg ALUALU Data Mem. zero PC Inst. Mem. +4 Pipeline registers Clock
11
EE524/CptS561 Jose G. Delgado-Frias 11 Datapath w/ pipeline Reg ALUALU Data Mem. zero PC Inst. Mem. +4
12
EE524/CptS561 Jose G. Delgado-Frias 12 Pipeline IF 1 ID/OF IF 2 3 4 5 6 7 8 9 INSTRUCTIONS CLOCK CYCLE 123456789 EX ID/OF IF MEM EX ID/OF IF WB MEM EX ID/OF IF WB MEM EX ID/OF IF WB MEM EX ID/OF IF WB MEM EX ID/OF IF WB MEM EX ID/OF IF
13
EE524/CptS561 Jose G. Delgado-Frias 13 Pipeline Hazards Structural Hazards –two or more instructions use same hardware at the same time. Data Hazards –Data dependencies –Result from inst. j is needed by inst. k Control Hazards –Branch changes flow, what happen with the following instruction(s)
14
EE524/CptS561 Jose G. Delgado-Frias 14 Resources Mem (IM) Reg Mem (IM) ALU Reg Mem (IM) Mem (DM) ALU Reg Mem (IM) Reg Mem (DM) Reg Mem (DM) Reg ALU Reg Mem (DM) Reg ALU
15
EE524/CptS561 Jose G. Delgado-Frias 15 Data Hazards Mem (IM) Reg Mem (IM) ALU Reg Mem (IM) Mem (DM) ALU Reg Mem (IM) Reg Mem (DM) Reg Mem (DM) Reg ALU Reg Mem (DM) Reg ALU R1 R2+R3 R5 R1+R3 R8 R1-R6
16
EE524/CptS561 Jose G. Delgado-Frias 16 Data Forwarding Mem (IM) Reg Mem (IM) ALU Reg Mem (IM) Mem (DM) ALU Reg Mem (IM) Reg Mem (DM) Reg Mem (DM) Reg ALU Reg Mem (DM) Reg ALU R1 R2+R3 R5 R1+R3 R8 R1-R6
17
EE524/CptS561 Jose G. Delgado-Frias 17 Datapath w/ pipeline Reg ALUALU Data Mem. zero PC Inst. Mem. +4 Forwarding unit
18
EE524/CptS561 Jose G. Delgado-Frias 18 Example Reg ALUALU Data Mem. zero PC Inst. Mem. +4 Forwarding unit ADD R1,R2,R3 SUB R4,R3,R1ADD R1,R2,R3SUB R4,R3,R1XOR R7,R8,R1 ADD R1,R2,R3 SUB R4,R3,R1XOR R7,R8,R1
19
EE524/CptS561 Jose G. Delgado-Frias 19 Example Reg ALUALU Data Mem. zero PC Inst. Mem. +4 Forwarding unit ADD R1.. SUB R4,R3,R1XOR R7,R8,R1
20
EE524/CptS561 Jose G. Delgado-Frias 20 Example Reg ALUALU Data Mem. zero PC Inst. Mem. +4 Forwarding unit ADD R1.. SUB R8,R3,R1XOR R7,R8,R1
21
EE524/CptS561 Jose G. Delgado-Frias 21 Data Hazard Classification RAW (Read After Write) –w/ forward only load presents a problem WAW WAR RAR j: R1 k:RY R1 j: R1 k:R1 j: R1 k:R1 j: R1 k: R1
22
EE524/CptS561 Jose G. Delgado-Frias 22 Data Forwarding (load) Mem (IM) Reg Mem (IM) ALU Reg Mem (IM) Mem (DM) ALU Reg Mem (IM) Reg Mem (DM) Reg Mem (DM) Reg ALU Reg Mem (DM) Reg ALU R1 LD[Mem] R5 R1+R3 R8 R1-R6
23
EE524/CptS561 Jose G. Delgado-Frias 23 Data hazard (load) IF LW R1,0(R1) ID IF SUB R4,R1,R5 EX ID IF WB EX ID IF MEM EX ID AND R6,R1,R7 OR R8,R1,R9 MEM stall MEM EX WB “R1”
24
EE524/CptS561 Jose G. Delgado-Frias 24 Branch BR R1, LABEL_A ADD R2,R3,R7 AND R5,R7,R11 :::: LD R4,R2,005LABEL_A:
25
EE524/CptS561 Jose G. Delgado-Frias 25 Branch Mem (IM) Reg Mem (IM) Reg BR R1, LABEL_A ALU Reg Mem (IM) Reg Mem (DM) ALU Reg Mem (DM) ALU Reg Mem (DM) ADD R2,R3,R7 AND R5,R7,R11 LD R4,R2,005 Mem (DM) ALU Reg Mem (IM)
26
EE524/CptS561 Jose G. Delgado-Frias 26 Datapath w/ pipeline Reg ALUALU Data Mem. zero PC Inst. Mem. +4 Forwarding unit
27
EE524/CptS561 Jose G. Delgado-Frias 27 What to do w/ branch Reduce the number of cycles to decide on a branch. Delayed branch (Software Solutions) –NO-OP –move instructions from before from target from fall through
28
EE524/CptS561 Jose G. Delgado-Frias 28 Branch Mem (IM) Reg Mem (IM) Reg BR R1, LABEL_A ALU Reg Mem (IM) Mem (DM) ALU Reg Mem (IM) Reg Mem (DM) ALU Reg Mem (DM) ALU Reg Mem (DM) ADD R2,R3,R7 LD R4,R2,005
29
EE524/CptS561 Jose G. Delgado-Frias 29 NO-OP Branch NO-OP
30
EE524/CptS561 Jose G. Delgado-Frias 30 From Before Branch
31
EE524/CptS561 Jose G. Delgado-Frias 31 From Target Branch
32
EE524/CptS561 Jose G. Delgado-Frias 32 From Fall Through Branch
33
33 Multicycle Operations I FI D MEMW B EX inst. unit FP multiply FP adder FP divider
34
34 FP operations FP Add: 4 cycles FP Multiply: 7 cycles FP Divide: 25 cycles
35
35 Out of order completionExecution starts in order Example MULTD ADDD LD SD 1 IF 2 ID IF 3 m1 ID IF 4 m2 a1 ID IF 5 m3 a2 X ID 6 m4 a3 M X 7 m5 a4 W M 8 m6 M W 9 m7 W 10 M 11 W
36
36 MIPS R4000 ( Superpipelining ) instruction memory IFIS ALU EX data memory DFDSTC Reg WB Reg RF IF: Instruction fetch First half IS: Instruction fetch Second half RF:Inst. Decode & Register Fetch EX:Execution DF: Data fetch First half DS: Data fetch Second half TC:Tag Check WB:Write Back
37
37 Load instruction memory ALU data memory Reg instruction memory ALU data memory Reg instruction memory ALU data memory Reg instruction memory ALU data memory Reg LW R1 Instruction 1 Instruction 2 ADD R2,R1 CC1CC2CC3CC4CC5CC6CC7
38
38 Branch instruction memory ALU data memory Reg instruction memory ALU data memory Reg instruction memory ALU data memory Reg instruction memory ALU data memory Reg BEQZ instruction memorydata memory Reg ALU
39
39 Branch (taken) Branch instIFISRFEXDFDSTCWB Delay slot IFISRFEXDFDSTCWB stallSSSSSSSS Branch target IFISRFEXDFDSTCWB
40
40 Branch (not taken) Branch instIFISRFEXDFDSTCWB Delay slot IFISRFEXDFDSTCWB Branch inst+2IFISRFEXDFDSTCWB Branch inst+3IFISRFEXDFDSTCWB Branch inst+4IFISRFEXDFDSTCWB
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.