Presentation is loading. Please wait.

Presentation is loading. Please wait.

Processor Design: Pipeline

Similar presentations


Presentation on theme: "Processor Design: Pipeline"— Presentation transcript:

1 Processor Design: Pipeline
[Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, UCB]

2 Processor design review
Single clock cycle instructions (CPI = 1) Clock cycle is longer Designing for the worst case Multiple clock cycles per instruction Divided each instruction into components Tried for balance among the functions Better performance ? Some instructions take a little longer. Some instructions take fewer cycles than others. On the average we have improved performance 11/10/2018

3 Design process Determine datapath requirements
Pick an instruction (sometimes one instruction can represent an entire class, e.g. R-type) Determine the datapath required for execution of the instruction Determine the controls required for the instruction Find the data path required for all the instructions Find the shared path requirements One approach: develop an input – output matrix Find destinations that have more than one input Insert multiplexers where necessary Determine control requirements CPI = 1 Controls are controlled by opcode CPI > 1 Controls are controlled by opcode and system state Finite State Machine Hardwired (PLA) or Software (Microprgrammed) implementation 11/10/2018

4 Interrupts, exceptions
Interrupt vs exception Interrupt: External – I/O device request Exception: Internal – OS calls, arithmetic overflow Interrupts are external hardware events Raise an interrupt (hardware) Wait to complete the current instruction Determine the source of the interrupt Save the return address Transfer to relevant Interrupt Service Routine Save the registers that may change Execute the program Can this be interrupted? Restore the registers Return to execution of the program 11/10/2018

5 Exceptions Exceptions are software driven MIPS exception handling
Overflow in an arithmetic instruction Memory access yields an undefined instruction MIPS exception handling Registers Stores address of the problem instruction in EPC – Exception PC Store the cause of the exception in the Cause Register Cause low order bit = 0 (undefined instruction) Cause low order bit = 1 (arithmetic overflow) Additional control signals – IntCause, EPCWrite and CauseWrite Transfer control to specified location in OS OS terminates program or continues processing 11/10/2018

6 Pipeline Review Vector arithmetic – focus on subtraction
Pipeline performance issues Time to execute a single subtraction may be higher as compared to a non-pipeline solution Overall performance improvement If vector length is increased by 1, a single additional clock cycle is needed What if the vector elements are only 8 bits and the design was for 32 bits? If the vector length was 15 in each case then what would be the difference in the time to execute the subtraction in the 2 cases? Pipeline design issues Hardware based Balanced stages In the vector pipeline case all the stages are involved in the computation What if all the stages are not required for each computation? Example: in the case of instruction execution, we know that different instructions require different number of cycles – what does this imply? 11/10/2018

7 Pipelines occur in many every day activities
Assembly Lines Fast food prep What goes wrong in these lines? What can we learn? What should we avoid? 11/10/2018

8 Pipelining is Natural! A B C D Laundry Example
Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer takes 30 minutes “Folder” takes 30 minutes “Stasher” takes 30 minutes to put clothes into drawers A B C D 11/10/2018 Copyright 1997 UCB

9 Laundry – 4 loads – each stage takes 0.5 hours
i m e 7 6 P M 8 9 1 2 A B C D a s k o r d Sequential laundry takes 8 hours Pipelined operations take 3.5 hours. Speedup = 2.3 T i m e 7 6 P M 8 9 1 2 A B C D a s k o r d Pipeline does not reduce time per task, but increases throughput. Involves resource sharing. Balanced tasks required for a good pipeline. Time to fill pipeline and time to drain pipeline limit performance gains – what is the impact of this? 11/10/2018 Ó1998 Morgan Kaufmann Publishers

10 More loads of laundry For eight loads Sequential laundry: 16 hours
P M 7 8 9 1 1 1 1 2 1 2 A M T i m e For eight loads Sequential laundry: 16 hours Pipeline: 5.5 hours Speedup = 16/5.5 = 2.9 For twelve loads Sequential laundry: 24 hours Pipeline: 7.5 hours Speedup = 24/7.5 = 3.2 In the limit Speedup approaches the number of stages – 4 in this case. What are the constraints? Balanced Pipeline!!! T a s k o r d e r A B C D 11/10/2018

11 Review: Time to Execute Instructions
Each instruction goes through 5 steps: IF = 200ps, Register Read = 100 ps, ALU Ops = 200 ps, Data Access = 200 ps, Register write = 100 ps Instruction type Instruction fetch Register read ALU Ops Data Access Register write Total time lw 200 ps 100 ps 800 ps sw 700 ps R format 600 ps beq 500 ps 11/10/2018

12 Pipelined Processor Start the next instruction while still working on the current one improves throughput or bandwidth - total amount of work done in a given time (average instructions per second or per clock) instruction latency is not reduced (time from the start of an instruction to its completion) pipeline clock cycle (pipeline stage time) is limited by the slowest stage for some instructions, some stages are wasted cycles Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 lw IFetch Dec Exec Mem WB IFetch Dec Exec Mem WB sw IFetch Dec Exec Mem WB R-type 11/10/2018

13 Pipelining Improve performance by increasing instruction throughput
Time for each lw – single clock cycle, pipeline? Time for “program” – single clock cycle, pipeline? Speed up = time for single datapath / time for pipeline Speed up is a measure of relative performance How can we get max speed up? What is the max speed up you can expect? 11/10/2018

14 Ideal Pipeline - Balanced
WB EX MEM ID IF WB EX MEM ID IF What is the max speed up? Max speed up < = Number of stages 11/10/2018

15 Speed up in our case Speedup (3 instructions) = 2.4/1.4 = 1.7
Seq: 800ns; Pipeline: *0.2 = 200.8 Speedup = 8000/2008 = Speedup depends on the longest stage!!! Don’t forget – all instructions are the same. What is the impact of changing instruction mix? What about multicycle? 11/10/2018

16 Instruction Sets for Pipelining
Pipelining is easier if all instructions are the same length (32 bits) Fewer the instruction formats the better Symmetry across instruction formats is preferred Since the location of the register addresses is fixed these can be retrieved in cycle 2 Restricting memory access via load or store reduces number of pipeline stages If operations involved memory access then additional pipeline stages will be required – address computation, memory access required before execute 11/10/2018

17 Pipelining improves instruction throughput, not instruction latency
Single Cycle, Multiple Cycle, vs. Pipeline Cycle 1 Cycle 2 Clk Single Cycle Implementation: Load Store Waste Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10 Clk Multiple Cycle Implementation: Load Store R-type Ifetch Reg Exec Mem Wr Ifetch Reg Exec Mem Ifetch Pipeline Implementation: WASTE Load Ifetch Reg Exec Mem Wr Store Ifetch Reg Exec Mem Wr R-type Ifetch Reg Exec Mem Wr Pipelining improves instruction throughput, not instruction latency 11/10/2018

18 REVIEW - Why Pipeline? SPECT 2000 instruction mix
25% lw, 10% sw, 11% branch, 2% jump, 52% ALU Cycles per instruction: lw (5), sw (4), ALU ( 4), branch/jump (3) CPI = .25*5+.1*4+ .13* *4 = 4.12 Single clock cycle time = 4.5 ns Multiple clock cycle time = 1.0 ns per cycle Pipeline clock cycle time = ? Suppose we execute 100 instructions Single Cycle Machine 4.5 ns/cycle x 1 CPI x 100 inst = 450 ns Multiple cycles per instruction Machine 1.0 ns/cycle x 4.12 CPI x 100 inst = 412 ns Ideal pipelined machine 1.0 ns/cycle x (4 cycle drain + 1 cycle per inst x 100 inst ) = 104 ns CPI ~ 1 11/10/2018

19 Pipelining the MIPS ISA
What makes it easy all instructions are the same length (32 bits) easier to fetch in 1st stage and decode in 2nd stage few instruction formats (three) with symmetry across formats can begin reading register file in 2nd stage memory operations can occur only in loads and stores can use the execute stage to calculate memory addresses each MIPS instruction writes at most one result and does so near the end of the pipeline What makes it hard structural hazards: what if we had only one memory? control hazards: what about branches? data hazards: what if an instruction’s input operands depend on the output of a previous instruction? 11/10/2018

20 REVIEW - Can pipelining get us into trouble?
Yes: Pipeline Hazards structural hazards: attempt to use the same resource two different ways at the same time, e.g single memory is a structural hazard reading data from memory in load cycle 4; reading instruction in instr 3 cycle 1. ALU I n s t r. O r d e Mem Reg Mem Reg Load ALU Mem Reg Instr 1 ALU Mem Reg Instr 2 ALU Instr 3 Mem Reg Mem Reg ALU Mem Reg Instr 4 Time (clock cycles) Detection is easy in this case! Convention used - right half highlight means read, left half write 11/10/2018 DAP Fa97, Ó U.CB

21 REVIEW - Can pipelining get us into trouble?
Yes: Pipeline Hazards structural hazards: attempt to use the same resource two different ways at the same time, e.g single memory is a structural hazard. data hazards: attempt to use item before it is ready instruction depends on result of prior instruction still in the pipeline control hazards: attempt to make a decision before condition is evaulated branch instructions Can always resolve hazards by waiting (stall) pipeline control must detect the hazard take action (or delay action) to resolve hazards reduces the attraction of a pipeline: seek a better solution 11/10/2018 DAP Fa97, Ó U.CB

22 Pipelining – So Far Pipeline throughput is higher than single and multiple cycle Time to execute an instruction may be more What makes it easy all instructions are the same length just a few instruction formats memory operands appear only in loads and stores What makes it hard? structural hazards: suppose we had only one memory control hazards: need to worry about branch instructions data hazards: an instruction depends on a previous instruction We’ll build a simple pipeline and look at these issues We’ll talk about modern processors and what really makes it hard: exception handling trying to improve performance with out-of-order execution, etc. 11/10/2018

23 MIPS Pipeline Datapath Modifications
What do we need to add/modify in MIPS datapath? State registers between each pipeline stage to isolate them Read Address Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Data 1 Data 2 16 32 ALU Shift left 2 Data IFetch/Dec Dec/Exec Exec/Mem Mem/WB IF:IFetch ID:Dec EX:Execute MEM: MemAccess WB: WriteBack System Clock Sign Extend 11/10/2018

24 Pipelined Datapath Buffer size?
What if we try to execute a R-type instruction (e.g. add)? Note that we are using trial and error to get to the best design. This is a usual part of design. What factors need checking? Different instructions, interaction between instructions. Problems? Writeback Register address IF:IFetch ID:Dec EX:Execute MEM: MemAccess WB: WriteBack Add 4 Shift left 2 Add Instruction Memory IFetch/Dec Read Addr 1 Data Memory Register File Read Data 1 PC Read Addr 2 Dec/Exec Exec/Mem Read Address Read Data Mem/WB Write Addr ALU Address Read Data 2 Write Data Write Data Sign Extend 16 32 System Clock 11/10/2018

25 Graphical Representation of Pipeline
Shading is used to emphasize the type of operation. In the figure the shading of the right half of IF emphasizes a memory read, and shading of the left half of WB indicates a writing data into the register file. IF = Instruction Fetch. Read instruction memory. ID = Instruction Decode. Register file is being read EX = Execute. ALU is busy. MEM = Memory is not being accessed WB = Write back. Register file is being written into. Can help with answering questions like: How many cycles does it take to execute this code? What is the ALU doing during cycle 4? Is there a hazard, why does it occur, and how can it be fixed? 11/10/2018

26 Why Pipeline? For Performance! Resources are available.
Time (clock cycles) Once the pipeline is full, one instruction is completed every cycle, so CPI = 1 ALU IM Reg DM Inst 0 I n s t r. O r d e ALU IM Reg DM Inst 1 ALU IM Reg DM Inst 2 ALU IM Reg DM Inst 3 ALU IM Reg DM Inst 4 Time to fill the pipeline 11/10/2018

27 A Single Memory Would Be a Structural Hazard
Time (clock cycles) Reading data from memory ALU Mem Reg lw I n s t r. O r d e ALU Mem Reg Inst 1 ALU Mem Reg Inst 2 ALU Mem Reg Inst 3 Reading instruction from memory ALU Mem Reg Inst 4 Fix with separate instr and data memories (I$ and D$) 11/10/2018

28 How About Register File Access?
Time (clock cycles) ALU IM Reg DM add $1, I n s t r. O r d e ALU IM Reg DM Inst 1 ALU IM Reg DM Inst 2 ALU IM Reg DM add $2,$1, 11/10/2018

29 How About Register File Access?
Time (clock cycles) Fix register file access hazard by doing reads in the second half of the cycle and writes in the first half ALU IM Reg DM add $1, I n s t r. O r d e ALU IM Reg DM Inst 1 ALU IM Reg DM Inst 2 ALU IM Reg DM add $2,$1, clock edge that controls loading of pipeline state registers clock edge that controls register writing 11/10/2018

30 Data Hazard on r1 add r1 ,r2,r3 sub r4, r1 ,r3 and r6, r1 ,r7
or r8, r1 ,r9 xor r10, r1 ,r11 11/10/2018 DAP Fa97, Ó U.CB

31 Example of Data Hazard – Software solution (compiler)
Cause of data hazard: dependence of one instruction on an earlier one. Analysis of such dependence critical for compilers Example add $s0, $t0, $t1 sub $t2, $s0, $t3 add is completed after cycle 5 sub must be delayed so that writeback was completed before the execute stage Approach based on stall stall WB EX MEM ID IF STALL STALL STALL WB EX MEM ID IF 11/10/2018

32 Register Usage Can Cause Data Hazards
Dependencies backward in time cause hazards ALU IM Reg DM add $1, I n s t r. O r d e ALU IM Reg DM sub $4,$1,$5 ALU IM Reg DM and $6,$1,$7 ALU IM Reg DM or $8,$1,$9 ALU IM Reg DM xor $4,$1,$5 Read before write data hazard 11/10/2018

33 Register Usage Can Cause Data Hazards
Dependencies backward in time cause hazards ALU IM Reg DM add $1, ALU IM Reg DM sub $4,$1,$5 ALU IM Reg DM and $6,$1,$7 ALU IM Reg DM or $8,$1,$9 ALU IM Reg DM xor $4,$1,$5 Read before write data hazard 11/10/2018

34 Loads Can Cause Data Hazards
Dependencies backward in time cause hazards ALU IM Reg DM lw $1,4($2) I n s t r. O r d e ALU IM Reg DM sub $4,$1,$5 ALU IM Reg DM and $6,$1,$7 ALU IM Reg DM or $8,$1,$9 ALU IM Reg DM xor $4,$1,$5 Load-use data hazard 11/10/2018

35 One Way to “Fix” a Data Hazard
Can fix data hazard by waiting – stall – but impacts CPI ALU IM Reg DM add $1, I n s t r. O r d e stall stall sub $4,$1,$5 and $6,$1,$7 ALU IM Reg DM 11/10/2018

36 Another Way to “Fix” a Data Hazard
Fix data hazards by forwarding results as soon as they are available to where they are needed ALU IM Reg DM add $1, I n s t r. O r d e ALU IM Reg DM sub $4,$1,$5 ALU IM Reg DM and $6,$1,$7 ALU IM Reg DM or $8,$1,$9 ALU IM Reg DM xor $4,$1,$5 11/10/2018

37 Another Way to “Fix” a Data Hazard
Fix data hazards by forwarding results as soon as they are available to where they are needed ALU IM Reg DM add $1, I n s t r. O r d e ALU IM Reg DM sub $4,$1,$5 ALU IM Reg DM and $6,$1,$7 ALU IM Reg DM or $8,$1,$9 ALU IM Reg DM xor $4,$1,$5 11/10/2018

38 Forwarding with Load-use Data Hazards
ALU IM Reg DM lw $1,4($2) I n s t r. O r d e ALU IM Reg DM sub $4,$1,$5 ALU IM Reg DM and $6,$1,$7 ALU IM Reg DM or $8,$1,$9 For class handout ALU IM Reg DM xor $4,$1,$5 11/10/2018

39 Forwarding with Load-use Data Hazards
ALU IM Reg DM lw $1,4($2) I n s t r. O r d e ALU IM Reg DM sub $4,$1,$5 ALU IM Reg DM and $6,$1,$7 ALU IM Reg DM or $8,$1,$9 For lecture Note that lw is just another example of register usage (beyond ALU ops) Need to stall even with forwarding when data hazard involves a load ALU IM Reg DM xor $4,$1,$5 Will still need one stall cycle even with forwarding 11/10/2018

40 Branch Instructions Cause Control Hazards
Dependencies backward in time cause hazards beq ALU IM Reg DM I n s t r. O r d e ALU IM Reg DM lw ALU IM Reg DM Inst 3 ALU IM Reg DM Inst 4 11/10/2018

41 Control Hazard Solutions
Redefine branch behavior (takes place after next instruction) “delayed branch” Impact: 0 clock cycles per branch instruction if can find instruction to put in “slot” (­ 50% of time) As launch more instruction per clock cycle, less useful I n s t r. O r d e Time (clock cycles) Add Beq Misc ALU Mem Reg Load 11/10/2018 DAP Fa97, Ó U.CB

42 One Way to “Fix” a Control Hazard
Fix branch hazard by waiting – stall – but affects CPI ALU IM Reg DM beq I n s t r. O r d e stall stall stall Another “solution” is to put in enough extra hardware so that we can test registers, calculate the branch address, and update the PC during the second stage of the pipeline. That would reduce the number of stalls to only one. A third approach is to prediction to handle branches, e.g., always predict that branches will be untaken. When right, the pipeline proceeds at full speed. When wrong, have to stall (and make sure nothing completes – changes machine state – that shouldn’t have). Will talk about these options in more detail in next,next lecture. lw ALU IM Reg DM Inst 3 11/10/2018

43 Corrected Datapath to Save RegWrite Addr
Need to preserve the destination register address in the pipeline state registers Read Address Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Data 1 Data 2 16 32 ALU Shift left 2 Data IF/ID Sign Extend ID/EX EX/MEM MEM/WB For class handout 11/10/2018

44 Corrected Datapath to Save RegWrite Addr
Need to preserve the destination register address in the pipeline state registers Read Address Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Data 1 Data 2 16 32 ALU Shift left 2 Data IF/ID Sign Extend ID/EX EX/MEM MEM/WB For lecture 11/10/2018

45 Pipeline control We have 5 stages. What needs to be controlled in each stage? Instruction Fetch and PC Increment Instruction Decode / Register Fetch Execution Memory Stage Write Back 11/10/2018

46 MIPS Pipeline Control Path Modifications
All control signals can be determined during Decode and held in the state registers between pipeline stages Read Address Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Data 1 Data 2 16 32 ALU Shift left 2 Data IF/ID Sign Extend ID/EX EX/MEM MEM/WB Control 11/10/2018

47 More on Hazards Structural hazards Data hazard
Same functional unit has to perform two distinct tasks If a single memory is used then in the same time slot it may be accessed for data or memory. Data hazard Based on “interference” between instructions. Data required by an instruction has not yet been computed. Two approaches Stall the instruction Feed forward Sometimes feed forward is not enough Stall + feedforward 11/10/2018

48 Data Hazards for Branches
If a comparison register is a destination of preceding ALU instruction or 2nd preceding load instruction Need 1 stall cycle IF ID EX MEM WB lw $1, addr IF ID EX MEM WB add $4, $5, $6 beq stalled IF ID beq $1, $4, target ID EX MEM WB 11/10/2018

49 Data Hazards for Branches
If a comparison register is a destination of immediately preceding load instruction Need 2 stall cycles IF ID EX MEM WB lw $1, addr beq stalled IF ID beq stalled ID beq $1, $0, target ID EX MEM WB 11/10/2018

50 More on hazards – control hazard
Branch result decides which is the next instruction to execute Two approaches Always stall Predict 11/10/2018

51 Example of Data Hazard (continued)
add $s0, $t0, $t1 sub $t2, $s0, $t3 Feedforward solution lw $s0, 20($t1) sub $t2, $s0, $t3 Does the above feedforward work? Sometimes feedforward is not enough. 11/10/2018

52 Data Forwarding (aka Bypassing)
Take the result from the earliest point that it exists in any of the pipeline state registers and forward it to the functional units (e.g., the ALU) that need it that cycle For ALU functional unit: the inputs can come from any pipeline register rather than just from ID/EX by adding multiplexors to the inputs of the ALU connecting the Rd write data in EX/MEM or MEM/WB to either (or both) of the EX’s stage Rs and Rt ALU mux inputs adding the proper control hardware to control the new muxes Other functional units may need similar forwarding logic (e.g., the DM) With forwarding can achieve a CPI of 1 even in the presence of data dependencies 11/10/2018

53 Control Hazard Solutions
Stall: wait until decision is clear Reduce the need for stalls by adding hardware for following in 2nd stage (ID) Compare registers. Calculate branch address. Update PC. One stall is still required. Impact: 2 clock cycles per branch instruction. => slow 11/10/2018

54 Control Hazard Solutions
Predict: guess one direction then back up if wrong Predict not taken Impact: 1 clock cycles per branch instruction if right, 2 if wrong (right ­ 50% of time) More dynamic scheme: history of 1 branch (­ 90%) 11/10/2018

55 Detecting Dependencies
Problem with starting next instruction before first is finished dependencies that “go backward in time” are data hazards 4 potential hazards – examine each separately I M R e g C 1 2 3 4 5 6 T i m ( n c l o k y s ) u b $ , P r a x t d D 7 8 9 / w V f : EX/MEM.RegRd = ID/EX.RegRs = $2 MEM/WB.RegRd = ID/EX.RegRs = $2 Use ‘temp’ results. No need to wait for Writeback to be completed. Recall: Register is written before being read. 11/10/2018

56 Data Hazard Detection Rules
1a. EX/MEM.RegRd = ID/EX.RegRs 1b. EX/MEM.RegRd = ID/EX.RegRt 2a. MEM/WB.RegRd = ID/EX.RegRs 2b. MEM/WB.RegRd = ID/EX.RegRt What if the instruction does not require WB? Potential for unnecessary feed forward Examine the RegWrite signal What if the destination register is $0? Detection rules have to be modified to capture these situations. 11/10/2018

57 Data Forwarding Control Conditions
EX/MEM hazard: if (EX/MEM.RegWrite and (EX/MEM.RegisterRd != 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = 10 and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) ForwardB = 10 Forwards the result from the previous instr. to either input of the ALU MEM/WB hazard: if (MEM/WB.RegWrite and (MEM/WB.RegisterRd != 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01 and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01 Forwards the result from the second previous instr. to either input of the ALU 11/10/2018

58 Impact of Forwarding FIGURE 4.53 The dependences between the pipeline registers move forward in time, so it is possible to supply the inputs to the ALU needed by the AND instruction and OR instruction by forwarding the results found in the pipeline registers. The values in the pipeline registers show that the desired value is available before it is written into the register file. We assume that the register fi le forwards values that are read and written during the same clock cycle, so the add does not stall, but the values come from the register file instead of a pipeline register. Register fi le “forwarding”—that is, the read gets the value of the write in that clock cycle—is why clock cycle 5 shows register $2 having the value 10 at the beginning and −20 at the end of the clock cycle. As in the rest of this section, we handle all forwarding except for the value to be stored by a store instruction. Copyright © 2009 Elsevier, Inc. All rights reserved. 11/10/2018

59 Forwarding Unit FIGURE 4.54 On the top are the ALU and pipeline registers before adding forwarding. On the bottom, the multiplexors have been expanded to add the forwarding paths, and we show the forwarding unit. The new hardware is shown in color. This figure is a stylized drawing, how ever, leaving out details from the full datapath such as the sign extension hardware. Note that the ID/EX. RegisterRt field is shown twice, once to connect to the mux and once to the forwarding unit, but it is a single signal. As in the earlier discussion, this ignores forwarding of a store value to a store instruction. Also note that this mechanism works for slt instructions as well. Copyright © 2009 Elsevier, Inc. All rights reserved. 11/10/2018

60 Forwarding Multiplexors Control
FIGURE 4.55 The control values for the forwarding multiplexors in Figure The signed immediate that is another input to the ALU is described in the Elaboration at the end of this section. Copyright © 2009 Elsevier, Inc. All rights reserved. 11/10/2018

61 Resolving Hazards with Forwarding
FIGURE 4.56 The datapath modified to resolve hazards via forwarding. Compared with the datapath in Figure 4.51, the additions are the multiplexors to the inputs to the ALU. This figure is a more stylized drawing, however, leaving out details from the full datapath, such as the branch hardware and the sign extension hardware. Copyright © 2009 Elsevier, Inc. All rights reserved. 11/10/2018


Download ppt "Processor Design: Pipeline"

Similar presentations


Ads by Google