Download presentation
Presentation is loading. Please wait.
1
Major CPU Design Steps 1. Analyze instruction set operations using independent RTN ISA => RTN => datapath requirements. This provides the the required datapath components and how they are connected to meet ISA requirements. 2. Select required datapath components, connections & establish clock methodology (e.g clock edge-triggered). 3. Assemble datapath meeting the requirements. 4. Identify and define the function of all control points or signals needed by the datapath. Analyze implementation of each instruction to determine setting of control points that affects its operations and register transfer. 5. Design & assemble the control logic. Hard-Wired: Finite-state machine implementation. Microprogrammed. Datapath Control (Chapter 5.5)
2
Single Cycle MIPS Datapath: CPI = 1, Long Clock Cycle
T = I x CPI x C imm16 32 ALUop (2-bits) Clk busW RegWr busA busB 5 Rw Ra Rb 32 32-bit Registers Rs Rt Rd RegDst Extender Mux 16 ALUSrc ExtOp MemtoReg Data In WrEn Adr Data Memory MemWr ALU Zero Instruction<31:0> 1 <21:25> <16:20> <11:15> <0:15> Imm16 = Adder PC 00 4 PCSrc PC Ext Inst Branch PC+4 Target R[rs] R[rt] Main (Includes ORI not in book version) Control Function Field Jump Not Included
3
Drawbacks of Single-Cycle Processor
Long cycle time: All instructions must take as much time as the slowest: Cycle time for load is longer than needed for all other instructions. Real memory is not as well-behaved as idealized memory Cannot always complete data access in one (short) cycle. Impossible to implement complex, variable-length instructions and complex addressing modes in a single cycle. e.g indirect memory addressing. High and duplicate hardware resource requirements Any hardware functional unit cannot be used more than once in a single cycle (e.g. ALUs). Cannot pipeline (overlap) the processing of one instruction with the previous instructions. (instruction pipelining, chapter 6).
4
Abstract View of Single Cycle CPU
PC Next PC Register Fetch ALU Reg. Wrt Access Mem Data Instruction Result Store ALUctr RegDst ALUSrc ExtOp MemWr Equal Branch, Jump RegWr MemRd Main Control control op fun Ext 1 ns 2 ns 1 ns 2 ns 2 ns One CPU Clock Cycle Duration C = 8ns One instruction per cycle CPI = 1 Assuming the following datapath/control hardware components delays: Memory Units: 2 ns ALU and adders: 2 ns Register File: 1 ns Control Unit < 1 ns
5
Single Cycle Instruction Timing
PC Inst Memory mux ALU Data Mem Reg File cmp Arithmetic & Logical Load Store Branch Critical Path setup (Determines CPU clock cycle, C)
6
Clock Cycle Time & Critical Path
One CPU Clock Cycle Duration C = 8ns here Clk . Critical Path i.e longest delay Critical path: the slowest path between any two storage devices Clock Cycle time is a function of the critical path, and must be greater than: Clock-to-Q + Longest Delay Path through the Combination Logic + Setup + Clock Skew Assuming the following datapath/control hardware components delays: Memory Units: 2 ns ALU and adders: 2 ns Register File: 1 ns Control Unit < 1 ns
7
Reducing Cycle Time: Multi-Cycle Design
Cut combinational dependency graph by inserting registers / latches. The same work is done in two or more shorter cycles, rather than one long cycle. storage element storage element Two shorter cycles One long cycle Acyclic Combinational Logic (A) Acyclic Combinational Logic Cycle 1 e.g CPI =1 e.g CPI =2 => storage element Acyclic Combinational Logic (B) Cycle 2 storage element Place registers to: Get a balanced clock cycle length Save any results needed for the remaining cycles storage element
8
Basic MIPS Instruction Processing Steps
Instruction Memory Instruction Fetch Decode Execute Result Store Next } Obtain instruction from program storage Instruction ¬ Mem[PC] Update program counter to address of next instruction Common steps for all instructions PC ¬ PC + 4 Determine instruction type Obtain operands from registers Done by Control Unit Compute result value or status Store result in register/memory if needed (usually called Write Back).
9
Partitioning The Single Cycle Datapath
Add registers between steps to break into cycles 2 ns 1 ns 2 ns Branch, Jump 1 ns 2 ns ExtOp MemRd MemWr RegDst RegWr MemWr ALUSrc ALUctr Reg. File Operand Fetch Exec Instruction Fetch Mem Access PC Next PC Result Store Data Mem Data Memory Access Cycle (MEM) Instruction Fetch Cycle (IF) Instruction Decode Cycle (ID) Execution Cycle (EX) Write back Cycle (WB) 1 2 3 4 5 Place registers to: Get a balanced clock cycle length Save any results needed for the remaining cycles
10
Example Multi-cycle Datapath
To Control Unit Branch, Jump MemToReg MemRd MemWr RegDst RegWr ExtOp ALUSrc ALUctr Equal Reg. File A Ext ALU Reg File R PC Instruction Fetch IR Next PC B Mem Access M Instruction Fetch (IF) 2ns Instruction Decode (ID) 1ns Data Mem Execution (EX) 2ns Memory (MEM) 2ns Write Back (WB) 1ns 1 2 3 4 5 Registers added: All clock-edge triggered (not shown register write enable control lines) IR: Instruction register A, B: Two registers to hold operands read from register file. R: or ALUOut, holds the output of the main ALU M: or Memory data register (MDR) to hold data read from data memory CPU Clock Cycle Time: Worst cycle delay = C = 2ns (ignoring MUX, CLK-Q delays) Assuming the following datapath/control hardware components delays: Memory Units: 2 ns ALU and adders: 2 ns Register File: 1 ns Control Unit < 1 ns Thus Clock Rate: f = 1 / 2ns = 500 MHz
11
Operations (Dependant RTN) for Each Cycle
Logic Immediate IR ¬ Mem[PC] A ¬ R[rs] B ¬ R[rt R ¬ A OR ZeroExt[imm16] R[rt] ¬ R PC ¬ PC + 4 R-Type IR ¬ Mem[PC] A ¬ R[rs] B ¬ R[rt] R ¬ A funct B R[rd] ¬ R PC ¬ PC + 4 Load IR ¬ Mem[PC] A ¬ R[rs] B ¬ R[rt R ¬ A + SignEx(Im16) M ¬ Mem[R] R[rt] ¬ M PC ¬ PC + 4 Store IR ¬ Mem[PC] A ¬ R[rs] B ¬ R[rt] R ¬ A + SignEx(Im16) Mem[R] ¬ B PC ¬ PC + 4 Branch IR ¬ Mem[PC] A ¬ R[rs] B ¬ R[rt] Zero ¬ A - B If Zero = 1: PC ¬ PC + 4 + (SignExt(imm16) x4) else (i.e Zero =0): PC ¬ PC + 4 Instruction Fetch IF ID EX MEM WB Instruction Decode Execution Memory Write Back Instruction Fetch (IF) & Instruction Decode cycles are common for all instructions
12
MIPS Multi-Cycle Datapath: Five Cycles of Load
IF ID EX MEM WB Load 1- Instruction Fetch (IF): Fetch the instruction from instruction Memory. 2- Instruction Decode (ID): Operand Register Fetch and Instruction Decode. 3- Execute (EX): Calculate the effective memory address. 4- Memory (MEM): Read the data from the Data Memory. 5- Write Back (WB): Write the loaded data to the register file. Update PC.
13
Multi-cycle Datapath Instruction CPI
R-Type/Immediate: Require four cycles, CPI = 4 IF, ID, EX, WB Loads: Require five cycles, CPI = 5 IF, ID, EX, MEM, WB Stores: Require four cycles, CPI = 4 IF, ID, EX, MEM Branches/Jumps: Require three cycles, CPI = 3 IF, ID, EX Average or effective program CPI: £ CPI £ depending on program profile (instruction mix).
14
Single Cycle Vs. Multi-Cycle CPU
Clk Cycle 1 Multiple Cycle Implementation: IF ID EX MEM WB Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10 Load Store Single Cycle Implementation: Waste R-type 8 ns 2ns (500 MHz) 8ns (125 MHz) Single-Cycle CPU: CPI = 1 C = 8ns One million instructions take = I x CPI x C = 106 x 1 x 8x10-9 = 8 msec Multi-Cycle CPU: CPI = 3 to C = 2ns One million instructions take from 106 x 3 x 2x10-9 = 6 msec to x 5 x 2x10-9 = 10 msec depending on instruction mix used. f = 125 MHz f = 500 MHz Assuming the following datapath/control hardware components delays: Memory Units: 2 ns ALU and adders: 2 ns Register File: 1 ns Control Unit < 1 ns
15
Finite State Machine (FSM) Control Model
Control Unit Design: Finite State Machine (FSM) Control Model State specifies control points (outputs) for Register Transfer. Control points (outputs) are assumed to depend only on the current state and not inputs (i.e. Moore finite state machine) Transfer (register/memory writes) and state transition occur upon exiting the state on the falling edge of the clock. Control State Next State Logic Output Logic inputs (opcode, conditions) outputs (control points) Last State State X Current State Register Transfer Control Points e.g Flip-Flops State Transition Depends on Inputs Next State To datapath
16
Control Specification For Multi-cycle CPU Finite State Machine (FSM) - State Transition Diagram
IR ¬ MEM[PC] R-type A ¬ R[rs] B ¬ R[rt] R ¬ A fun B R[rd] ¬ R PC ¬ PC + 4 R ¬ A or ZX R[rt] ¬ R PC ¬ PC + 4 ORi R ¬ A + SX R[rt] ¬ M M ¬ MEM[R] LW MEM[R] ¬ B BEQ & Zero BEQ & ~Zero PC ¬ PC + 4+ SX || 00 SW “instruction fetch” “decode / operand fetch” Execute Memory Write-back (Start state) To instruction fetch 13 states: 4 State Flip-Flops needed To instruction fetch To instruction fetch
17
Traditional FSM Controller
next state Outputs control points state op cond Next State Logic Output Logic State Transition Table next State Inputs control points 11 Equal 6 Opcode Current State State 4 Outputs (Control points) op To datapath datapath State State register (4 Flip-Flops)
18
Traditional FSM Controller
datapath + state diagram => control Translate RTN statements into control points. Assign states. Implement the controller. More on FSM controller implementation in Appendix C
19
Mapping RTNs To Control Points Examples & State Assignments
IR ¬ MEM[PC] 0000 R-type A ¬ R[rs] B ¬ R[rt] 0001 R ¬ A fun B 0100 R[rd] ¬ R PC ¬ PC + 4 0101 R ¬ A or ZX 0110 R[rt] ¬ R PC ¬ PC + 4 0111 ORi R ¬ A + SX 1000 R[rt] ¬ M 1010 M ¬ MEM[R] 1001 LW 1011 MEM[R] ¬ B 1100 BEQ & Zero BEQ & ~Zero 0011 PC ¬ PC + 4+SX || 00 0010 SW “instruction fetch” “decode / operand fetch” Execute Memory Write-back imem_rd, IRen Aen, Ben ALUfun, Sen RegDst, RegWr, PCen 1 4 2 6 8 11 12 3 9 To instruction fetch state 0000 5 7 10 To instruction fetch state 0000 To instruction fetch state 0000
20
Detailed Control Specification - State Transition Table
Current Op field Z Next IR PC Ops Exec Mem Write-Back State en sel A B Ex Sr ALU S R W M M-R Wr Dst 0000 ?????? ? 0001 BEQ 0001 BEQ 0001 R-type x 0001 orI x 0001 LW x 0001 SW x 0010 xxxxxx x 0011 xxxxxx x 0100 xxxxxx x fun 1 0101 xxxxxx x 0110 xxxxxx x or 1 0111 xxxxxx x 1000 xxxxxx x add 1 1001 xxxxxx x 1010 xxxxxx x 1011 xxxxxx x add 1 1100 xxxxxx x IF ID BEQ Can be combined in one state R ORI LW SW More on FSM controller implementation in Appendix C
21
Alternative Multiple Cycle Datapath (In Textbook)
Miminizes Hardware: 1 memory, 1 ALU PCWr PCWrCond PCSrc Zero IorD MemWr IRWr RegDst RegWr ALUSrcA 1 32 32 Mux PC Mux 1 32 Instruction Reg PC Zero Rs Mux 1 Ra 32 Address 5 32 Rt ALU Out 32 Rb busA A 32 Ideal Memory 32 ALU 5 Reg File Mux 1 Rt Rw 32 32 32 B 32 32 Mem Data Reg Rd 4 Din Dout busW busB 1 32 2 ALU Control Mux 1 3 MemRd Extend << 2 Imm 16 32 ALUOp MemtoReg ALUSrcB
22
Alternative Multiple Cycle Datapath (In Textbook)
rs rt rd imm16 Shared instruction/data memory unit A single ALU shared among instructions Shared units require additional or widened multiplexors Temporary registers to hold data between clock cycles of the instruction: Additional registers: Instruction Register (IR), Memory Data Register (MDR), A, B, ALUOut (Figure 5.27 page 322)
23
Alternative Multiple Cycle Datapath With Control Lines (Fig 5
Alternative Multiple Cycle Datapath With Control Lines (Fig 5.28 In Textbook) (ORI not supported, Jump supported) PC+ 4 Branch Target rs rt rd 2 imm16 32 PC (Figure 5.28 page 323)
24
The Effect of The 1-bit Control Signals
Name RegDst RegWrite ALUSrcA MemRead MemWrite MemtoReg IorD IRWrite PCWrite PCWriteCond Effect when deasserted (=0) The register destination number for the write register comes from the rt field (instruction bits 20:16). None The first ALU operand is the PC The value fed to the register write data input comes from ALUOut register. The PC is used to supply the address to the memory unit. Effect when asserted (=1) The register destination number for the write register comes from the rd field (instruction bits 15:11). The register on the write register input is written with the value on the Write data input. The First ALU operand is register A (I.e R[rs]) Content of memory specified by the address input are put on the memory data output. Memory contents specified by the address input is replaced by the value on the Write data input. The value fed to the register write data input comes from data memory register (MDR). The ALUOut register is used to supply the the address to the memory unit. The output of the memory is written into Instruction Register (IR) The PC is written; the source is controlled by PCSource The PC is written if the Zero output of the ALU is also active. (Figure 5.29 page 324)
25
The Effect of The 2-bit Control Signals
Name ALUOp ALUSrcB PCSource Value (Binary) 00 01 10 11 Effect The ALU performs an add operation The ALU performs a subtract operation The funct field of the instruction determines the ALU operation (R-Type) The second input of the ALU comes from register B The second input of the ALU is the constant 4 The second input of the ALU is the sign-extended 16-bit immediate (imm16) field of the instruction in IR The second input of the ALU is is the sign-extended 16-bit immediate field of IR shifted left 2 bits Output of the ALU (PC+4) is sent to the PC for writing The content of ALUOut (the branch target address) is sent to the PC for writing The jump target address (IR[25:0] shifted left 2 bits and concatenated with PC+4[31:28] is sent to the PC for writing i.e jump address (Figure 5.29 page 324)
26
Operations (Dependant RTN) for Each Cycle
R-Type IR ¬ Mem[PC] PC ¬ PC + 4 A ¬ R[rs] B ¬ R[rt] ALUout ¬ PC + (SignExt(imm16) x4) ALUout ¬ A funct B R[rd] ¬ ALUout Load IR ¬ Mem[PC] PC ¬ PC + 4 A ¬ R[rs] B ¬ R[rt] ALUout ¬ PC + (SignExt(imm16) x4) ALUout ¬ A + SignEx(Imm16) MDR ¬ Mem[ALUout] R[rt] ¬ MDR Store IR ¬ Mem[PC] PC ¬ PC + 4 A ¬ R[rs] B ¬ R[rt] ALUout ¬ PC + (SignExt(imm16) x4) ALUout ¬ A + SignEx(Imm16) Mem[ALUout] ¬ B Branch IR ¬ Mem[PC] PC ¬ PC + 4 A ¬ R[rs] B ¬ R[rt] ALUout ¬ PC + (SignExt(imm16) x4) Zero ¬ A - B Zero: PC ¬ ALUout Jump IR ¬ Mem[PC] PC ¬ PC + 4 A ¬ R[rs] B ¬ R[rt] ALUout ¬ PC + (SignExt(imm16) x4) PC ¬ Jump Address Instruction Fetch IF ID EX MEM WB Instruction Decode Execution Memory Write Back Instruction Fetch (IF) & Instruction Decode (ID) cycles are common for all instructions
27
High-Level View of Finite State Machine Control
(Figure 5.32) (Figure 5.33) (Figure 5.34) (Figure 5.35) (Figure 5.36) First steps are independent of the instruction class Then a series of sequences that depend on the instruction opcode Then the control returns to fetch a new instruction. Each box above represents one or several state. (Figure 5.31 page 332)
28
Instruction Fetch (IF) and Decode (ID) FSM States
A ¬ R[rs] B ¬ R[rt] ALUout ¬ PC + (SignExt(imm16) x4) IF ID IR ¬ Mem[PC] PC ¬ PC + 4 (Figure 5.33) (Figure 5.34) (Figure 5.35) (Figure 5.36) (Figure 5.32 page 333)
29
Instruction Fetch (IF) Cycle (State 0)
IR ¬ Mem[PC] PC ¬ PC + 4 MemRead = ALUSrcA = IorD = IRWrite = ALUSrcB = ALUOp = 00 (add) PCWrite = PCSource = 00 (ORI not supported, Jump supported) PC+ 4 Branch Target rs rt rd 2 imm16 32 PC 00 1 1 1 01 1 00 Add (Figure 5.28 page 323)
30
Instruction Decode (ID) Cycle (State 1)
A ¬ R[rs] B ¬ R[rt] ALUout ¬ PC + (SignExt(imm16) x4) ALUSrcA = ALUSrcB = ALUOp = 00 (add) (Calculate branch target) (ORI not supported, Jump supported) PC+ 4 Branch Target rs rt rd 2 imm16 32 PC 11 00 Add (Figure 5.28 page 323)
31
Load/Store Instructions FSM States
(From Instruction Decode) EX ALUout ¬ A + SignEx(Imm16) MDR ¬ Mem[ALUout] Mem[ALUout] ¬ B MEM R[rt] ¬ MDR WB To Instruction Fetch (Figure 5.32) (Figure 5.33 page 334)
32
Load/Store Execution (EX) Cycle (State 2)
Effective address calculation ALUSrcA = ALUSrcB = 10 ALUOp = 00 (add) ALUout ¬ A + SignEx(Imm16) (ORI not supported, Jump supported) PC+ 4 Branch Target rs rt rd 2 imm16 32 PC 10 1 00 Add (Figure 5.28 page 323)
33
Load Memory (MEM) Cycle (State 3)
MDR ¬ Mem[ALUout] MemRead = IorD = 1 (ORI not supported, Jump supported) PC+ 4 Branch Target rs rt rd 2 imm16 32 PC 1 1 (Figure 5.28 page 323)
34
Load Write Back (WB) Cycle (State 4)
R[rt] ¬ MDR RegWrite = MemtoReg = RegDst = 0 (ORI not supported, Jump supported) PC+ 4 Branch Target rs rt rd 2 imm16 32 PC 1 1 (Figure 5.28 page 323)
35
Store Memory (MEM) Cycle (State 5)
Mem[ALUout] ¬ B MemWrite = IorD = 1 (ORI not supported, Jump supported) PC+ 4 Branch Target rs rt rd 2 imm16 32 PC 1 1 (Figure 5.28 page 323)
36
R-Type Instructions FSM States
(From Instruction Decode) R-Type Instructions FSM States EX ALUout ¬ A funct B WB R[rd] ¬ ALUout To State 0 (Instruction Fetch) (Figure 5.32) (Figure 5.34 page 335)
37
R-Type Execution (EX) Cycle (State 6)
ALUout ¬ A funct B ALUSrcA = ALUSrcB = ALUOp = 10 (R-Type) (ORI not supported, Jump supported) PC+ 4 Branch Target rs rt rd 2 imm16 32 PC 00 1 10 R-Type (Figure 5.28 page 323)
38
R-Type Write Back (WB) Cycle (State 7)
R[rd] ¬ ALUout RegWrite = MemtoReg = RegDst = 1 (ORI not supported, Jump supported) PC+ 4 Branch Target rs rt rd 2 imm16 32 PC 1 1 (Figure 5.28 page 323)
39
Branch Instruction Single EX State
Jump Instruction Single EX State (From Instruction Decode) (From Instruction Decode) Zero ¬ A - B Zero : PC ¬ ALUout PC ¬ Jump Address EX EX To State 0 (Instruction Fetch) (Figure 5.32) To State 0 (Instruction Fetch) (Figure 5.32) (Figures 5.35, 5.36 page 337)
40
Branch Execution (EX) Cycle (State 8)
Zero ¬ A - B Zero : PC ¬ ALUout ALUSrcA = ALUSrcB = ALUOp = 01 (Subtract) PCWriteCond = PCSource = 01 (ORI not supported, Jump supported) PC+ 4 Branch Target rs rt rd 2 imm16 32 PC 1 01 00 1 01 Subtract (Figure 5.28 page 323)
41
Jump Execution (EX) Cycle (State 9)
PC ¬ Jump Address PCWrite = PCSource = 10 (ORI not supported, Jump supported) PC+ 4 Branch Target rs rt rd 2 imm16 32 PC 10 1 1 (Figure 5.28 page 323)
42
FSM State Transition Diagram (From Book) IF ID (Figure 5.38 page 339)
A ¬ R[rs] B ¬ R[rt] ALUout ¬ PC + (SignExt(imm16) x4) (Figure 5.38 page 339) IR ¬ Mem[PC] PC ¬ PC + 4 ALUout ¬ A + SignEx(Imm16) EX PC ¬ Jump Address ALUout ¬ A func B Zero ¬ A -B Zero: PC ¬ ALUout MDR ¬ Mem[ALUout] WB MEM R[rd] ¬ ALUout Mem[ALUout] ¬ B Total 10 states R[rt] ¬ MDR WB More on FSM controller implementation in Appendix C
43
MIPS Multi-cycle Datapath Performance Evaluation
What is the average CPI? State diagram gives CPI for each instruction type. Workload (program) below gives frequency of each type. Type CPIi for type Frequency CPIi x freqIi Arith/Logic % 1.6 Load % 1.5 Store % 0.4 branch % 0.6 Average CPI: Better than CPI = 5 if all instructions took the same number of clock cycles (5).
44
Adding Support for swap to Multi Cycle Datapath (For More Practice Exercise 5.42)
You are to add support for a new instruction, swap that exchanges the values of two registers to the MIPS multicycle datapath of Figure 5.28 on page 232 swap $rs, $rt Swap used the R-Type format with: the value of field rs = the value of field rd Add any necessary datapaths and control signals to the multicycle datapath. Find a solution that minimizes the number of clock cycles required for the new instruction without modifying the register file. Justify the need for the modifications, if any. Show the necessary modifications to the multicycle control finite state machine of Figure 5.38 on page 339 when adding the swap instruction. For each new state added, provide the dependent RTN and active control signal values. i.e No additional register write ports
45
Adding swap Instruction Support to Multi Cycle Datapath
We assume here rs = rd in instruction encoding Swap $rs, $rt R[rt] ¬ R[rs] R[rs] ¬ R[rt] op rs rt rd [31-26] [25-21] [20-16] [10-6] 2 2 PC+ 4 rs Branch Target rt R[rs] R[rt] rd imm16 2 The outputs of A and B should be connected to the multiplexor controlled by MemtoReg if one of the two fields (rs and rd) contains the name of one of the registers being swapped. The other register is specified by rt. The MemtoReg control signal becomes two bits. (For More Practice Exercise 5.42)
46
Adding swap Instruction Support to Multi Cycle Datapath
IF A ¬ R[rs] B ¬ R[rt] ALUout ¬ PC + (SignExt(imm16) x4) IR ¬ Mem[PC] PC ¬ PC + 4 ID EX ALUout ¬ A + SignEx(Imm16) WB1 R[rd] ¬ B ALUout ¬ A func B Zero ¬ A -B Zero: PC ¬ ALUout WB2 R[rt] ¬ A R[rd] ¬ ALUout MEM WB Swap takes 4 cycles WB (For More Practice Exercise 5.42)
47
Adding Support for add3 to Multi Cycle Datapath (For More Practice Exercise 5.45)
You are to add support for a new instruction, add3, that adds the values of three registers, to the MIPS multicycle datapath of Figure 5.28 on page For example: add3 $s0,$s1, $s2, $s3 Register $s0 gets the sum of $s1, $s2 and $s3. The instruction encoding uses a modified R-format, with an additional register specifier rx added replacing the five low bits of the “funct” field. Add necessary datapath components, connections, and control signals to the multicycle datapath without modifying the register bank or adding additional ALUs. Find a solution that minimizes the number of clock cycles required for the new instruction. Justify the need for the modifications, if any. Show the necessary modifications to the multicycle control finite state machine of Figure 5.38 on page 339 when adding the add3 instruction. For each new state added, provide the dependent RTN and active control signal values. OP rs rt rd rx $s1 $s2 Not used 6 bits [31-26] 5 bits [25-21] [20-16] [15-11] add3 [4-0] $s0 $s3 [10-5]
48
Exercise 5.45: add3 instruction support to Multi Cycle Datapath
Add3 $rd, $rs, $rt, $rx R[rd] ¬ R[rs] + R[rt] + R[rx] rx is a new register specifier in field [0-4] of the instruction No additional register read ports or ALUs allowed Modified R-Format op rs rt rd rx [31-26] [25-21] [20-16] [10-6] [4-0] 2 WriteB 2 2 PC+ 4 rs Branch Target rt rx rd imm16 1. ALUout is added as an extra input to first ALU operand MUX to use the previous ALU result as an input for the second addition. 2. A multiplexor should be added to select between rt and the new field rx containing register number of the 3rd operand (bits 4-0 for the instruction) for input for Read Register 2. This multiplexor will be controlled by a new one bit control signal called ReadSrc. 3. WriteB control line added to enable writing R[rx] to B
49
Exercise 5.45: add3 instruction support to Multi Cycle Datapath
IF A ¬ R[rs] B ¬ R[rt] ALUout ¬ PC + (SignExt(imm16) x4) IR ¬ Mem[PC] PC ¬ PC + 4 ID EX ALUout ¬ A + SignEx(Im16) WriteB EX1 ALUout ¬ A + B B ¬ R[rx] WriteB ALUout ¬ A func B EX2 Zero ¬ A -B Zero: PC ¬ ALUout ALUout ¬ ALUout + B R[rd] ¬ ALUout MEM WB Add3 takes 5 cycles WB (For More Practice Exercise 5.45)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.