Download presentation
Presentation is loading. Please wait.
Published byGillian Atkins Modified over 9 years ago
1
CPE232 Basic MIPS Architecture1 Computer Organization Multi-cycle Approach Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides http://www.abandah.com/gheith/Courses/CPE335_S08/index.html
2
CPE232 Basic MIPS Architecture2 Multicycle Datapath Approach Let an instruction take more than 1 clock cycle to complete l Break up instructions into steps where -each step takes a cycle while trying to balance the amount of work to be done in each step -restrict each cycle to use only one major functional unit; unless used in parallel l Not every instruction takes the same number of clock cycles In addition to faster clock rates, multicycle allows functional units that can be used more than once per instruction as long as they are used on different clock cycles, as a result l Need one memory only– but only one memory access per cycle l Need one ALU/adder only – but only one ALU operation per cycle
3
CPE232 Basic MIPS Architecture3 At the end of a cycle l Store values needed in a later cycle by the current instruction in internal registers (A,B, IR, and MDR). These registers are invisible to the programmer. l All of these registers, except IR, hold data only between a pair of adjacent clock cycles thus they don’t need write control signal. IR – Instruction RegisterMDR – Memory Data Register A, B – regfile read data registersALUout – ALU output register Multicycle Datapath Approach, con’t Address Read Data (Instr. or Data) Memory PC Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU Write Data IR MDR A B ALUout l Data used by subsequent instructions are stored in programmer visible registers (i.e., register file, PC, or memory)
4
CPE232 Basic MIPS Architecture4 Multicycle Datapath Approach, con’t Similar to single cycle, shared functional units should have multiplexers at their inputs. There is only one adder that will be used to update PC, perform ALU operations, comparison for beq, memory address computation, and branch address computation.
5
CPE232 Basic MIPS Architecture5 Multicycle Datapath Approach- Control Signals
6
CPE232 Basic MIPS Architecture6 The Multicycle Datapath with Control Signals Address Read Data (Instr. or Data) Memory PC Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU Write Data IR MDR A B ALUout Sign Extend Shift left 2 ALU control Shift left 2 ALUOp Control IRWrite MemtoReg MemWrite MemRead IorD PCWrite PCWriteCond RegDst RegWrite ALUSrcA ALUSrcB zero PCSource 1 1 1 1 1 1 0 0 0 0 0 0 2 2 3 4 Instr[5-0] Instr[25-0] PC[31-28] Instr[15-0] Instr[31-26] 32 28
7
CPE232 Basic MIPS Architecture7 Multicycle Machine: 1-bit Control Signals SignalEffect when deassertedEffect when asserted RegDst The destination register number comes from the rt field The destination register number comes from the rd field RegWrite None Write is enabled to selected destination register ALUSrcA The first ALU operand is the PCThe first ALU operand is register A MemRead None Content of memory address is placed on Memory data out MemWrtite None Memory location specified by the address is replaced by the value on Write data input MemtoReg The value fed to register file is from ALUOut The value fed to register file is from memory IorD PC is used as an address to memory unit ALUOut is used to supply the address to the memory unit IRWrite NoneThe output of memory is written into IR PCWrite None PC is written; the source is controlled by PCSource PCWriteCond None PC is written if Zero output from ALU is also active
8
CPE232 Basic MIPS Architecture8 Multicycle Machine: 2-bit Control Signals SignalValueEffect ALUOp 00ALU performs add operation 01ALU performs subtract operation 10The funct field of the instruction determines the ALU operation ALUSrcB 00The second input to the ALU comes from register B 01The second input to the ALU is 4 (to increment PC) 10 The second input to the ALU is the sign extended offset, lower 16 bits of IR. 11 The second input to the ALU is the sign extended, lower 16 bits of the IR shifted left by two bits PCSource 00Output of ALU (PC +4) is sent to the PC for writing 01 The content of ALUOut are sent to the PC for writing (Branch address) 10The jump address is sent to the PC for writing
9
CPE232 Basic MIPS Architecture9 Breaking Instruction Execution into Clock Cycles 1.IFetch: Instruction Fetch and Update PC (Same for all instructions) l Operations 1.1 Instruction Fetch: IR <= Memory[PC] 1.2 Update PC : PC <= PC + 4 l Control signals values -IorD = 0, MemRead = 1, IRWrite = 1 -ALUSrcA = 0, ALUSrcB = 01, ALUOp = 00, PCWrite = 1 -PCSrc = 00 Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5 IFetchDecExecMemWB
10
CPE232 Basic MIPS Architecture10 Breaking Instruction Execution into Clock Cycles 2. Decode - Instruction decode and register fetch (same for all instructions) We don’t know the instruction yet, do non harmful operations l Operations 2.1 read the two source registers rs and rt and place them in registers A and B, respectively. A <= Reg[IR[25:21]] B <= Reg[IR[20:16]] 2.2 Compute the branch address ALUOut <= PC + (sign-extend(IR[15:0]) <<2) l Control signals values -ALUSrcA = 0, ALUSrcB = 11, ALUOp = 00
11
CPE232 Basic MIPS Architecture11 Breaking Instruction Execution into Clock Cycles 3. Execution, Memory address computation, or branch completion Operation in this cycle depends on instruction type l Operations * if memory reference, compute address ALUOut <= A + sign-extend(IR[15:0]) ALUSrcA = 1, ALUSrcB = 10, ALUOp = 00 * if arithmetic-logic instruction, perform operation ALUOut <= A op B ALUSrcA = 1, ALUSrcB = 00, ALUOp = 10
12
CPE232 Basic MIPS Architecture12 Breaking Instruction Execution into Clock Cycles 3. Execution, Memory address computation, or branch completion (continued) operation depends on instruction type l Operations * if branch instruction if (A == B) PC<= ALUOut ALUSrcA = 1, ALUSrcB = 00, ALUOp = 01, PCWriteCond = 1, PCSrc = 01 * if jump instruction PC <= {PC[31:28], (IR[25:0],2’b00)} PCSource = 10, PCWrite = 1
13
CPE232 Basic MIPS Architecture13 Breaking Instruction Execution into Clock Cycles 4. Memory access or R-type completion operation in this cycle depends on instruction type l Operations * if load instruction : read value from memory into MDR MDR <= Memory[ALUOut] MemRead = 1, IorD = 1 * if store instruction: store rt into memory Memory[ALUOut] <= B MemWrite = 1, IorD = 1 * if arithmetic-logical instruction: write ALU result into rd Reg[IR[15:11]] <= ALUOut MemtoReg = 0, RegDst = 1, RegWrite = 1
14
CPE232 Basic MIPS Architecture14 Breaking Instruction Execution into Clock Cycles 5. Memory read completion Needed for the load instruction only l Operations 5.1 store the loaded value in MDR into rt Reg[IR[20:16]] <= MDR RegWrite = 1, MemtoReg = 1, RegDst = 0
15
CPE232 Basic MIPS Architecture15 Breaking Instruction Execution into Clock Cycles In this implementation, not all instructions take 5 cycles Instruction ClassClock Cycles Required Load5 Store4 Branch3 Arithmetic-logical4 Jump3
16
CPE232 Basic MIPS Architecture16 Multicycle Performance Compute the average CPI for multicycle implementation for SPECINT2000 program which has the following instruction mix: 25% loads, 10% stores, 11% branches, 2% jumps, 52% ALU. Assume the CPI for each instruction class as given in the previous table CPI = Σ CPIi x ICi / IC = 0.25 x 5 + 0.1 x 4 + 0.11 x 3 + 0.02 x 3 + 0.52 x 4 = 4.12 Compare to CPI = 1 for single cycle ?!! l Assume CC M = 1/5 CC S l Then Performance M / Performance S = (IC x 1 x CC S ) / (IC x 4.12 x (1/5) CC S ) = 1.21 l Multicycle is also cost-effective in terms of hardware.
17
CPE232 Basic MIPS Architecture17 Multicycle datapath control signals are not determined solely by the bits in the instruction l e.g., op code bits tell what operation the ALU should be doing, but not what instruction cycle is to be done next l Since the instruction is broken into multiple cycles, we need to know what we did in the previous cycle(s) in order to determine the current action Must use a finite state machine (FSM) for control l a set of states (current state stored in State Register) l next state function (determined by current state and the input) l output function (determined by current state and the input) Multicycle Control Unit Combinational control logic State Reg Inst Opcode Datapath control points Next State...
18
CPE232 Basic MIPS Architecture18 The States of the Control Unit 10 states are required in the FSM control The sequence of states is determined by five steps of execution and the instruction
19
CPE232 Basic MIPS Architecture19 The Control Unit 1. Logic gates l inputs : present state + opcode #bits = 10 l outputs: control + next state #bits = 20 l truth table size = 2 10 rows x 20 columns 2. ROM l Can be used to implement the truth table above (2 10 x 20 bit = 20 Kbit) l Each location stores the control signals values and the next state l Each location is addressable by the opcode and next state value
20
CPE232 Basic MIPS Architecture20 Micro-programmed Control Unit ROM implementation is vulnerable to bugs and expensive especially for complex CPU. Size increase as the number and complexity of instructions (states) increases. Use Microprogramming The next state value may not be sequential Generate the next state outside the storage element Each state is a microinstruction and the signals are specified symbolically Use labels for sequencing
21
CPE232 Basic MIPS Architecture21 Sequencer
22
CPE232 Basic MIPS Architecture22 Microprogram The microassembler converts the microcode into actual signal values The sequencing field is used along with the opcode to determine the next state
23
CPE232 Basic MIPS Architecture23 Multicycle Advantages & Disadvantages Uses the clock cycle efficiently – the clock cycle is timed to accommodate the slowest instruction step Multicycle implementations allow functional units to be used more than once per instruction as long as they are used on different clock cycles but Requires additional internal state registers, more muxes, and more complicated (FSM) control Clk Cycle 1 IFetchDecExecMemWB Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9Cycle 10 IFetchDecExecMem lwsw IFetch R-type
24
CPE232 Basic MIPS Architecture24 Single Cycle vs. Multiple Cycle Timing Clk Cycle 1 Multiple Cycle Implementation: IFetchDecExecMemWB Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9Cycle 10 IFetchDecExecMem lwsw IFetch R-type Clk Single Cycle Implementation: lwsw Waste Cycle 1Cycle 2 multicycle clock slower than 1/5 th of single cycle clock due to state register overhead
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.