5.5 A Multicycle Implementation

5.5 A Multicycle Implementation
A single memory unit is used for both instructions and data. There is a single ALU, rather than an ALU and two adders. One or more registers are added after every major functional unit.

Continue Replacing the three ALUs of the single-cycle by a single ALU means that the single ALU must accommodate all the inputs that used to go to the three different ALUs.

Continue Control signals:
The programmer-visible state units (PC, Memory, Register file) and IR  write Memory  Read ALU control: same as single cycle Multiplexor single/two control lines

Continue Three possible sources for the PC: PC+4
ALUOut : address of the beq Address for jump ( j ) PC write control signal: PCWrite : PC+4 and jump PCWriteCond : beq

Continue

Breaking the Instruction Execution into Clock Cycles
Instruction fetch step IR <= Memory[PC]; PC <= PC + 4; IR <= Memory[PC]; MemRead IRWrite IorD = 0 PC <= PC + 4; ALUSrcA = 0 ALUSrcB = 01 ALUOp = 00 (for add) PCSource = 00 PCWrite The increment of the PC and instruction memory access can occur in parallel, how?

Instruction decode and register fetch step Actions that are either applicable to all instructions Or are not harmful A <= Reg[IR[25:21]]; B <= Reg[IR[20:16]]; ALUOut <= PC + (sign-extend(IR[15-0] << 2 );

Instruction decode and register fetch step A <= Reg[IR[25:21]];
B <= Reg[IR[20:16]]; ALUOut <= PC + (sign-extend(IR[15-0] << 2 ); A <= Reg[IR[25:21]]; B <= Reg[IR[20:16]]; Since A and B are overwritten on every cycle  Done ALUOut <= PC + (sign-extend(IR[15-0]<<2); This requires: ALUSrcA  0 ALUSrcB  11 ALUOp  00 (for add) branch target address will be stored in ALUOut. The register file access and computation of branch target occur in parallel.

Execution, memory address computation, or branch completion Memory reference: ALUOut <= A + sign-extend(IR[15:0]); Arithmetic-logical instruction: ALUOut <= A op B; Branch: if (A == B) PC <= ALUOut; Jump: PC <= { PC[31:28], (IR[25:0], 2’b00) };

3. Execution, memory address computation, or branch completion
Memory reference: ALUOut <= A + sign-extend(IR[15:0]); ALUSrcA = 1 && ALUSrcB = 10 ALUOp = 00 Arithmetic-logical instruction: ALUOut <= A op B; ALUSrcA = 1 && ALUSrcB = 00 ALUOp = 10 Branch: if (A == B) PC <= ALUOut; ALUOp = 01 (for subtraction) PCSource = 01 PCWriteCond Jump: PC <= { PC[31:28], (IR[25:0],2’b00) }; PCSource = 10 PCWrite

Memory access or R-type instruction completion step Memory reference: MDR <= Memory [ALUOut]; MemRead or IorD=1 Memory [ALUOut] <= B; MemWrite Arithmetic-logical instruction (R-type): Reg[IR[15:11]] <= ALUOut; RegDst=1 RegWrite MemtoReg=0 Memory read completion step Load: Reg[IR[20:16]] <= MDR; MemtoReg=1 RegWrite RegDst=0

Defining the Control Two different techniques to design the control:
Finite state machine Microprogramming Example: CPI in a Multicycle CPU Using the SPECINT2000 instruction mix, which is: 25% load, 10% store, 11% branches, 2% jumps, and 52% ALU. What is the CPI, assuming that each state in the multicycle CPU requires 1 clock cycle? Answer: The number of clock cycles for each instruction class is the following: Load: 5 Stores: 4 ALU instruction: 4 Branches: 3 Jumps: 3

Example Continue The CPI is given by the following:
is simply the instruction frequency for the instruction class i. We can therefore substitute to obtain: CPI = 0.25    3 = 4.12 This CPI is better than the worst-case CPI of 5.0 when all instructions take the same number of clock cycles.

5.5 A Multicycle Implementation

Similar presentations

Presentation on theme: "5.5 A Multicycle Implementation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

5.5 A Multicycle Implementation

Similar presentations

Presentation on theme: "5.5 A Multicycle Implementation"— Presentation transcript:

Similar presentations

About project

Feedback