Lecture 9. MIPS Processor Design – Single-Cycle Processor Design Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.

Slides:



Advertisements
Similar presentations
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 13 - A Verilog.
Advertisements

COMP541 Datapath & Single-Cycle MIPS
1 Today  All HW1 turned in on time, this is great!  HW2 will be out soon —You will work on procedure calls/stack/etc.  Lab1 will be out soon (possibly.
The Processor: Datapath & Control
Processor II CPSC 321 Andreas Klappenecker. Midterm 1 Tuesday, October 5 Thursday, October 7 Advantage: less material Disadvantage: less preparation time.
331 Lec 14.1Fall 2002 Review: Abstract Implementation View  Split memory (Harvard) model - single cycle operation  Simplified to contain only the instructions:
Computer Structure - Datapath and Control Goal: Design a Datapath  We will design the datapath of a processor that includes a subset of the MIPS instruction.
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Digital Architectures1 Machine instructions execution steps (1) FETCH = Read the instruction.
The Datapath Andreas Klappenecker CPSC321 Computer Architecture.
Datapath and Control Andreas Klappenecker CPSC321 Computer Architecture.
The Processor: Datapath & Control. Implementing Instructions Simplified instruction set memory-reference instructions: lw, sw arithmetic-logical instructions:
Chapter 4 Sections 4.1 – 4.4 Appendix D.1 and D.2 Dr. Iyad F. Jafar Basic MIPS Architecture: Single-Cycle Datapath and Control.
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
Lecture 9. MIPS Processor Design – Instruction Fetch Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System Education &
1 COMP541 Multicycle MIPS Montek Singh Apr 4, 2012.
COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.
COMP541 Datapaths II & Single-Cycle MIPS
CDA 3101 Fall 2013 Introduction to Computer Organization
Computer Architecture and Design – ECEN 350 Part 6 [Some slides adapted from A. Sprintson, M. Irwin, D. Paterson and others]
1 A single-cycle MIPS processor  An instruction set architecture is an interface that defines the hardware operations which are available to software.
Chapter 7 Digital Design and Computer Architecture, 2 nd Edition Chapter 7 David Money Harris and Sarah L. Harris.
1 COMP541 Datapaths II & Control I Montek Singh Mar 22, 2010.
MIPS processor continued. In Class Exercise Question Show the datapath of a processor that supports only R-type and jr reg instructions.
COMP541 Multicycle MIPS Montek Singh Mar 25, 2010.
December 26, 2015©2003 Craig Zilles (derived from slides by Howard Huang) 1 A single-cycle MIPS processor  As previously discussed, an instruction set.
Chapter 4 From: Dr. Iyad F. Jafar Basic MIPS Architecture: Single-Cycle Datapath and Control.
February 22, 2016©2003 Craig Zilles (derived from slides by Howard Huang) 1 A single-cycle MIPS processor  As previously discussed, an instruction set.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 2.
MIPS processor continued
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
COM181 Computer Hardware Lecture 6: The MIPs CPU.
1 Chapter 5: Datapath and Control (Part 2) CS 447 Jason Bakos.
Single Cycle Controller Design
MIPS Processor.
Lecture 5. MIPS Processor Design
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
Chapter 7 Digital Design and Computer Architecture, 2 nd Edition Chapter 7 David Money Harris and Sarah L. Harris.
Single-cycle CPU Control
Access the Instruction from Memory
EE204 Computer Architecture
CS Computer Architecture Week 10: Single Cycle Implementation
MIPS Microarchitecture Single-Cycle Processor Control
Single Cycle CPU - Control
Microarchitecture.
CS161 – Design and Architecture of Computer Systems
Single-Cycle Datapath and Control
Computer Architecture
MIPS processor continued
Designing MIPS Processor (Single-Cycle) Presentation G
CS/COE0447 Computer Organization & Assembly Language
Single-Cycle CPU DataPath.
CS/COE0447 Computer Organization & Assembly Language
CSCI206 - Computer Organization & Programming
CS/COE0447 Computer Organization & Assembly Language
MIPS Processor.
Datapath & Control MIPS
Rocky K. C. Chang 6 November 2017
Composing the Elements
Composing the Elements
The Processor Lecture 3.2: Building a Datapath with Control
Lecture 9. MIPS Processor Design – Decoding and Execution
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
MIPS processor continued
CS/COE0447 Computer Organization & Assembly Language
Control Unit (single cycle implementation)
The Processor: Datapath & Control.
COMS 361 Computer Organization
MIPS Processor.
Processor: Datapath and Control
CS/COE0447 Computer Organization & Assembly Language
Presentation transcript:

Lecture 9. MIPS Processor Design – Single-Cycle Processor Design Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System Education & Research

Korea Univ Single-Cycle MIPS Processor Again, microarchitecture (CPU implementation) is divided into 2 interacting parts  Datapath  Control 2

Korea Univ Single-Cycle Processor Design Let’s start with a memory access instruction - lw  Example: lw $2, 80($0) 3 STEP 1: Instruction Fetch

Korea Univ Single-Cycle Processor Design STEP 2: Decoding  Read source operands from register file 4 Example: lw $2, 80($0)

Korea Univ Single-Cycle Processor Design STEP 2: Decoding  Sign-extend the immediate 5 Example: lw $2, 80($0) module signext(input [15:0] a, output [31:0] y); assign y = {{16{a[15]}}, a}; endmodule

Korea Univ Single-Cycle Processor Design 6 Example: lw $2, 80($0) STEP 3: Execution  Compute the memory address

Korea Univ Single-Cycle Processor Design 7 Example: lw $2, 80($0) STEP 4: Execution  Read data from memory and write it back to register file

Korea Univ Single-Cycle Processor Design We are done with lw CPU starts fetching the next instruction from PC+4 8 module adder(input [31:0] a, b, output [31:0] y); assign y = a + b; endmodule adder pcadd1(pc, 32'b100, pcplus4);

Korea Univ Single-Cycle Processor Design Let’s consider another memory access instruction - sw  sw instruction needs to write data to data memory 9 Example: sw $2, 84($0)

Korea Univ Single-Cycle Processor Design Let’s consider arithmetic and logical instructions - add, sub, and, or  Write ALUResult to register file  Note that R-type instructions write to rd field of instruction (instead of rt ) 10

Korea Univ Single-Cycle Processor Design Let’s consider a branch instruction - beq  Determine whether register values are equal  Calculate branch target address (BTA) from sign-extended immediate and PC+4 11 Example: beq $4,$0, around

Korea Univ Single-Cycle Datapath Example 12 We are done with the implementation of basic instructions Let’s see how or instruction works out in the implementation

Korea Univ Single-Cycle Processor - Control 13 As mentioned, CPU is designed with datapath and control Now, let’s delve into the control part design

Korea Univ Control Unit 14 Opcode and funct fields come from the fetched instruction

Korea Univ ALU Implementation and Control 15 F 2:0 Function 000A & B 001A | B 010A + B 011not used 100A & ~B 101A | ~B 110A - B 111SLT N = 32 in 32-bit processor slt : set less than Example: slt $t0, $t1, $t2 // $t0 = 1 if $t1 < $t2 adder

Korea Univ Control Unit: ALU Control 16 ALUOp 1:0 Meaning 00Add 01Subtract 10Look at Funct 11Not Used ALUOp 1:0 FunctALUControl 2:0 00X010 (Add) X1X110 (Subtract) 1X ( add ) 010 (Add) 1X ( sub ) 110 (Subtract) 1X ( and ) 000 (And) 1X ( or ) 001 (Or) 1X ( slt ) 111 (SLT) Implementation is completely dependent on hardware designers But, the designers should make sure the implementation is reasonable enough Memory access instructions (lw, sw) need to use ALU to calculate memory target address (addition) Branch instructions (beq, bne) need to use ALU for the equality check (subtraction)

Korea Univ Control Unit: Main Decoder 17 Instruction Op 5:0 RegWriteRegDstAluSrcBranchMemWriteMemtoRegALUOp 1:0 R-type lw sw beq ALUOp 1:0 Meaning 00Add 01Subtract 10 Look at Funct field 11Not Used X X X X 01 0

Korea Univ How about Other Instructions? 18 Example: addi $t0, $t1, -14 Hmmm.. Now, we are done with the control part design Let’s examine if the design is able to execute other instructions  addi

Korea Univ Control Unit: Main Decoder 19 InstructionOp 5:0 RegWriteRegDstAluSrcBranchMemWriteMemtoRegALUOp 1:0 R-type lw sw X101X00 beq X010X01 addi

Korea Univ How about Other Instructions? 20 Ok. So far, so good… How about jump instructions?  j

Korea Univ How about Other Instructions? 21 We need to add some hardware to support the j instruction  A logic to compute the target address  Mux and control signal

Korea Univ Control Unit: Main Decoder 22 InstructionOp 5:0 RegWriteRegDstAluSrcBranchMemWriteMemtoRegALUOp 1:0 Jump R-type lw sw X101X000 beq X010X010 addi j XXX0XXX1 There is one more output in the main decoder to support the jump instructions Jump

Korea Univ Verilog Code - Main Decoder and ALU Control 23 module maindec(input [5:0] op, output memtoreg, memwrite, output branch, alusrc, output regdst, regwrite, output jump, output [1:0] aluop); reg [8:0] controls; assign {regwrite, regdst, alusrc, branch, memwrite, memtoreg, jump, aluop} = controls; case(op) 6'b000000: controls <= 9'b ; // R-type 6'b100011: controls <= 9'b ; // lw 6'b101011: controls <= 9'b ; // sw 6'b000100: controls <= 9'b ; // beq 6'b001000: controls <= 9'b ; // addi 6'b000010: controls <= 9'b ; // j default: controls <= 9'bxxxxxxxxx; // ??? endcase endmodule module aludec(input [5:0] funct, input [1:0] aluop, output reg [2:0] alucontrol); case(aluop) 2'b00: alucontrol <= 3'b010; // add 2'b01: alucontrol <= 3'b110; // sub default: case(funct) // RTYPE 6'b100000: alucontrol <= 3'b010; // ADD 6'b100010: alucontrol <= 3'b110; // SUB 6'b100100: alucontrol <= 3'b000; // AND 6'b100101: alucontrol <= 3'b001; // OR 6'b101010: alucontrol <= 3'b111; // SLT default: alucontrol <= 3'bxxx; // ??? endcase endmodule

Korea Univ Verilog Code – ALU 24 module alu(input [31:0] a, b, input [2:0] alucont, output reg [31:0] result, output zero); wire [31:0] b2, sum, slt; assign b2 = alucont[2] ? ~b:b; assign sum = a + b2 + alucont[2]; assign slt = sum[31]; case(alucont[1:0]) 2'b00: result <= a & b2; 2'b01: result <= a | b2; 2'b10: result <= sum; 2'b11: result <= slt; endcase assign zero = (result == 32'b0); endmodule F 2:0 Function 000A & B 001A | B 010A + B 011not used 100A & ~B 101A | ~B 110A - B 111SLT

Korea Univ Single-Cycle Processor Performance How fast is the single-cycle processor? Clock cycle time (frequency) is limited by the critical path  The critical path is the path that takes the longest time  What do you think the critical path is? The path that lw instruction goes through 25

Korea Univ Single-Cycle Processor Performance Single-cycle critical path: T c = t pcq_PC + t mem + max(t RFread, t sext ) + t mux + t ALU + t mem + t mux + t RFsetup In most implementations, limiting paths are: memory (instruction and data), ALU, register file. Thus, T c = t pcq_PC + 2t mem + t RFread + 2t mux + t ALU + t RFsetup 26 ElementsParameter Register clock-to-Qt pcq_PC Multiplexert mux ALUt ALU Memory readt mem Register file readt RFread Register file setupt RFsetup

Korea Univ Single-Cycle Processor Performance Example 27 T c = t pcq_PC + 2t mem + t RFread + 2t mux + t ALU + t RFsetup = [30 + 2(250) (25) ] ps = 950 ps ElementsParameterDelay (ps) Register clock-to-Qt pcq_PC 30 Multiplexert mux 25 ALUt ALU 200 Memory readt mem 250 Register file readt RFread 150 Register file setupt RFsetup 20 Assuming that the CPU executes 100 billion instructions to run your program, what is the execution time of the program on a single-cycle MIPS processor? Execution Time = (#instructions)(cycles/instruction)(seconds/cycle) = (100 × 10 9 )(1)(950 × s) = 95 seconds f c = 1/T c f c = 1/950ps = 1.052GHz