Download presentation
Presentation is loading. Please wait.
Published byBruno Hutchinson Modified over 8 years ago
1
Lecture 9. MIPS Processor Design – Instruction Fetch Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System Education & Research
2
Korea Univ Introduction 2 Microarchitecture: How to implement an architecture in hardware Multiple implementations for a single architecture Single-cycle Each instruction executes in a single cycle Multicycle Each instruction is executed broken up into a series of shorter steps We don’t cover this in this class Pipeline Each instruction is broken up into a series of steps Multiple instructions execute simultaneously
3
Korea Univ Processor Performance Program execution time Execution Time = (#instructions)(cycles/instruction)(seconds/cycle) Challenge in designing microarchitecture is to satisfy constraints of: Cost Power Performance 3
4
Korea Univ Overview In chapter 4, we are going to implement (design) MIPS CPU The implemented CPU should be able to execute the machine code we discussed so far For the sake of your understanding, we simplify the processor system structure 4 CPU North Bridge South Bridge Main Memory (DDR) FSB (Front-Side Bus) DMI (Direct Media I/F) Real-PC system Memory (Instruction, data) MIPS CPU Address Bus Data Bus Simplified
5
Korea Univ Our MIPS Model Our MIPS CPU model has separate connections to instruction memory and data memory Actually, this structure is more realistic as we will see in chapter 5 5 Instruction Memory MIPS CPU Address Bus Data Bus Data Memory Address Bus Data Bus
6
Korea Univ MIPS CPU Processor Our MIPS implementation is simplified by implementing only memory-reference instructions: lw, sw arithmetic-logical instructions: add, sub, and, or, slt Control flow instructions: beq, j Generic implementation steps Fetch: use the program counter (PC) to supply the instruction address and fetch the instruction from memory (and update the PC) Decoding: decode the instruction (and read registers) Execution: execute the instruction 6 Instruction Memory Address Bus Data Bus Data Memory Address Bus Data Bus Fetch PC = PC +4 Decode Execute
7
Korea Univ Instruction Execution in CPU Fetch Fetch instruction by accessing memory with PC Decoding Extract opcode: Determine what operation should be done Extract operands: Register numbers or immediate from fetched instruction Read registers from register file Execution Use ALU to calculate (depending on instruction class) Arithmetic result Memory address for load/store Branch target address Access data memory for load/store Next Fetch PC target address or PC + 4 7 MIPS CPU Instruction Memory Address Bus Data Bus Data Memory Address Bus Data Bus Fetch PC = PC +4 Decode Execute
8
Korea Univ Revisiting Logic Design Basics Combinational logic Output is directly determined by input Sequential logic Output is determined not only by input, but also by internal state Sequential logic needs state elements to store information Flip-flop and latch are used to store the state information But, avoid using latch in digital design 8
9
Korea Univ Combinational Logic Examples 9 AND gate Y = A & B A B Y I0 I1 Y MuxMux S Multiplexer Y = S ? I1 : I0 A B Y + Adder Y = A + B A B Y ALU F Arithmetic Logic Unit (ALU) Y = F(A, B)
10
Korea Univ State Element (Register) Register (flip-flop): stores data in a circuit Clock signal determines when to update the stored value Edge-triggered Rising-edge triggered: update when clock changes from 0 to 1 Falling-edge triggered: update when clock changes from 1 to 0 Data input determines what (0 or 1) to update to the output 10 D Clk Q D Q Flip-flop (register)
11
Korea Univ State Element (Register) Register with write control Only updates on clock edge when write control input is 1 11 D Clk Q Write D Q Clk
12
Korea Univ Clocking Methodology Virtually all digital systems are essentially synchronous to the clock Combinational logic sits between state elements (registers) Combinational logic transforms data during clock cycles Between clock edges Input from state elements Output to the next state elements Longest delay determines clock period (frequency) 12
13
Korea Univ Building a Datapath Processor is composed of datapath and control Datapath Elements that process data and addresses in the CPU Registers, ALUs, mux’s, memories, … Control Logic that controls operations When to write to a register What kind of operation ALU should do Addition, Subtraction, Exclusive OR and so on We will build a MIPS datapath incrementally and provide Verilog code We adopt both structural and behavioral modeling Behavioral modeling describes what a module does For example, the lowest modules (such as ALU and register files) will be designed with the behavioral modeling Structural modeling describes a module from simpler modules via instantiations For example, the top module (such as MIPS_CPU) will be designed with the structural modeling 13
14
Korea Univ Overview of CPU Design 14 Instruction Memory MIPS CPU Address Bus Data Bus Data Memory Address Bus Data Bus mips_cpu.vimem.v (Instruction Memory) dmem.v (Data Memory) mips_cpu_mem.v mips_tb.v (testbench) clock reset Binary (machine code) Data in your program, Stack, Heap Address Instruction DataOut DataIn Address fetch, pc Decoding Register File ALU Memory Access
15
Korea Univ MIPS CPU Instruction Fetch 15 PC Instruction Memory Address Out Add 4 32-bit register (flip-flops) Increment by 4 for next instruction 32 instruction reset clock What is PC on reset? MIPS initializes the PC to 0xBFC0_0000 For the sake of simplicity, let’s initialize the PC to 0x0000_0000 in our design How about x86 and ARM? x86 reset vector is 0xFFFF_FFF0. BIOS ROM is located there ARM reset vector is 0x0000_0000
16
Korea Univ Instruction Fetch Verilog Model 16 `include "delay.v" module pc (input clk, reset, output reg [31:0] pc, input [31:0] pcnext); always @(posedge clk, posedge reset) begin if (reset) pc <= #`mydelay 0'h00000000; else pc <= #`mydelay pcnext; end endmodule PC Add 4 reset clock `include "delay.v" module adder(input [31:0] a, b, output [31:0] y); assign #`mydelay y = a + b; endmodule `include "delay.v" module mips_cpu(input clk, reset, output [31:0] pc, input [31:0] instr); wire [31:0] pcnext; // instantiate pc and adder modules pc pcreg (clk, reset, pc, pcnext); adder pcadd4 (pc, 32'b100, pcnext); endmodule
17
Korea Univ Memory As studied in the Computer Logic Design, memory is classified into RAM (Random Access Memory) and ROM (Read-Only Memory) RAM is classified into DRAM (Dynamic RAM) and SRAM (Static RAM) DDR is a DRAM Short form of DDR (Double Data Rate) SDRAM (Synchronous DRAM) DDR is used as main memory in modern computers We use a simple Verilog memory model that stores your program since our focus is on how CPU works 17
18
Korea Univ Simple MIPS Test Code Example MIPS Assembly code 18 assemble
19
Korea Univ Instruction Memory Verilog Model 19 module imem(input [6:0] a, output [31:0] rd); reg [31:0] RAM[127:0]; initial begin $readmemh("memfile.dat",RAM); end assign #1 rd = RAM[a]; // word aligned endmodule Instruction Memory Compiled binary file Word (32-bit) 128 words rd[31:0] 32 a[6:0] 7 Data comes out from the address a 20020005 2003000c 2067fff7 00e22025 00642824 00a42820 10a7000a 0064202a 10800001 20050000 00e2202a 00853820 00e23822 ac670044 8c020050 08000011 20020001 ac020054 memfile.dat Depending on your needs, you can increase or decrease the memory size Examples For 1KB word-addressable memory, reg [31:0] RAM[255:0] For 16KB byte-addressable memory, reg [7:0] RAM[16*1024-1:0]
20
Korea Univ MIPS CPU with imem and Testbench 20 module mips_cpu_mem(input clk, reset); wire [31:0] pc, instr; // instantiate processor and memories mips_cpu imips_cpu (clk, reset, pc, instr); imem imips_imem (pc[7:2], instr); endmodule module mips_tb(); reg clk; reg reset; // instantiate device to be tested mips_cpu_mem imips_cpu_mem(clk, reset); // initialize test initial begin reset <= 1; # 32; reset <= 0; end // generate clock to sequence tests initial begin clk <= 0; forever #10 clk <= ~clk; end endmodule
21
Korea Univ Simulation and Synthesis Instruction fetch simulation 21 Synthesis Try to synthesis pc and adder with Quartus-II
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.