Download presentation
Presentation is loading. Please wait.
Published bySharlene Lamb Modified over 9 years ago
1
Chapter 4 The Processor CprE 381 Computer Organization and Assembly Level Programming, Fall 2013 Zhao Zhang Iowa State University Revised from original slides provided by MKP
2
Week 8 Overview CPU design overview Datapath and Control Control Unit ALU Control Unit Chapter 1 — Computer Abstractions and Technology — 2
3
Announcements Mini-project B starts in week 9 Mini-projects B and C will be revised The grading scale will be discussed by Friday (week 8) Chapter 1 — Computer Abstractions and Technology — 3
4
Chapter 4 — The Processor — 4 Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified, single-cycle version A more realistic, pipelined version §4.1 Introduction
5
Nine-Instruction MIPS We will first use a MIPS subset of nine instructions, then extend the subset It’s enough to illustrate the most aspects of CPU design, particularly datapath and control design Memory reference: LW and SW Arithmetic/logic: ADD, SUB, AND, OR, SLT Branch: BEQ, BNE Chapter 1 — Computer Abstractions and Technology — 5
6
Chapter 4 — The Processor — 6 Instruction Execution PC instruction memory, Fetch instruction Register numbers register file, Read registers Then, depending on instruction class Execute: Use ALU to calculate Arithmetic result Memory address for load/store Branch target address Memory access: Access data memory for load/store Register writeback: Write data back to registers PC update (for all): PC target address or PC + 4
7
Chapter 4 — The Processor — 7 CPU Overview Next Sequential PC = PC + 4 Branch Target = (PC+4)+offset A Sketchy view An instruction may change 1.PC (all instructions) 2.Some register (arithmetic/logic, load) 3.Some memory word/halfword/byte (store)
8
Chapter 4 — The Processor — 8 Multiplexers Can’t just join wires together Use multiplexers What would happen if you just join signals in VHDL?
9
Chapter 4 — The Processor — 9 Control Control signals: mux select, read/write enable, ALU opcode, etc.
10
Chapter 4 — The Processor — 10 Logic Design Basics §4.2 Logic Design Conventions Combinational element Operate on data Output is a function of input State (sequential) elements Store information Output is a function of internal state and input
11
Chapter 4 — The Processor — 11 Combinational Elements AND-gate Y = A & B A B Y I0 I1 Y MuxMux S Multiplexer Y = S ? I1 : I0 A B Y + A B Y ALU F Adder Y = A + B Arithmetic/Logic Unit Y = F(A, B)
12
Chapter 4 — The Processor — 12 Sequential Elements Register: stores data in a circuit Uses a clock signal to determine when to update the stored value Edge-triggered: update when Clk changes from 0 to 1 Data output Q is stable for a clock cycle D Clk Q D Q
13
Chapter 4 — The Processor — 13 Sequential Elements Register with write control Only updates on clock edge when write control input is 1 VHDL: rising_edge(Clk) AND Write Used when stored value is required later D Clk Q Write D Q Clk
14
Chapter 4 — The Processor — 14 Clocking Methodology Combinational logic transforms data during clock cycles Input from state elements Output must stabilize within one cycle Longest delay determines clock period Output to state element at the next rising edge
15
Clocking Methodology Processor is a big state machine Works like a Moore machine in non-I/O phase Output is a function of the state States include PC, all registers and memory contents Chapter 1 — Computer Abstractions and Technology — 15
16
Chapter 4 — The Processor — 16 Building a Datapath Datapath elements Elements that process data and addresses in the CPU Registers, ALUs, mux’s, memories, … We will build a MIPS datapath incrementally Refining the overview design §4.3 Building a Datapath
17
Chapter 4 — The Processor — 17 Instruction Fetch 32-bit register Increment by 4 for next instruction Datapath elements: PC register, instruction memory, 32-bit adder
18
Chapter 4 — The Processor — 18 R-Format Instructions Read two register operands Perform arithmetic/logical operation Write register result Datapath elements: Register file, ALU
19
Chapter 4 — The Processor — 19 Load/Store Instructions Read register operands Calculate address using 16-bit offset Use ALU, but sign-extend offset Load: Read memory and update register Store: Write register value to memory Datapath elements: Data memory, sign extender
20
Chapter 4 — The Processor — 20 Branch Instructions Read register operands Compare operands Use ALU, subtract and check Zero output Calculate target address Sign-extend displacement Shift left 2 places (word displacement) Add to PC + 4 Already calculated by instruction fetch
21
Chapter 4 — The Processor — 21 Branch Instructions Just re-routes wires Sign-bit wire replicated New: Shifter, 2 nd 32-bit Adder
22
Chapter 4 — The Processor — 22 Composing the Elements First-cut data path does an instruction in one clock cycle Each datapath element can only do one function at a time Hence, we need separate instruction and data memories Use multiplexers where alternate data sources are used for different instructions
23
Chapter 4 — The Processor — 23 R-Type/Load/Store Datapath
24
Chapter 4 — The Processor — 24 Full Datapath
25
Chapter 4 — The Processor — 25 Performance Issues Longest delay determines clock period Critical path: load instruction Instruction memory register file ALU data memory register file Not every instruction requires the same time
26
Chapter 4 — The Processor — 26 Performance Issues Some instructions may take substantially longer time, e.g. multiply/division Not feasible to vary clock cycle for different instructions Must use the worst-case delay as the clock cycle Violates design principle making the common case fast We will improve performance by pipelining
27
Chapter 4 — The Processor — 27 ALU Control ALU used for Load/Store: F = add Branch: F = subtract R-type: F depends on funct field §4.4 A Simple Implementation Scheme ALU controlFunction 0000AND 0001OR 0010add 0110subtract 0111set-on-less-than 1100NOR
28
Chapter 4 — The Processor — 28 ALU Control Assume 2-bit ALUOp derived from opcode Combinational logic derives ALU control opcodeALUOpOperationfunctALU functionALU control lw00load wordXXXXXXadd0010 sw00store wordXXXXXXadd0010 beq01branch equalXXXXXXsubtract0110 R-type10add100000add0010 subtract100010subtract0110 AND100100AND0000 OR100101OR0001 set-on-less-than101010set-on-less-than0111
29
VHDL Notes How to program the ALU control? -- Behavior style process (alu_op, funct) begin case alu_op is when ‘00’ => alu_code <= ‘0010’; when ’01’ => … end case; end process; Chapter 1 — Computer Abstractions and Technology — 29
30
Chapter 4 — The Processor — 30 The Main Control Unit Control signals derived from instruction 0rsrtrdshamtfunct 31:265:025:2120:1615:1110:6 35 or 43rsrtaddress 31:2625:2120:1615:0 4rsrtaddress 31:2625:2120:1615:0 R-type Load/ Store Branch opcodealways read read, except for load write for R-type and load sign-extend and add
31
Chapter 4 — The Processor — 31 Datapath With Control
32
Summary of Control Signals RegDst: Write to register rt or rd? ALUSrc: Immediate to ALU? MemtoReg: Write memory or ALU output? RegWrite: Write to regfile at all? MemRead: Read from Data Memory? MemWrite: Write to the Data Memory? Branch: Is it a branch intruction? ALUOp[1:0]: ALU control field Chapter 1 — Computer Abstractions and Technology — 32
33
Chapter 4 — The Processor — 33 R-Type Instruction
34
R-Type: Control Signals RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp[1:0] 1 (write to rd) 0 (No immediate) 0 (wrote not from memory) 1 (does write regfile) 0 (no memory read) 0 (no memory write) 0 (does write regfile) 10 (R-type ALU op) Chapter 1 — Computer Abstractions and Technology — 34
35
Chapter 4 — The Processor — 35 Load Instruction
36
Load: Control Signals RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp[1:0] 0 1 1 1 1 0 0 00 Chapter 1 — Computer Abstractions and Technology — 36
37
Store: Control Signals RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp[1:0] X 1 X 0 0 1 0 00 Chapter 1 — Computer Abstractions and Technology — 37
38
Chapter 4 — The Processor — 38 Branch-on-Equal Instruction
39
BEQ: Control Signals RegDst ALUSrc MemtoReg RegWrite MemRead MemWrite Branch ALUOp[1:0] X 0 X 0 0 0 1 01 Chapter 1 — Computer Abstractions and Technology — 39
40
Control Signal Setting What’re the control signal values for each instruction or instruction type? InstReg- Dst ALU- Src Mem- toReg Reg- Write Mem- Read Mem- Write Branc h ALUO p1 ALUO p0 R-100100010 lw011110000 swX1X001000 beqX0X000101 Chapter 1 — Computer Abstractions and Technology — 40 Note: “R-” means R-format
41
VHDL Notes How to program the control? entity control is port (op_code : in m32_6bits; reg_dst : out m32_1bit; alu_src : out m32_1bit; mem_to_reg : out m32_1bit; reg_write : out m32_1bit; mem_read : out m32_1bit; mem_write : out m32_1bit; branch : out m32_1bit; alu_op : out m32_2bits); end control; Chapter 1 — Computer Abstractions and Technology — 41
42
VHDL Notes architecture rom of control is subtype code_t is m32_vector(8 downto 0); type rom_t is array (0 to 63) of code_t; -- The ROM content for control signals signal rom : rom_t := ( 00 => "100100010", -- R-type 35 => "011110000", -- LW … -- More for other instructions others=>"000000000"); begin (reg_dst, alu_src, mem_to_reg, reg_write, mem_read, mem_write, branch, alu_op(1), alu_op(0)) <= rom(to_integer(unsigned(op_code))); end rom; Chapter 1 — Computer Abstractions and Technology — 42
43
Chapter 4 — The Processor — 43 Implementing Jumps Jump uses word address Update PC with concatenation of Top 4 bits of old PC 26-bit jump address 00 Need an extra control signal decoded from opcode 2address 31:2625:0 Jump
44
Chapter 4 — The Processor — 44 Datapath With Jumps Added
45
Grading Scale Tentative grading scale A: 90, A-: 87 B+: 84, B: 80, B-: 75 C+: 70, C: 65, C-: 60 D: 50 There will be a bonus in lab projects Chapter 1 — Computer Abstractions and Technology — 45
46
Mini-Project B, Tentative Implement single-cycle processor (SCP). There will be three parts 1. Part 1, SCPv1: Implement the nine- instruction ISA plus the J instruction 2. Part 2, SCPv2a: Support all the instructions needed to run bubble sorting 3. Part 3, SCPv2b: Detailed modeling of data elements Chapter 1 — Computer Abstractions and Technology — 46
47
Mini-Project B Bonus part, SCPv3: Support all integer instructions on the green sheet, due in the last lab Some support files will be provided High-level modeling of Register File, ALU, Adder, to be used in Parts 1 and 2 Partial sample VHDL code will be provided Chapter 1 — Computer Abstractions and Technology — 47
48
Mini-Project B The CPU composition must be strongly structural Parts 1 and 2 may use behavior/dataflow modeling for data elements Part 3 must use detailed modeling for data elements – Reuse your VHDL code in the labs Chapter 1 — Computer Abstractions and Technology — 48
49
Extend Single-Cycle MIPS Consider the following instructions addi: add immediate sll: Shift left logic by a constant bne: branch if not equal jal: Jump and link jr: Jump register Chapter 1 — Computer Abstractions and Technology — 49
50
Chapter 4 — The Processor — 50 SCPv0: R-Format, LW/SW, BEQ
51
Chapter 4 — The Processor — 51 SCPv1: R-Format, LW/SW, BEQ, J
52
SCPv1: Control Signals What’re the control signal values for each instruction or instruction type? InstReg- Dst ALU- Src Mem- toReg Reg- Write Mem Read Mem Write Bran ch ALU Op1 ALU Op0 Jum p R-1001000100 lw0111100000 swX1X0010000 beqX0X0001010 jXXX0000XX1 Chapter 1 — Computer Abstractions and Technology — 52 Note: “R-” means R-format
53
Extend the Single-Cycle Processor For each instruction, do we need 1. Any new or revised datapath element(s)? 2. Any new control signal(s)? Then revise, if necessary, 1. Datapath: Add new elements or revise existing ones, add new connections 2. Control Unit: Add/extend control signals, extend the truth table 3. ALU Control: Extend the truth table Chapter 1 — Computer Abstractions and Technology — 53
54
SCPv0 + ADDI addi rs, rt, immediate R[rt] = R[rs]+SignExtImm Read register operands (only one is used) Sign extend the immediate (in parallel) Perform arithmetic/logical operation Write register result Chapter 1 — Computer Abstractions and Technology — 54 001000rsrtimmediate 31:2625:2120:1615:0
55
SCPv0 + ADDI Chapter 1 — Computer Abstractions and Technology — 55 What changes to this baseline?
56
Chapter 4 — The Processor — 56 SCPv0 + ADDI Do we need new or revised datapath elements?
57
SCPv0 + ADDI Do we need new or revised datapath elements? Do we need new control signal(s)? InstReg- Dst ALU- Src Mem- toReg Reg- Write Mem- Read Mem- Write Branc h ALUO p1 ALUO p0 R-100100010 lw011110000 swX1X001000 beqX0X000101 addi Chapter 1 — Computer Abstractions and Technology — 57
58
SCPv0 + ADDI Like LW I-format instruction Write to register[rt] Use add operation InstReg- Dst ALU- Src Mem- toReg Reg- Write Mem- Read Mem- Write Branc h ALUO p1 ALUO p0 R-100100010 lw011110000 swX1X001000 beqX0X000101 addi010100000 Chapter 1 — Computer Abstractions and Technology — 58 Like R-format arithmetic Write ALU result to register file
59
SCPv0 + SLL sll rd, rs, shamt R[rd] = R[rt]<<shamt Read register operands (only one is used) Perform shift operation Write register result Note: sllv rd, rt, rs for shift left logic variable Chapter 1 — Computer Abstractions and Technology — 59 000000rsrtrdshamt000000 31:265:025:2120:1615:1110:6
60
SCPv0 + SLL Chapter 1 — Computer Abstractions and Technology — 60 What changes to the datapath elements?
61
SCPv0 + SLL Chapter 1 — Computer Abstractions and Technology — 61 ALU needs to do the shift operation ALU 1 st input needs another source: shamt extended to 32-bit
62
SCPv0 + SLL Add another source to the 1 st input of ALU Shamt: Instruction[10-6] Add a Mux and ALUSrc1 control line 0: R[rs] 1: Shamt (sign-extended) Rename ALUSrc to ALUSrc2 Extend ALU control Add an ALU control code for SLL Chapter 1 — Computer Abstractions and Technology — 62
63
Chapter 4 — The Processor — 63 SCPv0 + SLL Extend ALU control: Choose a code of your choice (kkkk shown in the table) opcodeALUOpOperationfunctALU functionALU control lw00load wordXXXXXXadd0010 sw00store wordXXXXXXadd0010 beq01branch equalXXXXXXsubtract0110 R-type10add100000add0010 subtract100010subtract0110 AND100100AND0000 OR100101OR0001 set-on-less-than101010set-on-less-than0111 shift-left-logic000000shift-left-logickkkk
64
SCPv0 + SLL InstReg- Dst ALU- Src Mem- toReg Reg- Write Mem- Read Mem- Write Branc h ALUO p R-10010001 0 lw01111000 swX1X00100 beqX0X00010 1 sll Chapter 1 — Computer Abstractions and Technology — 64 InstReg- Dst ALU Src1 ALU- Src2 Mem- toReg Reg- Write Mem - Read Mem - Write Bran ch ALU Op R-100010001 0 lw001111000 swX01X00100 beqX00X00010 1 sll110010001 0
65
SCPv0 + BNE bne rs, rt, label PC = (R[Rs]!=R[rt]) ? PC+4+(SignExtImm<<2) : PC+4 Read register operands Compare operands Use ALU, subtract and check Zero output Calculate target address Sign-extend displacement Shift left 2 places (word displacement) Add to PC + 4 Already calculated by instruction fetch Chapter 1 — Computer Abstractions and Technology — 65 000101rsrtoffset 31:2625:2120:1615:0
66
Chapter 4 — The Processor — 66 SCPv0 + BNE Make what changes to the datapath?
67
SCPv0 + BNE Extend Branch to two bits 10: Branch-Equal 11: Branch-Not-Equal Replace the AND gate with the following logic Can use a different truth table Chapter 1 — Computer Abstractions and Technology — 67 BranchZeroBranch taken? 1 011 1 01 otherwise0
68
SCPv0 + BNE InstReg- Dst ALU- Src Mem- toReg Reg- Write Mem- Read Mem- Write Branc h ALUO p R-10010001 0 lw01111000 swX1X00100 beqX0X00010 1 bne Chapter 1 — Computer Abstractions and Technology — 68 InstReg- Dst ALU- Src Mem- toReg Reg- Write Mem- Read Mem- Write Branc h ALUO p R-1001000 1 0 lw0111100 swX1X0010 beqX0X0001 00 1 bneX0X0001 0 1
69
SCPv1 + JAL jal target PC = JumpAddr R[31] = PC+4 Jump uses word address Update PC with JumpAddr: concatenation of top 4 bits of old PC, 26-bit jump address, and 00 (called pseudo-direct) Save PC+4 to $ra Chapter 1 — Computer Abstractions and Technology — 69 000011address 31:2625:0
70
Chapter 4 — The Processor — 70 SCPv1 + JAL Make what changes to the datapath?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.