COMS 361 Computer Organization Title: MIPS Datapath Date: 11/18/2004 Lecture Number: 21
Announcements Exam 2 Tuesday, 11/23/2004
Review Multiplication Real numbers IEEE standard 754 (floating point) Single precision Double precision Decimal to floating point conversion Limitations
Outline MIPS implementation Subset of the MIPS instruction set Data flow Control
Performance Perspective Performance of a machine is determined by Instruction count Clock cycle time Clock cycles per instruction Processor design (datapath and control) determines CPI Inst. Count Cycle Time
Processor Design 101 MIPS implementation Subset of the MIPS instruction set Data flow 5 Steps to design a processor 1) Analyze instruction set => datapath requirements 2) Select set of datapath components and establish clocking methodology 3) Assemble datapath meeting the requirements 4) Determine location and setting of control points 5) Assemble the control logic
Processor Design 101 Step 1 Analyze instruction set => datapath requirements Meaning of each instruction is given by the register transfers Datapath must include storage element for ISA registers Possibly more Datapath must support each register transfer
Instruction formats: MIPS All MIPS instructions are 32 bits long Instruction formats R-type I-type J-type op rs rt rd shamt funct 6 11 16 21 26 31 6 bits 5 bits op rs rt immediate 16 21 26 31 6 bits 16 bits 5 bits op target address 26 31 6 bits 26 bits
Instruction formats: MIPS Different fields op: operation of the instruction rs, rt, rd: the source and destination register specifiers shamt: shift amount funct: selects the variant of the operation in the “op” field address / immediate: address offset or immediate value target address: target address of the jump instruction
MIPS-lite Subset ADD and SUB OR Immediate addU rd, rs, rt subU rd, rs, rt OR Immediate ori rt, rs, imm16 op rs rt rd shamt funct 6 11 16 21 26 31 6 bits 5 bits op rs rt immediate 16 21 26 31 6 bits 16 bits 5 bits
MIPS-lite Subset LOAD and STORE Word BRANCH lw rt, rs, imm16 sw rt, rs, imm16 BRANCH beq rs, rt, imm16 op rs rt immediate 16 21 26 31 6 bits 16 bits 5 bits op rs rt immediate 16 21 26 31 6 bits 16 bits 5 bits
Logical Register Transfers RTL gives the meaning of the instructions All start by fetching the instruction op | rs | rt | rd | shamt | funct = MEM[ PC ] op | rs | rt | Imm16 = MEM[ PC ] inst Register Transfers ADDU R[rd] <– R[rs] + R[rt]; PC <– PC + 4 SUBU R[rd] <– R[rs] – R[rt]; PC <– PC + 4 ORi R[rt] <– R[rs] + zero_ext(Imm16); PC <– PC + 4 LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4 STORE MEM[ R[rs] + sign_ext(Imm16) ] <– R[rt]; PC <– PC + 4 BEQ if ( R[rs] == R[rt] ) then PC <– PC + sign_ext(Imm16)] || 00 else PC <– PC + 4
Instruction Set Requirements Memory Instruction & data Registers (32 x 32) read RS, read RT Write RT or RD PC Extender Zero and sign Add and Sub register or extended immediate Add 4 or extended immediate to PC
The Processor Design steps completed Next step: 1) Analyze instruction set => datapath requirements Next step: 2) Select set of datapath components and establish clocking methodology
Components of the Datapath Combinational Elements Elements that operate on data values Sequential Elements Elements that contain memory also called state Restore a units state by restoring previous state values Storage elements Clocking methodology An edge triggered methodology
Clocking Typical execution: Read contents of some state elements Send values through some combinational logic Write results to one or more state elements Clock cycle time determined by how long it takes for signals to reach state element 2 C l o c k y e S t a m n 1 b i g 2
Basic Building Blocks Combinational Logic Elements Adder MUX ADDER MUX Carry Out Sum Carry In A B 32 ADDER Y 32 A B Select MUX
Basic Building Blocks Combinational Logic Elements ALU ALU Operation A 32 ALU Result
Basic Building Blocks Storage Element: Register Similar to the D Flip Flop except N-bit input and output Write Enable input Write Enable negated (0): Data Out will not change asserted (1): Data Out will become Data In N Data Out Data In Write Enable Clock
Basic Building Blocks Storage Element: Register File MIPS 32 general purpose 32-bit registers stored in a structure called a register file any register can be read or written to specify the register number in the file How will the register file be used?
Basic Building Blocks R-type instructions Read two registers Register numbers specified in the instruction Two inputs specifying register numbers to read Two outputs corresponding to the two read registers M u x Register N-2 Read data 1 Read Register Number 1 Register 0 Register 1 ... Register N-1 Read data 2 Read Register Number 2
Basic Building Blocks R-type instructions Register file Input side n - 1 d e c r R g i s – C D u m b W a
Basic Building Blocks Register File R-type instructions Perform an ALU operation Store result Third input specifying the register number to write Fourth input for the data to store Control signal enables write into a register Clock input (CLK) A factor ONLY during write operation Behaves as a combinational logic block during read R e a d r g i s t n u m b 1 2 f l W Clk
Basic Building Blocks Memory (idealized) Select memory word One input bus: Data In One output bus: Data Out Select memory word Address selects the word to put on Data Out Write Enable = activated Address selects the memory word to be written Word is on Data In bus Clock input (CLK) a factor ONLY during write operation 32 Data Out Data In Write Enable Clock Address
The Processor All MIPS instructions use the ALU after reading the registers Why? Memory-reference? Effective address must be computed Arithmetic? Operation execution Control flow? Comparison
The Processor Instruction completion differs among classes after using the ALU Memory-reference Access memory write data for a store read data for a load Arithmetic Write the ALU data into a register Control flow May need to modify the address in the PC
The Processor Design steps completed Next step: 1) Analyze instruction set => datapath requirements 2) Select set of datapath components and establish clocking methodology Next step: 3) Assemble datapath meeting the requirements
Implementation Details Abstract / Simplified / High-Level View
Fetch Instruction Functional Unit Fetch the Instruction: mem[PC] Update the program counter: Sequential Code: PC <- PC + 4 Branch and Jump: PC <- “some other address” PC Adder Sum 4 Instruction address memory 32
Fetch Instruction Functional Unit Fetch the Instruction: mem[PC] Update the program counter: Sequential Code: PC <- PC + 4 Branch and Jump: PC <- “some other address” PC Adder Sum 4 Instruction address memory 32
Arithmetic Instructions addU rd, rs, rt R[rd] <- R[rs] op R[rt] Ra, Rb, and Rw come from instruction’s rs, rt, and rd fields ALUctr and RegWr Control logic after decoding the instruction
Arithmetic Instructions op rs rt rd shamt funct 6 11 16 21 26 31 6 bits 5 bits ALUctr Rs Read register number 1 5 Read data 1 A ALU Read register number 2 32 Rt Result 5 32 Write register Rd Read data 2 5 B 32 Write data RegWr Clk 32
Register-Register Timing Clk PC ALUctr RegWr busA, B Result Rs, Rt, Rd, Op, Func Clk-to-Q Instruction Memory Access Time Old Value New Value Delay through Control Logic Register File Access Time ALU Delay Register Write Occurs Here Read register number 1 RegWr Clk number 2 Write register data Read data 1 data 2 5 32 Rs Rt Rd ALUctr A B ALU Result
Logical Operations with Immediate ori rt, rs, imm16 R[rt] <- R[rs] op ZeroExt[imm16] ] Destination register can be either rt or rd Select the correct instruction field for destination register address 11 op rs rt immediate 16 21 26 31 6 bits 16 bits 5 bits rd? immediate 16 15 31 16 bits 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Logical Operations with Immediate R[rt] <- R[rs] op ZeroExt[imm16] ] Incorporate zero extension unit Add zero extended 16-bit immediate operand B input of ALU Read Data 2 Zero extended 16-bit immediate operand
Logical Operations with Immediate Read register number 1 RegWr Clk number 2 Write register data Read data 1 data 2 5 32 Rs Rt Rd ALUctr A B ALU Result RegDst MUX ALUsrc 0 Ext