Processor (I).

Slides:



Advertisements
Similar presentations
331 W08.1Spring :332:331 Computer Architecture and Assembly Language Spring 2006 Week 8: Datapath Design [Adapted from Dave Patterson’s UCB CS152.
Advertisements

The Processor: Datapath & Control
1  1998 Morgan Kaufmann Publishers Chapter Five The Processor: Datapath and Control.
1 Chapter Five. 2 We're ready to look at an implementation of the MIPS Simplified to contain only: –memory-reference instructions: lw, sw –arithmetic-logical.
Computer Structure - Datapath and Control Goal: Design a Datapath  We will design the datapath of a processor that includes a subset of the MIPS instruction.
The Processor 2 Andreas Klappenecker CPSC321 Computer Architecture.
Chapter Five The Processor: Datapath and Control.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 25 CPU design (of a single-cycle CPU) Intel is prototyping circuits that.
Processor I CPSC 321 Andreas Klappenecker. Midterm 1 Thursday, October 7, during the regular class time Covers all material up to that point History MIPS.
The Processor Andreas Klappenecker CPSC321 Computer Architecture.
The Processor: Datapath & Control. Implementing Instructions Simplified instruction set memory-reference instructions: lw, sw arithmetic-logical instructions:
Chapter 4 Sections 4.1 – 4.4 Appendix D.1 and D.2 Dr. Iyad F. Jafar Basic MIPS Architecture: Single-Cycle Datapath and Control.
Computing Systems The Processor: Datapath and Control.
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
Processor: Datapath and Control
Lec 15Systems Architecture1 Systems Architecture Lecture 15: A Simple Implementation of MIPS Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some.
Computer Organization CS224 Fall 2012 Lesson 22. The Big Picture  The Five Classic Components of a Computer  Chapter 4 Topic: Processor Design Control.
EEM 486: Computer Architecture Designing a Single Cycle Datapath.
Computer Architecture and Design – ECEN 350 Part 6 [Some slides adapted from A. Sprintson, M. Irwin, D. Paterson and others]
1  2004 Morgan Kaufmann Publishers Chapter Five.
1  1998 Morgan Kaufmann Publishers Simple Implementation Include the functional units we need for each instruction Why do we need this stuff?
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
COM181 Computer Hardware Lecture 6: The MIPs CPU.
May 22, 2000Systems Architecture I1 Systems Architecture I (CS ) Lecture 14: A Simple Implementation of MIPS * Jeremy R. Johnson Mon. May 17, 2000.
1. 2 MIPS Hardware Implementation Full die photograph of the MIPS R2000 RISC Microprocessor. The 1986 MIPS R2000 with five pipeline stages and 450,000.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
Access the Instruction from Memory
CS Computer Architecture Week 10: Single Cycle Implementation
CS161 – Design and Architecture of Computer Systems
CS161 – Design and Architecture of Computer Systems
Single-Cycle Datapath and Control
Morgan Kaufmann Publishers
IT 251 Computer Organization and Architecture
Introduction CPU performance factors
(Chapter 5: Hennessy and Patterson) Winter Quarter 1998 Chris Myers
Morgan Kaufmann Publishers
CS/COE0447 Computer Organization & Assembly Language
Chapter Five.
MIPS processor continued
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath Start: X:40.
Designing MIPS Processor (Single-Cycle) Presentation G
Single Cycle CPU Design
Single-Cycle CPU DataPath.
MIPS Processor.
Chapter Five The Processor: Datapath and Control
Levels in Processor Design
Topic 5: Processor Architecture Implementation Methodology
Rocky K. C. Chang 6 November 2017
Composing the Elements
CS152 Computer Architecture and Engineering Lecture 8 Designing a Single Cycle Datapath Start: X:40.
The Processor Lecture 3.2: Building a Datapath with Control
The Processor Lecture 3.1: Introduction & Logic Design Conventions
Topic 5: Processor Architecture
Systems Architecture I
COMS 361 Computer Organization
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
Lecture 14: Single Cycle MIPS Processor
Processor: Multi-Cycle Datapath & Control
Simple Implementation
Access the Instruction from Memory
Chapter Four The Processor: Datapath and Control
The Processor: Datapath & Control.
COMS 361 Computer Organization
What You Will Learn In Next Few Sets of Lectures
Designing a Single-Cycle Processor
MIPS Processor.
Processor: Datapath and Control
Presentation transcript:

Processor (I)

The Processor: Datapath & Control We're ready to look at an implementation of the MIPS Simplified ISA to contain only: Memory-reference instructions: lw, sw Arithmetic-logical instructions: add, sub, and, or, slt Control flow instructions: beq, j Generic implementation: Use the program counter (PC) to supply instruction address Get the instruction from memory Read registers Use the instruction to decide exactly what to do All instructions use the ALU after reading the registers Memory-reference? arithmetic? control flow?

More Implementation Details Abstract / Simplified View: Five steps to execute an instruction Fetch, Decode, Execute, Memory, Writeback add ALU Registers 4 PC Instruction memory address instruction data reg# Data

State Elements State elements are needed to contain the state (value) Clocks used in synchronous logic When should an element that contains state be updated? Unclocked state element: set-reset latch C l o c k p e r i d R s n g F a (cycle time) R S Q S R Output 0 0 No change 1 0 Q = 1 0 1 Q = 0 1 1 Invalid

D-latch vs. D flip-flop D-latch D flip-flop Output changes only on the clock edge Q C D _ D C Q D C l a t c h Q

Our Implementation An edge triggered methodology Typical execution: Read contents of some state elements, Send values through some combinational logic Write results to one or more state elements S t a e l m n 1 2 C o b i g c k y

The Steps of Designing a Processor 1. Instruction Set Architecture used for high-level specification or Register-Transfer Level (RTL) model Includes major organizational decisions Examples: number and type of functional units, number of register file ports 2. Datapath-RTL refined to specify functional unit behavior and interfaces Datapath components Datapath interconnect Associated datapath “control points” 3. Control structure defined and Control-RTL behavioral representation created 4. RTL datapath and control design are refined to track physical design and functional validation Changes made for timing and bug fixes Amount of work varies with capabilities of CAD tools and degree of optimization for cost and performance

Example RTL add rd, rs, rt lw rt, imm16(rs) Instruction  mem[PC]; Fetch instruction from memory R[rd]  R[rs] + R[rt]; ADD operation PC  PC + 4; Calculate next address lw rt, imm16(rs) Addr  R[rs] + SignExt(imm16); Compute memory Addr R[rt]  Mem[Addr]; Load data into register op rs rt rd shamt funct op rs rt imm16

Combinational Logic Elements Combinational logic does not use a clock Adder MUX ALU Adder 32 A B Sum Carry CarryIn 32 A B Y Select MUX ALU 32 A B Result OP 3

Logic Abstraction Make sure you understand the abstractions! Simpler version is easier to understand than actual implementation M u x S e l c t B 3 1 A C … M u x C S e l c t 3 2 B A

Register File Built using D flip-flops Data ports Select register by Two read ports connected to two 32-bit buses One write port connected to on 32-bit bus Select register by Read register 1, Read register 2, Write register Register File Read Data 1 Data 2 Read register 2 Read register 1 Write register Write data Write

Register File: Read Ports Two read ports Two source operands are read from two ports R e a d r g i s t r 1 1 . n – 2 M u x r 2 Do you understand? What is the “Mux” above?

Register File: Write Port We still use the real clock to determine when to write 1 n - t o 2 d e c r – R g i s C D . write register write write data

Storage Element: Memory Two ports and buses for memory One input bus (Data In) connected to one input port (Write data) One output bus (Data Out) connected to one output port (Read data) Memory word is selected by Address If MemRead = 1 then Address selects the word to put on Data Out If MemWrite = 1 then Address selects the memory word to be written via the Data In bus MemRead Data In Memory MemWrite 32 DataOut Address Read data Write

Instruction Fetch Unit Common RTL operations Fetch the Instruction: Instruction  mem[PC] Update the program counter: Sequential Code: PC  PC + 4 Branch and Jump: PC  ”something else” Instruction Memory

R-type ALU Operations Register File Read Data 1 Data 2 Read register 2 Read register 1 Write register Write data Write A L U o p e r a t i o n 4 Instruction Z e r o A L U op rs rt rd shamt funct A L U r e s u l t Registers[Write register]  ALU operation (Read data 1, Read data 2) R e g W r i t e

Load & Store R e a d r g i s t 1 2 W D A L U Z o u l p n 4 Instruction m y M S x 6 3 (=add) op rs rt imm16 Addresses are calculated from ALU (ALU operation is add) Load ALU result  ALU operation(Read data1, Sign extend(Instruction[0:15])) Registers[Write register]  MEM[ALU result] Store MEM[ALU result]  Read data 2

Branch imm16 rt rs op Zero  ALU operation (Read data 1, Read data 2) 4 (=subtract) op rs rt imm16 Condition is calculated from ALU (ALU operation is subtract) Zero  ALU operation (Read data 1, Read data 2) Branch target  Add(PC+4, Shift left2(Sign extend(Instruction[0:15]))) Branch control  Zero Depending on condition select appropriate target address Even for fall-through case, ALU needs to calculate target address for taken case

Simple Implementation Include the functional units we need for each instruction P C I n s t r u c i o a d e m y A S . b g A d r e s R a t D m o y . u n i W M b S g - x 1 6 3 2 R e a d r g i s t 1 2 W D A L U Z o u l . b 5 n m p 4

Building the Datapath Use MUXs to stitch them together R e a d r g i s 1 2 W A L U Z o M m P C S c p n 4 x 6 3 I u l D y h f Use MUXs to stitch them together

Control Selecting the operations to perform (ALU, read/write, etc.) Controlling the flow of data (multiplexor inputs) Information comes from the 32 bits of the instruction Example: add $8, $17, $18 ALU's operation based on instruction type and function code 000000 10001 10010 01000 00000 100000 op rs rt rd shamt funct

ALU Control What should the ALU do with this instruction Example: lw $1, 100($2) ALU control input 0000 AND 0001 OR 0010 add 0110 subtract 0111 set-on-less-than 1100 NOR Why is the code for subtract 0110 and not 0011? 35 2 1 100 op rs rt 16 bit offset

ALU Control (cont’d) Must describe hardware to compute 4-bit ALU control input Given instruction type (opcode) decides the ALUOp ALUOp = 00 (lw, sw instructions) 01 (beq instruction) 10 (all arithmetic-logical instructions) Function code is meaningful only for arithmetic type (10) Describe it using a truth table (can turn into gates): Input bits output bits

Combinational Logic for ALU Control 00 (lw, sw instructions) 01 (beq instruction) 10 (arithmetic-logical) O p e r a t i o n 2 1 A L U F 3 ( 5 – ) c l b k 0000 AND 0001 OR 0010 add 0110 subtract 0111 slt 1100 NOR

Main Control Instruction encoding determines the main control Arithmetic-logical instructions: add, sub, and, or, slt Memory instructions & branch: lw, sw, beq Jump instructions: j op rs rt rd shamt funct R-type 31:26 25:21 20:16 15:11 10:6 5:0 op rs rt 16 bit address I-type 31:26 25:21 20:16 15:0 lw: rt <- Mem[rs + Imm16] – the only case for writing a register op 26 bit address J-type 31:26 25:0

R e a d r g i s t 1 2 W A L U Z o S n x 6 3 I u c [ – ] l M D m y h f 4 P C 5 B O p

Control R - type I w s b e q O p 1 2 3 4 5 n u t g D A L U S r c M m o 1 2 3 4 5 n u t g D A L U S r c M m o W i a d B h

Implementing Jump Unconditional jump to target address Target address is calculated by using pseudo-direct addressing PCnext = (PC+4)[31-28] + (Instruction[25-0] << 2) Additional logic to calculate the target address (Figure 5.24 in textbook) Reuse adder for PC+4 Extra shifter: to calculate << 2 operation Extra mux: to select the target address op 26 bit address J-type 31:26 25:0

Our Simple Control Structure All of the logic is combinational Wait for everything to settle down and right thing to be done ALU might not produce “right answer” right away We use write signals along with clock to determine when to write Cycle time determined by length of the longest path S t a e l m n 1 2 C o b i g c k y We are ignoring some details like setup and hold times

Single Cycle Implementation (example) Calculate cycle time under the following condition Operation time of major units (assume no delay for the others) Memory (200ps), ALU and adders (100ps), Register file access (50ps) Compare with a machine with variable clock cycles Assume the following instruction mix: load(25%), store(10%), ALU instr(45%), branch(15%), jump(5%) lw: 200(instruction fetch) + 50(register) + 100(adder) + 200(memory read) +50(register write)

Where we are headed Single Cycle Problems: One Solution: What if we had a more complicated instruction like floating point? Wasteful of area One Solution: Use a “smaller” cycle time Have different instructions take different numbers of cycles A “multicycle” datapath: D a t R e g i s r # P C A d I n u c o M m y L U O B