Chapter 5 Instructor: Mozafar Bag-Mohammadi Spring 2010 Ilam University.

Slides:



Advertisements
Similar presentations
Microprocessor Design Multi-cycle Datapath Nia S. Bradley Vijay.
Advertisements

1 Chapter Five The Processor: Datapath and Control.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 14 - Multi-Cycle.
Chapter 5 The Processor: Datapath and Control Basic MIPS Architecture Homework 2 due October 28 th. Project Designs due October 28 th. Project Reports.
EECE476 Lecture 9: Multi-cycle CPU Datapath Chapter 5: Section 5.5 The University of British ColumbiaEECE 476© 2005 Guy Lemieux.
Levels in Processor Design
©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)
EECE476 Lectures 10: Multi-cycle CPU Control Chapter 5: Section 5.5 The University of British ColumbiaEECE 476© 2005 Guy Lemieux.
Preparation for Midterm Binary Data Storage (integer, char, float pt) and Operations, Logic, Flip Flops, Switch Debouncing, Timing, Synchronous / Asynchronous.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Advanced Computer Architecture 5MD00 / 5Z033 MIPS Design data path and control Henk Corporaal TUEindhoven 2007.
331 W10.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s.
CSE431 L05 Basic MIPS Architecture.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 05: Basic MIPS Architecture Review Mary Jane Irwin.
Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
CMPE 421 Advanced Computer Architecture Supplementary material for Pipelining PART1.
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
Datapath and Control: MultiCycle Implementation. Performance of Single Cycle Machines °Assume following operation times: Memory units : 200 ps ALU and.
CPE232 Basic MIPS Architecture1 Computer Organization Multi-cycle Approach Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides
1 CS/COE0447 Computer Organization & Assembly Language Multi-Cycle Execution.
C HAPTER 5 T HE PROCESSOR : D ATAPATH AND C ONTROL M ULTICYCLE D ESIGN.
1 Processor: Datapath and Control Single cycle processor –Datapath and Control Multicycle processor –Datapath and Control Microprogramming –Vertical and.
Multicycle Implementation
IT 251 Computer Organization and Architecture Multi Cycle CPU Datapath Chia-Chi Teng.
ECE-C355 Computer Structures Winter 2008 The MIPS Datapath Slides have been adapted from Prof. Mary Jane Irwin ( )
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
Fall 2015, Sep ELEC / Lecture 5 1 ELEC / Computer Architecture and Design Fall 2015 Datapath and Control (Chapter.
ECE/CS 552: Data Path and Control Instructor:Mikko H Lipasti Fall 2010 University of Wisconsin-Madison Lecture notes based on set created by Mark Hill.
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 2.
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 10: Control Design
Chapter 4 From: Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
1 The final datapath. 2 Control  The control unit is responsible for setting all the control signals so that each instruction is executed properly. —The.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
Design a MIPS Processor (II)
ECE/CS 552: Single Cycle Control Path
Multi-Cycle Datapath and Control
Chapter 5: A Multi-Cycle CPU.
IT 251 Computer Organization and Architecture
/ Computer Architecture and Design
ECE/CS 552: Multicycle Data Path
Systems Architecture I
Multi-Cycle CPU.
Extensions to the Multicycle CPU
Single Cycle Processor
D.4 Finite State Diagram for the Multi-cycle processor
Multi-Cycle CPU.
Basic MIPS Architecture
CS/COE0447 Computer Organization & Assembly Language
Multiple Cycle Implementation of MIPS-Lite CPU
ECE/CS 552: Single Cycle Datapath
Chapter Five The Processor: Datapath and Control
Multicycle Approach Break up the instructions into steps
The Multicycle Implementation
Processor: Datapath and Control (part 2)
Chapter Five The Processor: Datapath and Control
The Multicycle Implementation
Systems Architecture I
Vishwani D. Agrawal James J. Danaher Professor
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
Instructor: Mozafar Bag-Mohammadi Spring 2006 University of Ilam
Processor: Multi-Cycle Datapath & Control
Multi-Cycle Datapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Chapter Four The Processor: Datapath and Control
Control Unit for Multiple Cycle Implementation
5.5 A Multicycle Implementation
Systems Architecture I
Control Unit for Multiple Cycle Implementation
FloorPlan for Multicycle MIPS
CS161 – Design and Architecture of Computer Systems
Presentation transcript:

Chapter 5 Instructor: Mozafar Bag-Mohammadi Spring 2010 Ilam University

Processor Implementation Sequential logic design review (brief) Clock methodology (FSD) Datapath – 1 CPI Single instruction, 2’s complement, unsigned Control Multiple cycle implementation (information only) Microprogramming Exceptions

Review: Sequential Logic Logic is combinational if output is solely function of inputs E.g. ALU of previous lecture Logic is sequential or “has state” if output function of: Past and current inputs Past inputs remembered in “state”

Review: Sequential Logic Clock high, Q = D, ~Q = ~D after prop. delay Clock low Q, ~Q remain unchanged Level-sensitive latch

Review: Sequential Logic E.g. Master/Slave D flip-flop While clock high, Q M follows D, but Q S holds At falling edge Q M propagates to Q S

Review: Sequential Logic Can build: Why can this fail for a latch? D FF +1

Clocking Methodology Motivation Design data and control without considering clock Use Fully Synchronous Design (FSD) Just a convention to simplify design process Restricts design freedom Eliminates complexity, can guarantee timing correctness Not really feasible in real designs Even in this course you will violate FSD

Our Methodology Only flip-flops All on the same edge (e.g. falling) All with same clock No need to draw clock signals All logic finishes in one cycle

Our Methodology, cont’d No clock gating! Book has bad examples Correct design:

Datapath – 1 CPI Assumption: get whole instruction done in one long cycle Instructions: add, sub, and, or slt, lw, sw, & beq To do For each instruction type Putting it all together

Fetch Instructions Fetch instruction, then increment PC Same for all types Assumes PC updated every cycle No branches or jumps After this instruction fetch next one

ALU Instructions and $1, $2, $3 # $1 <= $2 & $3 E.g. MIPS R-format Opcodersrtrdshamtfunction

Load/Store Instructions lw $1, immed($2) # $1 <= M[SE(immed)+$2] E.g. MIPS I-format: Opcodertrtimmed 65516

Branch Instructions beq $1, $2, addr # if ($1==$2) PC = PC + addr<<2 Actually newPC = PC + 4 target = newPC + addr << 2 # in MIPS offset from newPC if (($1 - $2) == 0) PC = target else PC = newPC

Branch Instructions

All Together

Control Overview Single-cycle implementation Datapath: combinational logic, I-mem, regs, D-mem, PC Last three written at end of cycle Need control – just combinational logic! Inputs: Instruction (I-mem out) Zero (for beq) Outputs: Control lines for muxes ALUop Write-enables

Control Overview Fast control Divide up work on “need to know” basis Logic with fewer inputs is faster E.g. Global control need not know which ALUop

ALU Control Assume ALU uses 000and 001or 010add 110sub 111slt (set less than) othersdon’t care

ALU Control ALU-ctrl = f(opcode,function) To simplify ALU-ctrl ALUop = f(opcode) 2 bits6 bits InstructionOperationOpcodeFunction add sub and or slt lwadd100011xxxxxx swadd101011xxxxxx beqsub

ALU Control ALU-ctrl = f(ALUop, function) 3 bits2 bits6 bits Requires only five gates plus inverters 10add, sub, and, … 00lw, sw 01beq

Control Signals Needed (5.19)

Global Control R-format: opcodersrtrdshamtfunction I-format: opcodersrtaddress/immediate J-format: opcodeaddress 626

Global Control Route instruction[25:21] as read reg1 spec Route instruction[20:16] are read reg2 spec Route instruction[20:16] (load) and instruction[15:11] (others) to Write reg mux Call instruction[31:26] op[5:0]

Global Control Global control outputs ALU-ctrl- see above ALU src- R-format, beq vs. ld/st MemRead- lw MemWrite- sw MemtoReg- lw RegDst- lw dst in bits 20:16, not 15:11 RegWrite- all but beq and sw PCSrc- beq taken

Global Control Global control outputs Replace PCsrc with Branch beq PCSrc = Branch * Zero What are the inputs needed to determine above global control signals? Just Op[5:0]

Global Control (Fig. 5.20) RegDst = ~Op[0] ALUSrc = Op[0] RegWrite = ~Op[3] * ~Op[2] InstructionOpcodeRegDstALUSrc rrr lw sw101011x1 beq000100x0 ???othersxx

Global Control More complex with entire MIPS ISA Need more systematic structure Want to share gates between control signals Common solution: PLA MIPS opcode space designed to minimize PLA inputs, minterms, and outputs See MIPS Opcode map (Fig A.19)

Control Signals; Add Jumps

Control Signals w/Jumps (5.29)

What’s wrong with single cycle? Critical path probably lw: I-mem, reg-read, alu, d-mem, reg-write Other instructions faster E.g. rrr: skip d-mem Instruction variation much worse for full ISA and real implementation: FP divide Cache misses (what the heck is this? – chapter 7) Instructions Cycles Program Instruction Time Cycle (code size) XX (CPI) (cycle time)

Single Cycle Implementation Solution Variable clock? Too hard to control, design Fixed short clock Variable cycles per instruction

Multi-cycle Implementation Clock cycle = max(i-mem,reg-read+reg-write, ALU, d-mem) Reuse combination logic on different cycles One memory One ALU without other adders But Control is more complex Need new registers to save values (e.g. IR) Used again on later cycles Logic that computes signals is reused

High-level Multi-cycle Data- path Note: Instruction register, memory data register One memory with address bus One ALU with ALUOut register

Comment on busses Share wires to reduce #signals Distributed multiplexor Multiple sources driving one bus Ensure only one is active!

Multi-cycle Ctrl Signals (Fig 5.32)

Multi-cycle Datapath

Multi-cycle Steps StepDescriptionSample Actions IFFetch IR=MEM[PC] PC=PC+4 IDDecode A=RF(IR[25:21]) B=RF(IR[20:16]) ALUout=PC+SE(IR[15:0] << 2) EXExecute ALUout = A + SE(IR[15:0]) # lw/sw ALUout = A op B # rrr if (A==B) PC = ALUout # beq MemMemory MEM[ALUout] = B # sw MDR = MEM[ALUout] #lw RF(IR[15:11]) = ALUout # rrr WBWriteback Reg(IR[20:16]) = MDR # lw

Multi-cycle Control Function of Op[5:0] and current step Defined as Finite State Machine (FSM) or Micro-program or microcode (later) FSM – App. B State is combination of step and which path outputs Next State Next State Fn Output Fn Current state Inputs

Finite State Machine (FSM) For each state, define: Control signals for datapath for this cycle Control signals to determine next state All instructions start in same IF state Instructions terminate by making IF next After proper PC update

Multi-cycle Example (and) PCWrite PCSource = 10 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond PCSource = 01 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10 MemRead IorD = 1 MemWrite IorD = 1 RegDst = 1 RegWrite MemtoReg = 0 RegDst = 0 RegWrite MemtoReg = 1 Start MemRead ALUSrcA=0 IorD = 0 IRWrite ALUSrcB = 01 ALUOp = 00 PCWrite PCSrc = 00 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 LW | SW RRR BEQ J LW SW IFID EX MEM WB

Multi-cycle Example (and)

Nuts and Bolts--More on FSMs You will be using FSM control for your processor implementation You will be producing the state machine and control outputs from binary (ISA/datapath controls) There are multiple methods for specifying a state machine Moore machine (output is function of state only) Mealy machine (output is function of state/input) There are different methods of assigning states

FSMs--State Assignment State assignment is converting logical states to binary representation Is state assignment interesting/important? Judicious choice of state representation can make next state fcn/output fcn have fewer gates Optimal solution is hard, but having intuition is helpful (CAD tools can also help in practice)

State Assignment--Example 10 states in multicycle control FSM Each state can have 1 of 16 (2^4) encodings with “dense” state representation Any choice of encoding is fine functionally as long as all states are unique Appendix C-26 example: RegWrite signal

State Assignment, RegWrite Signal PCWrite PCSource = 10 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond PCSource = 01 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10 MemRead IorD = 1 MemWrite IorD = 1 RegDst = 1 RegWrite MemtoReg = 0 RegDst = 0 RegWrite MemtoReg = 1 Start MemRead ALUSrcA=0 IorD = 0 IRWrite ALUSrcB = 01 ALUOp = 00 PCWrite PCSrc = 00 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 LW | SW RRR BEQ J LW SW IFID EX MEM WB State 0 State 1 State 2 State 6 State 8State 9 State 3 State 5 State 7 (0111b) State 4 (0100b) Original: 2 inverters, 2 and3s, 1 or2 State 8 (1000b) State 4 State 7 State 9 (1001b) New: No gates--just bit 3!

Multi-cycle Example (lw) PCWrite PCSource = 10 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond PCSource = 01 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10 MemRead IorD = 1 MemWrite IorD = 1 RegDst = 1 RegWrite MemtoReg = 0 RegDst = 0 RegWrite MemtoReg = 1 Start MemRead ALUSrcA=0 IorD = 0 IRWrite ALUSrcB = 01 ALUOp = 00 PCWrite PCSrc = 00 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 LW | SW RRR BEQ J LW SW IFID EX MEM WB

Multi-cycle Example (lw)

Multi-cycle Example (sw) PCWrite PCSource = 10 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond PCSource = 01 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10 MemRead IorD = 1 MemWrite IorD = 1 RegDst = 1 RegWrite MemtoReg = 0 RegDst = 0 RegWrite MemtoReg = 1 Start MemRead ALUSrcA=0 IorD = 0 IRWrite ALUSrcB = 01 ALUOp = 00 PCWrite PCSrc = 00 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 LW | SW RRR BEQ J LW SW IFID EX MEM WB

Multi-cycle Example (sw)

Multi-cycle Example (beq T) PCWrite PCSource = 10 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond PCSource = 01 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10 MemRead IorD = 1 MemWrite IorD = 1 RegDst = 1 RegWrite MemtoReg = 0 RegDst = 0 RegWrite MemtoReg = 1 Start MemRead ALUSrcA=0 IorD = 0 IRWrite ALUSrcB = 01 ALUOp = 00 PCWrite PCSrc = 00 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 LW | SW RRR BEQ J LW SW IFID EX MEM WB

Multi-cycle Example (beq T)

Multi-cycle Example (beq NT)

Multi-cycle Example (j) PCWrite PCSource = 10 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond PCSource = 01 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10 MemRead IorD = 1 MemWrite IorD = 1 RegDst = 1 RegWrite MemtoReg = 0 RegDst = 0 RegWrite MemtoReg = 1 Start MemRead ALUSrcA=0 IorD = 0 IRWrite ALUSrcB = 01 ALUOp = 00 PCWrite PCSrc = 00 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 LW | SW RRR BEQ J LW SW IFID EX MEM WB

Multi-cycle Example (j)

Summary Processor implementation Datapath Control Single cycle implementation Next: microprogramming