COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Slides:



Advertisements
Similar presentations
Adding the Jump Instruction
Advertisements

CIS 314 Fall 2005 MIPS Datapath (Single Cycle and Multi-Cycle)
1 Chapter Five The Processor: Datapath and Control.
CS-447– Computer Architecture Lecture 12 Multiple Cycle Datapath
The Processor: Datapath & Control
The Processor Data Path & Control Chapter 5 Part 2 - Multi-Clock Cycle Design N. Guydosh 2/29/04.
VHDL Development for ELEC7770 VLSI Project Chris Erickson Graduate Student Department of Electrical and Computer Engineering Auburn University, Auburn,
CSE378 Multicycle impl,.1 Drawbacks of single cycle implementation All instructions take the same time although –some instructions are longer than others;
Fall 2007 MIPS Datapath (Single Cycle and Multi-Cycle)
Lec 17 Nov 2 Chapter 4 – CPU design data path design control logic design single-cycle CPU performance limitations of single cycle CPU multi-cycle CPU.
Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr CS-447– Computer Architecture.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Computer Structure - Datapath and Control Goal: Design a Datapath  We will design the datapath of a processor that includes a subset of the MIPS instruction.
Lecture 16: Basic CPU Design
The Processor: Datapath & Control. Implementing Instructions Simplified instruction set memory-reference instructions: lw, sw arithmetic-logical instructions:
Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
Chapter 4 Sections 4.1 – 4.4 Appendix D.1 and D.2 Dr. Iyad F. Jafar Basic MIPS Architecture: Single-Cycle Datapath and Control.
Computing Systems The Processor: Datapath and Control.
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
1 COMP541 Multicycle MIPS Montek Singh Apr 4, 2012.
EE204 L12-Single Cycle DP PerformanceHina Anwar Khan EE204 Computer Architecture Single Cycle Data path Performance.
CPE232 Basic MIPS Architecture1 Computer Organization Multi-cycle Approach Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides
1 CS/COE0447 Computer Organization & Assembly Language Multi-Cycle Execution.
ECE 445 – Computer Organization
COMP541 Datapaths II & Single-Cycle MIPS
CDA 3101 Fall 2013 Introduction to Computer Organization
1 COMP541 Pipelined MIPS Montek Singh Mar 30, 2010.
CDA 3101 Fall 2013 Introduction to Computer Organization Multicycle Datapath 9 October 2013.
Computer Architecture and Design – ECEN 350 Part 6 [Some slides adapted from A. Sprintson, M. Irwin, D. Paterson and others]
Datapath and Control Unit Design
1 A single-cycle MIPS processor  An instruction set architecture is an interface that defines the hardware operations which are available to software.
1 COMP541 Datapaths II & Control I Montek Singh Mar 22, 2010.
1 Processor: Datapath and Control Single cycle processor –Datapath and Control Multicycle processor –Datapath and Control Microprogramming –Vertical and.
COMP541 Multicycle MIPS Montek Singh Mar 25, 2010.
1 COMP541 Pipelined MIPS Montek Singh Apr 9, 2012.
Copyright © 2007 Elsevier Digital Design and Computer Architecture David Money Harris and Sarah L. Harris.
LECTURE 6 Multi-Cycle Datapath and Control. SINGLE-CYCLE IMPLEMENTATION As we’ve seen, single-cycle implementation, although easy to implement, could.
ECE-C355 Computer Structures Winter 2008 The MIPS Datapath Slides have been adapted from Prof. Mary Jane Irwin ( )
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
Elements of Datapath for the fetch and increment The first element we need: a memory unit to store the instructions of a program and supply instructions.
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
COM181 Computer Hardware Lecture 6: The MIPs CPU.
MIPS Processor.
Lecture 9. MIPS Processor Design – Single-Cycle Processor Design Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.
Chapter 4 From: Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 3.
Lecture 5. MIPS Processor Design
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
Chapter 7 Digital Design and Computer Architecture, 2 nd Edition Chapter 7 David Money Harris and Sarah L. Harris.
Multi-Cycle Datapath and Control
Exceptions Another form of control hazard Could be caused by…
COMP541 Datapaths I Montek Singh Mar 28, 2012.
Morgan Kaufmann Publishers
Design of the Control Unit for Single-Cycle Instruction Execution
Design of the Control Unit for One-cycle Instruction Execution
MIPS Processor.
MIPS Microarchitecture Multicycle Processor
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
Processor: Multi-Cycle Datapath & Control
COMP541 Datapaths I Montek Singh Mar 18, 2010.
Review Fig 4.15 page 320 / Fig page 322
Chapter Four The Processor: Datapath and Control
MIPS processor continued
Chapter 7 Microarchitecture
Chapter 7 Microarchitecture
The Processor: Datapath & Control.
COMS 361 Computer Organization
MIPS Processor.
Processor: Datapath and Control
CS161 – Design and Architecture of Computer Systems
Presentation transcript:

COMP541 Multicycle MIPS Montek Singh Apr 8, 2015

Topics Challenges w/ single-cycle MIPS implementation Multicycle MIPS State elements Now add registers between stages How to control Performance

Review: Processor Performance Program execution time Execution Time = (# instructions) (cycles/instruction)(seconds/cycle) = IC x CPI x Tc Definitions: IC = instruction count Cycles/instruction = CPI Seconds/cycle = clock period = Tc 1/CPI = Instructions/cycle = IPC Challenge is to satisfy constraints of: Cost Power Performance

Single-Cycle Performance (textbook version) TC is limited by the critical path (lw) lw is typically the longest instruction

Single-Cycle Performance (textbook version) Single-cycle critical path: Tc = tpcq_PC + tmem + max(tRFread, tsext + tmux) + tALU + tmem + tmux + tRFsetup In most implementations, limiting paths are: memory, ALU, register file. Tc = tpcq_PC + 2tmem + tRFread + tALU + tmux + tRFsetup

Single-Cycle Performance Example Tc = tpcq_PC + 2tmem + tRFread + tALU + tmux + tRFsetup = [30 + 2(250) + 150 + 200 + 25 + 20] ps = 925 ps What’s the max clock frequency?

Single-Cycle Performance Example For a program with 100 billion instructions executing on a single-cycle MIPS processor, Execution Time = # instructions x CPI x TC = (100 × 109)(1)(925 × 10-12 s) = 92.5 seconds

Key idea: Break instruction execution into multiple clock cycles Multicycle MIPS Key idea: Break instruction execution into multiple clock cycles

Multicycle MIPS Processor Single-cycle microarchitecture: + simple cycle time limited by longest instruction (lw) two adders/ALUs and two memories Multicycle microarchitecture: + higher clock speed + simpler instructions run faster + reuse expensive hardware on multiple cycles - sequencing overhead Same design steps: datapath & control

Multicycle State Elements Replace Instruction and Data memories with a single unified memory More realistic (buy one big RAM!) Was not possible in single-cycle implementation both instruction and data accesses needed within same clock cycle Now: Use same memory twice if needed instruction fetch and data access are in distinct clock cycles

Multicycle Datapath: lw instr fetch First consider executing lw STEP 1: Fetch instruction introduce Instruction Register to buffer this instruction a “non-architectural register” not accessible to programmer

Multicycle Datapath: lw register read Read register $rs insert another non-architectural register, A buffers the value of $rs read from register file

Multicycle Datapath: lw immediate Immediate field is sign-extended for consistency, could insert another non-architectural register to buffer SignImm skipped in this version because SignImm is a simple combinational function of Instr, which is already being held in Instruction Register

Multicycle Datapath: lw address ALU computes memory address insert another register to buffer ALUOut

Multicycle Datapath: lw memory read Same memory read now for data access insert a mutiplexer in front of memory’s address input choose either PC or ALUOut as address i.e., either instruction fetch or data access controlled by new control signal IorD

Multicycle Datapath: lw write register Data from memory is written into register file

Multicycle Datapath: increment PC PC incremented by re-using the ALU to do PC + 4 in single-cycle, we had to introduce a dedicated +4 adder in multi-cycle, same ALU used twice, in distinct cycles! Now using main ALU when it is not busy (instead of dedicated adder)

Multicycle Datapath: sw Compared to lw address computation is identical to lw write data in $rt to memory MemWrite will be 1 during the appropriate clock cycle $rt is buffered using nonarchitectural register B

Multicycle Datapath: R-type Instrs. Read from $rs and $rt multiplexers in front of ALU choose $rs and $rt as operands rite ALUResult to register file Write to $rd (instead of $rt) multiplexers in front of write address/data to register file

Multicycle Datapath: beq 2 tasks Determine whether values in rs and rt are equal Calculate branch target address: BTA = (sign-extended immediate << 2) + (PC+4) ALU reused!

Complete Multicycle Processor Caveat: Same differences in functionality w.r.t. our lab version as single-cycle MIPS

Control Unit

Main Controller FSM: Fetch

Main Controller FSM: Fetch Fetch instruction Also increment PC (because ALU not in use) Note: signals only shown when needed and enables only when asserted.

Main Controller FSM: Decode No signals needed for decode Register values also fetched Perhaps will not be used

Main Controller FSM: Address Calculation Now change states depending on instr

Main Controller FSM: Address Calculation For lw or sw, need to compute addr

Main Controller FSM: lw For lw now need to read from memory Then write to register

Main Controller FSM: sw sw just writes to memory One step shorter

Main Controller FSM: R-Type The r-type instructions have two steps: compute result in ALU and write to reg

Main Controller FSM: beq beq needs to use ALU twice, so consumes two cycles One to compute addr Another to decide on eq Can take advantage of decode when ALU not used to compute BTA (no harm if BTA not used)

Complete Multicycle Controller FSM

Main Controller FSM: addi Similar to r-type Add Write back

Main Controller FSM: addi

Extended Functionality: j

Control FSM: j

Control FSM: j

Multicycle Performance Instructions take different number of cycles: 3 cycles: beq, j 4 cycles: R-Type, sw, addi 5 cycles: lw CPI is weighted average SPECINT2000 benchmark: 25% loads 10% stores 11% branches 2% jumps 52% R-type Average CPI = (0.11 + 0.2)(3) + (0.52 + 0.10)(4) + (0.25)(5) = 4.12

Multicycle Performance Multicycle critical path: Tc = tpcq + tmux + max(tALU + tmux, tmem) + tsetup

Multicycle Performance Example Tc = tpcq_PC + tmux + max(tALU + tmux, tmem) + tsetup = tpcq_PC + tmux + tmem + tsetup = [30 + 25 + 250 + 20] ps = 325 ps

Multicycle Performance Example For a program with 100 billion instructions executing on a multicycle MIPS processor CPI = 4.12 Tc = 325 ps Execution Time = (# instructions) × CPI × Tc = (100 × 109)(4.12)(325 × 10-12) = 133.9 seconds This is slower than the single-cycle processor (92.5 seconds). Why? Not all steps the same length Sequencing overhead for each step (tpcq + tsetup= 50 ps)

Review: Single-Cycle MIPS Processor

Review: Multicycle MIPS Processor

Next Time Next topic: We’ll look at pipelined MIPS Improving throughput (and adding complexity!) by trying to use all of the hardware every cycle