TIME Single-Cycle LOAD K1 (K2) ADD K1 K2 ORI 0x1F Multi-Cycle

Slides:



Advertisements
Similar presentations
Adding the Jump Instruction
Advertisements

Pipeline Example: cycle 1 lw R10,9(R1) sub R11,R2, R3 and R12,R4, R5 or R13,R6, R7.
1 Datapath and Control (Multicycle datapath) CDA 3101 Discussion Section 11.
CIS 314 Fall 2005 MIPS Datapath (Single Cycle and Multi-Cycle)
Microprocessor Design Multi-cycle Datapath Nia S. Bradley Vijay.
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University RISC Pipeline See: P&H Chapter 4.6.
CSE378 Multicycle impl,.1 Drawbacks of single cycle implementation All instructions take the same time although –some instructions are longer than others;
Datorsystem 1 och Datorarkitektur 1 – föreläsning 10 måndag 19 November 2007.
1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.
ECE 232 L19.Pipeline2.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 19 Pipelining,
331 Lec18.1Fall :332:331 Computer Architecture and Assembly Language Fall 2003 Lecture 18 Introduction to Pipelined Datapath [Adapted from Dave.
Spring W :332:331 Computer Architecture and Assembly Language Spring 2005 Week 11 Introduction to Pipelined Datapath [Adapted from Dave Patterson’s.
CMPE 421 Advanced Computer Architecture Supplementary material for Pipelining PART1.
Lecture 12: Pipeline Datapath Design Professor Mike Schulte Computer Architecture ECE 201.
CASE STUDY OF A MULTYCYCLE DATAPATH. Alternative Multiple Cycle Datapath (In Textbook) Minimizes Hardware: 1 memory, 1 ALU Ideal Memory Din Address 32.
B10001 Pipelining Hazards ENGR xD52 Eric VanWyk Fall 2012.
Computer Organization CS224 Chapter 4 Part b The Processor Spring 2010 With thanks to M.J. Irwin, T. Fountain, D. Patterson, and J. Hennessy for some lecture.
Let’s look at a normal lw instruction first… 1. 2 Register file addresscontent 6 (00110) (00111) OpcodeSource register Destination register.
CDA 3101 Fall 2013 Introduction to Computer Organization Multicycle Datapath 9 October 2013.
Computer Architecture and Design – ECEN 350 Part 6 [Some slides adapted from A. Sprintson, M. Irwin, D. Paterson and others]
ECE243 CPU.
1 Processor: Datapath and Control Single cycle processor –Datapath and Control Multicycle processor –Datapath and Control Microprogramming –Vertical and.
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
PC Instruction Memory Address Instr. [31-0] 4 Fig 4.6 p 309 Instruction Fetch.
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
Multicycle datapath.
RegDst 1: RegFile destination No. for the WR Reg. comes from rd field. 0: RegFile destination No. for the WR Reg. comes from rt field.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
Multi-Cycle Datapath and Control
Electrical and Computer Engineering University of Cyprus
Problem with Single Cycle Processor Design
Chapter 5: A Multi-Cycle CPU.
IT 251 Computer Organization and Architecture
/ Computer Architecture and Design
CMSC 611: Advanced Computer Architecture
ECE/CS 552: Multicycle Data Path
Systems Architecture I
Multi-Cycle CPU.
Five Execution Steps Instruction Fetch
Single Cycle Processor
D.4 Finite State Diagram for the Multi-cycle processor
Multi-Cycle CPU.
ECS 154B Computer Architecture II Spring 2009
Basic MIPS Architecture
Processor (I).
CS/COE0447 Computer Organization & Assembly Language
Chapter 4 The Processor Part 2
CSCI206 - Computer Organization & Programming
CS/COE0447 Computer Organization & Assembly Language
Pipelining review.
Multicycle Approach Break up the instructions into steps
TIME C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 fetch decode rf exec wb fetch
Pipelining in more detail
Drawbacks of single cycle implementation
\course\cpeg323-05F\Topic6b-323
Systems Architecture I
The Processor Lecture 3.2: Building a Datapath with Control
Processor: Multi-Cycle Datapath & Control
Computer Architecture Processor: Datapath
Multi-Cycle Datapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Control Unit for Multiple Cycle Implementation
5.5 A Multicycle Implementation
Introduction to Computer Organization and Architecture
Systems Architecture I
Control Unit for Multiple Cycle Implementation
FloorPlan for Multicycle MIPS
Alternative datapath (book): Multiple Cycle Datapath
COMS 361 Computer Organization
Computer Structure Pipeline
Presentation transcript:

TIME Single-Cycle LOAD K1 (K2) ADD K1 K2 ORI 0x1F Multi-Cycle LOAD K1 (K2) ADD K1 K2 ORI 0x1F

http://www.marthastewart.com/337010/chocolate-cupcakes

http://www. marthastewart http://www.marthastewart.com/318727/swiss-meringue-buttercream-for-cupcakes

TIME Make dough Bake dough Blue frosting White frosting Black frosting

TIME Make dough Bake dough Blue frosting White frosting Black frosting

TIME dough bake blue white black

TIME dough bake blue white black dough bake blue white black

TIME dough bake blue white black dough bake blue white black dough bake blue white black

CYCLE   ADD, SUB, NAND 1 [IR] = Mem[ [PC] ] [PC] = [PC] + 1 2 [R1] = RF[ [IR7..6] ] [R2] = RF[ [IR5..4] ] 3 [ALUout] = [R1] op [R2] Update Z & N 4 RF[ [IR7..6] ] = [ALUout]

CYCLE   ADD, SUB, NAND 1 FETCH [IR] = Mem[ [PC] ] [PC] = [PC] + 1 2 DECODE Decode 3 RF [R1] = RF[ [IR7..6] ] [R2] = RF[ [IR5..4] ] 4 EXEC [ALUout] = [R1] op [R2] Update Z & N 5 WB RF[ [IR7..6] ] = [ALUout]

TIME C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb

CLK FETCH Decode RF EXEC WB

CLK FETCH Decode RF EXEC WB Control Control Control Control Control Control

TIME fetch decode rf exec wb C1 C2 C3 C4 C5 C6 C7 C8 C9 Program Order 0x100 ADD K0 K0 0x104 ADD K1 K1 0x108 ADD K2 K2 0x10C ADD K3 K3 0x120 ADD K0 K0

Stage 1: FETCH: IR = Mem[PC], PC = PC + 1 ADD K0 K0 8 ALU1 1 IRload R1Sel ALUop RFWrite IR6-7 3 2 8 8 reg1 data1 R1 2 AddrSel MemRead MemWrite 1 1 ALU2 ALUout 2 ALU IR IR5-4 reg2 RF 8 8 8 data2 R2 PC ADDR 8 8 1 8 regw dataw 8 Memory 8 1 Imm4 4 SE 8 8 N Z PCwrite Data_in Data_out Imm5 5 ZE 8 FlagWrite MDR Imm3 ZE RegIn MDRload Stage 1: FETCH: IR = Mem[PC], PC = PC + 1

ADD K1 K1 ADD K0 K0 Stage 2: Decode ALU RF Memory ALU1 1 8 IR1ld IR2ld 1 8 IR1ld IR2ld R1Sel ALUop RFWrite IR6-7 3 2 8 IR1 8 reg1 data1 R1 2 AddrSel MemRead MemWrite 1 1 ALU2 ALUout 2 ALU IR2 IR5-4 reg2 RF 8 8 8 data2 R2 ADDR 00 PC 8 1 8 regw dataw 8 8 Memory 8 Imm4 4 SE 8 01 8 Data_in Data_out N Z PCwrite Imm5 5 ZE 8 10 FlagWrite MDR Imm3 ZE 11 RegIn MDRload Stage 2: Decode

ADD K2 K2 ADD K0 K0 ADD K1 K1 Stage 3: RF ALU RF Memory ALU1 1 8 IR1ld 1 8 IR1ld IR2ld R1Sel ALUop RFWrite IR6-7 3 2 8 IR1 8 reg1 data1 R1 2 AddrSel MemRead MemWrite 1 1 ALU2 ALUout IR2 IR5-4 2 reg2 RF ALU 8 8 8 data2 R2 PC ADDR 00 8 8 8 1 regw dataw 8 8 Memory 8 Imm4 4 SE 8 01 8 N Z PCwrite Data_in Data_out Imm5 5 ZE 8 10 FlagWrite MDR Imm3 ZE 11 RegIn MDRload Stage 3: RF

Stage 3: RF, got to remember what to do ADD K2 K2 ADD K0 K0 ADD K1 K1 ALU1 1 IR3R1R2ld 8 8 IR3 IR1ld IR2ld R1Sel ALUop RFWrite IR6-7 3 2 8 IR1 8 reg1 data1 R1 2 AddrSel MemRead MemWrite 1 1 ALU2 ALUout IR5-4 2 RF ALU IR2 reg2 8 8 8 data2 R2 PC ADDR 00 8 1 8 regw dataw 8 8 Memory 8 Imm4 4 SE 8 01 8 Data_in Data_out N Z PCwrite Imm5 5 ZE 8 10 FlagWrite MDR Imm3 ZE 11 RegIn MDRload Stage 3: RF, got to remember what to do

oops ADD K3 K3 ADD K1 K1 ADD K0 K0 Stage 4: EXEC ADD K2 K2 ALU RF 1 IR3R1R2ld 8 8 IR3 IR1ld IR2ld R1Sel ALUop RFWrite IR6-7 3 2 8 IR1 8 reg1 data1 R1 2 AddrSel MemRead MemWrite 1 1 ALU2 ALUout IR5-4 2 RF ALU IR2 reg2 8 8 8 data2 R2 00 PC ADDR 8 1 8 regw dataw 8 8 Memory 8 Imm4 4 SE 8 01 8 Data_in Data_out N Z PCwrite Imm5 5 ZE 8 10 FlagWrite MDR Imm3 ZE 11 RegIn MDRload oops

oops ADD K3 K3 ADD K1 K1 ADD K0 K0 Stage 4: EXEC ADD K2 K2 ALU RF 1 IR3R1R2ld 8 8 IR3 IR1ld IR2ld R1Sel ALUop RFWrite IR6-7 3 8 IR1 2 8 reg1 data1 R1 2 AddrSel MemRead MemWrite 1 1 ALU2 ALUout IR2 IR5-4 2 ALU reg2 RF 8 8 8 data2 R2 00 PC ADDR 8 1 8 regw dataw 8 8 Memory 8 Imm4 4 SE 8 01 8 Data_in Data_out N Z PCwrite Imm5 5 ZE 8 10 FlagWrite MDR Imm3 ZE 11 RegIn MDRload oops

ADD K3 K3 ADD K1 K1 ADD K0 K0 ADD K2 K2 Stage 4: EXEC ALU RF Memory IR3R1R2ld 8 8 IR3 IR1ld IR2ld R1Sel ALUop RFWrite IR6-7 3 8 IR1 2 8 reg1 data1 R1 2 AddrSel MemRead MemWrite 1 1 ALU2 ALUout IR5-4 2 PCSel RF ALU IR2 reg2 8 8 8 data2 R2 ADDR 00 PC 8 8 1 8 regw dataw 8 8 Memory 1 8 Imm4 4 SE 8 01 8 PCwrite Data_in Data_out N Z Imm5 5 ZE 8 10 FlagWrite MDR 1 Imm3 ZE 11 RegIn MDRload Stage 4: EXEC

ADD K0 K0 ADD K3 K3 ADD K2 K2 ADD K1 K1 ADD K0 K0 No connection here IR4.6-7 IR3R1R2ld IR4ld 8 8 IR3 8 IR4 IR1ld IR2ld R1Sel ALUop RFWrite 3 8 IR1 8 reg1 data1 R1 2 AddrSel MemRead MemWrite 1 1 ALU2 ALUout IR2 IR5-4 2 PCSel reg2 RF ALU 8 8 8 data2 R2 ADDR 00 PC 8 8 1 8 regw dataw 8 8 Memory 1 8 Imm4 4 SE 8 01 8 PCwrite Data_in Data_out N Z Imm5 5 ZE 8 10 FlagWrite MDR 1 Imm3 ZE 11 RegIn MDRload STAGE 5: WB

I1: ADD K2 K1 I2: ADD K0 K0 I3: ADD K3 K2 DATA HAZARD TIME C1 C2 C3 C4 C5 C6 C7 I1 D I1 Read K1, K2 K1 + K2 Write K2 Fetch I2 decode rf exec wb Fetch I3 D I3 Read K3, K2 K3 + K2 Write K2 Reading Write Read Values Latched

I1: ADD K2 K1 I2: ADD K0 K0 I3: ADD K3 K2 TIME C1 C2 C3 C4 C5 C6 C7 I1 D I1 Read K1, K2 K1 + K2 Write K2 Fetch I2 decode rf exec wb Fetch I3 D I3 Read K3, K2 Read K3, K2 K3 + K2 Write K2 bubble Try to think of the simplest solution first

I1: ADD K2 K1 I2: ADD K0 K0 I3: ADD K3 K2 TIME C1 C2 C3 C4 C5 C6 C7 I1 D I1 Read K1, K2 K1 + K2 Write K2 Fetch I2 decode rf exec wb Fetch I3 D I3 Read K3, K2 K3 + K2 Write K2 Value is available Reading Write Read Values Latched

RF ALU Memory 2 IR4.6-7 IR3R1R2ld IR4ld 8 8 IR3 8 IR4 IR1ld IR2ld R1Sel ALUop RFWrite 8 1 3 IR1 8 reg1 data1 R1 2 AddrSel MemRead MemWrite 1 1 RF ALU2 ALUout 8 1 IR2 IR5-4 2 ALU PCSel reg2 8 8 data2 R2 00 PC ADDR 8 1 8 8 regw dataw 8 8 Memory 1 8 Imm4 4 SE 8 01 8 N Z PCwrite Data_in Data_out Imm5 5 ZE 8 10 FlagWrite MDR 1 Imm3 ZE 11 RegIn MDRload

I1: ADD K2 K1 I2: ADD K3 K2 TIME C1 C2 C3 C4 C5 C6 C7 I1 D I1 Read K1, K2 K1 + K2 Write K2 Fetch I2 D I2 Read K3, K2 K3 + K2 Write K2

I1: ADD K2 K1 I2: ADD K3 K2 TIME C1 C2 C3 C4 C5 C6 C7 I1 D I1 Read K1, K2 K1 + K2 Write K2 Fetch I2 D I2 Read K3, K2 K3 + K2 Write K2

RF ALU Memory 2 IR4.6-7 IR3R1R2ld IR4ld 8 8 IR3 8 IR4 IR1ld ALU1 IR2ld R1Sel ALUop RFWrite 3 8 IR1 8 8 reg1 data1 R1 2 1 1 AddrSel MemRead MemWrite 1 1 RF ALU2 ALUout IR5-4 2 8 ALU PCSel IR2 reg2 8 8 data2 R2 PC ADDR 1 000 8 8 8 1 regw dataw 8 8 111 Memory 1 8 Imm4 4 SE 8 001 8 Data_in Data_out N Z PCwrite Imm5 5 ZE 8 010 FlagWrite MDR 1 Imm3 ZE 011 RegIn 1 MDRload

RF ALU Memory 2 IR4.6-7 IR3R1R2ld IR4ld 8 8 IR3 8 IR4 IR1ld ALU1 IR2ld R1Sel ALUop RFWrite 3 8 IR1 8 8 reg1 data1 R1 2 1 1 AddrSel MemRead MemWrite 1 1 RF ALU2 ALUout IR5-4 2 8 ALU PCSel IR2 reg2 8 8 data2 R2 PC ADDR 1 000 8 8 8 1 regw dataw 8 8 111 Memory 1 8 Imm4 4 SE 8 001 8 Data_in Data_out N Z PCwrite Imm5 5 ZE 8 010 FlagWrite MDR 1 Imm3 ZE 011 RegIn 1 MDRload

SHIFT: we are using the wrong immediate 2 IR4.6-7 IR3R1R2ld IR4ld 8 8 IR3 8 IR4 IR1ld ALU1 IR2ld R1Sel ALUop RFWrite 3 8 8 IR1 8 reg1 data1 R1 2 1 1 AddrSel MemRead MemWrite 1 1 RF ALU2 ALUout IR5-4 2 8 ALU PCSel IR2 reg2 8 8 data2 R2 000 PC ADDR 1 8 1 8 8 regw dataw 8 8 111 Memory 1 8 Imm4 4 SE 8 001 8 N Z PCwrite Data_in Data_out Imm5 5 ZE 8 010 FlagWrite MDR 1 Imm3 ZE 011 RegIn 1 MDRload

SHIFT: we are using the wrong immediate 2 IR4.6-7 IR3R1R2ld IR4ld 8 8 IR3 8 IR4 R1B IR1ld ALU1 IR2ld R1Sel ALUop RFWrite 3 8 IR1 8 8 reg1 data1 R1 2 1 1 AddrSel MemRead MemWrite 1 1 RF ALU2 ALUout 2 ALU PCSel IR2 IR5-4 reg2 8 8 8 data2 R2 000 PC ADDR 1 8 8 8 1 regw dataw 8 8 111 Memory R2B 1 8 Imm4 4 SE 8 001 8 PCwrite Data_in Data_out N Z Imm5 5 ZE 8 010 FlagWrite MDR 1 Imm3 ZE 011 RegIn From IR3 1 MDRload

STORE: STRUCTURAL HAZARD TIME C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 fetch decode rf STORE wb fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb

TIME C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 fetch decode rf STORE wb fetch decode rf exec wb fetch decode rf exec wb BUBBLE fetch decode rf exec wb fetch decode rf exec wb

RF ALU IM IR3 Memory 2 IR4.6-7 IR3R1R2ld IR4ld 8 8 IR3 8 IR4 R1B R2B R1Sel ALUop RFWrite 00 3 8 01 8 IR1 8 reg1 data1 R1 2 10 IMRead 1 1 RF ALU2 ALUout 00 IR2 IR5-4 2 ALU PCSel reg2 8 01 8 data2 R2 ADDR 10 000 PC 8 8 100 8 regw dataw 8 8 111 IM 1 8 Imm4 4 SE 8 001 8 Data_out N Z PCwrite Imm5 5 ZE 8 010 FlagWrite 1 Imm3 ZE 011 RegIn 1 MemRead MemWrite ADDR MDRload Memory Data_in Data_out MDR

LOAD RF ALU IM IR3 Memory 2 IR4.6-7 IR3R1R2ld IR4ld 8 8 IR3 8 IR4 R1B R1Sel ALUop RFWrite 00 3 8 01 8 IR1 8 reg1 data1 R1 2 10 IMRead 1 1 RF ALU2 ALUout IR5-4 2 00 ALU PCSel IR2 reg2 8 01 8 data2 R2 PC ADDR 10 000 8 8 100 8 regw dataw 8 8 111 IM 1 8 Imm4 4 SE 8 001 8 Data_out N Z PCwrite Imm5 5 ZE 8 010 FlagWrite 1 Imm3 ZE 011 RegIn 1 MemRead MemWrite ADDR MDRload Memory Data_in Data_out MDR

Do we need the extra MDR? RF ALU IM IR3 Memory 2 IR4.6-7 IR3R1R2ld IR4ld 8 8 IR3 8 IR4 R1B R2B IR3 IR1ld ALU1 IR2ld R1Sel ALUop RFWrite 00 3 8 01 8 IR1 8 reg1 data1 R1 2 10 IMRead 1 1 RF ALU2 ALUout IR5-4 2 00 ALU PCSel IR2 reg2 8 01 8 data2 R2 PC ADDR 10 000 8 8 100 8 regw dataw 8 8 111 IM 1 8 Imm4 4 SE 8 001 8 Data_out N Z PCwrite Imm5 5 ZE 8 010 FlagWrite 1 Imm3 ZE 011 RegIn 1 MemRead MemWrite ADDR MDRload Memory Data_in Data_out MDR

LOAD revisited RF ALU IM IR3 Memory 2 IR4.6-7 IR3R1R2ld IR4ld 8 8 IR3 R1B R2B IR3 IR1ld ALU1 IR2ld R1Sel ALUop RFWrite 3 8 8 IR1 8 ALU1 reg1 data1 R1 2 1 1 IMRead 1 1 RF ALU2 WBin IR5-4 2 IR2 reg2 8 ALU PCSel 8 1 data2 1 R2 ADDR 000 PC 8 8 regw dataw 8 8 111 IM 1 8 Imm4 4 SE 8 001 8 N Z PCwrite Data_out Imm5 5 ZE 8 010 FlagWrite 1 Imm3 ZE 011 MemRead MemWrite ADDR Memory Data_in Data_out

BRANCHES: Calculate the target Just route the PC to the ALU? The ALU is ours in the fourth cycle 2 IR4.6-7 IR3R1R2ld IR4ld 8 8 IR3 8 IR4 R1B R2B IR3 IR1ld ALU1 IR2ld R1Sel ALUop RFWrite 3 8 8 IR1 8 ALU1 reg1 data1 R1 2 1 1 IMRead 1 1 RF ALU2 WBin IR5-4 2 IR2 reg2 8 ALU PCSel 8 1 data2 1 R2 ADDR 000 PC 8 8 regw dataw 8 8 111 IM 1 8 Imm4 4 SE 8 001 8 N Z PCwrite Data_out Imm5 5 ZE 8 010 FlagWrite 1 Imm3 ZE 011 MemRead MemWrite ADDR Memory Data_in Data_out

BRANCHES: Calculate the target: we have to use the right PC S1Ld S2Ld S3Ld 2 IR4.6-7 IR4ld PC1 PC2 PC3 8 8 IR3 8 IR4 R1B R2B IR3 ALU1 R1Sel ALUop RFWrite 3 00 8 8 01 8 IR1 ALU1 reg1 data1 R1 2 1 10 IMRead 1 1 RF ALU2 WBin 2 ALU PCSel IR2 IR5-4 reg2 8 8 1 ADDR data2 1 R2 000 PC 8 8 regw dataw 8 8 111 IM 1 8 Imm4 4 SE 8 001 8 Data_out N Z PCwrite Imm5 5 ZE 8 010 FlagWrite 1 Imm3 ZE 011 MemRead MemWrite ADDR Memory Data_in Data_out

How about ORI? Can it write to K1? S1Ld S2Ld S3Ld 2 IR4.6-7 IR4ld PC1 PC2 PC3 8 8 IR3 8 IR4 R1B R2B IR3 ALU1 R1Sel ALUop RFWrite 3 00 8 8 01 8 IR1 ALU1 reg1 data1 R1 2 1 10 IMRead 1 1 RF ALU2 WBin 2 ALU PCSel IR2 IR5-4 reg2 8 8 1 ADDR data2 1 R2 000 PC 8 8 regw dataw 8 8 111 IM 1 8 Imm4 4 SE 8 001 8 Data_out N Z PCwrite Imm5 5 ZE 8 010 FlagWrite 1 Imm3 ZE 011 MemRead MemWrite ADDR Memory Data_in Data_out

How about ORI? Can it write to K1? S1Ld S2Ld S3Ld 2 IR4.6-7 IR4ld PC1 PC2 PC3 8 8 IR3 8 IR4 R1B ALU1 R1Sel ALUop RFWrite 3 00 8 8 01 8 IR1 ALU1 reg1 data1 R1 2 1 10 IMRead 1 1 RF ALU2 WBin IR5-4 2 8 ALU PCSel IR2 reg2 8 1 1 ADDR RwSel data2 1 R2 000 PC 8 8 regw dataw 8 8 111 IM R2B 1 8 Imm4 4 SE 8 001 8 Data_out N Z PCwrite Imm5 5 ZE 8 010 FlagWrite 1 Imm3 ZE 011 IR3 MemRead MemWrite ADDR Memory Data_in Data_out

TIME C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 BRANCH fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb Fetch what? fetch decode rf exec wb fetch decode rf exec wb

Simplest solution first TIME C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 BRANCH fetch decode rf exec wb bubble bubble bubble fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec fetch decode rf fetch decode

Branch resolved TIME C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 BRANCH fetch decode rf exec wb fetch decode rf exec wb

Speculate what might be the next instruction TIME C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 branch decode rf exec wb squashed fetch decode rf bubble exec wb Redirected fetch

CYCLE   Instruction Class ADD, SUB, NAND SHIFT ORI LOAD STORE BPZ BZ BNZ 1 FETCH [IR] = Mem[ [PC] ] [PC] = [PC] + 1 2 DECODE 3 RF [R1] = RF[ [IR7..6] ] [R2] = RF[ [IR5..4] ] [R1] = RF [1] 4 EXECUTE [WBin] = [R1] op [R2] [WBin] = [R1] shift Imm3 [WBin] = [R1] OR Imm5 [WBin] = Mem[ [R2] ] MEM[[R2] = [R1] if (N’) PC = PC + SE(Imm4) if (Z) if (‘Z) 5 WRITEBACK RF[[IR7..6]] = [WBin] RF[ 1 ] = [WBin]

SEQUENTIAL EXECUTION SEMANTICS Time

PC FLAGS REGISTERS TIME C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb MEMORY

TIME A B C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb fetch decode rf exec wb