1 CS/COE0447 Computer Organization & Assembly Language Multi-Cycle Execution.

Slides:



Advertisements
Similar presentations
Multicycle Datapath & Control Andreas Klappenecker CPSC321 Computer Architecture.
Advertisements

1 Chapter Five The Processor: Datapath and Control.
CS-447– Computer Architecture Lecture 12 Multiple Cycle Datapath
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 14 - Multi-Cycle.
The Processor Data Path & Control Chapter 5 Part 2 - Multi-Clock Cycle Design N. Guydosh 2/29/04.
Chapter 5 The Processor: Datapath and Control Basic MIPS Architecture Homework 2 due October 28 th. Project Designs due October 28 th. Project Reports.
CSE378 Multicycle impl,.1 Drawbacks of single cycle implementation All instructions take the same time although –some instructions are longer than others;
1 5.5 A Multicycle Implementation A single memory unit is used for both instructions and data. There is a single ALU, rather than an ALU and two adders.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
10/18/2005Comp 120 Fall October Questions? Instruction Execution.
331 W10.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s.
CPU Architecture Why not single cycle? Why not single cycle? Hardware complexity Hardware complexity Why not pipelined? Why not pipelined? Time constraints.
Class 9.1 Computer Architecture - HUJI Computer Architecture Class 9 Microprogramming.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Multi-Cycle Processor.
The Multicycle Processor CPSC 321 Andreas Klappenecker.
Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
CPE232 Basic MIPS Architecture1 Computer Organization Multi-cycle Approach Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides
C HAPTER 5 T HE PROCESSOR : D ATAPATH AND C ONTROL M ULTICYCLE D ESIGN.
Multicycle Implementation
LECTURE 6 Multi-Cycle Datapath and Control. SINGLE-CYCLE IMPLEMENTATION As we’ve seen, single-cycle implementation, although easy to implement, could.
ECE-C355 Computer Structures Winter 2008 The MIPS Datapath Slides have been adapted from Prof. Mary Jane Irwin ( )
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 2.
Multicycle datapath.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 3.
Chapter 4 From: Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 3.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 3 In-Class Exercises.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
Design a MIPS Processor (II)
Multi-Cycle Datapath and Control
Chapter 5: A Multi-Cycle CPU.
CS161 – Design and Architecture of Computer Systems
IT 251 Computer Organization and Architecture
Systems Architecture I
Multi-Cycle CPU.
MIPS Instructions.
D.4 Finite State Diagram for the Multi-cycle processor
Multi-Cycle CPU.
CS/COE0447 Computer Organization & Assembly Language
Multiple Cycle Implementation of MIPS-Lite CPU
CS/COE0447 Computer Organization & Assembly Language
CS/COE0447 Computer Organization & Assembly Language
Chapter Five The Processor: Datapath and Control
CS/COE0447 Computer Organization & Assembly Language
Multicycle Approach Break up the instructions into steps
The Multicycle Implementation
CS/COE0447 Computer Organization & Assembly Language
CS/COE0447 Computer Organization & Assembly Language
CS/COE0447 Computer Organization & Assembly Language
Chapter Five The Processor: Datapath and Control
Drawbacks of single cycle implementation
The Multicycle Implementation
Systems Architecture I
The Processor Lecture 3.2: Building a Datapath with Control
Vishwani D. Agrawal James J. Danaher Professor
Multicycle Approach We will be reusing functional units
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
Processor: Multi-Cycle Datapath & Control
Multicycle Design.
Lecture 17: Multi Cycle MIPS Processor
CS/COE0447 Computer Organization & Assembly Language
Multi-Cycle Datapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Chapter Four The Processor: Datapath and Control
5.5 A Multicycle Implementation
CS/COE0447 Computer Organization & Assembly Language
Systems Architecture I
CS161 – Design and Architecture of Computer Systems
Presentation transcript:

1 CS/COE0447 Computer Organization & Assembly Language Multi-Cycle Execution

2 A Multi-cycle Datapath A single memory unit for both instructions and data Single ALU rather than ALU & two adders Registers added after every major functional unit to hold the output until it is used in a subsequent clock cycle

3 Multi-Cycle Control What we need to cover Adding registers after every functional unit –Need to modify the “instruction execution” slides to reflect this Breaking instruction execution down into cycles –What can be done during the same cycle? What requires a cycle? –Need to modify the “instruction execution” slides again –Timing Control signal values –What they are per cycle, per instruction –Finite state machine which determines signals based on instruction type + which cycle it is Putting it all together

4 Execution: single-cycle (reminder) add –Fetch instruction and add 4 to PC add $t2,$t1,$t0 –Read two source registers $t1 and $t0 –Add two values $t1 + $t0 –Store result to the destination register $t1 + $t0  $t2

5 A Multi-cycle Datapath For add: Instruction is stored in the instruction register (IR) Values read from rs and rt are stored in A and B Result of ALU is stored in ALUOut

6 Execution: single-cycle (reminder) lw (load word) –Fetch instruction and add 4 to PC lw $t0,-12($t1) –Read the base register $t1 –Sign-extend the immediate offset fff4  fffffff4 –Add two values to get address X = fffffff4 + $t1 –Access data memory with the computed address M[X] –Store the memory data to the destination register $t0

7 A Multi-cycle Datapath For lw: lw $t0, -12($t1) Instruction is stored in the IR Contents of rs stored in A $t1 Output of ALU (address of memory location to be read) stored in ALUOut Value read from memory is stored in the memory data register (MDR)

8 Execution: single-cycle (reminder) sw –Fetch instruction and add 4 to PC sw $t0,-4($t1) –Read the base register $t1 –Read the source register $t0 –Sign-extend the immediate offset fffc  fffffffc –Add two values to get address X = fffffffc + $t1 –Store the contents of the source register to the computed address $t0  Memory[X]

9 A Multi-cycle Datapath For sw: sw $t0, -12($t1) Instruction is stored in the IR Contents of rs stored in A $t1 Output of ALU (address of memory location to be written) stored in ALUOut

10 Execution: single-cycle (reminder) beq –Fetch instruction and add 4 to PC beq $t0,$t1,L Assume that L is +4 instructions away –Read two source registers $t0,$t1 –Sign Extend the immediate, and shift it left by 2 0x0003  0x c –Perform the test, and update the PC if it is true If $t0 == $t1, the PC = PC + 0x c

11 A Multi-cycle Datapath For beq beq $t0,$t1,label Instruction stored in IR Registers rs and rt are stored in A and B Result of ALU (rs – rt) is stored in ALUOut

12 Execution: single-cycle (reminder) j –Fetch instruction and add 4 to PC –Take the 26-bit immediate field –Shift left by 2 (to make 28-bit immediate) –Get 4 bits from the current PC and attach to the left of the immediate –Assign the value to PC

13 A Multi-cycle Datapath For j No accesses to registers or memory; no need for ALU

14 Multi-Cycle Control What we need to cover Adding registers after every functional unit –Need to modify the “instruction execution” slides to reflect this Breaking instruction execution down into cycles  –What can be done during the same cycle? What requires a cycle? –Need to modify the “instruction execution” slides again –Timing Control signal values –What they are per cycle, per instruction –Finite state machine which determines signals based on instruction type + which cycle it is Putting it all together

15 Break up the instructions into steps –each step takes one clock cycle –balance the amount of work to be done in each step/cycle so that they are about equal –restrict each cycle to use at most once each major functional unit so that such units do not have to be replicated –functional units can be shared between different cycles within one instruction Multicycle Approach

Operations These take time: Memory (read/write); register file (read/write); ALU operations The other connections and logical elements have no latency (for our purposes)

17 Five Execution Steps Each takes one cycle In one cycle, there can be at most one memory access, at most one register access, and at most one ALU operation But, you can have a memory access, an ALU op, and/or a register access, as long as there is no contention for resources Changes to registers are made at the end of the clock cycle –PC, ALUOut, A, B, etc. save information for the next clock cycle

18 Step 1: Instruction Fetch Access memory w/ PC to fetch instruction and store it in Instruction Register (IR) Increment PC by 4 –We can do this because the ALU is not being used for something else this cycle

19 Step 2: Decode and Reg. Read Read registers rs and rt –We read both of them regardless of necessity Compute the branch address in case the instruction is a branch –We can do this because the ALU is not busy –ALUOut will keep the target address

20 Step 3: Various Actions ALU performs one of three functions based on instruction type (later – cycles per type of instruction; easier to understand) Memory reference –ALUOut <= A + sign-extend(IR[15:0]); R-type –ALUOut <= A op B; Branch: –if (A==B) PC <= ALUOut; Jump: –PC <= {PC[31:28],IR[25:0],2’b00};

21 Step 4: Memory Access… If the instruction is memory reference –MDR <= Memory[ALUOut];// if it is a load –Memory[ALUOut] <= B;// if it is a store Store is complete! If the instruction is R-type –Reg[IR[15:11]] <= ALUOut; Now the instruction is complete!

22 Step 5: Register Write Back Only the lw instruction reaches this step –Reg[IR[20:16]] <= MDR;

23 Multicycle Execution Step (1): Instruction Fetch IR = Memory[PC]; PC = PC + 4; 4 PC + 4

24 Multicycle Execution Step (2): Instruction Decode & Register Fetch A = Reg[IR[25-21]];(A = Reg[rs]) B = Reg[IR[20-15]];(B = Reg[rt]) ALUOut = (PC + sign-extend(IR[15-0]) << 2) Branch Target Address Reg[rs] Reg[rt] PC + 4

25 Multicycle Execution Step (3): Memory Reference Instructions ALUOut = A + sign-extend(IR[15-0]); Mem. Address Reg[rs] Reg[rt] PC + 4

26 Multicycle Execution Step (4): Memory Access - Write ( sw ) Memory[ALUOut] = B; PC + 4 Reg[rs] Reg[rt]

27 Multicycle Execution Step (4): Memory Access - Read ( lw ) MDR = Memory[ALUOut]; Mem. Data PC + 4 Reg[rs] Reg[rt] Mem. Address

28 Multicycle Execution Step (5): Memory Read Completion ( lw ) Reg[IR[20-16]] = MDR ; PC + 4 Reg[rs] Reg[rt] Mem. Data Mem. Address

29 Multicycle Execution Step (3): ALU Instruction (R-Type) ALUOut = A op B R-Type Result Reg[rs] Reg[rt] PC + 4

30 Multicycle Execution Step (4): ALU Instruction (R-Type) Reg[IR[15:11]] = ALUOUT R-Type Result Reg[rs] Reg[rt] PC + 4

31 Multicycle Execution Step (3): Branch Instructions if (A == B) PC = ALUOut; Branch Target Address Reg[rs] Reg[rt] Branch Target Address

32 Multicycle Execution Step (3): Jump Instruction PC = PC[31-28] concat (IR[25-0] << 2) Jump Address Reg[rs] Reg[rt] Branch Target Address

33 For Reference The next 5 slides give the steps, one slide per instruction

34 Multi-Cycle Execution: R-type Instruction fetch –IR <= Memory[PC]; sub $t0,$t1,$t2 –PC <= PC + 4; Decode instruction/register read –A <= Reg[IR[25:21]]; rs –B <= Reg[IR[20:16]]; rt –ALUOut <= PC + (sign-extend(IR[15:0])<<2); Execution –ALUOut <= A op B; op = add, sub, and, or,… Completion –Reg[IR[15:11]] <= ALUOut; $t0 <= ALU result

35 Multi-cycle Execution: lw Instruction fetch –IR <= Memory[PC]; lw $t0,-12($t1) –PC <= PC + 4; Instruction Decode/register read –A <= Reg[IR[25:21]]; rs –B <= Reg[IR[20:16]]; –ALUOut <= PC + (sign-extend(IR[15:0])<<2); Execution –ALUOut <= A + sign-extend(IR[15:0]); $t (sign extended) Memory Access –MDR <= Memory[ALUOut]; M[$t ] Write-back –Load: Reg[IR[20:16]] <= MDR; $t0 <= M[$t ]

36 Multi-cycle Execution: sw Instruction fetch –IR <= Memory[PC]; sw $t0,-12($t1) –PC <= PC + 4; Decode/register read –A <= Reg[IR[25:21]]; rs –B <= Reg[IR[20:16]]; rt –ALUOut <= PC + (sign-extend(IR[15:0])<<2); Execution –ALUOut <= A + sign-extend(IR[15:0]); $t (sign extended) Memory Access –Memory[ALUOut] <= B; M[$t ] <= $t0

37 Multi-cycle execution: beq Instruction fetch –IR <= Memory[PC]; beq $t0,$t1,label –PC <= PC + 4; Decode/register read –A <= Reg[IR[25:21]]; rs –B <= Reg[IR[20:16]]; rt –ALUOut <= PC + (sign-extend(IR[15:0])<<2); Execution –if (A == B) then PC <= ALUOut; if $t0 == $t1 perform branch

38 Multi-cycle execution: j Instruction fetch –IR <= Memory[PC]; j label –PC <= PC + 4; Decode/register read –A <= Reg[IR[25:21]]; –B <= Reg[IR[20:16]]; –ALUOut <= PC + (sign-extend(IR[15:0])<<2); Execution –PC <= {PC[31:28],IR[25:0],”00”};

39 Multi-Cycle Control What we need to cover Adding registers after every functional unit –Need to modify the “instruction execution” slides to reflect this Breaking instruction execution down into cycles –What can be done during the same cycle? What requires a cycle? –Need to modify the “instruction execution” slides again –Timing Control signal values  –What they are per cycle, per instruction –Finite state machine which determines signals based on instruction type + which cycle it is Putting it all together

40 Datapath w/ Control Signals

41 Final Version w/ Control

42 Example from beginning to end lw $t0,4($t1) Machine code: opcode rs rt immediate IR[31:26] IR[25:21] IR[20:16] IR[15:0] rt rs

43 Multi-cycle Execution: lw Instruction fetch –IR <= Memory[PC]; lw $t0,-12($t1) –PC <= PC + 4; Instruction Decode/register read –A <= Reg[IR[25:21]]; rs –B <= Reg[IR[20:16]]; –ALUOut <= PC + (sign-extend(IR[15:0])<<2); Execution –ALUOut <= A + sign-extend(IR[15:0]); $t (sign extended) Memory Access –MDR <= Memory[ALUOut]; M[$t ] Write-back –Load: Reg[IR[20:16]] <= MDR; $t0 <= M[$t ]

44 Example: Load (1)

45 Example: Load (2) rs rt

46 Example: Load (3)

47 Example: Load (4) 1 10

48 Example: Load (5) 1 1 0

49 Example: Jump (1)

50 Example: Jump (2)

51 Example: Jump (3)

52 A FSM State Diagram  this one is wrong; RegDst = 0; MemToReg = 1

53 Multicycle Control Step (1): Fetch IR = Memory[PC]; PC = PC + 4; X 0 X

54 Multicycle Control Step (2): Instruction Decode & Register Fetch A = Reg[IR[25-21]];(A = Reg[rs]) B = Reg[IR[20-15]];(B = Reg[rt]) ALUOut = (PC + sign-extend(IR[15-0]) << 2); 0 0 X 0 0 X 3 0 X X 010 0

55 0 X Multicycle Control Step (3): Memory Reference Instructions ALUOut = A + sign-extend(IR[15-0]); X X 0 1 X 010 0

56 Multicycle Control Step (3): ALU Instruction (R-Type) ALUOut = A op B; 0 X X X 0 1 X ??? 0

57 1 if Zero=1 Multicycle Control Step (3): Branch Instructions if (A == B) PC = ALUOut; 0 X X 0 0 X

58 Multicycle Execution Step (3): Jump Instruction PC = PC[21-28] concat (IR[25-0] << 2); 0 X X X 0 1 X 0 X 2 XXX 0

59 Multicycle Control Step (4): Memory Access - Read ( lw ) MDR = Memory[ALUOut]; 0 X X X X X XXX 0

60 Multicycle Execution Steps (4) Memory Access - Write (sw) Memory[ALUOut] = B; 0 X X X X X XXX 0

X 0 X 0 XXX X X RD1 RD2 RN1RN2WN WD RegWrite Registers Operation ALU 3 E X T N D 1632 Zero RD WD MemRead Memory ADDR MemWrite 5 Instruction I 32 ALUSrcB <<2 PC 4 RegDst 5 I R M D R M U X M U X 0 1 M U X 0 1 A B ALU OUT M U X <<2 CONCAT 2832 M U X 0 1 ALUSrcA jmpaddr I[25:0] rd MUX 01 rtrs immediate PCSource MemtoReg IorD PCWr* IRWrite Multicycle Control Step (4): ALU Instruction (R-Type) Reg[IR[15:11]] = ALUOut; (Reg[Rd] = ALUOut)

62 Multicycle Execution Steps (5) Memory Read Completion (lw) Reg[IR[20-16]] = MDR; X 0 0 X 0 X X XXX 0 55 RD1 RD2 RN1RN2WN WD RegWrite Registers Operation ALU 3 E X T N D 1632 Zero RD WD MemRead Memory ADDR MemWrite 5 Instruction I 32 ALUSrcB <<2 PC 4 RegDst 5 I R M D R M U X M U X 0 1 M U X 0 1 A B ALU OUT M U X <<2 CONCAT 2832 M U X 0 1 ALUSrcA jmpaddr I[25:0] rd MUX 01 rtrs immediate PCSource MemtoReg IorD PCWr* IRWrite

63 Multi-Cycle Control What we need to cover Adding registers after every functional unit –Need to modify the “instruction execution” slides to reflect this Breaking instruction execution down into cycles –What can be done during the same cycle? What requires a cycle? –Need to modify the “instruction execution” slides again –Timing: Registers/memory updated at the beginning of the next clock cycle Control signal values –What they are per cycle, per instruction –Finite state machine which determines signals based on instruction type + which cycle it is  Putting it all together

64 For reference

65 A FSM State Diagram  this one is wrong; RegDst = 0; MemToReg = 1

66 State Diagram, Big Picture

67 Handling Memory Instructions

68 R-type Instruction

69 Branch and Jump

70 FSM Implementation

71 To Summarize… From several building blocks, we constructed a datapath for a subset of the MIPS instruction set First, we analyzed instructions for functional requirements Second, we connected buildings blocks in a way to accommodate instructions Third, we refined the datapath and added controls

72 To Summarize… We looked at how an instruction is executed on the datapath in a pictorial way We looked at control signals connected to functional blocks in our datapath We analyzed how execution steps of an instruction change the control signals

73 To Summarize… We compared a single-cycle implementation and a multi-cycle implementation of our datapath We analyzed multi-cycle execution of instructions We refined multi-cycle datapath We designed multi-cycle control

74 To Summarize… We looked at the multi-cycle control scheme in detail Multi-cycle control can be implemented using FSM FSM is composed of some combinational logic and memory element

75 Summary Techniques described in this chapter to design datapaths and control are at the core of all modern computer architecture Multicycle datapaths offer two great advantages over single-cycle –functional units can be reused within a single instruction if they are accessed in different cycles – reducing the need to replicate expensive logic –instructions with shorter execution paths can complete quicker by consuming fewer cycles Modern computers, in fact, take the multicycle paradigm to a higher level to achieve greater instruction throughput: –pipelining (later class) where multiple instructions execute simultaneously by having cycles of different instructions overlap in the datapath –the MIPS architecture was designed to be pipelined