1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 3.

Slides:



Advertisements
Similar presentations
CS/COE1541: Introduction to Computer Architecture Datapath and Control Review Sangyeun Cho Computer Science Department University of Pittsburgh.
Advertisements

Adding the Jump Instruction
1 Datapath and Control (Multicycle datapath) CDA 3101 Discussion Section 11.
Multicycle Datapath & Control Andreas Klappenecker CPSC321 Computer Architecture.
1  1998 Morgan Kaufmann Publishers We will be reusing functional units –ALU used to compute address and to increment PC –Memory used for instruction and.
1 Chapter Five The Processor: Datapath and Control.
CS-447– Computer Architecture Lecture 12 Multiple Cycle Datapath
L14 – Control & Execution 1 Comp 411 – Fall /04/09 Control & Execution Finite State Machines for Control MIPS Execution.
The Processor Data Path & Control Chapter 5 Part 2 - Multi-Clock Cycle Design N. Guydosh 2/29/04.
CSE378 Multicycle impl,.1 Drawbacks of single cycle implementation All instructions take the same time although –some instructions are longer than others;
Fall 2007 MIPS Datapath (Single Cycle and Multi-Cycle)
10/18/2005Comp 120 Fall October Questions? Instruction Execution.
Computer Structure - Datapath and Control Goal: Design a Datapath  We will design the datapath of a processor that includes a subset of the MIPS instruction.
The Processor 2 Andreas Klappenecker CPSC321 Computer Architecture.
Lecture 16: Basic CPU Design
1 Lecture 14: FSM and Basic CPU Design Today’s topics:  Finite state machines  Single-cycle CPU Reminder: midterm on Tue 10/24  will cover Chapters.
L15 – Control & Execution 1 Comp 411 – Spring /25/08 Control & Execution Finite State Machines for Control MIPS Execution.
331 W10.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s.
CPU Architecture Why not single cycle? Why not single cycle? Hardware complexity Hardware complexity Why not pipelined? Why not pipelined? Time constraints.
1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.
Class 9.1 Computer Architecture - HUJI Computer Architecture Class 9 Microprogramming.
Processor I CPSC 321 Andreas Klappenecker. Midterm 1 Thursday, October 7, during the regular class time Covers all material up to that point History MIPS.
The Multicycle Processor CPSC 321 Andreas Klappenecker.
Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
Computing Systems The Processor: Datapath and Control.
CPE232 Basic MIPS Architecture1 Computer Organization Multi-cycle Approach Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides
1 CS/COE0447 Computer Organization & Assembly Language Multi-Cycle Execution.
C HAPTER 5 T HE PROCESSOR : D ATAPATH AND C ONTROL M ULTICYCLE D ESIGN.
1  2004 Morgan Kaufmann Publishers Chapter Five.
Multicycle Implementation
LECTURE 6 Multi-Cycle Datapath and Control. SINGLE-CYCLE IMPLEMENTATION As we’ve seen, single-cycle implementation, although easy to implement, could.
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 2.
Multicycle datapath.
MIPS Processor.
Chapter 4 From: Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 3.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 3 In-Class Exercises.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
CS161 – Design and Architecture of Computer Systems
Control & Execution Finite State Machines for Control MIPS Execution.
Five Execution Steps Instruction Fetch
MIPS Instructions.
CS/COE0447 Computer Organization & Assembly Language
Multiple Cycle Implementation of MIPS-Lite CPU
Control & Execution Finite State Machines for Control MIPS Execution.
CS/COE0447 Computer Organization & Assembly Language
CS/COE0447 Computer Organization & Assembly Language
CSCE 212 Chapter 5 The Processor: Datapath and Control
Chapter Five The Processor: Datapath and Control
CS/COE0447 Computer Organization & Assembly Language
Multicycle Approach Break up the instructions into steps
The Multicycle Implementation
CS/COE0447 Computer Organization & Assembly Language
CS/COE0447 Computer Organization & Assembly Language
CS/COE0447 Computer Organization & Assembly Language
A Multiple Clock Cycle Instruction Implementation
Chapter Five The Processor: Datapath and Control
The Multicycle Implementation
The Processor Lecture 3.2: Building a Datapath with Control
Multicycle Approach We will be reusing functional units
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
Processor: Multi-Cycle Datapath & Control
Multicycle Design.
CS/COE0447 Computer Organization & Assembly Language
Multi-Cycle Datapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Chapter Four The Processor: Datapath and Control
CS/COE0447 Computer Organization & Assembly Language
CS161 – Design and Architecture of Computer Systems
CS/COE0447 Computer Organization & Assembly Language
Presentation transcript:

1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 3

2 A Multi-cycle Datapath A single memory unit for both instructions and data Single ALU rather than ALU & two adders Registers added after every major functional unit to hold the output until it is used in a subsequent clock cycle

3 Multi-Cycle Control What we need to cover Adding registers after every functional unit –Need to modify the “instruction execution” slides to reflect this Breaking instruction execution down into cycles –What can be done during the same cycle? What requires a cycle? –Need to modify the “instruction execution” slides again –Timing: Registers/memory updated at the beginning of the next clock cycle Control signal values –What they are per cycle, per instruction –Finite state machine which determines signals based on instruction type + which cycle it is Putting it all together

4 Execution: single-cycle (reminder) add –Fetch instruction and add 4 to PC add $t2,$t1,$t0 –Read two source registers $t1 and $t0 –Add two values $t1 + $t0 –Store result to the destination register $t1 + $t0  $t2

5 A Multi-cycle Datapath For add: Instruction is stored in the instruction register (IR) Values read from rs and rt are stored in A and B Result of ALU is stored in ALUOut

6 Multi-Cycle Execution: R-type Instruction fetch –IR <= Memory[PC]; sub $t0,$t1,$t2 –PC <= PC + 4; Decode instruction/register read –A <= Reg[IR[25:21]]; rs –B <= Reg[IR[20:16]]; rt –ALUOut <= PC + (sign-extend(IR[15:0])<<2);  later Execution –ALUOut <= A op B; op = add, sub, and, or,… Completion –Reg[IR[15:11]] <= ALUOut; $t0 <= ALU result

7 Execution: single-cycle (reminder) lw (load word) –Fetch instruction and add 4 to PC lw $t0,-12($t1) –Read the base register $t1 –Sign-extend the immediate offset fff4  fffffff4 –Add two values to get address X = fffffff4 + $t1 –Access data memory with the computed address M[X] –Store the memory data to the destination register $t0

8 A Multi-cycle Datapath For lw: lw $t0, -12($t1) Instruction is stored in the IR Contents of rs stored in A $t1 Output of ALU (address of memory location to be read) stored in ALUOut Value read from memory is stored in the memory data register (MDR)

9 Multi-cycle Execution: lw Instruction fetch –IR <= Memory[PC]; lw $t0,-12($t1) –PC <= PC + 4; Instruction Decode/register read –A <= Reg[IR[25:21]]; rs –B <= Reg[IR[20:16]]; –ALUOut <= PC + (sign-extend(IR[15:0])<<2); Execution –ALUOut <= A + sign-extend(IR[15:0]); $t (sign extended) Memory Access –MDR <= Memory[ALUOut]; M[$t ] Write-back –Load: Reg[IR[20:16]] <= MDR; $t0 <= M[$t ]

10 Execution: single-cycle (reminder) sw (store word) –Fetch instruction and add 4 to PC sw $t0,-4($t1) –Read the base register $t1 –Read the source register $t0 –Sign-extend the immediate offset fffc  fffffffc –Add two values to get address X = fffffffc + $t1 –Store the contents of the source register to the computed address $t0  Memory[X]

11 A Multi-cycle Datapath For sw: sw $t0, -12($t1) Instruction is stored in the IR Contents of rs stored in A $t1 Output of ALU (address of memory location to be written) stored in ALUOut

12 Multi-cycle Execution: sw Instruction fetch –IR <= Memory[PC]; sw $t0,-12($t1) –PC <= PC + 4; Decode/register read –A <= Reg[IR[25:21]]; rs –B <= Reg[IR[20:16]]; rt –ALUOut <= PC + (sign-extend(IR[15:0])<<2); Execution –ALUOut <= A + sign-extend(IR[15:0]); $t (sign extended) Memory Access –Memory[ALUOut] <= B; M[$t ] <= $t0

13 Execution: single-cycle (reminder) beq –Fetch instruction and add 4 to PC beq $t0,$t1,L Assume that L is +3 instructions away –Read two source registers $t0,$t1 –Sign Extend the immediate, and shift it left by 2 0x0003  0x c –Perform the test, and update the PC if it is true If $t0 == $t1, the PC = PC + 0x c [we will follow what Mars does, so this is not Immediate == 0x0002; PC = PC x ]

14 A Multi-cycle Datapath For beq beq $t0,$t1,label Instruction stored in IR Registers rs and rt are stored in A and B Result of ALU (rs – rt) is stored in ALUOut

15 Multi-cycle execution: beq Instruction fetch –IR <= Memory[PC]; beq $t0,$t1,label –PC <= PC + 4; Decode/register read –A <= Reg[IR[25:21]]; rs –B <= Reg[IR[20:16]]; rt –ALUOut <= PC + (sign-extend(IR[15:0])<<2); PC + #bytes away label is (negative for backward branches, positive for forward branches) Execution –if (A == B) then PC <= ALUOut; if $t0 == $t1 perform branch Note: the ALU is used to evaluate A == B; we’ll see later that this does not clash with the use of the ALU above.

16 Execution: single-cycle (reminder) j –Fetch instruction and add 4 to PC –Take the 26-bit immediate field –Shift left by 2 (to make 28-bit immediate) –Get 4 bits from the current PC and attach to the left of the immediate –Assign the value to PC BUT, as we’ll see soon, only the instruction fetch takes time (at our level of detail)

17 A Multi-cycle Datapath For j No accesses to registers or memory; no need for ALU

18 Multi-cycle execution: j Instruction fetch –IR <= Memory[PC]; j label –PC <= PC + 4; Decode/register read –A <= Reg[IR[25:21]]; –B <= Reg[IR[20:16]]; –ALUOut <= PC + (sign-extend(IR[15:0])<<2); Execution –PC <= {PC[31:28],IR[25:0],”00”};

19 Multi-Cycle Control What we need to cover Adding registers after every functional unit –Need to modify the “instruction execution” slides to reflect this Breaking instruction execution down into cycles –What can be done during the same cycle? What requires a cycle?  –Need to modify the “instruction execution” slides again –Timing: Registers/memory updated at the beginning of the next clock cycle Control signal values –What they are per cycle, per instruction –Finite state machine which determines signals based on instruction type + which cycle it is Putting it all together

Operations These take time: Memory (read/write); register file (read/write); ALU operations The other connections and logical elements have no latency (for our purposes)

21 Fig 5.28 (given exam 3 and Final) Memory, register file, ALU take time; the rest of it doesn’t (for our purposes)

22 Where I am!!!hererick

23 Five Execution Steps Instruction fetch Instruction decode and register read Execution, memory address calculation, or branch completion Memory access or R-type instruction completion Write-back Instruction execution takes 3~5 cycles!

24 A Multi-cycle Datapath for reference

25 Step 1: Instruction Fetch Access memory w/ PC to fetch instruction and store it in Instruction Register (IR) Increment PC by 4 –We can do this because ALU is not busy and we can use it –PC Update is done at the next clock rising edge

26 A Multi-cycle Datapath for reference

27 Step 2: Decode and Reg. Read Read registers rs and rt –We read both of them regardless of necessity Compute the branch address in case the instruction is a branch –We can do this as ALU is not busy –ALUOut will keep the target address We still don’t set any control signals based on the instruction type –Instruction is being decoded now in the control logic!

28 A Multi-cycle Datapath for reference

29 Step 3: Various Actions ALU performs one of three functions based on instruction type Memory reference –ALUOut <= A + sign-extend(IR[15:0]); R-type –ALUOut <= A op B; Branch: –if (A==B) PC <= ALUOut; Jump: –PC <= {PC[31:28],IR[25:0],2’b00};// verilog notation

30 A Multi-cycle Datapath for reference

31 Step 4: Memory Access… If the instruction is memory reference –MDR <= Memory[ALUOut];// if it is a load –Memory[ALUOut] <= B;// if it is a store Store is complete! If the instruction is R-type –Reg[IR[15:11]] <= ALUOut; Now the instruction is complete!

32 A Multi-cycle Datapath for reference

33 Step 5: Register Write Back Only memory load instruction reaches this step –Reg[IR[20:16]] <= MDR;

34 A (Refined) Datapath fig 5.26

35 Datapath w/ Control Signals Fig 5.27

36 Final Version w/ Control Fig 5.28

37 Finite State Machine (FSM)

38 Traffic Light Control Example Two states –NSlite: 1: green light on North-South road; 0: red light on North-South Road –EWlite: similar Two inputs: NS car (a car is sensed on NS road, going either way); EW car (similar) Current state goes for 30 seconds, then –Switch to the other state if there is a car waiting –Current state goes for another 30 seconds if not So, use 1/30Hz clock, or 0.033Hz

39 Traffic Light Control, cont’d

40 Traffic Light Control, cont’d

41 Traffic Light Control, cont’d Let’s assign “0” to NSlite and “1” to “EWlite” NextState=CurrentState’  EWcar+CurrentS tate  NScar’

42 Finite State Machine (FSM) FSM –Memory element to keep current state –Next state function –Output function

43 Fig 5.28 For reference

44 State Diagram, Big Picture

45 Handling Memory Instructions

46 R-type Instruction

47 Branch and Jump

48 A FSM State Diagram

49 FSM Implementation

50 Figure 5.28 for reference

51 Summary: R-type Instruction fetch –IR <= Memory[PC]; –PC <= PC + 4; Decode/register read –A <= Reg[IR[25:21]]; –B <= Reg[IR[20:16]]; –ALUOut <= PC + (sign-extend(IR[15:0])<<2); Execution –ALUOut <= A op B; Completion –Reg[IR[15:11]] <= ALUOut;// done

52 Summary: Memory Instruction fetch –IR <= Memory[PC]; –PC <= PC + 4; Decode/register read –A <= Reg[IR[25:21]]; –B <= Reg[IR[20:16]]; –ALUOut <= PC + (sign-extend(IR[15:0])<<2); Execution –ALUOut <= A + sign-extend(IR[15:0]); Memory Access –Load: MDR <= Memory[ALUOut]; –Store: Memory[ALUOut] <= B;// done Write-back –Load: Reg[IR[20:16]] <= MDR;

53 Summary: Branch Instruction fetch –IR <= Memory[PC]; –PC <= PC + 4; Decode/register read –A <= Reg[IR[25:21]]; –B <= Reg[IR[20:16]]; –ALUOut <= PC + (sign-extend(IR[15:0])<<2); Execution –if (A == B) then PC <= ALUOut;// done

54 Summary: Jump Instruction fetch –IR <= Memory[PC]; –PC <= PC + 4; Decode/register read –A <= Reg[IR[25:21]]; –B <= Reg[IR[20:16]]; –ALUOut <= PC + (sign-extend(IR[15:0])<<2); Execution –PC <= {PC[31:28],IR[25:0],”00”};// concatenation

55 Example: Load (1)

56 Example: Load (2) rs rt

57 Example: Load (3)

58 Example: Load (4) 1 10

59 Example: Load (5) 1 1 0

60 Example: Jump (1)

61 Example: Jump (2)

62 Example: Jump (3) 1 10

63 To Summarize… From several building blocks, we constructed a datapath for a subset of the MIPS instruction set First, we analyzed instructions for functional requirements Second, we connected buildings blocks in a way to accommodate instructions Third, we refined the datapath and added controls

64 To Summarize… We looked at how an instruction is executed on the datapath in a pictorial way We looked at control signals connected to functional blocks in our datapath We analyzed how execution steps of an instruction change the control signals

65 To Summarize… We compared a single-cycle implementation and a multi-cycle implementation of our datapath We analyzed multi-cycle execution of instructions We refined multi-cycle datapath We designed multi-cycle control

66 To Summarize… We looked at the multi-cycle control scheme in detail Multi-cycle control can be implemented using FSM FSM is composed of some combinational logic and memory element