CS1104: Computer Organisation School of Computing National University of Singapore.

Slides:



Advertisements
Similar presentations
Control path Recall that the control path is the physical entity in a processor which: fetches instructions, fetches operands, decodes instructions, schedules.
Advertisements

Arithmetic Logic Unit (ALU)
Topics covered: CPU Architecture CSE 243: Introduction to Computer Architecture and Hardware/Software Interface.
CIS 314 Fall 2005 MIPS Datapath (Single Cycle and Multi-Cycle)
ELEN 468 Advanced Logic Design
CMPT 334 Computer Organization
1 IKI10230 Pengantar Organisasi Komputer Kuliah no. 11: Control Unit Sumber: 1. Paul Carter, PC Assembly Language 2. Hamacher. Computer Organization, ed-5.
CS-447– Computer Architecture Lecture 12 Multiple Cycle Datapath
CS61C L24 Introduction to CPU Design (1) Garcia, Spring 2007 © UCB Cell pic to web site  A new MS app lets people search the web based on a digital cell.
CS61C L18 Introduction to CPU Design (1) Beamer, Summer 2007 © UCB Scott Beamer, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
Fall 2007 MIPS Datapath (Single Cycle and Multi-Cycle)
Topics covered: CPU Architecture CSE 243: Introduction to Computer Architecture and Hardware/Software Interface.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Computer Structure - Datapath and Control Goal: Design a Datapath  We will design the datapath of a processor that includes a subset of the MIPS instruction.
CS61C L24 Introduction to CPU Design (1) Garcia, Fall 2006 © UCB Fedora Core 6 (FC6) just out  The latest version of the distro has been released; they.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 24 Introduction to CPU design Stanford researchers developing 3D camera.
Chapter 7. Basic Processing Unit
S. Barua – CPSC 440 CHAPTER 5 THE PROCESSOR: DATAPATH AND CONTROL Goals – Understand how the various.
CS61C L20 Datapath © UC Regents 1 CS61C - Machine Structures Lecture 20 - Datapath November 8, 2000 David Patterson
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 27: Single-Cycle CPU Datapath Design Instructor: Sr Lecturer SOE Dan Garcia
Basic Processing Unit (Week 6)
Princess Sumaya Univ. Computer Engineering Dept. Chapter 4: IT Students.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 4:
Chapter 5 Basic Processing Unit
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Lecture 16 Today’s topics: –MARIE Instruction Decoding and Control –Hardwired control –Micro-programmed control 1.
Multiple-bus organization
EXECUTION OF COMPLETE INSTRUCTION
Computer Architecture Lecture 09 Fasih ur Rehman.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 5:
Gary MarsdenSlide 1University of Cape Town Chapter 5 - The Processor  Machine Performance factors –Instruction Count, Clock cycle time, Clock cycles per.
EECS 322: Computer Architecture
CS2100 Computer Organisation The Processor: Datapath (AY2015/6) Semester 1.
Chapter 3 Basic Processing Unit.
CS61C L20 Datapath © UC Regents 1 Microprocessor James Tan Adapted from D. Patterson’s CS61C Copyright 2000.
COM181 Computer Hardware Lecture 6: The MIPs CPU.
MIPS Processor.
New-School Machine Structures Parallel Requests Assigned to computer e.g., Search “Katz” Parallel Threads Assigned to core e.g., Lookup, Ads Parallel Instructions.
BASIC COMPUTER ARCHITECTURE HOW COMPUTER SYSTEMS WORK.
CS 61C: Great Ideas in Computer Architecture MIPS Datapath 1 Instructors: Nicholas Weaver & Vladimir Stojanovic
CPU Design - Datapath. Review Use muxes to select among input S input bits selects 2 S inputs Each input can be n-bits wide, indep of S Can implement.
Computer Organization and Architecture + Networks
CS161 – Design and Architecture of Computer Systems
Single-Cycle Datapath and Control
Computer Organization
/ Computer Architecture and Design
ELEN 468 Advanced Logic Design
Systems Architecture I
Processor (I).
CS/COE0447 Computer Organization & Assembly Language
Single-Cycle DataPath
UNIT 4 Control Unit. UNIT 4 Control Unit Single CPU Bus CPU Bus MUX Temp PC R0 R(n-1) Instruction Decoder IR MAR MDR Z Y ALU Carry In Address Lines.
CSCE 212 Chapter 5 The Processor: Datapath and Control
Basic Processing Unit Unit- 7 Engineered for Tomorrow CSE, MVJCE.
Computer Organization “Central” Processing Unit (CPU)
MIPS Processor.
Some Fundamental Concepts
Rocky K. C. Chang 6 November 2017
Chapter 7. Basic Processing Unit
Systems Architecture I
Guest Lecturer TA: Shreyas Chand
Chapter 7. Basic Processing Unit
Fundamental Concepts Processor fetches one instruction at a time and perform the operation specified. Instructions are fetched from successive memory locations.
Instructor Paul Pearce
Processor: Multi-Cycle Datapath & Control
COMP541 Datapaths I Montek Singh Mar 18, 2010.
Basic Processing Unit UNIT-5.
Systems Architecture I
MIPS Processor.
CS161 – Design and Architecture of Computer Systems
Presentation transcript:

CS1104: Computer Organisation School of Computing National University of Singapore

CS1104-P2-6Processor: Datapath and Control2 PII Lecture 6: Processor: Datapath and Control  Datapath:  Single-bus Organization  Multiple-bus Organization  MIPS: Multicycle Datapath and Control  Stages of Instructions  Datapath Walkthroughs  Processor and Logic Design

CS1104-P2-6Processor: Datapath and Control3 PII Lecture 6: Processor: Datapath and Control  Reading:  Chapter 9 of textbook, which is Chapter 7 in “Computer Organization” by Hamacher, Vranesic and Zaky.  Optional reading: Chapter 5 in “Computer Organization & Design” by Patterson and Hennessy.

CS1104-P2-6Processor: Datapath and Control4 Datapath

CS1104-P2-6Processor: Datapath and Control5 Recap: Organisation Processor Control Datapath MemoryDevices Input Output Cache Registers Bus

CS1104-P2-6Processor: Datapath and Control6 Fundamental Concepts  Processor (CPU): the active part of the computer, which does all the work (data manipulation and decision-making).  Datapath: portion of the processor which contains hardware necessary to perform all operations required by the computer (the brawn).  Control: portion of the processor (also in hardware) which tells the datapath what needs to be done (the brain).

CS1104-P2-6Processor: Datapath and Control7 Fundamental Concepts (2)  Instruction execution cycle: fetch, decode, execute.  Fetch: fetch next instruction (using PC) from memory into IR.  Decode: decode the instruction.  Execute: execute instruction. Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction

CS1104-P2-6Processor: Datapath and Control8 Fundamental Concepts (3)  Fetch: Fetch next instruction into IR (Instruction Register).  Assume each word is 4 bytes and each instruction is stored in a word, and that the memory is byte addressable.  PC (Program Counter) contains address of next instruction. IR  [[PC]] PC  [PC] + 4

CS1104-P2-6Processor: Datapath and Control9 Single-bus Organization Data line Address line PC MAR MDR Y Internal processor bus Memory bus Z MUX A ALU B Constant 4 Select Add Sub XOR : ALU control lines Carry-in IR RO R(n–1) :::: TEMP Instruction decoder and control logic... Control signals

CS1104-P2-6Processor: Datapath and Control10 Instruction Execution  An instruction can be executed by performing one or more of the following operations in some specified sequence:  Transfer a word of data from one register to another or to the ALU (Arithmetic Logic Unit).  Perform an arithmetic or a logic operation and store the result in a register.  Fetch the contents of a given memory location and load them into a register.  Store a word of data from a register into a given memory location.

CS1104-P2-6Processor: Datapath and Control11 Register Transfer  Register to register transfer:  For each register Ri, two control signals:  Ri in used to load the data on the bus into the register.  Ri out to place the register’s contents on the bus.  Example: To transfer contents of R1 to R4:  Set R1 out to 1. This places contents of R1 on the bus.  Set R4 in to 1. This loads data from the processor bus into R4.

CS1104-P2-6Processor: Datapath and Control12 Register Transfer (2) Y Internal processor bus Z MUX A ALU B Constant 4 Select Ri X Ri in X Ri out Y in X X Z in Z out X

CS1104-P2-6Processor: Datapath and Control13 Arithmetic/Logic Operation  ALU: Performs arithmetic and logic operations on its A and B inputs.  To perform R3  [R1] + [R2]: 1.R1 out, Y in 2.R2 out, SelectY, Add, Z in 3.Z out, R3 in Y Internal processor bus Z MUX A ALU B Constant 4 Select Ri X Ri in X Ri out Y in X X Z in Z out X

CS1104-P2-6Processor: Datapath and Control14 Arithmetic/Logic Operation (2)  If there are n operations, do we need n ALU control lines?  We could use encoding, which requires log 2 n control lines for n operations. However, this will increase complexity and hardware (additional decoder needed). A ALU B Add Sub XOR : ALU control lines Carry-in

CS1104-P2-6Processor: Datapath and Control15 Reading a Word from Memory  Move (R1), R2/* R2  [[R1]] 1.MAR  [R1] 2.Start a Read operation on the memory bus 3.Wait for the MFC response from the memory 4.Load MDR from the memory bus 5.R2  [MDR]  MDR has four control signals: MDR in, MDR out, MDR inE and MDR outE. Memory-bus data lines MDR X MDR in E X MDR out E Internal processor bus X MDR in X MDR ou t

CS1104-P2-6Processor: Datapath and Control16 Reading a Word from Memory (2)  Move (R1), R2/* R2  [[R1]]  Sequence of control steps: 1.R1 out, MAR in, Read 2.MDR inE, WMFC 3.MDR out, R2 in  WMFC: Wait for arrival of MFC (Memory-Function- Completed) signal.  MFC: To accommodate variability in response time, the processor waits until it receives an indication that the Read/Write operation has been completed. The addressed device sets MFC to 1 to indicate this.

CS1104-P2-6Processor: Datapath and Control17 Storing a Word in Memory  Move R2, (R1)/* [R1]  [R2]  Sequence of control steps: 1.R1 out, MAR in 2.R2 out, MDR in, Write 3.MDR outE, WMFC

CS1104-P2-6Processor: Datapath and Control18 Executing a Complete Instruction  Add (R3), R1/* R1  [R1] + [[R3]]  Adds the contents of a memory location pointed to by R3 to register R1.  Sequence of control steps: 1.PC out, MAR in, Read, Select4, Add, Z in 2.Z out, PC in, Y in, WMFC 3.MDR out, IR in 4.R3 out, MAR in, Read 5.R1 out, Y in, WMFC 6.MDR out, SelectY, Add, Z in 7.Z out, R1 in, End Steps 1 – 3: Instruction fetch

CS1104-P2-6Processor: Datapath and Control19 Multiple-Bus Organization  Single-bus structure: Control sequences are long as only one data item can be transferred over the bus in a clock cycle.  Figure on next slide shows a three-bus structure.  All registers are combined into a single block called register file with three ports: 2 outputs allowing 2 registers to be accessed simultaneously and have their contents put on buses A and B, and 1 input allowing data on bus C to be loaded into a third register.  Buses A and B are used to transfer source operands to the A and B inputs of ALU, and result transferred to destination over bus C.

CS1104-P2-6Processor: Datapath and Control20 Multiple-Bus Organization (2) Bus C Constant 4 Bus A Bus B PC Register file MUX Incrementer A ALU B R Address line Memory bus data lines Bus C Bus A Bus B MAR MDR IR Instruction decoder

CS1104-P2-6Processor: Datapath and Control21 Multiple-Bus Organization (3)  For the ALU, R=A (or R=B) means that its A (or B) input is passed unmodified to bus C.  Add R4, R5, R6/* R6  [R4] + [R5]  Adds the contents of R4 and R5 to R6.  Sequence of control steps: 1.PC out, R=B, MAR in, Read, IncPC 2.WMFC 3.MDR outB, R=B, IR in 4.R4 outA, R5 outB, SelectA, Add, R6 in, End

CS1104-P2-6Processor: Datapath and Control22 Control  Hardwired control or microprogrammed control.  Hardwired control: Memory bus data lines Control signals Clock... CLK :::: : :... IR Decoder/ encoder External inputs Condition codes Control step counter

CS1104-P2-6Processor: Datapath and Control23 Control (2)  Microprogrammed control:  Control signals generated by a program.  Control word (CW) is a microinstruction that contains individual bits that represent the various control signals.  Vertical organization: highly encoded schemes that use compact codes to specify only a small number of control functions in each microinstruction.  Horizontal organization: minimally encoded scheme in which many resources can be controlled with a single microinstructions.  Popular in Complex Instruction Set Architectures (CISC) because complex instruction sets require complex controllers that can more easily be implemented as microprograms. Memory bus data lines

CS1104-P2-6Processor: Datapath and Control24 Control (3)  Example of a horizontal organization scheme: Memory bus data lines 1.PC out, MAR in, Read, Select4, Add, Z in 2.Z out, PC in, Y in, WMFC 3.MDR out, IR in 4.R3 out, MAR in, Read 5.R1 out, Y in, WMFC 6.MDR out, SelectY, Add, Z in 7.Z out, R1 in, End PC in PC out EndMAR in ReadIR jn Y in SelectMDR out Z out Z in R1 out R1 in AddR3 out WMFC Micro-instruction.. Select=0: SelectYSelect=1: Select4

CS1104-P2-6Processor: Datapath and Control25 MIPS: Multicycle Datapath and Control Adapted from D. Patterson’s CS61C Copyright 2000 UCB

CS1104-P2-6Processor: Datapath and Control26 Stages of a Datapath  Problem: a single, atomic block which “executes an instruction” (performs all necessary operations beginning with fetching the instruction) would be too bulky and inefficient.  Solution: break up the process of “executing an instruction” into stages, and then connect the stages to create the whole datapath.  Smaller stages are easier to design.  Easy to optimize (change) one stage without touching the others.

CS1104-P2-6Processor: Datapath and Control27 Stages of a Datapath (2)  There is a wide variety of MIPS instructions: so what general steps do they have in common?  Stages 1.Instruction Fetch 2.Instruction Decode 3.ALU 4.Memory Access 5.Register Write

CS1104-P2-6Processor: Datapath and Control28 Stages of a Datapath (3)  Stage 1: Instruction Fetch.  No matter what the instruction is, the 32-bit instruction word must first be fetched from memory (the cache-memory hierarchy).  Also, this is where we increment PC (that is, PC = PC + 4, to point to the next instruction; byte addressing so + 4).

CS1104-P2-6Processor: Datapath and Control29 Stages of a Datapath (4)  Stage 2: Instruction Decode  Upon fetching the instruction, we next gather data from the fields (decode all necessary instruction data).  First, read the opcode to determine instruction type and field lengths.  Second, read in data from all necessary registers.  For add, read two registers.  For addi, read one register.  For jal, no read necessary.

CS1104-P2-6Processor: Datapath and Control30 Stages of a Datapath (5)  Stage 3: ALU (Arithmetic-Logic Unit)  The real work of most instructions is done here: arithmetic (+, -, *, /), shifting, logic (&, |), comparisons ( slt ).  What about loads and stores?  lw $t0, 40($t1)  The address we are accessing in memory = the value in $t1 plus the value 40.  We do this addition at this stage.

CS1104-P2-6Processor: Datapath and Control31 Stages of a Datapath (6)  Stage 4: Memory Access  Actually only the load and store instructions do anything during this stage; for the other instructions, they remain idle during this stage.  Since these instructions have a unique step, we need this extra stage to account for them.  As a result of the cache system, this stage is expected to be just as fast (on average) as the others.

CS1104-P2-6Processor: Datapath and Control32 Stages of a Datapath (7)  Stage 5: Register Write  Most instructions write the result of some computation into a register.  Examples: arithmetic, logical, shifts, loads, slt  What about stores, branches, jumps?  They do not write anything into a register at the end.  These remain idle during this fifth stage.

CS1104-P2-6Processor: Datapath and Control33 Datapath: Generic Steps PC instruction memory +4 rt rs rd registers ALU Data memory imm 1. Instruction Fetch 2. Decode/ Register Read 3. Execute4. Memory5. Reg. Write

CS1104-P2-6Processor: Datapath and Control34 Datapath Walkthroughs: add  add $r3,$r1,$r2 # r3 = r1+r2  Stage 1: Fetch this instruction, increment PC.  Stage 2: Decode to find that it is an add instruction, then read registers $r1 and $r2.  Stage 3: Add the two values retrieved in stage 2.  Stage 4: Idle (nothing to write to memory).  Stage 5: Write result of stage 3 into register $r3.

CS1104-P2-6Processor: Datapath and Control35 Datapath Walkthroughs: add (2) PC instruction memory +4 registers ALU Data memory imm add r3, r1, r2 reg[1]+reg[2] reg[2] reg[1]

CS1104-P2-6Processor: Datapath and Control36 Datapath Walkthroughs: slti  slti $r3,$r1,17  Stage 1: Fetch this instruction, increment PC.  Stage 2: Decode to find it is an slti, then read register $r1.  Stage 3: Compare value retrieved in stage 2 with the integer 17.  Stage 4: Go idle.  Stage 5: Write the result of stage 3 in register $r3.

CS1104-P2-6Processor: Datapath and Control37 Datapath Walkthroughs: slti (2) PC instruction memory +4 registers ALU Data memory imm 3 1 x slti r3, r1, 17 reg[1] reg[1]

CS1104-P2-6Processor: Datapath and Control38 Datapath Walkthroughs: sw  sw $r3, 20($r1)  Stage 1: Fetch this instruction, increment PC.  Stage 2: Decode to find it is an sw, then read registers $r1 and $r3.  Stage 3: Add 20 to value in register $r1 (retrieved in stage 2).  Stage 4: Write value in register $r3 (retrieved in stage 2) into memory address computed in stage 3.  Stage 5: Go idle (nothing to write into a register).

CS1104-P2-6Processor: Datapath and Control39 Datapath Walkthroughs: sw (2) PC instruction memory +4 registers ALU Data memory imm 3 1 x sw r3, 20(r1) reg[1] reg[1] MEM[r1+20]<-r3 reg[3]

CS1104-P2-6Processor: Datapath and Control40 Why Five Stages?  Could we have a different number of stages?  Yes, and other architectures do.  So why does MIPS have five stages, if instructions tend to go idle for at least one stage?  There is one instruction that uses all five stages: the load.

CS1104-P2-6Processor: Datapath and Control41 Datapath Walkthroughs: lw  lw $r3, 40($r1)  Stage 1: Fetch this instruction, increment PC.  Stage 2: Decode to find it is a lw, then read register $r1.  Stage 3: Add 40 to value in register $r1 (retrieved in stage 2).  Stage 4: Read value from memory address compute in stage 3.  Stage 5: Write value found in stage 4 into register $r3.

CS1104-P2-6Processor: Datapath and Control42 Datapath Walkthroughs: lw (2) PC instruction memory +4 registers ALU Data memory imm 3 1 x lw r3, 40(r1) reg[1] reg[1] r3<-MEM[r1+40] reg[3]

CS1104-P2-6Processor: Datapath and Control43 What Hardware Is Needed?  PC: a register which keeps track of address of the next instruction.  General Purpose Registers  Used in stages 2 (read) and 5 (write).  We are currently working with 32 of these.  Memory  Used in stages 1 (fetch) and 4 (R/W).  Cache system makes these two stages as fast as the others, on average.

CS1104-P2-6Processor: Datapath and Control44 Datapath: Summary  Construct datapath based on register transfers required to perform instructions.  Control part causes the right transfers to happen. PC instruction memory +4 rt rs rd registers ALU Data memory imm Controller opcode, funct

CS1104-P2-6Processor: Datapath and Control45 Where is Logic Design Used?  Combinational circuits for ALU and other parts of the datapath.  Different control signals are needed for different clock cycles and different instructions for the ALU, registers and other parts of the datapath. Sequential circuits. ALU ALU Control

CS1104-P2-6Processor: Datapath and Control46 Where is Logic Design Used? (2)  High-level view of finite state machine control.  Sequential logic design can be used to assert the correct control signals at the correct times. Start Instruction fetch/decode and register fetch Memory access instructions R-type instructions Branch instruction Jump instruction

CS1104-P2-6Processor: Datapath and Control47 Summary  Datapath is the hardware that performs operations necessary to execute programs.  Control instructs datapath on what to do next.  Datapath needs:  access to storage (general purpose registers and memory)  computational ability (ALU)  helper hardware (local registers and PC)

CS1104-P2-6Processor: Datapath and Control48 Summary (2)  Five stages of datapath (executing an instruction):  1: Instruction Fetch (Increment PC)  2: Instruction Decode (Read Registers)  3: ALU (Computation)  4: Memory Access  5: Write to Registers ALL instructions must go through ALL five stages. Datapath designed in hardware.

49 End of file