CSE331 W09.1Irwin Fall 2007 PSU CSE 331 Computer Organization and Design Fall 2007 Week 9 Section 1: Mary Jane Irwin (www.cse.psu.edu/~mji)www.cse.psu.edu/~mji.

Slides:



Advertisements
Similar presentations
331 W08.1Spring :332:331 Computer Architecture and Assembly Language Spring 2006 Week 8: Datapath Design [Adapted from Dave Patterson’s UCB CS152.
Advertisements

The Processor: Datapath & Control
1  1998 Morgan Kaufmann Publishers Chapter Five The Processor: Datapath and Control.
Chapter 5 The Processor: Datapath and Control Basic MIPS Architecture Homework 2 due October 28 th. Project Designs due October 28 th. Project Reports.
331 W9.1Spring :332:331 Computer Architecture and Assembly Language Spring 2006 Week 9 Building a Single-Cycle Datapath [Adapted from Dave Patterson’s.
Levels in Processor Design
331 Lec 14.1Fall 2002 Review: Abstract Implementation View  Split memory (Harvard) model - single cycle operation  Simplified to contain only the instructions:
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Computer Structure - Datapath and Control Goal: Design a Datapath  We will design the datapath of a processor that includes a subset of the MIPS instruction.
The Processor 2 Andreas Klappenecker CPSC321 Computer Architecture.
331 W08.1Spring :332:331 Computer Architecture and Assembly Language Fall 2003 Week 8 [Adapted from Dave Patterson’s UCB CS152 slides and Mary Jane.
Chapter Five The Processor: Datapath and Control.
331 W10.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s.
Processor I CPSC 321 Andreas Klappenecker. Midterm 1 Thursday, October 7, during the regular class time Covers all material up to that point History MIPS.
The Processor Andreas Klappenecker CPSC321 Computer Architecture.
CSE431 L05 Basic MIPS Architecture.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 05: Basic MIPS Architecture Review Mary Jane Irwin.
The Processor: Datapath & Control. Implementing Instructions Simplified instruction set memory-reference instructions: lw, sw arithmetic-logical instructions:
Chapter 4 Sections 4.1 – 4.4 Appendix D.1 and D.2 Dr. Iyad F. Jafar Basic MIPS Architecture: Single-Cycle Datapath and Control.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
CSE331 W10&11.1Irwin Fall 2007 PSU CSE 331 Computer Organization and Design Fall 2007 Week 10 & 11 Section 1: Mary Jane Irwin (
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
The CPU Processor (CPU): the active part of the computer, which does all the work (data manipulation and decision-making) Datapath: portion of the processor.
Computer Architecture Chapter 5 Fall 2005 Department of Computer Science Kent State University.
Lec 15Systems Architecture1 Systems Architecture Lecture 15: A Simple Implementation of MIPS Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some.
CSE431 Chapter 4A.1Irwin, PSU, 2008 CSE 431 Computer Architecture Fall 2008 Chapter 4A: The Processor, Part A Mary Jane Irwin ( )
Gary MarsdenSlide 1University of Cape Town Chapter 5 - The Processor  Machine Performance factors –Instruction Count, Clock cycle time, Clock cycles per.
Computer Architecture and Design – ECEN 350 Part 6 [Some slides adapted from A. Sprintson, M. Irwin, D. Paterson and others]
1 Processor: Datapath and Control Single cycle processor –Datapath and Control Multicycle processor –Datapath and Control Microprogramming –Vertical and.
1 Processor: Datapath and Control Single cycle processor –Datapath and Control Multicycle processor –Datapath and Control Microprogramming –Vertical and.
D ATA P ATH OF A PROCESSOR (MIPS) Module 1.1 : Elements of computer system UNIT 1.
1  2004 Morgan Kaufmann Publishers Chapter Five.
CSE331 W10.1Irwin&Li Fall 2006 PSU CSE 331 Computer Organization and Design Fall 2006 Week 10 Section 1: Mary Jane Irwin (
ECE-C355 Computer Structures Winter 2008 The MIPS Datapath Slides have been adapted from Prof. Mary Jane Irwin ( )
Chapter 4 From: Dr. Iyad F. Jafar Basic MIPS Architecture: Single-Cycle Datapath and Control.
Design a MIPS Processor
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
COM181 Computer Hardware Lecture 6: The MIPs CPU.
MIPS Processor.
Chapter 4 From: Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
CS Computer Architecture Week 10: Single Cycle Implementation
CS161 – Design and Architecture of Computer Systems
CS 230: Computer Organization and Assembly Language
Single-Cycle Datapath and Control
Processor Design & Implementation
Basic MIPS Architecture
Processor (I).
CS/COE0447 Computer Organization & Assembly Language
MIPS processor continued
Designing MIPS Processor (Single-Cycle) Presentation G
CSCI206 - Computer Organization & Programming
Single-Cycle CPU DataPath.
CS/COE0447 Computer Organization & Assembly Language
CSCI206 - Computer Organization & Programming
Rocky K. C. Chang 6 November 2017
Composing the Elements
Composing the Elements
The Processor Lecture 3.3: Single-cycle Implementation
The Processor Lecture 3.2: Building a Datapath with Control
The Processor Lecture 3.1: Introduction & Logic Design Conventions
COMS 361 Computer Organization
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
Lecture 14: Single Cycle MIPS Processor
MIPS processor continued
CS/COE0447 Computer Organization & Assembly Language
The Processor: Datapath & Control.
COMS 361 Computer Organization
Processor: Datapath and Control
CS/COE0447 Computer Organization & Assembly Language
Presentation transcript:

CSE331 W09.1Irwin Fall 2007 PSU CSE 331 Computer Organization and Design Fall 2007 Week 9 Section 1: Mary Jane Irwin ( Section 2: Krishna Narayanan Course material on ANGEL: cms.psu.edu [ adapted from D. Patterson slides ]

CSE331 W09.2Irwin Fall 2007 PSU Head’s Up  Last week’s material l Number representation, basic arithmetic ops, MIPS ALU design  This week’s material l Designing a MIPS single cycle datapath -Reading assignment – PH , B.7  Next week’s material l More on single cycle datapath design -Reading assignment – PH: 5.4, B.8, C.1-C.2  Reminders l HW 6 is due Monday, Oct 29 th (by 11:55pm) l Quiz 5 closes Thursday, Oct 25 th (by 11:55pm) l Exam #1 take home solution due Thursday, Nov 1 st (by 11:55pm) l Exam #2 is Thursday, Nov 8, 6:30 to 7:45pm -People with conflicts should have sent by now l Friday, Nov 16 is the late-drop deadline

CSE331 W09.3Irwin Fall 2007 PSU Datapath design tended to just work … Control paths are where the system complexity lives. Bugs spawned from control path design errors reside in the microcode flow, the finite-state machines, and all the special exceptions that inevitably spring up in a machine design like thistles in a flower garden. The Pentium Chronicles, Colwell, pg. 64

CSE331 W09.4Irwin Fall 2007 PSU Review: Design Principles  Simplicity favors regularity l fixed size instructions – 32-bits l only three instruction formats  Good design demands good compromises l three instruction formats  Smaller is faster l limited instruction set l limited number of registers in register file l limited number of addressing modes  Make the common case fast l arithmetic operands from the register file (load-store machine) l allow instructions to contain immediate operands

CSE331 W09.5Irwin Fall 2007 PSU  We're ready to look at an implementation of the MIPS  Simplified to contain only: memory-reference instructions: lw, sw arithmetic-logical instructions: add, addu, sub, subu, and, or, xor, nor, slt, sltu arithmetic-logical immediate instructions: addi, addiu, andi, ori, xori, slti, sltiu control flow instructions: beq, j  Generic implementation: l use the program counter (PC) to supply the instruction address and fetch the instruction from memory (and update the PC) l decode the instruction (and read registers) l execute the instruction The Processor: Datapath & Control Fetch PC = PC+4 DecodeExec

CSE331 W09.6Irwin Fall 2007 PSU Abstract Implementation View  Two types of functional units: l elements that operate on data values (combinational) l elements that contain state (sequential)  Single cycle operation  Split memory (Harvard) model - one memory for instructions and one for data AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data

CSE331 W09.7Irwin Fall 2007 PSU Clocking Methodologies  Clocking methodology defines when signals can be read and when they can be written falling (negative) edge rising (positive) edge clock cycle clock rate = 1/(clock cycle) e.g., 10 nsec clock cycle = 100 MHz clock rate 1 nsec clock cycle = 1 GHz clock rate  State element design choices l level sensitive latch l master-slave and edge-triggered flipflops

CSE331 W09.8Irwin Fall 2007 PSU State Elements  Set-reset latch R S Q !Q RSQ(t+1)!Q(t+1) Q(t)!Q(t) 1100 clock D Q !Q clock D Q  Level sensitive D latch l latch is transparent when clock is high (copies input to output)

CSE331 W09.9Irwin Fall 2007 PSU Two-Sided Clock Constraint  Race problem with latch based design … D clock Q !Q D-latch0 D clock Q !Q D-latch1 clock  Consider the case when D-latch0 holds a 0 and D- latch1 holds a 1 and you want to transfer the contents of D-latch0 to D-latch1 and vica versa l must have the clock high long enough for the transfer to take place l must not leave the clock high so long that the transferred data is copied back into the original latch  Two-sided clock constraint

CSE331 W09.10Irwin Fall 2007 PSU State Elements, con’t  Solution is to use flipflops that change state (Q) only on clock edge (master-slave) l master (first D-latch) copies the input when the clock is high (the slave (second D-latch) is locked in its memory state and the output does not change) l slave copies the master when the clock goes low (the master is now locked in its memory state so changes at the input are not loaded into the master D-latch) D clock Q !Q D-latch D clock Q !Q D-latch Q !Q D clock D Q

CSE331 W09.11Irwin Fall 2007 PSU One-Slided Clock Constraint  Master-slave (edge-triggered) flipflops removes one of the clock constraints D clock Q !Q MS-ff0 D clock Q !Q MS-ff1 clock  Consider the case when MS-ff0 holds a 0 and MS-ff1 holds a 1 and you want to transfer the contents of MS-ff0 to MS-ff1 and vica versa l must have the clock cycle time long enough to accommodate the worst case delay path  One-sided clock constraint

CSE331 W09.12Irwin Fall 2007 PSU Latches vs Flipflops  Output is equal to the stored value inside the element  Change of state (value) is based on the clock l Latches: output changes whenever the inputs change and the clock is asserted (level sensitive methodology) -Two-sided timing constraint l Flip-flop: output changes only on a clock edge (edge- triggered methodology) -One-sided timing constraint A clocking methodology defines when signals can be read and written – wouldn’t want to read a signal at the same time it was being written

CSE331 W09.13Irwin Fall 2007 PSU Our Implementation  An edge-triggered methodology, typical execution l read contents of some state elements (combinational activity, so no clock control signal needed) l send values through some combinational logic l write results to one or more state elements on clock edge State element 1 State element 2 Combinational logic clock one clock cycle  Assumes state elements are written on every clock cycle; if not, need explicit write control signal l write occurs only when both the write control is asserted and the clock edge occurs

CSE331 W09.14Irwin Fall 2007 PSU Fetching Instructions  Fetching instructions involves l reading the instruction from the Instruction Memory l updating the PC value to be the address of the next (sequential) instruction Read Address Instruction Memory Add PC 4 l PC is updated every clock cycle, so it does not need an explicit write control signal just a clock signal l Reading from the Instruction Memory is a combinational activity, so it doesn’t need an explicit read control signal Fetch PC = PC+4 DecodeExec clock

CSE331 W09.15Irwin Fall 2007 PSU Decoding Instructions  Decoding instructions involves l sending the fetched instruction’s opcode and function field bits to the control unit and Instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 Control Unit l reading two values from the Register File -Register File addresses are contained in the instruction Fetch PC = PC+4 DecodeExec

CSE331 W09.16Irwin Fall 2007 PSU  Note that both RegFile read ports are active for all instructions during the Decode cycle using the rs and rt instruction field addresses l Since haven’t decoded the instruction yet, don’t know what the instruction is ! l Just in case the instruction uses values from the RegFile do “work ahead” by reading the two source operands Which instructions do make use of the RegFile values?  Also, all instructions (except j ) use the ALU after reading the registers Why? memory-reference? arithmetic? control flow? Reading Registers “Just in Case”

CSE331 W09.17Irwin Fall 2007 PSU Executing R Format Operations  R format operations ( add, sub, slt, and, or ) l perform operation (op and funct) on values in rs and rt l store the result back into the Register File (into location rd) Instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU overflow zero ALU controlRegWrite R-type: oprsrtrdfunctshamt 10 Note that Register File is not written every cycle (e.g. sw ), so we need an explicit write control signal for the Register File Fetch PC = PC+4 DecodeExec

CSE331 W09.18Irwin Fall 2007 PSU Consider slt Instruction  R format operations ( add, sub, slt, and, or ) l perform operation (op and funct) on values in rs and rt l store the result back into the Register File (into location rd) Instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU overflow zero ALU controlRegWrite R-type: oprsrtrdfunctshamt 10 Note that Register File is not written every cycle (e.g. sw ), so we need an explicit write control signal for the Register File Fetch PC = PC+4 DecodeExec

CSE331 W09.19Irwin Fall 2007 PSU  Remember the R format instruction slt slt $t0, $s0, $s1 # if $s0 < $s1 # then $t0 = 1 # else $t0 = 0 2 Consider the slt Instruction l Where does the 1 (or 0) come from to store into $t0 in the Register File at the end of the execute cycle? Instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU overflow zero ALU controlRegWrite

CSE331 W09.20Irwin Fall 2007 PSU End of First Lecture of the Week

CSE331 W09.21Irwin Fall 2007 PSU

CSE331 W09.22Irwin Fall 2007 PSU Executing Load and Store Operations  Load and store operations have to l compute a memory address by adding the base register (in rs) to the 16-bit signed offset field in the instruction -base register was read from the Register File during decode -offset value in the low order 16 bits of the instruction must be sign extended to create a 32-bit signed value l store value, read from the Register File during decode, must be written to the Data Memory l load value, read from the Data Memory, must be stored in the Register File I-Type: oprsrt address offset

CSE331 W09.23Irwin Fall 2007 PSU Executing Load and Store Operations, con’t Instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU overflow zero ALU controlRegWrite Data Memory Address Write Data Read Data Sign Extend MemWrite MemRead

CSE331 W09.24Irwin Fall 2007 PSU Executing Load and Store Operations, con’t Instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU overflow zero ALU controlRegWrite Data Memory Address Write Data Read Data Sign Extend MemWrite MemRead 1632

CSE331 W09.25Irwin Fall 2007 PSU Executing Branch Operations  Branch operations have to compare the operands read from the Register File during decode (rs and rt values) for equality ( zero ALU output) l compute the branch target address by adding the updated PC to the sign extended16-bit signed offset field in the instruction -“base register” is the updated PC -offset value in the low order 16 bits of the instruction must be sign extended to create a 32-bit signed value and then shifted left 2 bits to turn it into a word address I-Type: oprsrt address offset

CSE331 W09.26Irwin Fall 2007 PSU Executing Branch Operations, con’t Instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU zero ALU control Sign Extend 1632 Shift left 2 Add 4 PC Branch target address (to branch control logic)

CSE331 W09.27Irwin Fall 2007 PSU Executing Branch Operations, con’t Instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU zero ALU control Sign Extend 1632 Shift left 2 Add 4 PC Branch target address (to branch control logic)

CSE331 W09.28Irwin Fall 2007 PSU Executing Jump Operations  Jump operations have to l replace the lower 28 bits of the PC with the lower 26 bits of the fetched instruction shifted left by 2 bits Read Address Instruction Memory Add PC 4 Shift left 2 Jump address J-Type: op jump target address

CSE331 W09.29Irwin Fall 2007 PSU Creating a Single Datapath from the Parts  Assemble the datapath elements, add control lines as needed, and design the control path  Fetch, decode and execute each instructions in one clock cycle – single cycle design l no datapath resource can be used more than once per instruction, so some must be duplicated (e.g., why we have a separate Instruction Memory and Data Memory) l to share datapath elements between two different instruction classes will need multiplexors at the input of the shared elements with control lines to do the selection  Cycle time is determined by length of the longest path

CSE331 W09.30Irwin Fall 2007 PSU Fetch, R, and Memory Access Portions Read Address Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU ovf zero ALU controlRegWrite Data Memory Address Write Data Read Data MemWrite MemRead Sign Extend 1632

CSE331 W09.31Irwin Fall 2007 PSU Multiplexor Insertion MemtoReg Read Address Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU ovf zero ALU controlRegWrite Data Memory Address Write Data Read Data MemWrite MemRead Sign Extend 1632 ALUSrc

CSE331 W09.32Irwin Fall 2007 PSU Clock Distribution MemtoReg Read Address Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU ovf zero ALU control RegWrite Data Memory Address Write Data Read Data MemWrite MemRead Sign Extend 1632 ALUSrc System Clock clock cycle

CSE331 W09.33Irwin Fall 2007 PSU Adding the Branch Portion Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU ovf zero ALU controlRegWrite Data Memory Address Write Data Read Data MemWrite MemRead Sign Extend 1632 MemtoReg ALUSrc Read Address Instruction Memory Add PC 4

CSE331 W09.34Irwin Fall 2007 PSU Adding the Branch Portion Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU ovf zero ALU controlRegWrite Data Memory Address Write Data Read Data MemWrite MemRead Sign Extend 1632 MemtoReg ALUSrc Read Address Instruction Memory Add PC 4 Shift left 2 Add PCSrc

CSE331 W09.35Irwin Fall 2007 PSU  We wait for everything to settle down l ALU might not produce “right answer” right away l Memory and RegFile reads are combinational (as are ALU, adders, muxes, shifter, signextender) l Use write signals along with the clock edge to determine when to write to the sequential elements (to the PC, to the Register File and to the Data Memory)  The clock cycle time is determined by the logic delay through the longest path Our Simple Control Structure We are ignoring some details like register setup and hold times