What You Will Learn In Next Few Sets of Lectures

Slides:



Advertisements
Similar presentations
CS152 Lec9.1 CS152 Computer Architecture and Engineering Lecture 9 Designing Single Cycle Control.
Advertisements

361 datapath Computer Architecture Lecture 8: Designing a Single Cycle Datapath.
CS61C L26 Single Cycle CPU Datapath II (1) Garcia © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use.
Savio Chau Single Cycle Controller Design Last Time: Discussed the Designing of a Single Cycle Datapath Control Datapath Memory Processor (CPU) Input Output.
CS 61C L34 Single Cycle CPU Control I (1) Garcia, Spring 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
Microprocessor Design
ECE 232 L13. Control.1 ©UCB, DAP’ 97 ECE 232 Hardware Organization and Design Lecture 13 Control Design
CS61C L25 CPU Design : Designing a Single-Cycle CPU (1) Garcia, Fall 2006 © UCB T-Mobile’s Wi-Fi / Cell phone  T-mobile just announced a new phone that.
CS 61C L17 Control (1) A Carle, Summer 2006 © UCB inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #17: CPU Design II – Control
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Fall 2006 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 25 CPU design (of a single-cycle CPU) Intel is prototyping circuits that.
EEM 486: Computer Architecture Lecture 3 Designing a Single Cycle Datapath.
CS61C L27 Single-Cycle CPU Control (1) Garcia, Spring 2010 © UCB inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 27 Single-cycle.
CS 61C L16 Datapath (1) A Carle, Summer 2004 © UCB inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #16 – Datapath Andy.
361 control Computer Architecture Lecture 9: Designing Single Cycle Control.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2010 © UCB inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures.
ECE 232 L12.Datapath.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 12 Datapath.
CS61C L27 Single Cycle CPU Control (1) Garcia, Fall 2006 © UCB Wireless High Definition?  Several companies will be working on a “WirelessHD” standard,
CS3350B Computer Architecture Winter 2015 Lecture 5.6: Single-Cycle CPU: Datapath Control (Part 1) Marc Moreno Maza [Adapted.
Computer Organization CS224 Fall 2012 Lesson 26. Summary of Control Signals addsuborilwswbeqj RegDst ALUSrc MemtoReg RegWrite MemWrite Branch Jump ExtOp.
EEM 486: Computer Architecture Designing Single Cycle Control.
Computer Organization CS224 Fall 2012 Lesson 22. The Big Picture  The Five Classic Components of a Computer  Chapter 4 Topic: Processor Design Control.
Designing a Single Cycle Datapath In this lecture, slides from lectures 3, 8 and 9 from the course Computer Architecture ECE 201 by Professor Mike Schulte.
EEM 486: Computer Architecture Designing a Single Cycle Datapath.
IT253: Computer Organization Lecture 9: Making a Processor: Single-Cycle Processor Design Tonga Institute of Higher Education.
CPE 442 single-cycle datapath.1 Intro. To Computer Architecture CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath.
W.S Computer System Design Lecture 4 Wannarat Suntiamorntut.
CS3350B Computer Architecture Winter 2015 Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2) Marc Moreno Maza [Adapted.
Computer Organization CS224 Chapter 4 Part a The Processor Spring 2011 With thanks to M.J. Irwin, T. Fountain, D. Patterson, and J. Hennessy for some lecture.
By Wannarat Computer System Design Lecture 4 Wannarat Suntiamorntut.
COM181 Computer Hardware Lecture 6: The MIPs CPU.
Csci 136 Computer Architecture II –Single-Cycle Datapath Xiuzhen Cheng
EEM 486: Computer Architecture Lecture 3 Designing Single Cycle Control.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single-Cycle CPU Datapath & Control Part 2 Instructors: Krste Asanovic & Vladimir Stojanovic.
Single Cycle Controller Design
CS 110 Computer Architecture Lecture 11: Single-Cycle CPU Datapath & Control Instructor: Sören Schwertfeger School of Information.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:

CS161 – Design and Architecture of Computer Systems
Problem with Single Cycle Processor Design
Designing a Single-Cycle Processor
IT 251 Computer Organization and Architecture
(Chapter 5: Hennessy and Patterson) Winter Quarter 1998 Chris Myers
Computer Organization Fall 2017 Chapter 4A: The Processor, Part A
Processor (I).
CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath Start: X:40.
CPU Organization (Design)
Single Cycle CPU Design
Single-Cycle CPU DataPath.
Vladimir Stojanovic and Nicholas Weaver
Lecturer PSOE Dan Garcia
Instructors: Randy H. Katz David A. Patterson
The Single Cycle Datapath
Rocky K. C. Chang 6 November 2017
Single Cycle datapath.
CS152 Computer Architecture and Engineering Lecture 8 Designing a Single Cycle Datapath Start: X:40.
COMS 361 Computer Organization
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
inst.eecs.berkeley.edu/~cs61c-to
CSC3050 – Computer Architecture
Computer Architecture Processor: Datapath
Prof. Giancarlo Succi, Ph.D., P.Eng.
John Kubiatowicz ( CS152 Computer Architecture and Engineering Lecture 7 Designing a Single Cycle Datapath Start: X:40.
Instructors: Randy H. Katz David A. Patterson
The Processor: Datapath & Control.
COMS 361 Computer Organization
Designing a Single-Cycle Processor
Processor: Datapath and Control
Presentation transcript:

What You Will Learn In Next Few Sets of Lectures Basic CPU Architecture Single Cycle Data Path Design Single Cycle Controller Design Multiple Cycle Data Path Design Multiple Cycle Controller Design Savio Chau

Five Classic Components of a Computer Control Datapath Memory Processor (CPU) Input Output Today’s Topic: Designing a Single Cycle Datapath

The Processor Processor Executes The Program Instructions 2 Major Components Datapath Hardware to Execute Each Machine Instruction Consists of a cascade of combinational and state elements (e.g., Arithmetic Logic Unit (ALU), Shifters, Registers, Multipliers, etc.) Control Generates the Signals Telling the Datapath What To Do At Each Clock Cycle Generates the Signals to Execute an Instruction in a Single Cycle or as a Series of Small Steps Over Multiple Cycles

A Simplified Processor Model Memory I/O Simplified Execution Cycle: Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction Data Address Control Program Counter Instruction Register Control Register File ALU Data Path

Execution Cycle

Steps to Design a Processor 1. Analyze instruction set Define the instruction set to be implemented Specify the requirements for the data path Specify the physical implementation 2. Select set of datapath components & establish clock methodology 3. Assemble data path meeting the requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic MIPS makes it easier Instructions same size Source registers always in same place Immediates have same size, location Operations always on registers/immediates Datapath Design Cpntrol Logic Design

Define the Functions of Each Instructions: Step 1: Analyze the Instruction Set a) Defining the Instruction Set Architecture Define the Functions of Each Instructions: Data Movement: load, store Arithmetic and Logic: add, sub, ori, and, or, slt Program Control: beq, jump For Each Instruction, Specify: Instruction Mnemonics (Assembly Language) Instruction Format and Op Codes (Machine Language)

Example: Subset of MIPS ISA to be Implemented

Step 1: Analyze the Instruction Set b) Specify Requirements for the Data Path Where and how to fetch the instruction? Where are the instructions stored? Instruction format or encoding how is it decoded? Location of operands where to find the operations? how many explicit operands? Data type and Size Type of Operations Location of results where to store the results? Successor instruction How to determine the next instruction? (next address logic for jumps, conditions branches) fetch-decode-execute next address is implicit!

Step 1: Analyze the Instruction Set c) Specify the Physical Implementation Write Register Transfer Language (RTL) for the ISA: Specify what state elements (registers, memories, flip-flops) are needed to implement the instructions Describe how signals are transferred among state elements There are many types of RTLs. Examples: VDHL and Verilog An informal RTL is used in this class: Syntax: variable  expression Where variable is either a register or a signal or signal group (Note: Use the following convention in this class. Variable is a register if it is all caps or in form of array[address]. Otherwise it is a signal or signal group) Expression is a function of input signals and the output of other state elements

RTL Conventions for This Class Register names: Either all upper case, underlined, or in array format. Examples: REG # all upper case Reg # not all upper case but underlined Reg[10] # 10th register in a register file Signal names or signal group names: neither all upper case nor underlined. Examples: Output output Register transfers: A  B # register to register REG  input # signal to register Each register write statement is assumed to take one clock unless is grouped by { } . Register read doesn’t take any clock. Examples A  B # reg to reg { A  B # reg to reg a  B # reg to signal C  A C  A } c  A Takes 2 clocks. Write Takes 1 clock. Write Takes 0 clock. Read transfers are sequential transfers are in parallel transfer is immediate REG input output clock

Register Transfer in RTL • RTL: B can also be written as: A  A + B AOut  A + B B  (A + B) xor C XOut  AOut xor C C  B B  XOut

RTL: Bit Level Description • Use pointed bracket to denote the bits in a register or signal group, e.g., A< 31: 0> means bit 31 to bit 0 of register A F  E<26: 23> E  E + SignExtend( F) Another way of expressing: Alternatively: F<3>  E<26> F<3: 0>  E<26: 23> F<2>  E<25> F<1>  E<24> F<0>  E<23>

RTL: Memory Description • Memory is described as an array • General purpose registers are described as an array e. g., Mem[100] Contents of address 100 in memory R[6] Contents of Register 6 R[rs] Contents of the register whose register number is specified by the signal rs

RTL: Conditionals • Conditionals can also be used in RTL e. g., RTL: if (Select = 0) then Output  Input_0 else if (Select = 1) then Output  Input_1

Register Transfer Language and Clocking Register transfer in RTL: R2  f(R1) What Really Happens Physically . R1 R2 1 1 1 1 1 1 1 Clk Don’t Care Setup Hold Setup (Hold) - Short time before (after) clocking that inputs can’t change or they might mess up the output Two possible clocking methodologies: positively triggered or negatively triggered. This class uses the negatively-triggered.

Instructions and RTL for the MIPS Subset instr  mem[PC] Instruction Fetch rs  instr<25:21> Define Signals (Fields) of Instr rt  instr<20:16> rd  instr<15:11> R[rd]  R[rs] + R[rt] Add Register Contents PC  PC + 4 Update Program Counter Take 0 clock RTL: Instr  mem[PC] Instruction Fetch rs  instr<25: 21> Define Signals (Fields) of Instr rt  instr<20: 16> rd  instr<15: 11> R[rd]  R[rs] - R[rt] Subtract Register Contents PC  PC + 4 Update Program Counter

Instructions and RTL for the MIPS Subset (continued) instr  mem[PC] Instruction Fetch rs  instr<25:21> Define Signals (Fields) of Instr rt  instr<20:16> imm16  instr<15:0> addr  R[rs] + sign_extend(imm16) Calculate Memory Address R[rt]  Mem[addr] Load Data into Register PC  PC + 4 Update Program Counter Take 0 clock

Instructions and RTL for the MIPS Subset (continued) instr  mem[PC] Instruction Fetch rs  instr<25:21> Define Signals (Fields) of Instr rt  instr<20:16> imm16  instr<15:0> addr  R[rs] + sign_ext(imm16) Calculate Memory Address Mem[addr]  R[rt] Store Register data Into Memory PC  PC + 4 RTL: instr  mem[PC] Instruction Fetch rs  instr<25:21> Define Signals (Fields) of Instr rt  instr<20:16> imm16  instr< 15: 0> R[rt]  R[rs] or zero_ext(imm16) Logical OR PC  PC + 4 Update Program Counter

Instructions and RTL for the MIPS Subset (continued) instr  mem[PC] Instruction Fetch rs  instr<25:21> Define Signals (Fields) of Instr rt  instr<20:16> imm16  instr<15:0> branch_ cond  R[rs] - R[rt] Calculate Branch Condition if (branch_cond eq 0) Calculate Next Instruction Address then PC  PC + 4 + (sign_ext(imm16)* 4) else PC  PC + 4 RTL: instr  mem[PC] Instruction Fetch PC_incr  PC + 4 Increment Program Counter PC<31:2>  PC_incr<31:28> concat target<25:0> Calculate Next Instr. Addr. Note: PC< 1: 0> is “00” for a word address so not necessary to implement PC< 1: 0>

Step 2: Select Basic Processor Elements Possible Elements to be Used in Data Path

Data Path Element Example: ALU Cin ALU0 Less Cout a0 b0 result0 ALU1 a1 b1 result1 ALU31 a31 b31 result31 overflow set Binvert op[1:0] zero a b cin 1 2 3 result + sum Less op[1:0] Binvert cout a b cin cout sum a b cin 1 2 3 result + sum Less op[1:0] Binvert Overflow detection set overflow

Data Path Element Example: Register File Clock Signal

Implementation of Register File clock

Data Path Element Example: An Idealized Memory

Step 3: Assemble the Datapath Put Together a Datapath for R-Type Instruction General format: Op rd, rs, rt (e.g., add rd, rs, rt) instr  mem[PC] Instruction Fetch rs  instr<25:21> Define Signals (Fields) of Instr rt  instr<20:16> rd  instr<15:11> R[rd]  R[rs] + R[rt] Add Register Contents PC  PC + 4 Update Program Counter PC+4 Next Address Logic PC rs Instruction Memory Register File Rd addr1 rt Rd addr2 ALU rd Wr addr Wr data See Example Before Animating the Construction of the Data Path

Step 3: Assemble the Datapath Details of Instruction Fetch Unit The Common RTL Operations: Fetch the Instruction and Define signal fields of the instruction: instr  mem[ PC]; rs  instr< 25: 21>; rt  instr< 20: 16>; rd  instr< 15: 11>; imm16  instr< 15: 0> Update the Program Counter: Sequential Code: PC  PC+ 4 Branch and Jump: PC  “something else” 4 8 12 4 8 Next Address Logic PC Clk Instruction Memory 00 Instruction #1 Instruction #2 Instruction #3 Instruction #4 Instruction #5 Instruction #6 04 08 12 16 20 Instr <31:0> 32 <25:21> rs <20:16> rt <15:11> rd <15:0> imm16 To Data Path

Operations of R-Type Instruction Datapath • R[ rd]  R[ rs] op R[ rt] Example: add rd, rs, rt instr  mem[PC] Instruction Fetch rs  instr<25:21> Define Signals (Fields) of Instr rt  instr<20:16> rd  instr<15:11> R[rd]  R[rs] + R[rt] Add Register Contents PC  PC + 4 Update Program Counter ALUctr and RegWr: Control Signals from Control Logic Instruction Memory PC clock clock rs rt rd

Details of R-Type Instruction Timing Clk to-Q Old Value New Value Instruction Memory Access Time Old Value New Value Delay Through Control Logic Old Value New Value Control Signal Old Value New Value Control Signal Register File Access Time Old Value New Value ALU Delay Old Value New Value

Step 3: Assemble the Datapath (continue) Put Together a Datapath for Load Instruction lw rt, immed16(rs) Instr  mem[PC] Instruction Fetch rs  Instr<25:21> Define Signals (Fields) of Instr rt  Instr<20:16> imm16  Instr<15:0> Addr  R[rs] + SignExtend(imm16) Calculate Memory Address R[rt]  Mem[Addr] Load Data into Register PC  PC + 4 Update Program Counter PC+4 Next Address Logic PC rs Instruction Memory Register File Rd addr1 rt ALU Data Memory addr data in data out imm16 Wr addr Wr data ext See Example Before Animating the Construction of the Data Path

Operations of the Datapath for Load Instruction • R[ rt]  Mem[ R[ rs] + SignExt( imm16)] Example: lw rt, imm16( rs) Instruction Memory PC clock clock rs rt data

Timing of a Load Instruction Clk to-Q Old Value New Value Instruction Memory Access Time Old Value New Value Delay Through Control Logic Old Value New Value Old Value New Value Old Value New Value RegWr busA busB Address busW Old Value New Value Register File Access Time Old Value New Value Delay through Extender & Mux Old Value New Value ALU Delay Old Value New Value Data Memory Access & MUX Time Old Value New Value

Step 3: Assemble the Datapath (continue) Put Together a Datapath for Store Instruction sw rt, immed16($2) Instr  mem[PC] Instruction Fetch rs  Instr<25:21> Define Signals (Fields) of Instr rt  Instr<20:16> imm16  Instr<15:0> Addr  R[rs] + SignExt(imm16) Calculate Memory Address Mem[Addr]  R[rt] Store Register data Into Memory PC  PC + 4 PC+4 Next Address Logic PC rs Instruction Memory Register File Rd addr1 rt Rd addr2 ALU Data Memory addr data in data out imm16 ext

Operations of the Datapath for Store Instruction Instruction Memory PC clock rs rt mem=rt

Step 3: Assemble the Datapath (continue) Put Together a Datapath for I-Type Instruction General format: Op rt, rs, immed16 (e.g., ori rt, rs, immed16) Instr  mem[PC] Instruction Fetch rs  Instr<25:21> Define Signals (Fields) of Instr rt  Instr<20:16> imm16  Instr<15:0> R[rt]  R[rs] or ZeroExt(imm16) Logical OR PC  PC + 4 Update Program Counter PC+4 Next Address Logic PC rs Instruction Memory Register File rt Rd addr1 ALU imm16 Wr addr Wr data ext

Operations of the I-Type Instruction Datapath • R[rt]  R[rs] op ZeroExt(lmm16); op = +, -, and, or etc. Example: ori rt, rs, Imm16 Instruction Memory PC clock clock rs rt

Step 3: Assemble the Datapath (continue) Put Together a Datapath for Branch Instruction beq rs, rt, immed16 Instr <- mem[PC] Instruction Fetch rs <- Instr<25:21> Define Signals (Fields) of Instr rt <- Instr<20:16> imm16 <- Instr<15:0> branch_ cond <- R[rs] - R[rt] Calculate Branch Condition if (branch_ cond eq 0) Calculate Next Instruction Address then PC <- PC + 4 + (SignExt(immd16)* 4) else PC <- PC + 4 PC+4+immd16*4 branch_cond Next Address Logic PC rs Instruction Memory Register File Rd addr1 rt Rd addr2 ALU imm16 ext

Wr Data = ALU output or Mem[addr] Step 3: Assemble the Datapath (continue) Combining Datapaths for Different Instructions Example: Combining Data Paths for add and lw PC Instruction Memory Rd addr1 Rd addr2 Wr addr Wr data ALU Next Address Logic PC+4 rs imm16 R[rs] Data Memory Register File rt ext Data Path for lw PC Instruction Memory Rd addr1 Rd addr2 Wr addr Wr data ALU Next Address Logic PC+4 rs rd R[rs] Register File rt R[rt] Data Path for Add PC Instruction Memory Rd addr1 Rd addr2 Wr addr Wr data ALU Next Address Logic PC+4 rs rd imm16 R[rs] Data Memory Wr Data = ALU output or Mem[addr] Register File rt mux ext R[rt] Combined Data Path See Example Before Animating the Construction of the Data Path

Operations of the Datapath for Branch Instruction Instruction Memory clock clock Pc+4+ imm16 PC+4 rs rt

Binary Arithmetic for the Next Address In Theory, the PC is a 32- bit byte Address Into the Instruction Memory Sequential Operation: PC< 31: 0> = PC< 31: 0> + 4 Branch Operation: PC< 31: 0> = PC< 31: 0> + 4 + SignExt( Imm16)* 4 The Magic Number “4” Always Comes Up Because: The 32- Bit PC is a Byte Address And All Our Instructions are 4 Bytes (32- bits) Long In Other Words: The 2 LSBs of the 32- bit PC are Always Zeros There is No Reason to Have Hardware to Keep the 2 LSBs In Practice, We Can Simplify the Hardware by Using a 30- bit PC< 31: 2> Sequential Operation: PC< 31: 2> = PC< 31: 2> + 1 Branch Operation: PC< 31: 2> = PC< 31: 2> + 1 + SignExt(imm16) In Either Case, Instruction Memory Address = PC< 31: 2> concat “00”

Next Address Logic Including Branch Instructions If no branch clock clock =1 1 1 MUX delay after branch decision is made

Next Address Logic: Cheaper Solution 1 MUX + 1 Adder delay after branch decision is made

A Complete Instruction Fetch Unit Question: What is the data path for Jump instruction? Answer: None. Jump instruction is handled by Instruction Fetch Unit alone. Just need to add a MUX clock

Putting It All Together: A Single Cycle Datapath imm16 32 ALUctr Clk busW RegWr busA busB 5 Rw Ra Rb 32 32-bit Registers Rs Rt Rd RegDst Extender 16 ALUSrc ExtOp MemtoReg Data In WrEn Adr Data Memory MemWr Equal Instruction<31:0> <21:25> <16:20> <11:15> <0:15> Imm16 PC 00 4 nPC_sel PC Ext Inst MUX 1 Adder = We Have Everything Except Control Signals (underline)

Load Instruction in the Complete Data Path imm16 32 ALUctr Clk busW RegWr busA busB 5 Rw Ra Rb 32 32-bit Registers Rs Rt Rd RegDst Extender 16 ALUSrc ExtOp MemtoReg Data In WrEn Adr Data Memory MemWr Equal Instruction<31:0> <21:25> <16:20> <11:15> <0:15> Imm16 PC 00 4 nPC_sel PC Ext Inst MUX 1 Adder = We Have Everything Except Control Signals (underline) rs rt PC+4 PC+4 data for rt