CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz (www.cs.berkeley.edu/~kubitron)

Slides:



Advertisements
Similar presentations
EEM 486 EEM 486: Computer Architecture Lecture 4 Designing a Multicycle Processor.
Advertisements

ELEN 350 Multi-Cycle Datapath Adapted from the lecture notes of John Kubiatowicz (UCB) and Hank Walker (TAMU)
CS152 Lec9.1 CS152 Computer Architecture and Engineering Lecture 9 Designing Single Cycle Control.
EECC550 - Shaaban #1 Lec # 4 Summer Major CPU Design Steps 1Using independent RTN, write the micro- operations required for all target ISA.
361 datapath Computer Architecture Lecture 8: Designing a Single Cycle Datapath.
CS61C L19 CPU Design : Designing a Single-Cycle CPU (1) Beamer, Summer 2007 © UCB Scott Beamer Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
EECC550 - Shaaban #1 Lec # 5 Winter Major CPU Design Steps 1. Analyze instruction set operations using independent RTN ISA => RTN => datapath.
CS61C L26 Single Cycle CPU Datapath II (1) Garcia © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
CS61C L20 Single-Cycle CPU Control (1) Beamer, Summer 2007 © UCB Scott Beamer Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
CS152 / Kubiatowicz Lec8.1 9/26/01©UCB Fall 2001 CS152 Computer Architecture and Engineering Lecture 8 Designing Single Cycle Control September 26, 2001.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use.
361 multipath..1 ECE 361 Computer Architecture Lecture 10: Designing a Multiple Cycle Processor.
EECC550 - Shaaban #1 Lec # 5 Winter CPU Design Steps 1. Analyze instruction set operations using independent ISA => RTN => datapath requirements.
Savio Chau Single Cycle Controller Design Last Time: Discussed the Designing of a Single Cycle Datapath Control Datapath Memory Processor (CPU) Input Output.
EECC550 - Shaaban #1 Lec # 5 Winter CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements.
EECC550 - Shaaban #1 Lec # 5 Winter CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements.
Ceg3420 control.1 ©UCB, DAP’ 97 CEG3420 Computer Design Lecture 9.2: Designing Single Cycle Control.
CS152 / Kubiatowicz Lec9.1 9/28/01©UCB Fall 2001 CS 152 Computer Architecture and Engineering Lecture 9 Designing a Multicycle Processor February 15, 2001.
EECC550 - Shaaban #1 Lec # 4 Winter CPU Organization Datapath Design: –Capabilities & performance characteristics of principal Functional.
CS 61C L35 Single Cycle CPU Control II (1) Garcia, Spring 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
CS 61C L34 Single Cycle CPU Control I (1) Garcia, Spring 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
ECE 232 L15.Miulticycle.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 15 Multi-cycle.
Microprocessor Design
Give qualifications of instructors: DAP
CS152 / Kubiatowicz Lec9.1 2/26/03©UCB Spring 2003 CS 152 Computer Architecture and Engineering Lecture 9 Designing a Multicycle Processor February 26,
ECE 232 L13. Control.1 ©UCB, DAP’ 97 ECE 232 Hardware Organization and Design Lecture 13 Control Design
CS152 / Kubiatowicz Lec8.1 2/22/99©UCB Spring 1999 CS152 Computer Architecture and Engineering Lecture 8 Designing Single Cycle Control Feb 22, 1999 John.
CS61C L25 CPU Design : Designing a Single-Cycle CPU (1) Garcia, Fall 2006 © UCB T-Mobile’s Wi-Fi / Cell phone  T-mobile just announced a new phone that.
Recap: Processor Design is a Process
CS 61C L17 Control (1) A Carle, Summer 2006 © UCB inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #17: CPU Design II – Control
CS151B Computer Systems Architecture Winter 2002 TuTh 2-4pm BH Instructor: Prof. Jason Cong Lecture 8 Designing a Single Cycle Control.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Fall 2006 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 25 CPU design (of a single-cycle CPU) Intel is prototyping circuits that.
CS 61C L29 Single Cycle CPU Control II (1) Garcia, Fall 2004 © UCB Andrew Schultz inst.eecs.berkeley.edu/~cs61c-tb inst.eecs.berkeley.edu/~cs61c CS61C.
EECC550 - Shaaban #1 Lec # 4 Winter Major CPU Design Steps 1Using independent RTN, write the micro- operations required for all target.
EEM 486: Computer Architecture Lecture 3 Designing a Single Cycle Datapath.
CS61C L27 Single-Cycle CPU Control (1) Garcia, Spring 2010 © UCB inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 27 Single-cycle.
CS 61C L16 Datapath (1) A Carle, Summer 2004 © UCB inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #16 – Datapath Andy.
CS61C L20 Single Cycle Datapath, Control (1) Chae, Summer 2008 © UCB Albert Chae, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
361 control Computer Architecture Lecture 9: Designing Single Cycle Control.
EECC550 - Shaaban #1 Lec # 5 Winter Major CPU Design Steps 1. Analyze instruction set operations using independent RTN ISA => RTN => datapath.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2010 © UCB inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures.
EECC550 - Shaaban #1 Lec # 5 Spring CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements.
ECE 232 L12.Datapath.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 12 Datapath.
ELEN 350 Single Cycle Datapath Adapted from the lecture notes of John Kubiatowicz(UCB) and Hank Walker (TAMU)
Major CPU Design Steps 1. Analyze instruction set operations using independent RTN ISA => RTN => datapath requirements. This provides the the required.
EECC550 - Shaaban #1 Lec # 5 Spring CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements.
CS61C L27 Single Cycle CPU Control (1) Garcia, Fall 2006 © UCB Wireless High Definition?  Several companies will be working on a “WirelessHD” standard,
CS3350B Computer Architecture Winter 2015 Lecture 5.6: Single-Cycle CPU: Datapath Control (Part 1) Marc Moreno Maza [Adapted.
ECS154B Computer Architecture Designing a Multicycle Processor Note Set 4
CASE STUDY OF A MULTYCYCLE DATAPATH. Alternative Multiple Cycle Datapath (In Textbook) Minimizes Hardware: 1 memory, 1 ALU Ideal Memory Din Address 32.
EEM 486: Computer Architecture Designing Single Cycle Control.
Designing a Single Cycle Datapath In this lecture, slides from lectures 3, 8 and 9 from the course Computer Architecture ECE 201 by Professor Mike Schulte.
EEM 486: Computer Architecture Designing a Single Cycle Datapath.
CPE 442 single-cycle datapath.1 Intro. To Computer Architecture CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath.
CS3350B Computer Architecture Winter 2015 Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2) Marc Moreno Maza [Adapted.
Computer Organization CS224 Chapter 4 Part a The Processor Spring 2011 With thanks to M.J. Irwin, T. Fountain, D. Patterson, and J. Hennessy for some lecture.
Designing a Single- Cycle Processor 國立清華大學資訊工程學系 黃婷婷教授.
Csci 136 Computer Architecture II –Single-Cycle Datapath Xiuzhen Cheng
EEM 486: Computer Architecture Lecture 3 Designing Single Cycle Control.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single-Cycle CPU Datapath & Control Part 2 Instructors: Krste Asanovic & Vladimir Stojanovic.
CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually build a CPU Questions on CS140 ? Computer Arithmetic ?
Single Cycle Controller Design
EI209 Chapter 4B.1Haojin Zhu, SJTU 2015 EI 209 Computer Organization Fall 2015 Chapter 4B: The Processor, Control and Multi-cycle Datapath [Adapted from.
CS 110 Computer Architecture Lecture 11: Single-Cycle CPU Datapath & Control Instructor: Sören Schwertfeger School of Information.
Designing a Multicycle Processor
John Kubiatowicz (http.cs.berkeley.edu/~kubitron)
CS152 Computer Architecture and Engineering Lecture 8 Designing a Single Cycle Datapath Start: X:40.
Instructors: Randy H. Katz David A. Patterson
Alternative datapath (book): Multiple Cycle Datapath
Presentation transcript:

CS 152 Computer Architecture and Engineering Lecture 8 Single-Cycle (Con’t) Designing a Multicycle Processor February 23, 2004 John Kubiatowicz ( lecture slides:

CS152 / Kubiatowicz Lec8.2 2/23/04©UCB Spring 2004 Recap: A Single Cycle Datapath °Rs, Rt, Rd and Imed16 hardwired into datapath from Fetch Unit °We have everything except control signals (underline) 32 ALUctr Clk busW RegWr 32 busA 32 busB 555 RwRaRb bit Registers Rs Rt Rd RegDst Extender Mux imm16 ALUSrc ExtOp Mux MemtoReg Clk Data In WrEn 32 Adr Data Memory 32 MemWr ALU Instruction Fetch Unit Clk Equal Instruction Imm16RdRsRt nPC_sel

CS152 / Kubiatowicz Lec8.3 2/23/04©UCB Spring 2004 Recap: Flexible Instruction Fetch °Branch (nPC_sel = “Br”): if (Equal == 1) then PC = PC SignExt[imm16]*4 ; else PC = PC + 4 °Other (nPC_sel = “+4”): PC=PC+4 °What is encoding of nPC_sel? Direct MUX select? Branch / not branch °Let’s choose second option Adr Inst Memory Adder PC Clk 00 Mux 4 nPC_sel imm16 Instruction 0 1 Equal nPC_MUX_sel

CS152 / Kubiatowicz Lec8.4 2/23/04©UCB Spring 2004 Recap: The Single Cycle Datapath during Add 32 ALUctr = Add Clk busW RegWr = 1 32 busA 32 busB 555 RwRaRb bit Registers Rs Rt Rd RegDst = 1 Extender Mux imm16 ALUSrc = 0 ExtOp = x Mux MemtoReg = 0 Clk Data In WrEn 32 Adr Data Memory 32 MemWr = 0 ALU Instruction Fetch Unit Clk Equal Instruction °R[rd] <- R[rs] + R[rt] Imm16RdRsRt oprsrtrdshamtfunct nPC_sel= +4

CS152 / Kubiatowicz Lec8.5 2/23/04©UCB Spring 2004 Recap: The Single Cycle Datapath during Or Immediate 32 ALUctr = Or Clk busW RegWr = 1 32 busA 32 busB 555 RwRaRb bit Registers Rs Rt Rd RegDst = 0 Extender Mux imm16 ALUSrc = 1 ExtOp = 0 Mux MemtoReg = 0 Clk Data In WrEn 32 Adr Data Memory 32 MemWr = 0 ALU Instruction Fetch Unit Clk Equal Instruction °R[rt] <- R[rs] or ZeroExt[Imm16] Imm16RdRsRt oprsrtimmediate nPC_sel= +4

CS152 / Kubiatowicz Lec8.6 2/23/04©UCB Spring 2004 Recap: The Single Cycle Datapath during Load 32 ALUctr = Add Clk busW RegWr = 1 32 busA 32 busB 555 RwRaRb bit Registers Rs Rt Rd RegDst = 0 Extender Mux imm16 ALUSrc = 1 ExtOp = 1 Mux MemtoReg = 1 Clk Data In WrEn 32 Adr Data Memory 32 MemWr = 0 ALU Instruction Fetch Unit Clk Equal Instruction Imm16RdRsRt °R[rt] <- Data Memory {R[rs] + SignExt[imm16]} oprsrtimmediate nPC_sel= +4

CS152 / Kubiatowicz Lec8.7 2/23/04©UCB Spring 2004 Recap: The Single Cycle Datapath during Store 32 ALUctr = Add Clk busW RegWr = 0 32 busA 32 busB 555 RwRaRb bit Registers Rs Rt Rd RegDst = x Extender Mux imm16 ALUSrc = 1 ExtOp = 1 Mux MemtoReg = x Clk Data In WrEn 32 Adr Data Memory 32 MemWr = 1 ALU Instruction Fetch Unit Clk Equal Instruction Imm16RdRsRt °Data Memory {R[rs] + SignExt[imm16]} <- R[rt] oprsrtimmediate nPC_sel= +4

CS152 / Kubiatowicz Lec8.8 2/23/04©UCB Spring 2004 Recap: The Single Cycle Datapath during Branch 32 ALUctr =Sub Clk busW RegWr = 0 32 busA 32 busB 555 RwRaRb bit Registers Rs Rt Rd RegDst = x Extender Mux imm16 ALUSrc = 0 ExtOp = x Mux MemtoReg = x Clk Data In WrEn 32 Adr Data Memory 32 MemWr = 0 ALU Instruction Fetch Unit Clk Equal Instruction Imm16RdRsRt °if (R[rs] - R[rt] == 0) then Zero <- 1 ; else Zero <- 0 oprsrtimmediate nPC_sel= “Br”

CS152 / Kubiatowicz Lec8.9 2/23/04©UCB Spring 2004 Recap: A Summary of Control Signals inst Register Transfer ADDR[rd] <– R[rs] + R[rt];PC <– PC + 4 ALUsrc = RegB, ALUctr = “add”, RegDst = rd, RegWr, nPC_sel = “+4” SUBR[rd] <– R[rs] – R[rt];PC <– PC + 4 ALUsrc = RegB, ALUctr = “sub”, RegDst = rd, RegWr, nPC_sel = “+4” ORiR[rt] <– R[rs] + zero_ext(Imm16); PC <– PC + 4 ALUsrc = Im, Extop = “Z”, ALUctr = “or”, RegDst = rt, RegWr, nPC_sel = “+4” LOADR[rt] <– MEM[ R[rs] + sign_ext(Imm16)];PC <– PC + 4 ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemtoReg, RegDst = rt, RegWr, nPC_sel = “+4” STOREMEM[ R[rs] + sign_ext(Imm16)] <– R[rs];PC <– PC + 4 ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemWr, nPC_sel = “+4” BEQif ( R[rs] == R[rt] ) then PC <– PC + sign_ext(Imm16)] || 00 else PC <– PC + 4 nPC_sel = “Br”, ALUctr = “sub”

CS152 / Kubiatowicz Lec8.10 2/23/04©UCB Spring 2004 Step 5: Assemble Control logic ALUctr RegDst ALUSrc ExtOp MemtoRegMemWr Equal Instruction Imm16RdRsRt nPC_sel Adr Inst Memory DATA PATH Decoder Op Fun RegWr

CS152 / Kubiatowicz Lec8.11 2/23/04©UCB Spring 2004 A Summary of the Control Signals addsuborilwswbeq RegDst ALUSrc MemtoReg RegWrite MemWrite nPCsel ExtOp ALUctr x Add x Subtract Or Add x 1 x x 0 x x Subtract optarget address oprsrtrdshamtfunct oprsrt immediate R-type I-type J-type add, sub ori, lw, sw, beq jump func op Appendix A See We Don’t Care :-)

CS152 / Kubiatowicz Lec8.12 2/23/04©UCB Spring 2004 The Concept of Local Decoding Main Control op 6 ALU Control (Local) func N 6 ALUop ALUctr 3 ALU R-typeorilwswbeq RegDst ALUSrc MemtoReg RegWrite MemWrite Branch ExtOp ALUop x “R-type” Or Add x 1 x x 0 x x Subtract op

CS152 / Kubiatowicz Lec8.13 2/23/04©UCB Spring 2004 The Encoding of ALUop °ALUop has to be 2 bits wide to represent: (1) “R-type” instructions “I-type” instructions that require the ALU to perform: -(2) Or, (3) Add, and (4) Subtract Main Control op 6 ALU Control (Local) func N 6 ALUop ALUctr 3 R-typeorilwswbeq ALUop (Symbolic)“R-type”OrAdd Subtract ALUop

CS152 / Kubiatowicz Lec8.14 2/23/04©UCB Spring 2004 The Decoding of the “func” Field Main Control op 6 ALU Control (Local) func N 6 ALUop ALUctr 3 oprsrtrdshamtfunct R-type funct Instruction Operation add subtract and or set-on-less-than ALUctr ALU Operation And Or Add Subtract Set-on-less-than P. 286 text: ALUctr ALU R-typeorilwswbeq ALUop (Symbolic)“R-type”OrAdd Subtract ALUop

CS152 / Kubiatowicz Lec8.15 2/23/04©UCB Spring 2004 The Truth Table for ALUctr R-typeorilwswbeq ALUop (Symbolic) “R-type”OrAdd Subtract ALUop funct Instruction Op add subtract and or set-on-less-than

CS152 / Kubiatowicz Lec8.16 2/23/04©UCB Spring 2004 Step 5: Logic for each control signal `define Rtype 6`b000000; `define BEQ 6`b000100; `defineOri6`b001101; `defineLoad6`b100011; `defineStore6`b101011; … etc … nPC_sel <= (OP == `BEQ) ? `Br : `plus4; ALUsrc <= (OP == `Rtype) ? `regB : `immed; ALUctr <= (OP == `Rtype`) ? funct : (OP == `ORi) ? `ORfunction : (OP == `BEQ) ? `SUBfunction : `ADDfunction; ExtOp <= (OP == `ORi) : `ZEROextend : `SIGNextend; MemWr <= (OP == `Store) ? 1 : 0; MemtoReg<= (OP == `Load) ? 1 : 0; RegWr: <= ((OP == `Store) || (OP == `BEQ)) ? 0 : 1; RegDst: <= ((OP == `Load) || (OP == `ORi)) ? 0 : 1;

CS152 / Kubiatowicz Lec8.17 2/23/04©UCB Spring 2004 The “Truth Table” for the Main Control R-typeorilwswbeq RegDst ALUSrc MemtoReg RegWrite MemWrite nPC_sel Jump ExtOp ALUop (Symbolic) x “R-type” Or Add x 1 x x 0 x x Subtract op ALUop Main Control op 6 ALU Control (Local) func 3 6 ALUop ALUctr 3RegDst ALUSrc :

CS152 / Kubiatowicz Lec8.18 2/23/04©UCB Spring 2004 The “Truth Table” for RegWrite R-typeorilwswbeqjump RegWrite op °RegWrite = R-type + ori + lw = !op & !op & !op & !op & !op & !op (R-type) + !op & !op & op & op & !op & op (ori) + op & !op & !op & !op & op & op (lw) RegWrite

CS152 / Kubiatowicz Lec8.19 2/23/04©UCB Spring 2004 PLA Implementation of the Main Control RegWrite ALUSrc MemtoReg MemWrite Branch Jump RegDst ExtOp ALUop

CS152 / Kubiatowicz Lec8.20 2/23/04©UCB Spring 2004 Administrative Issues °Read Chapter 5 °This lecture and next one slightly different from the book °Design Document for lab 3 due in section this Thursday! Describe your division of labor Your testing methodology (how will you test each step of the way?) Top-level block diagrams °Midterm on Wednesday 3/10 5:30pm to 8:30pm, location TBA No class on that day : Pencil, calculator, one 8.5” x 11” (both sides) of handwritten notes °Meet at LaVal’s pizza after the midterm

CS152 / Kubiatowicz Lec8.21 2/23/04©UCB Spring 2004 The Big Picture: Where are We Now? °The Five Classic Components of a Computer °Today’s Topic: Designing the Datapath for the Multiple Clock Cycle Datapath Control Datapath Memory Processor Input Output

CS152 / Kubiatowicz Lec8.22 2/23/04©UCB Spring 2004 Abstract View of our single cycle processor °looks like a FSM with PC as state PC Next PC Register Fetch ALU Reg. Wrt Mem Access Data Mem Instruction Fetch Result Store ALUctr RegDst ALUSrc ExtOp MemWr Equal nPC_sel RegWr MemWr MemRd Main Control ALU control op fun Ext

CS152 / Kubiatowicz Lec8.23 2/23/04©UCB Spring 2004 What’s wrong with our CPI=1 processor? °Long Cycle Time °All instructions take as much time as the slowest °Real memory is not as nice as our idealized memory cannot always get the job done in one (short) cycle PCInst Memory mux ALUData Mem mux PCReg FileInst Memory mux ALU mux PCInst Memory mux ALUData Mem PCInst Memorycmp mux Reg File Arithmetic & Logical Load Store Branch Critical Path setup

CS152 / Kubiatowicz Lec8.24 2/23/04©UCB Spring 2004 Memory Access Time °Physics => fast memories are small (large memories are slow) question: register file vs. memory °=> Use a hierarchy of memories Storage Array selected word line address storage cell bit line sense amps address decoder Cache Processor 1 time-period proc. bus L2 Cache mem. bus 2-3 time-periods time-periods memory

CS152 / Kubiatowicz Lec8.25 2/23/04©UCB Spring 2004 Reducing Cycle Time °Cut combinational dependency graph and insert register / latch °Do same work in two fast cycles, rather than one slow one °May be able to short-circuit path and remove some components for some instructions! storage element Acyclic Combinational Logic storage element Acyclic Combinational Logic (A) storage element Acyclic Combinational Logic (B) 

CS152 / Kubiatowicz Lec8.26 2/23/04©UCB Spring 2004 Worst Case Timing (Load) Clk PC Rs, Rt, Rd, Op, Func Clk-to-Q ALUctr Instruction Memoey Access Time Old ValueNew Value RegWrOld ValueNew Value Delay through Control Logic busA Register File Access Time Old ValueNew Value busB ALU Delay Old ValueNew Value Old ValueNew Value Old Value ExtOpOld ValueNew Value ALUSrcOld ValueNew Value MemtoRegOld ValueNew Value AddressOld ValueNew Value busWOld ValueNew Delay through Extender & Mux Register Write Occurs Data Memory Access Time

CS152 / Kubiatowicz Lec8.27 2/23/04©UCB Spring 2004 Basic Limits on Cycle Time °Next address logic PC <= branch ? PC + offset : PC + 4 °Instruction Fetch InstructionReg <= Mem[PC] °Register Access A <= R[rs] °ALU operation R <= A + B PC Next PC Operand Fetch Exec Reg. File Mem Access Data Mem Instruction Fetch Result Store ALUctr RegDst ALUSrc ExtOp MemWr nPC_sel RegWr MemWr MemRd Control

CS152 / Kubiatowicz Lec8.28 2/23/04©UCB Spring 2004 Partitioning the CPI=1 Datapath °Add registers between smallest steps °Place enables on all registers PC Next PC Operand Fetch Exec Reg. File Mem Access Data Mem Instruction Fetch Result Store ALUctr RegDst ALUSrc ExtOp MemWr nPC_sel RegWr MemWr MemRd Equal

CS152 / Kubiatowicz Lec8.29 2/23/04©UCB Spring 2004 Example Multicycle Datapath °Critical Path ? PC Next PC Operand Fetch Instruction Fetch nPC_sel IR Reg File Ext ALU Reg. File Mem Acces s Data Mem Result Store RegDst RegWr MemWr MemRd S M MemToReg Equal ALUctr ALUSrc ExtOp A B E

CS152 / Kubiatowicz Lec8.30 2/23/04©UCB Spring 2004 Recall: Step-by-step Processor Design Step 1: ISA => Logical Register Transfers Step 2: Components of the Datapath Step 3: RTL + Components => Datapath Step 4: Datapath + Logical RTs => Physical RTs Step 5: Physical RTs => Control

CS152 / Kubiatowicz Lec8.31 2/23/04©UCB Spring 2004 Step 4: R-rtype (add, sub,...) °Logical Register Transfer °Physical Register Transfers inst Logical Register Transfers ADDUR[rd] <– R[rs] + R[rt]; PC <– PC + 4 inst Physical Register Transfers IR <– MEM[pc] ADDUA<– R[rs]; B <– R[rt] S <– A + B R[rd] <– S; PC <– PC + 4 Exec Reg. File Mem Acces s Data Mem SM Reg File PC Next PC IR Inst. Mem Time A B E

CS152 / Kubiatowicz Lec8.32 2/23/04©UCB Spring 2004 Step 4: Logical immed °Logical Register Transfer °Physical Register Transfers inst Logical Register Transfers ORIR[rt] <– R[rs] OR ZExt(Im16); PC <– PC + 4 inst Physical Register Transfers IR <– MEM[pc] ORIA<– R[rs]; B <– R[rt] S <– A or ZExt(Im16) R[rt] <– S; PC <– PC + 4 Exec Reg. File Mem Acces s Data Mem SM Reg File PC Next PC IR Inst. Mem Time A B E

CS152 / Kubiatowicz Lec8.33 2/23/04©UCB Spring 2004 Step 4 : Load °Logical Register Transfer °Physical Register Transfers inst Logical Register Transfers LWR[rt] <– MEM[R[rs] + SExt(Im16)]; PC <– PC + 4 inst Physical Register Transfers IR <– MEM[pc] LWA<– R[rs]; B <– R[rt] S <– A + SExt(Im16) M <– MEM[S] R[rd] <– M; PC <– PC + 4 Exec Reg. File Mem Acces s Data Mem SM Reg File PC Next PC IR Inst. Mem A B E Time

CS152 / Kubiatowicz Lec8.34 2/23/04©UCB Spring 2004 Step 4 : Store °Logical Register Transfer °Physical Register Transfers inst Logical Register Transfers SWMEM[R[rs] + SExt(Im16)] <– R[rt]; PC <– PC + 4 inst Physical Register Transfers IR <– MEM[pc] SWA<– R[rs]; B <– R[rt] S <– A + SExt(Im16); MEM[S] <– BPC <– PC + 4 Exec Reg. File Mem Acces s Data Mem SM Reg File PC Next PC IR Inst. Mem A B E Time

CS152 / Kubiatowicz Lec8.35 2/23/04©UCB Spring 2004 Step 4 : Branch °Logical Register Transfer °Physical Register Transfers inst Logical Register Transfers BEQif R[rs] == R[rt] then PC <= PC + 4+SExt(Im16) || 00 else PC <= PC + 4 Exec Reg. File Mem Acces s Data Mem SM Reg File PC Next PC IR Inst. Mem inst Physical Register Transfers IR <– MEM[pc] BEQE<– (R[rs] = R[rt]) if !E then PC <– PC + 4 else PC <– PC+4+SExt(Im16)||00 A B E Time

CS152 / Kubiatowicz Lec8.36 2/23/04©UCB Spring 2004 Alternative datapath (book): Multiple Cycle Datapath °Miminizes Hardware: 1 memory, 1 adder Ideal Memory WrAdr Din RAdr 32 Dout MemWr 32 ALU 32 ALUOp ALU Control Instruction Reg 32 IRWr 32 Reg File Ra Rw busW Rb busA 32busB RegWr Rs Rt Mux 0 1 Rt Rd PCWr ALUSelA Mux 01 RegDst Mux PC MemtoReg Extend ExtOp Mux Imm 32 << 2 ALUSelB Mux 1 0 Target 32 Zero PCWrCondPCSrcBrWr 32 IorD ALU Out

CS152 / Kubiatowicz Lec8.37 2/23/04©UCB Spring 2004 Our Control Model °State specifies control points for Register Transfer °Transfer occurs upon exiting state (same falling edge) Control State Next State Logic Output Logic inputs (conditions) outputs (control points) State X Register Transfer Control Points Depends on Input

CS152 / Kubiatowicz Lec8.38 2/23/04©UCB Spring 2004 Step 4  Control Specification for multicycle proc IR <= MEM[PC] R-type A <= R[rs] B <= R[rt] S <= A fun B R[rd] <= S PC <= PC + 4 S <= A or ZX R[rt] <= S PC <= PC + 4 ORi S <= A + SX R[rt] <= M PC <= PC + 4 M <= MEM[S] LW S <= A + SX MEM[S] <= B PC <= PC + 4 BEQ PC <= Next(PC,Equal) SW “instruction fetch” “decode / operand fetch” Execute Memory Write-back

CS152 / Kubiatowicz Lec8.39 2/23/04©UCB Spring 2004 Traditional FSM Controller State next State op Equal control points stateopcond next state control points Truth Table datapath State

CS152 / Kubiatowicz Lec8.40 2/23/04©UCB Spring 2004 Step 5  (datapath + state diagram  control) °Translate RTs into control points °Assign states °Then go build the controller

CS152 / Kubiatowicz Lec8.41 2/23/04©UCB Spring 2004 Mapping RTs to Control Points IR <= MEM[PC] R-type A <= R[rs] B <= R[rt] S <= A fun B R[rd] <= S PC <= PC + 4 S <= A or ZX R[rt] <= S PC <= PC + 4 ORi S <= A + SX R[rt] <= M PC <= PC + 4 M <= MEM[S] LW S <= A + SX MEM[S] <= B PC <= PC + 4 BEQ PC <= Next(PC,Equal) SW “instruction fetch” “decode” imem_rd, IRen ALUfun, Sen RegDst, RegWr, PCen Aen, Ben, Een Execute Memory Write-back

CS152 / Kubiatowicz Lec8.42 2/23/04©UCB Spring 2004 Assigning States IR <= MEM[PC] R-type A <= R[rs] B <= R[rt] S <= A fun B R[rd] <= S PC <= PC + 4 S <= A or ZX R[rt] <= S PC <= PC + 4 ORi S <= A + SX R[rt] <= M PC <= PC + 4 M <= MEM[S] LW S <= A + SX MEM[S] <= B PC <= PC + 4 BEQ PC <= Next(PC) SW “instruction fetch” “decode” Execute Memory Write-back

CS152 / Kubiatowicz Lec8.43 2/23/04©UCB Spring 2004 (Mostly) Detailed Control Specification (missing  0) 0000??????? BEQx R-typex ORIx LWx SWx xxxxxx x 0 x 0011xxxxxx x 0 x 0100xxxxxxx fun xxxxxxx xxxxxxx or xxxxxxx xxxxxxx add xxxxxxx xxxxxxx xxxxxxx add xxxxxxx StateOp fieldEqNext IRPCOpsExecMemWrite-Back en selA B EEx Sr ALU S R W MM-R Wr Dst R: ORi: LW: SW: -all same in Moore machine BEQ:

CS152 / Kubiatowicz Lec8.44 2/23/04©UCB Spring 2004 Performance Evaluation °What is the average CPI? state diagram gives CPI for each instruction type workload gives frequency of each type TypeCPI i for typeFrequency CPI i x freqI i Arith/Logic440%1.6 Load530%1.5 Store410%0.4 branch320%0.6 Average CPI:4.1

CS152 / Kubiatowicz Lec8.45 2/23/04©UCB Spring 2004 Controller Design °The state digrams that arise define the controller for an instruction set processor are highly structured °Use this structure to construct a simple “microsequencer” °Control reduces to programming this very simple device  microprogramming sequencer control datapath control micro-PC sequencer microinstruction

CS152 / Kubiatowicz Lec8.46 2/23/04©UCB Spring 2004 Example: Jump-Counter op-code Map ROM Counter zero inc load 0000 i i+1 i None of above: Do nothing (for wait states)

CS152 / Kubiatowicz Lec8.47 2/23/04©UCB Spring 2004 Using a Jump Counter IR <= MEM[PC] R-type A <= R[rs] B <= R[rt] S <= A fun B R[rd] <= S PC <= PC + 4 S <= A or ZX R[rt] <= S PC <= PC + 4 ORi S <= A + SX R[rt] <= M PC <= PC + 4 M <= MEM[S] LW S <= A + SX MEM[S] <= B PC <= PC + 4 BEQ PC <= Next(PC) SW “instruction fetch” “decode” inc load zero inc Execute Memory Write-back

CS152 / Kubiatowicz Lec8.48 2/23/04©UCB Spring 2004 Our Microsequencer op-code Map ROM Micro-PC Z I L datapath control taken

CS152 / Kubiatowicz Lec8.49 2/23/04©UCB Spring 2004 Microprogram Control Specification 0000?inc load zero zero xinc0 1 fun xzero xinc0 0 or xzero xinc1 0 add xinc xzero xinc1 0 add xzero µPC TakenNext IRPCOpsExecMemWrite-Back en selA B Ex Sr ALU S R W MM-R Wr Dst R: ORi: LW: SW: BEQ

CS152 / Kubiatowicz Lec8.50 2/23/04©UCB Spring 2004 Overview of Control °Control may be designed using one of several initial representations. The choice of sequence control, and how logic is represented, can then be determined independently; the control can then be implemented with one of several methods using a structured logic technique. Initial Representation Finite State Diagram Microprogram Sequencing ControlExplicit Next State Microprogram counter Function + Dispatch ROMs Logic RepresentationLogic EquationsTruth Tables Implementation PLAROM Technique “hardwired control”“microprogrammed control”

CS152 / Kubiatowicz Lec8.51 2/23/04©UCB Spring 2004 Summary °Disadvantages of the Single Cycle Processor Long cycle time Cycle time is too long for all instructions except the Load °Multiple Cycle Processor: Divide the instructions into smaller steps Execute each step (instead of the entire instruction) in one cycle °Partition datapath into equal size chunks to minimize cycle time ~10 levels of logic between latches °Follow same 5-step method for designing “real” processor

CS152 / Kubiatowicz Lec8.52 2/23/04©UCB Spring 2004 Summary (cont’d) °Control is specified by finite state digram °Specialize state-diagrams easily captured by microsequencer simple increment & “branch” fields datapath control fields °Control design reduces to Microprogramming °Control is more complicated with: complex instruction sets restricted datapaths (see the book) °Simple Instruction set and powerful datapath  simple control could try to reduce hardware (see the book) rather go for speed => many instructions at once!