Download presentation
Presentation is loading. Please wait.
Published byMakena Nettle Modified over 9 years ago
1
EEM 486 EEM 486: Computer Architecture Lecture 4 Designing a Multicycle Processor
2
Lec 4.2 The Big Picture Designing a Multiple Clock Cycle Datapath Control Datapath Memory Processor Input Output
3
Lec 4.3 Single-Cycle Processor In our single-cycle processor, each instruction is realized by exactly one control command or microinstruction Control Logic / Store (PLA, ROM) OPcode Datapath Instruction Decode Conditions Control Points microinstruction
4
Lec 4.4 Abstract View of Single Cycle-Processor PC Next PC Register Fetch ALU Reg. Wrt Mem Access Data Mem Instruction Fetch ALUctr RegDst ALUSrc ExtOp MemWr Equal nPC_sel RegWr MemWr MemRd Main Control ALU control op fun Ext
5
Lec 4.5 What’s Wrong with CPI=1 Processor? Long Cycle Time All instructions take as much time as the slowest Real memory is not as nice as our idealized memory Cannot always get the job done in one (short) cycle PCInst Memory mux ALUData Mem mux PCReg FileInst Memory mux ALU mux PC Inst Memory mux ALUData Mem PCInst Memorycmp mux Reg File Arithmetic & Logical Load Store Branch Critical Path setup
6
Lec 4.6 Memory Access Time Physics => fast memories are small (large memories are slow) => Use a hierarchy of memories Storage Array selected word line address storage cell bit line sense amps address decoder Cache Processor 1 time-period proc. bus L2 Cache mem. bus 2-3 time-periods 20 - 50 time-periods memory
7
Lec 4.7 Break up the instructions into steps: Let each step take one “smaller” clock cycle - Balance the amount of work to be done - Restrict each cycle to use only one major functional unit Major functional units: Memory, Register File, and ALU Let different instructions take different numbers of cycles Use a functional unit more than once within execution of one instruction (Less hardware) A single memory unit for both instructions and data A single ALU, rather than an ALU and two adders At the end of a cycle store values for use in later cycles introduce additional “internal” registers Multicycle Approach
8
Lec 4.8 Partitioning the CPI=1 Datapath Add registers between smallest steps PC Next PC Operand Fetch Exec Reg. File Mem Access Data Mem Instruction Fetch ALUct r RegDst ALUSrc ExtOp MemWr nPC_sel RegWr MemWr MemRd Equal Instruction fetch Decode and Operand fetch Execution Memory access Write back
9
Lec 4.9 Recall: Step-by-step Processor Design Step 1: ISA => Logical Register Transfers Step 2: Components of the Datapath Step 3: RTL + Components => Datapath Step 4: Datapath + Logical RTs => Physical RTs Step 5: Physical RTs => Control
10
Lec 4.10 Step 4: R-type (add, sub,...) inst Logical Register Transfers ADDUR[rd]<–R[rs] + R[rt]; PC <– PC + 4 Step 1.Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2.Instruction Decode and Register Fetch A ← R[rs], B ← R[rt] Step 3.Execution ALUOut ← A op B Step 4.Write-back R[rd] ← ALUOut
11
Lec 4.11 R-type - Fetch 4 ALU Instruction register Address MemData Memory MemRead=1 IRWrite=1 ALUctr=Add nPCWrite=1 PC Write Data
12
Lec 4.12 R-type – Decode/Register Fetch PC A B 4 ALU Rs Rw Rt Registers Write data Read data 1 Read data 2 Instruction [25-21] Instruction [20-16] Instruction [15-11] Instruction register Address MemData Write data Memory MemRead=0 IRWrite=0 RegWrite=0 ALUctr=x nPCWrite=0
13
Lec 4.13 R-type - Execution PC A B 0 1 0 1 4 ALU Out Rs Rw Rt Registers Write data Read data 1 Read data 2 Instruction [25-21] Instruction [20-16] Instruction [15-11] Instruction register Address MemData Write data Memory MemRead=0 IRWrite=0 RegWrite=0 ALUSrcA=1 ALUSrcB=0 ALUctr= Func nPCWrite=0
14
Lec 4.14 R-type – Write Back PC A B 0 1 0 1 4 ALU Out Rs Rw Rt Registers Write data Read data 1 Read data 2 Instruction [25-21] Instruction [20-16] Instruction [15-11] Instruction register Address MemData Write data Memory MemRead=0 IRWrite=0 RegWrite=1 ALUSrcA=x ALUSrcB=x ALUctr=x nPCWrite=0
15
Lec 4.15 Step 4: Logical immed inst Logical Register Transfers ORI R[rt] <– R[rs] OR ZExt(Im16); PC <– PC + 4 Step 1.Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2.Instruction Decode and Register Fetch A ← R[rs] Step 3.Execution ALUOut ← A OR ZExt(Im16) Step 4.Write-back R[rt] ← ALUOut
16
Lec 4.16 Logical immediate - Execution PC Inst [15-11] A B 0 1 0 1 4 ALU Out Rs Rw Rt Registers Write data Read data 1 Read data 2 Instruction [25-21] Instruction [20-16] Instruction [15-0] Instruction register Address MemData Write data Memory MemRead=0 IRWrite=0 RegWrite=0 ALUSrcA=1 ALUSrcB=2 ALUctr=Or nPCWrite=0 2 Zero extend 16 32
17
Lec 4.17 Logical immediate – Write Back
18
Lec 4.18 Step 4 : Load inst Logical Register Transfers LWR[rt] <– MEM[R[rs] + SExt(Im16)]; PC <– PC + 4 Step 1.Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2.Instruction Decode and Register Fetch A ← R[rs] Step 3.Memory address computation ALUOut ← A + SExt(Im16) Step 4.Memory access MDR ← Memory[ALUOut] Step 5.Load completion R[rt] ← MDR
19
Lec 4.19 Load: Address Calculation PC Inst [15-11] A B 0 1 0 1 4 ALU Out Rs Rw Rt Registers Write data Read data 1 Read data 2 Instruction [25-21] Instruction [20-16] Instruction [15-0] Instruction register Address MemData Write data Memory MemRead=0 IRWrite=0 RegWrite=0 ALUSrcA=1 ALUSrcB=2 ALUctr=Add nPCWrite=0 2 Zero/ Sign extend 0 1 RegDst=x 16 32 ExtOp=Sign
20
Lec 4.20 Load: Memory Read PC Inst [15-11] A B 0 1 0 1 4 ALU Out Rs Rw Rt Registers Write data Read data 1 Read data 2 Instruction [31-26] Instruction [25-21] Instruction [20-16] Instruction [15-0] Instruction register Address MemData Write data Memory MemRead=1 IRWrite=0 RegWrite=0 ALUSrcA=x ALUSrcB=x ALUctr=x nPCWrite=0 2 Extender 0 1 RegDst=x 16 32 0 1 MDR IorD=1 ExtOp=x
21
Lec 4.21 Load: Write Back
22
Lec 4.22 Step 4 : Store inst Logical Register Transfers SWMEM[R[rs] + SExt(Im16)] <– R[rt]; PC <– PC + 4 Step 1.Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2.Instruction Decode and Register Fetch A ← R[rs], B ← R[rt] Step 3.Memory address computation ALUOut ← A + SExt(Im16) Step 4.Memory access Memory[ALUOut] ← B
23
Lec 4.23 Store: Address Calculation
24
Lec 4.24 Store: Memory Write
25
Lec 4.25 Step 4 : Branch inst Logical Register Transfers BEQif R[rs] == R[rt] then PC <= PC + 4 + SExt(Im16) || 00 else PC <= PC + 4 Step 1.Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2.Instruction Decode and Register Fetch A ← R[rs], B ← R[rt] ALUOut ← PC + SExt(Im16) || 00 Step 3.Branch completion If A = B, PC ← ALUOut
26
Lec 4.26 Branch – Address Calculation
27
Lec 4.27 Branch:Execution
28
Lec 4.28 Multicycle Processor
29
Lec 4.29 Summary of Instruction Steps
30
Lec 4.30 How many cycles will it take to execute this code? lw $t2, 0($t3) lw $t3, 4($t3) beq $t2, $t3, Label #assume not add $t5, $t2, $t3 sw $t5, 8($t3) Label:... What is going on during the 8th cycle of execution? In what cycle does the actual addition of $t2 and $t3 takes place? Simple Questions
31
Lec 4.31 Finite State Machine (FSM) Controller State specifies control points for Register Transfer Transfer occurs upon exiting state (same falling edge) Control State Next State Logic Output Logic inputs (conditions) outputs (control points)
32
Lec 4.32 FSM for Control PCWrite PCWriteCond IorD MemtoReg PCSource ALUOp ALUSrcB ALUSrcA RegWrite RegDst NS3 NS2 NS1 NS0 O p 5 O p 4 O p 3 O p 2 O p 1 O p 0 S 3 S 2 S 1 S 0 State register IRWrite MemRead MemWrite Instruction register opcode field Outputs Control logic Inputs
33
Lec 4.33 Step 4 Control Specification IR <= MEM[PC] PC <= PC + 4 R-type A <= R[rs], B <= R[rt] S <= PC + SX || 00 S <= A fun B R[rd] <= S S <= A or ZX R[rt] <= S ORi S <= A + SX R[rt] <= M M<=MEM[S] LW MEM[S] <= B BEQ PC <= Next(PC,Equal ) SW instruction fetch decode / operand fetch execute memory write-back
34
Lec 4.34 Step 5 (datapath + state diagram control) Translate RTs into control points Assign states Then go build the controller
35
Lec 4.35 Mapping RTs to Control Points PCSource= 0 PCWrite IorD= 0 MemRead IRWrite ALUSrcA= 0 ALUSrcB= 01 ALUOp= 000 ALUSrcA= 0 ALUSrcB= 11 ALUOp= 000 ExtOp= 1 Instruction fetch Instruction decode / register fetch ALUSrcA= 1 ALUSrcB= 00 ALUOp= 100 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 010 ExtOp= 0 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 000 ExtOp= 1 ALUSrcA= 1 ALUSrcB= 00 ALUOp= 001 PCSource= 1 PCWriteCond RegDst= 1 RegWrite MemtoReg= 0 RegDst= 0 RegWrite MemtoReg= 0 IorD= 1 MemRead IorD= 1 MemWrite RegDst= 0 RegWrite MemtoReg= 1 R-type ORi LW / SW LW SW Branch
36
Lec 4.36 Assigning States IR <= MEM[PC] PC <= PC + 4 R-type A <= R[rs], B <= R[rt] S <= PC + SX || 00 S <= A fun B R[rd] <= S S <= A or ZX R[rt] <= S ORi S <= A + SX R[rt] <= M M<=MEM[S] LW MEM[S] <= B BEQ PC <= Next(PC,Equal ) SW 0000 0001 0100 0101 0110 0111 1010 1000 1001 1011 0011
37
Lec 4.37 OutputCurrent State PCSourcestate3 PCWritestate0 PCWriteCondstate3 IorDstate9 + state11 MemReadstate0 + state9 MemWritestate11 IRWritestate0 RegDststate4 MemtoRegstate10 RegWrite state5 + state7 + state10 ALUSrcA state3+ state4 + state6 + state8 ALUSrcB1 state1 + state6 + state8 ALUSrcB0state0 + state1 ExtOpstate1 + state8 ALUOp2state4 ALUOp1state6 ALUOp0state3 Control Logic – Datapath Control Outputs PCSource= 0 PCWrite IorD= 0 MemRead IRWrite ALUSrcA= 0 ALUSrcB= 01 ALUOp= 000 ALUSrcA= 0 ALUSrcB= 11 ALUOp= 000 ExtOp= 1 Instruction fetch Instruction decode / register fetch ALUSrcA= 1 ALUSrcB= 00 ALUOp= 100 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 010 ExtOp= 0 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 000 ExtOp= 1 ALUSrcA= 1 ALUSrcB= 00 ALUOp= 001 PCSource= 1 PCWriteCond RegDst= 1 RegWrite MemtoReg= 0 RegDst= 0 RegWrite MemtoReg= 0 IorD= 1 MemRead IorD= 1 MemWrite RegDst= 0 RegWrite MemtoReg= 1 R-type ORi LW / SW LW SW Branch 0 1 4 6 8 3 57 9 10 11
38
Lec 4.38 Control Logic – Next State Function Output Current State Opcode NextState0 state3 + state5+ state7+ state10+ state11 NextState1state0 NextState3state1BEQ NextState4state1R-type NextState5state4 NextState6state1ORi NextState7state6 NextState8state1 LW + SW NextState9state8LW NextState10state9 NextState11state5SW PCSource= 0 PCWrite IorD= 0 MemRead IRWrite ALUSrcA= 0 ALUSrcB= 01 ALUOp= 000 ALUSrcA= 0 ALUSrcB= 11 ALUOp= 000 ExtOp= 1 Instruction fetch Instruction decode / register fetch ALUSrcA= 1 ALUSrcB= 00 ALUOp= 100 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 010 ExtOp= 0 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 000 ExtOp= 1 ALUSrcA= 1 ALUSrcB= 00 ALUOp= 001 PCSource= 1 PCWriteCond RegDst= 1 RegWrite MemtoReg= 0 RegDst= 0 RegWrite MemtoReg= 0 IorD= 1 MemRead IorD= 1 MemWrite RegDst= 0 RegWrite MemtoReg= 1 R-type ORi LW / SW LW SW Branch 0 1 4 6 8 3 57 9 10 11
39
Lec 4.39 PLA Implementation
40
Lec 4.40 Performance Evaluation What is the average CPI? State diagram gives CPI for each instruction type Workload gives frequency of each type TypeCPI i for typeFrequency CPI i x freqI i Arith/Logic 4 40% 1.6 Load 5 30% 1.5 Store 4 10% 0.4 branch 3 20% 0.6 Average CPI: 4.1
41
Lec 4.41 Another Implementation Style
42
Lec 4.42 Address Select Logic
43
Lec 4.43 Address Select Logic Current State Address Control ActionAddrCtl 0 Use incremented state 3 1 Use dispatch ROM 1 1 2 Use dispatch ROM 2 2 3 Use incremented state 3 4 Replace state number by 0 0 5 0 6 Use incremented state 3 7 Replace state number by 0 0 8 0 9 0 PCWrite PCSource = 10 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond PCSource = 01 ALUSrcA =1 ALUSrcB = 00 ALUOp= 10 RegDst = 1 RegWrite MemtoReg = 0 MemWrite IorD = 1 MemRead IorD = 1 ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 RegDst=0 RegWrite MemtoReg=1 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 MemRead ALUSrcA = 0 IorD = 0 IRWrite ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00 Instruction fetch Instruction decode/ register fetch Jump completion Branch completionExecution Memory address computation Memory access Memory access R-type completion Write-back step ( O p = ' L W ' ) o r ( O p = ' S W ' ) ( O p = R - t y p e ) ( O p = ' B E Q ' ) ( O p = ' J ' ) ( O p = ' S W ' ) ( O p = ' L W ' ) 4 0 1 9862 753 Start
44
Lec 4.44 Dispatch ROMs Dispatch ROM 1 Dispatch ROM 2 OpOpcode nameValue OpOpcode nameValue 000000R-format0110 100011 lw 0011 000010 jmp 1001 101011 sw 0101 000100 beq 1000 100011 lw 0010 101011 sw 0010 PCWrite PCSource = 10 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond PCSource = 01 ALUSrcA =1 ALUSrcB = 00 ALUOp= 10 RegDst = 1 RegWrite MemtoReg = 0 MemWrite IorD = 1 MemRead IorD = 1 ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 RegDst=0 RegWrite MemtoReg=1 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 MemRead ALUSrcA = 0 IorD = 0 IRWrite ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00 Instruction fetch Instruction decode/ register fetch Jump completion Branch completionExecution Memory address computation Memory access Memory access R-type completion Write-back step ( O p = ' L W ' ) o r ( O p = ' S W ' ) ( O p = R - t y p e ) ( O p = ' B E Q ' ) ( O p = ' J ' ) ( O p = ' S W ' ) ( O p = ' L W ' ) 4 0 1 9862 753 Start
45
Lec 4.45 Microprogramming: Designing the control as a program that implements the machine instructions in terms of microinstructions Microprogramming
46
Lec 4.46 Main Memory execution unit control memory CPU ADD SUB AND DATA...... User program plus Data this can change! AND microsequence e.g., Fetch Calc Operand Addr Fetch Operand(s) Calculate Save Answer(s) one of these is mapped into one of these Microinstruction ???
47
Lec 4.47 Microprogramming a Multicycle Processor 1) Choose datapath and sequencer architecture 2) Assign states and sequence of each (multicycle) instruction (i.e., define the controller FSM) 3) Choose microinstruction format (minimum bits to describe all allowable functions of sequencer and datapath) 4) Map instructions into microinstruction sequences
48
Lec 4.48 Designing a Microinstruction Set 1) Start with list of control signals 2) Group signals together that make sense: called “fields” 3) Place fields in some logical order (e.g., ALU operation & ALU operands first and microinstruction sequencing last) 4) Create a symbolic legend for the microinstruction format, showing name of field values and how they set the control signals 5) To minimize the width, encode operations that will never be used at the same time
49
Lec 4.49 Multicycle Processor
50
Lec 4.50 Microinstruction fields
51
Lec 4.51 Sequencer Dispatch ROM 1 OpOpcode nameValue 000000R-format Rformat1 000010 jmp JUMP1 000100 beq BEQ1 100011 lw Mem1 101011 sw Mem1 Dispatch ROM 2 OpOpcode nameValue 100011 lw LW2 101011 sw SW2
52
Lec 4.52 Microinstructions Label ALU control SRC1SRC2 Register control Memory PCWrite control Sequencing FetchAddPC4 Read PC ALUSeq AddPC Extshft ReadDispatch 1 Fetch and Decode: R-type instructions Label ALU control SRC1SRC2 Register control Memory PCWrite control Sequencing Rformat1 Func code ABSeq Write ALU Fetch
53
Lec 4.53 Microinstructions Label ALU control SRC1SRC2 Register control Memory PCWrite control Sequencing Mem1AddA Extend Dispatch 2 LW2 Read ALU Seq Write MDR Fetch SW2 Write ALU Fetch Memory-reference: Branch Label ALU control SRC1SRC2 Register control Memory PCWrite control Sequencing BEQ1SubtA B ALUOut - cond Fetch
54
Lec 4.54 Microprogram
55
Lec 4.55 Summary Disadvantages of the Single Cycle Processor Long cycle time Cycle time is too long for all instructions except the Load Multiple Cycle Processor: Divide the instructions into smaller steps Execute each step (instead of the entire instruction) in one cycle Partition datapath into equal size chunks to minimize cycle time Follow same 5-step method for designing “real” processor
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.