Download presentation
Presentation is loading. Please wait.
1
EECC550 - Shaaban #1 Lec # 5 Spring 2001 13-28-2001 CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements. 2. Select set of datapath components & establish clock methodology. 3. Assemble datapath meeting the requirements. 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic.
2
EECC550 - Shaaban #2 Lec # 5 Spring 2001 13-28-2001 CPU Design & Implantation Process Bottom-up Design: –Assemble components in target technology to establish critical timing. Top-down Design: –Specify component behavior from high-level requirements. Iterative refinement: –Establish a partial solution, expand and improve. datapath control processor Instruction Set Architecture => Reg. FileMuxALURegMemDecoderSequencer CellsGates
3
EECC550 - Shaaban #3 Lec # 5 Spring 2001 13-28-2001 Single Cycle MIPS Datapath: CPI = 1, Long Clock Cycle
4
EECC550 - Shaaban #4 Lec # 5 Spring 2001 13-28-2001 Drawback of Single Cycle Processor Long cycle time. All instructions must take as much time as the slowest: –Cycle time for load is longer than needed for all other instructions. Real memory is not as well-behaved as idealized memory –Cannot always complete data access in one (short) cycle.
5
EECC550 - Shaaban #5 Lec # 5 Spring 2001 13-28-2001 Abstract View of Single Cycle CPU PC Next PC Register Fetch ALU Reg. Wrt Mem Access Data Mem Instruction Fetch Result Store ALUctr RegDst ALUSrc ExtOp MemWr Equal nPC_sel RegWr MemWr MemRd Main Control ALU control op fun Ext
6
EECC550 - Shaaban #6 Lec # 5 Spring 2001 13-28-2001 Single Cycle Instruction Timing PCInst Memory mux ALUData Mem mux PCReg FileInst Memory mux ALU mux PCInst Memory mux ALUData Mem PCInst Memorycmp mux Reg File Arithmetic & Logical Load Store Branch Critical Path setup
7
EECC550 - Shaaban #7 Lec # 5 Spring 2001 13-28-2001 Reducing Cycle Time: Multi-Cycle Design Cut combinational dependency graph by inserting registers / latches. The same work is done in two or more fast cycles, rather than one slow cycle. storage element Acyclic Combinational Logic storage element Acyclic Combinational Logic (A) storage element Acyclic Combinational Logic (B) =>
8
EECC550 - Shaaban #8 Lec # 5 Spring 2001 13-28-2001 Clock Cycle Time & Critical Path Critical path: the slowest path between any two storage devices Cycle time is a function of the critical path must be greater than: –Clock-to-Q + Longest Path through the Combination Logic + Setup Clk........................
9
EECC550 - Shaaban #9 Lec # 5 Spring 2001 13-28-2001 Instruction Processing Cycles Obtain instruction from program storage Determine instruction type Obtain operands from registers Compute result value or status Store result in register/memory if needed (usually called Write Back). Update program counter to address of next instruction } Common steps for all instructions Instruction Fetch Instruction Decode Execute Result Store Next Instruction
10
EECC550 - Shaaban #10 Lec # 5 Spring 2001 13-28-2001 Partitioning The Single Cycle Datapath Add registers between smallest steps PC Next PC Operand Fetch Exec Reg. File Mem Access Data Mem Instruction Fetch Result Store ALUctr RegDst ALUSrc ExtOp MemWr nPC_sel RegWr MemWr MemRd
11
EECC550 - Shaaban #11 Lec # 5 Spring 2001 13-28-2001 Example Multi-cycle Datapath PC Next PC Operand Fetch Ext ALU Reg. File Mem Acces s Data Mem Instruction Fetch Result Store ALUctr RegDst ALUSrc ExtOp nPC_sel RegWr MemWr MemRd IR A B R M Reg File MemToReg Equal Registers added: IR: Instruction register A, B: Two registers to hold operands read from register file. R: or ALUOut, holds the output of the ALU M: or Memory data register (MDR) to hold data read from data memory
12
EECC550 - Shaaban #12 Lec # 5 Spring 2001 13-28-2001 Operations In Each Cycle Instruction Fetch Instruction Decode Execution Memory Write Back R-Type IR Mem[PC] A R[rs] B R[rt] R A + B R[rd] R PC PC + 4 Logic Immediate IR Mem[PC] A R[rs] R A OR ZeroExt[imm16] R[rt] R PC PC + 4 Load IR Mem[PC] A R[rs] R A + SignEx(Im16) M Mem[R] R[rd] M PC PC + 4 Store IR Mem[PC] A R[rs] B R[rt] R A + SignEx(Im16) Mem[R] B PC PC + 4 Branch IR Mem[PC] A R[rs] B R[rt] If Equal = 1 PC PC + 4 + (SignExt(imm16) x4) else PC PC + 4 IF ID EX MEM WB
13
EECC550 - Shaaban #13 Lec # 5 Spring 2001 13-28-2001 MIPS Multi-Cycle Datapath: Five Cycles of Load Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5 IF IDEXMEMWBLoad 1- Instruction Fetch (IF) Instruction Fetch Fetch the instruction from the Instruction Memory. 2- Instruction Decode (ID): Registers Fetch and Instruction Decode. 3- Execute (EX): Calculate the effective memory address. 4- Memory (MEM): Read the data from the Data Memory. 5- Write Back (WB): Write the data back to the register file. Update PC.
14
EECC550 - Shaaban #14 Lec # 5 Spring 2001 13-28-2001 Single Cycle Vs. Multi-Cycle CPU Clk Cycle 1 Multiple Cycle Implementation: IFIDEXMEMWB Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9Cycle 10 IFIDEXMEM LoadStore Clk Single Cycle Implementation: LoadStoreWaste IF R-type Cycle 1Cycle 2 8 ns 2ns
15
EECC550 - Shaaban #15 Lec # 5 Spring 2001 13-28-2001 Finite State Machine (FSM) Control Model State specifies control points for Register Transfer. Transfer occurs upon exiting state (same falling edge). State X Register Transfer Control Points Depends on Input Control State Next State Logic Output Logic inputs (conditions) outputs (control points)
16
EECC550 - Shaaban #16 Lec # 5 Spring 2001 13-28-2001 Control Specification For Multi-cycle CPU Finite State Machine (FSM) IR MEM[PC] R-type A R[rs] B R[rt] R A fun B R[rd] R PC PC + 4 R A or ZX R[rt] R PC PC + 4 ORi R A + SX R[rt] M PC PC + 4 M MEM[R] LW R A + SX MEM[R] B PC PC + 4 BEQ & Equal BEQ & ~Equal PC PC + 4 PC PC + SX || 00 SW “instruction fetch” “decode / operand fetch” Execute Memory Write-back To instruction fetch
17
EECC550 - Shaaban #17 Lec # 5 Spring 2001 13-28-2001 Traditional FSM Controller State 6 4 11 next State op Equal control points stateopcond next state control points Truth or Transition Table datapath State To datapath
18
EECC550 - Shaaban #18 Lec # 5 Spring 2001 13-28-2001 Traditional FSM Controller datapath + state diagram => control Translate RTN statements into control points. Assign states. Implement the controller.
19
EECC550 - Shaaban #19 Lec # 5 Spring 2001 13-28-2001 Mapping RTNs To Control Points Examples & State Assignments IR MEM[PC] 0000 R-type A R[rs] B R[rt] 0001 R A fun B 0100 R[rd] R PC PC + 4 0101 R A or ZX 0110 R[rt] R PC PC + 4 0111 ORi R A + SX 1000 R[rt] M PC PC + 4 1010 M MEM[S] 1001 LW R A + SX 1011 MEM[S] B PC PC + 4 1100 BEQ & Equal BEQ & ~Equal PC PC + 4 0011 PC PC + SX || 00 0010 SW “instruction fetch” “decode / operand fetch” Execute Memory Write-back imem_rd, IRen Aen, Ben ALUfun, Sen RegDst, RegWr, PCen To instruction fetch state 0000 To instruction fetch state 0000
20
EECC550 - Shaaban #20 Lec # 5 Spring 2001 13-28-2001 Detailed Control Specification StateOp fieldEqNext IRPC OpsExec Mem Write-Back en selA B Ex Sr ALU S R W MM-R Wr Dst 0000???????00011 0001BEQ000111 1 0001BEQ100101 1 0001R-typex01001 1 0001orIx01101 1 0001LWx10001 1 0001SWx10111 1 0010xxxxxxx00001 1 0011xxxxxxx00001 0 0100xxxxxxx01010 1 fun 1 0101xxxxxxx00001 00 1 1 0110xxxxxxx01110 0 or 1 0111xxxxxxx00001 00 1 0 1000xxxxxxx10011 0 add 1 1001xxxxxxx10101 0 0 1010 xxxxxxx00001 01 1 0 1011xxxxxxx11001 0 add 1 1100xxxxxxx0000 1 00 1 R ORI LW SW BEQ
21
EECC550 - Shaaban #21 Lec # 5 Spring 2001 13-28-2001 Alternative Multiple Cycle Datapath (In Textbook) Miminizes Hardware: 1 memory, 1 adder
22
EECC550 - Shaaban #22 Lec # 5 Spring 2001 13-28-2001 Alternative Multiple Cycle Datapath (In Textbook) Shared instruction/data memory unit A single ALU shared among instructions Shared units require additional or widened multiplexors Temporary registers to hold data between clock cycles of the instruction: Additional registers: Instruction Register (IR), Memory Data Register (MDR), A, B, ALUOut
23
EECC550 - Shaaban #23 Lec # 5 Spring 2001 13-28-2001 Operations In Each Cycle Instruction Fetch Instruction Decode Execution Memory Write Back R-Type IR Mem[PC] PC PC + 4 A R[rs] B R[rt] ALUout PC + (SignExt(imm16) x4) ALUout A + B R[rd] ALUout Logic Immediate IR Mem[PC] PC PC + 4 A R[rs] B R[rt] ALUout PC + (SignExt(imm16) x4) ALUout A OR ZeroExt[imm16] R[rt] ALUout Load IR Mem[PC] PC PC + 4 A R[rs] B R[rt] ALUout PC + (SignExt(imm16) x4) ALUout A + SignEx(Im16) M Mem[ALUout] R[rd] Mem Store IR Mem[PC] PC PC + 4 A R[rs] B R[rt] ALUout PC + (SignExt(imm16) x4) ALUout A + SignEx(Im16) Mem[ALUout] B Branch IR Mem[PC] PC PC + 4 A R[rs] B R[rt] ALUout PC + (SignExt(imm16) x4) If Equal = 1 PC ALUout
24
EECC550 - Shaaban #24 Lec # 5 Spring 2001 13-28-2001 High-Level View of Finite State Machine Control First steps are independent of the instruction class Then a series of sequences that depend on the instruction opcode Then the control returns to fetch a new instruction. Each box above represents one or several state.
25
EECC550 - Shaaban #25 Lec # 5 Spring 2001 13-28-2001 Instruction Fetch and Decode FSM States
26
EECC550 - Shaaban #26 Lec # 5 Spring 2001 13-28-2001 Load/Store Instructions FSM States
27
EECC550 - Shaaban #27 Lec # 5 Spring 2001 13-28-2001 R-Type Instructions FSM States
28
EECC550 - Shaaban #28 Lec # 5 Spring 2001 13-28-2001 Jump Instruction Single State Branch Instruction Single State
29
EECC550 - Shaaban #29 Lec # 5 Spring 2001 13-28-2001
30
EECC550 - Shaaban #30 Lec # 5 Spring 2001 13-28-2001 Finite State Machine (FSM) Specification Finite State Machine (FSM) Specification IR MEM[PC] PC PC + 4 R-type ALUout A fun B R[rd] ALUout ALUout A op ZX R[rt] ALUout ORi ALUout A + SX R[rt] M M MEM[ALUout] LW ALUout A + SX MEM[ALUout] B SW “instruction fetch” “decode” Execute Memory Write-back 0000 0001 0100 0101 0110 0111 1000 1001 1010 1011 1100 BEQ 0010 If A = B then PC ALUout A R[rs] B R[rt] ALUout PC +SX To instruction fetch
31
EECC550 - Shaaban #31 Lec # 5 Spring 2001 13-28-2001 MIPS Multi-cycle Datapath Performance Evaluation What is the average CPI? –State diagram gives CPI for each instruction type –Workload below gives frequency of each type TypeCPI i for typeFrequency CPI i x freqI i Arith/Logic 440%1.6 Load 5 30%1.5 Store 410%0.4 branch 320%0.6 Average CPI: 4.1 Better than CPI = 5 if all instructions took the same number of clock cycles (5).
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.