1 SEQ CPU Implementation. 2 Outline SEQ Implementation Suggested Reading 4.3.1, 4.3.4.

Slides:



Advertisements
Similar presentations
Randal E. Bryant Carnegie Mellon University CS:APP CS:APP Chapter 4 Computer Architecture PipelinedImplementation Part II CS:APP Chapter 4 Computer Architecture.
Advertisements

University of Amsterdam Computer Systems – the processor architecture Arnoud Visser 1 Computer Systems The processor architecture.
Real-World Pipelines: Car Washes Idea  Divide process into independent stages  Move objects through stages in sequence  At any instant, multiple objects.
PipelinedImplementation Part I PipelinedImplementation.
Instructor: Erol Sahin
– 1 – Chapter 4 Processor Architecture Pipelined Implementation Chapter 4 Processor Architecture Pipelined Implementation Instructor: Dr. Hyunyoung Lee.
Randal E. Bryant Carnegie Mellon University CS:APP2e CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture.
Computer Architecture Carnegie Mellon University
PipelinedImplementation Part I CSC 333. – 2 – Overview General Principles of Pipelining Goal Difficulties Creating a Pipelined Y86 Processor Rearranging.
Randal E. Bryant CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture SequentialImplementation Slides.
Computer Organization Chapter 4
Datapath Design II Topics Control flow instructions Hardware for sequential machine (SEQ) Systems I.
Pipelining III Topics Hazard mitigation through pipeline forwarding Hardware support for forwarding Forwarding to mitigate control (branch) hazards Systems.
David O’Hallaron Carnegie Mellon University Processor Architecture PIPE: Pipelined Implementation Part I Processor Architecture PIPE: Pipelined Implementation.
Y86 Processor State Program Registers
Processor Architecture: The Y86 Instruction Set Architecture
Data Hazard Solution 2: Data Forwarding Our naïve pipeline would experience many data stalls  Register isn’t written until completion of write-back stage.
1 Seoul National University Pipelined Implementation : Part I.
1 Naïve Pipelined Implementation. 2 Outline General Principles of Pipelining –Goal –Difficulties Naïve PIPE Implementation Suggested Reading 4.4, 4.5.
Instructor: Erol Sahin
Randal E. Bryant Carnegie Mellon University CS:APP CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture.
Randal E. Bryant adapted by Jason Fritts CS:APP2e CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture.
Datapath Design I Topics Sequential instruction execution cycle Instruction mapping to hardware Instruction decoding Systems I.
Based on slides by Patrice Belleville CPSC 121: Models of Computation Unit 10: A Working Computer.
CSC 2405 Computer Systems II Advanced Topics. Instruction Set Architecture.
Computer Architecture I: Outline and Instruction Set Architecture
1 Sequential CPU Implementation. 2 Outline Logic design Organizing Processing into Stages SEQ timing Suggested Reading 4.2,4.3.1 ~
Computer Architecture adapted by Jason Fritts
Randal E. Bryant Carnegie Mellon University CS:APP2e CS:APP Chapter 4 Computer Architecture PipelinedImplementation Part II CS:APP Chapter 4 Computer Architecture.
Sequential Hardware “God created the integers, all else is the work of man” Leopold Kronecker (He believed in the reduction of all mathematics to arguments.
Rabi Mahapatra CS:APP3e Slides are from Authors: Bryant and O Hallaron.
Real-World Pipelines Idea –Divide process into independent stages –Move objects through stages in sequence –At any given times, multiple objects being.
1 Pipelined Implementation. 2 Outline Handle Control Hazard Handle Exception Performance Analysis Suggested Reading 4.5.
Sequential CPU Implementation Implementation. – 2 – Processor Suggested Reading - Chap 4.3.
1 Seoul National University Sequential Implementation.
CPSC 121: Models of Computation
Real-World Pipelines Idea Divide process into independent stages
CPSC 121: Models of Computation
Lecture 13 Y86-64: SEQ – sequential implementation
Lecture 14 Y86-64: PIPE – pipelined implementation
Course Outline Background Sequential Implementation Pipelining
Sequential Implementation
Administrivia Midterm to be posted on Tuesday after class
Samira Khan University of Virginia Feb 14, 2017
Computer Architecture adapted by Jason Fritts then by David Ferry
Y86 Processor State Program Registers
Pipelined Implementation : Part I
Seoul National University
Seoul National University
Instruction Decoding Optional icode ifun valC Instruction Format
Pipelined Implementation : Part II
Systems I Pipelining III
Computer Architecture adapted by Jason Fritts
Systems I Pipelining II
Pipelined Implementation : Part I
Seoul National University
Sequential CPU Implementation
Pipeline Architecture I Slides from: Bryant & O’ Hallaron
Pipelined Implementation : Part I
Recap: Performance Comparison
Pipelined Implementation
Pipelined Implementation
Computer Architecture
Systems I Pipelining II
Chapter 4 Processor Architecture
Systems I Pipelining II
Pipelined Implementation
Real-World Pipelines: Car Washes
Sequential CPU Implementation
Sequential Design תרגול 10.
Presentation transcript:

1 SEQ CPU Implementation

2 Outline SEQ Implementation Suggested Reading 4.3.1, 4.3.4

3 What we will discuss today? The implementation of a sequential CPU ---- SEQ –Every Instruction finished in one cycle –Instruction executes in sequential –No two instruction execute in parallel or overlap

4 Computation Steps All instructions follow same general pattern Differ in what gets computed on each step OPl rA, rB icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch Read instruction byte Read register byte [Read constant word] Compute next PC valA  R[rA] valB  R[rB] Decode Read operand A Read operand B valE  valB OP valA Set CC Execute Perform ALU operation Set condition code register Memory [Memory read/write] R[rB]  valE Write back Write back ALU result [Write back memory result] PC  valP PC update Update PC icode,ifun rA,rB valC valP valA, srcA valB, srcB valE Cond code valM dstE dstM PC

5 Computation Steps call Dest Fetch Decode Execute Memory Write back PC update icode,ifun rA,rB valC valP valA, srcA valB, srcB valE Cond code valM dstE dstM PC icode:ifun  M 1 [PC] valC  M 4 [PC+1] valP  PC+5 valB  R[ %esp ] valE  valB + –4 M 4 [valE]  valP R[ %esp ]  valE PC  valC Read instruction byte [Read register byte] Read constant word Compute next PC [Read operand A] Read operand B Perform ALU operation [Set condition code reg.] [Memory read/write] [Write back ALU result] Write back memory result Update PC All instructions follow same general pattern Differ in what gets computed on each step

6 Instruction Instruction memory PC increment CC ALU Data memory Fetch Decode Execute Memory Write back icode:ifun, rA:rB valC Register M valP srcA, srcB dstE,dstM valA, valB aluA,aluB Cnd valE addrs,data valM PC valE, valM newPC PC A B Register File MEME

7 Computed Values Fetch icodeInstruction code ifunInstruction function rAInstr. Register A rBInstr. Register B valCInstruction constant valPIncremented PC Decode § Write back valARegister value A valBRegister value B Execute –valEALU result –BchBranch flag Memory –valMValue from memory

8 SEQ Semantics Achieve the same effect as a sequential execution of the assignment shown in the tables of Figures 4.18 to 4.21 –Though all of the state updates occur simultaneously at the clock rises to the next cycle. –A problem: popl %esp need to sequentially write two registers. So the register file control logic must process it.

9 SEQ Operation State –Program counter register (PC) –Condition code register (CC) –Register File –Memories Access same memory space Data: for reading/writing program data Instruction: for reading instructions All updated as clock rises

10 SEQ Operation Combinational Logic –ALU –Control logic –Memory reads Instruction memory Register file Data memory

11 SEQ Operation #2 0x00c:addl%edx,%ebx# %ebx<--0x300 CC < x00e:je dest# Not taken Cycle 3: Cycle 4: 0x006:irmovl$0x200,% edx # %edx<--0x200 Cycle 2: 0x000:irmovl$0x100,% ebx # %ebx<--0x100 Cycle 1: Clock Cycle 1Cycle 2Cycle 3Cycle 4

12 SEQ Operation #2 state set according to second irmovl instruction combinational logic starting to react to state changes

13 SEQ Operation #3

14 SEQ Operation #3 state set according to second irmovl instruction combinational logic generates results for addl instruction 0x300

15 SEQ Operation #4

16 SEQ Operation #4 state set according to addl instruction combinational logic starting to react to state changes

17 SEQ Operation #5

18 SEQ Operation #5 state set according to addl instruction combinational logic generates results for je instruction

19 SEQ Hardware Structure Stages –Fetch: Read instruction from memory –Decode: Read program registers –Execute: Compute value or address –Memory: Read or write data –Write Back: Write program registers –PC: Update program counter

20 SEQ Hardware Structure Instruction Flow –Read instruction at address specified by PC –Process through stages –Update program counter

21 Instruction Instruction memory PC increment CC ALU Data memory Fetch Decode Execute Memory Write back icode:ifun, rA:rB valC Register M valP srcA, srcB dstE,dstM valA, valB aluA,aluB Cnd valE addrs,data valM PC valE, valM newPC PC A B Register File MEME

22 Difference between semantics and implementation ISA –Every stage may update some states, these updates occur sequentially SEQ –All the state update operations occur simultaneously at clock rising

SEQ Hardware Blue boxes: predesigned hardware blocks –E.g., memories, ALU Gray boxes: control logic –Describe in HCL White ovals: labels for signals Thick lines: 32-bit word values Thin lines: 4-8 bit values Dotted lines: 1-bit values 23

24 Some Macros NameValueMeaning INOP0Code for nop instruction IHALT1Code for halt instruction IRRMOVL2Code for rrmovl instruction IIRMOVL3Code for irmovl instruction IRMMOVL4Code for rmmovl instruction IMRMOVL5Code for mrmovl instruction IOPL6Code for integer op instructions IJXX7Code for jump instructions …………………………………………… IPOPLBCode for popl instruction

25 Some Macros NameValueMeaning RESP4Register ID for %esp RNONEFIndicates no register file access ALUADD0Function for addition operation

26 Fetch Logic Instruction memory Instruction memory PC increment PC increment rBicodeifunrA PC valCvalP Need regids Need valC Instr valid Align Split Bytes 1-5Byte 0 imem_error icodeifun

27 Fetch Logic Predefined Blocks –PC: Register containing PC –Instruction memory: Read 6 bytes (PC to PC+5) –Split: Divide instruction byte into icode and ifun –Align: Get fields for rA, rB, and valC Instruction memory Instruction memory PC increment PC increment rBicodeifunrA PC valCvalP Need regids Need valC Instr valid Align Split Bytes 1-5Byte 0 imem_error icodeifun

28 Fetch Logic Control Logic –Instr. Valid: Is this instruction valid? –Need regids: Does this instruction have a register bytes? –Need valC: Does this instruction have a constant word? Instruction memory Instruction memory PC increment PC increment rBicodeifunrA PC valCvalP Need regids Need valC Instr valid Align Split Bytes 1-5Byte 0 imem_error icodeifun

Fetch Control Logic in HCL # Determine instruction code int icode = [ imem_error: INOP; 1: imem_icode; ]; # Determine instruction function int ifun = [ imem_error: FNONE; 1: imem_ifun; ]; Instruction memory Instruction memory PC Split Byte 0 imem_error icodeifun 29

Fetch Control Logic in HCL bool need_regids = icode in { IRRMOVL, IOPL, IPUSHL, IPOPL, IIRMOVL, IRMMOVL, IMRMOVL }; bool instr_valid = icode in { INOP, IHALT, IRRMOVL, IIRMOVL, IRMMOVL, IMRMOVL, IOPL, IJXX, ICALL, IRET, IPUSHL, IPOPL }; pushl rA A0 rA F jXX Dest 7 fn Dest popl rA B0 rA F call Dest 80 Dest cmovXX rA, rB 2 fnrArB irmovl V, rB 30F rB V rmmovl rA, D ( rB ) 40 rArB D mrmovl D ( rB ), rA 50 rArB D OPl rA, rB 6 fnrArB ret 90 nop 10 halt 00 30

Decode & Write-Back Logic Register File –Read ports A, B –Write ports E, M –Addresses are register IDs or 15 (0xF) (no access) Control Logic srcA, srcB: read port addresses dstE, dstM: write port addresses rB dstEdstMsrcAsrcB Register file Register file AB M E dstEdstMsrcAsrcB icoderA valBvalAvalEvalMCnd Signals Cnd: Indicate whether or not to perform conditional move Computed in Execute stage 31

A Source int srcA = [ icode in { IRRMOVL, IRMMOVL, IOPL, IPUSHL } : rA; icode in { IPOPL, IRET } : RESP; 1 : RNONE; # Don't need register ]; cmovXX rA, rB valA  R[rA] Decode Read operand A rmmovl rA, D(rB) valA  R[rA] Decode Read operand A popl rA valA  R[%esp] Decode Read stack pointer jXX Dest Decode No operand call Dest valA  R[%esp] Decode Read stack pointer ret Decode No operand OPl rA, rB valA  R[rA] Decode Read operand A 32

E Desti- nation int dstE = [ icode in { IRRMOVL } && Cnd : rB; icode in { IIRMOVL, IOPL} : rB; icode in { IPUSHL, IPOPL, ICALL, IRET } : RESP; 1 : RNONE; # Don't write any register ]; None R[%esp]  valE Update stack pointer None R[rB]  valE cmovXX rA, rB Write-back rmmovl rA, D(rB) popl rA jXX Dest call Dest ret Write-back Conditionally write back result R[%esp]  valE Update stack pointer R[%esp]  valE Update stack pointer R[rB]  valE OPl rA, rB Write-back Write back result 33

34 Execute Logic CC ALU A ALU B ALU fun. Cnd icodeifunvalCvalBvalA valE Set CC cond

35 Execute Logic (Units) ALU –Implements 4 required functions –Generates condition code values CC –Register with 3 condition code bits cond –Computes conditional jump/move flag

36 Execute Logic (Control Logic) Set CC: Should condition code register be loaded? ALU A: Input A to ALU ALU B: Input B to ALU ALU fun: What function should ALU compute?

37 ALU A Input valE  valB + –4Decrement stack pointer No operation valE  valB + 4Increment stack pointer valE  valB + valCCompute effective address valE  0 + valAPass valA through ALU cmovXX rA, rB Execute rmmovl rA, D(rB) popl rA jXX Dest call Dest ret Execute valE  valB + 4Increment stack pointer valE  valB OP valAPerform ALU operation OPl rA, rB Execute

38 ALU A Input int aluA = [ icode in { IRRMOVL, IOPL } : valA; icode in { IIRMOVL, IRMMOVL,IMRMOVL} : valC; icode in { ICALL, IPUSHL } : -4; icode in { IRET, IPOPL } : 4; # Other instructions don't need ALU ];

ALU Operation valE  valB + –4Decrement stack pointer No operation valE  valB + 4Increment stack pointer valE  valB + valCCompute effective address valE  0 + valAPass valA through ALU cmovXX rA, rB Execute rmmovl rA, D(rB) popl rA jXX Dest call Dest ret Execute valE  valB + 4Increment stack pointer valE  valB OP valAPerform ALU operation OPl rA, rB Execute 39

40 ALU Operation int alufun = [ icode == IOPL : ifun; 1 : ALUADD; ];

41 Condition Set Bool set_cc = icode in { IOPL };

Memory Logic Memory –Reads or writes memory word Control Logic –stat: What is instruction status? –Mem. read: should word be read? –Mem. write: should word be written? –Mem. addr.: Select address –Mem. data.: Select data Data memory Data memory Mem. read Mem. addr read write data out Mem. data valE valM valAvalP Mem. write data in icode Stat dmem_error instr_valid imem_error stat 42

Instruction Status Control Logic –stat: What is instruction status? Data memory Data memory Mem. read Mem. addr read write data out Mem. data valE valM valAvalP Mem. write data in icode Stat dmem_error instr_valid imem_error stat ## Determine instruction status int Stat = [ imem_error || dmem_error : SADR; !instr_valid: SINS; icode == IHALT : SHLT; 1 : SAOK; ]; 43

44 Memory Address OPl rA, rB Memory rmmovl rA, D(rB) popl rA jXX Dest call Dest ret No operation M 4 [valE]  valA Memory Write value to memory valM  M 4 [valA] Memory Read from stack M 4 [valE]  valP Memory Write return value on stack valM  M 4 [valA] Memory Read return address Memory No operation

45 Memory Address int mem_addr = [ icode in { IRMMOVL, IPUSHL, ICALL, IMRMOVL } : valE; icode in { IPOPL, IRET } : valA; # Other instructions don't need address ];

46 Memory Read opl rA, rB Memory rmmovl rA, D(rB) popl rA jXX Dest call Dest ret No operation M 4 [valE]  valA Memory Write value to memory valM  M 4 [valA] Memory Read from stack M 4 [valE]  valP Memory Write return value on stack valM  M 4 [valA] Memory Read return address Memory No operation

47 Memory Read bool mem_read = icode in { IMRMOVL, IPOPL, IRET }; bool mem_write = icode in { IRMMOVL, IPUSHL, ICALL };

48 PC Update Logic New PC –Select next value of PC New PC CndicodevalCvalPvalM PC

49 PC Update OPl rA, rB rmmovl rA, D(rB) popl rA jXX Dest call Dest ret PC  valP PC update Update PC PC  valP PC update Update PC PC  valP PC update Update PC PC  Cnd ? valC : valP PC update Update PC PC  valC PC update Set PC to destination PC  valM PC update Set PC to return address

50 PC Update int new_pc = [ icode == ICALL : valC; icode == IJXX && Cnd : valC; icode == IRET : valM; 1 : valP; ];

51 Instruction Memory rBicodeifunrA PC PC increment valC valP dstEdstM Register file ABM E dstEdstMvalBvalA CC ALU Data memory ALUAALUB Mem Control Addr read write ALU fun data out Cnd Data valE valM New PC newPC srcAsrcB srcAsrcB Decode Execute Memory PC Fetch Write Back

52 SEQ Summary Implementation –Express every instruction as series of simple steps –Follow same general flow for each instruction type –Assemble registers, memories, predesigned combinational blocks –Connect with control logic

53 SEQ Summary Limitations –Too slow to be practical –In one cycle, must propagate through instruction memory, register file, ALU, and data memory –Would need to run clock very slowly –Hardware units only active for fraction of clock cycle

54 Next General Principles of Pipelining Naïve PIPE Implementation Suggested Reading 4.4, 4.5