1 Seoul National University Sequential Implementation.

Slides:



Advertisements
Similar presentations
Randal E. Bryant Carnegie Mellon University CS:APP CS:APP Chapter 4 Computer Architecture PipelinedImplementation Part II CS:APP Chapter 4 Computer Architecture.
Advertisements

University of Amsterdam Computer Systems – the processor architecture Arnoud Visser 1 Computer Systems The processor architecture.
Real-World Pipelines: Car Washes Idea  Divide process into independent stages  Move objects through stages in sequence  At any instant, multiple objects.
PipelinedImplementation Part I PipelinedImplementation.
1 Seoul National University Logic Design. 2 Overview of Logic Design Seoul National University Fundamental Hardware Requirements  Computation  Storage.
David O’Hallaron Carnegie Mellon University Processor Architecture Logic Design Processor Architecture Logic Design
Randal E. Bryant Carnegie Mellon University CS:APP2e CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture.
CS:APP CS:APP Chapter 4 Computer Architecture Control Logic and Hardware Control Language CS:APP Chapter 4 Computer Architecture Control Logic and Hardware.
Randal E. Bryant CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture SequentialImplementation Slides.
Computer Organization Chapter 4
Datapath Design II Topics Control flow instructions Hardware for sequential machine (SEQ) Systems I.
Y86 Processor State Program Registers
Processor Architecture: The Y86 Instruction Set Architecture
1 Seoul National University Pipelined Implementation : Part I.
1 Seoul National University Logic Design. 2 Overview of Logic Design Seoul National University Fundamental Hardware Requirements  Computation  Storage.
1 Naïve Pipelined Implementation. 2 Outline General Principles of Pipelining –Goal –Difficulties Naïve PIPE Implementation Suggested Reading 4.4, 4.5.
Instructor: Erol Sahin
Randal E. Bryant Carnegie Mellon University CS:APP CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture.
Randal E. Bryant adapted by Jason Fritts CS:APP2e CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture.
Datapath Design I Topics Sequential instruction execution cycle Instruction mapping to hardware Instruction decoding Systems I.
Based on slides by Patrice Belleville CPSC 121: Models of Computation Unit 10: A Working Computer.
Computer Architecture Carnegie Mellon University
CSC 2405 Computer Systems II Advanced Topics. Instruction Set Architecture.
Computer Architecture I: Outline and Instruction Set Architecture
1 Sequential CPU Implementation. 2 Outline Logic design Organizing Processing into Stages SEQ timing Suggested Reading 4.2,4.3.1 ~
1 Processor Architecture. Coverage Our Approach –Work through designs for particular instruction set Y86---a simplified version of the Intel IA32 (a.k.a.
1 SEQ CPU Implementation. 2 Outline SEQ Implementation Suggested Reading 4.3.1,
Sequential Hardware “God created the integers, all else is the work of man” Leopold Kronecker (He believed in the reduction of all mathematics to arguments.
Rabi Mahapatra CS:APP3e Slides are from Authors: Bryant and O Hallaron.
Real-World Pipelines Idea –Divide process into independent stages –Move objects through stages in sequence –At any given times, multiple objects being.
1 Pipelined Implementation. 2 Outline Handle Control Hazard Handle Exception Performance Analysis Suggested Reading 4.5.
Sequential CPU Implementation Implementation. – 2 – Processor Suggested Reading - Chap 4.3.
CPSC 121: Models of Computation
Real-World Pipelines Idea Divide process into independent stages
CPSC 121: Models of Computation
Lecture 13 Y86-64: SEQ – sequential implementation
Lecture 14 Y86-64: PIPE – pipelined implementation
Module 10: A Working Computer
Lecture 12 Logic Design Review & HCL & Bomb Lab
Seoul National University
Course Outline Background Sequential Implementation Pipelining
Sequential Implementation
Ch. 2 Two’s Complement Boolean vs. Logical Floating Point
Administrivia Midterm to be posted on Tuesday after class
Samira Khan University of Virginia Feb 14, 2017
Computer Architecture adapted by Jason Fritts then by David Ferry
Y86 Processor State Program Registers
Pipelined Implementation : Part I
Seoul National University
Seoul National University
Instruction Decoding Optional icode ifun valC Instruction Format
Pipelined Implementation : Part II
Systems I Pipelining III
Computer Architecture adapted by Jason Fritts
Systems I Pipelining II
Pipelined Implementation : Part I
Seoul National University
Sequential CPU Implementation
Pipeline Architecture I Slides from: Bryant & O’ Hallaron
Pipelined Implementation : Part I
Recap: Performance Comparison
Pipelined Implementation
Computer Architecture
Systems I Pipelining II
Chapter 4 Processor Architecture
Systems I Pipelining II
Disassembly תרגול 7 ניתוח קוד.
Real-World Pipelines: Car Washes
Sequential CPU Implementation
Sequential Design תרגול 10.
Presentation transcript:

1 Seoul National University Sequential Implementation

2 Y86 Instruction Set #1 Seoul National University Byte pushl rA A0 rA 8 jXX Dest 7 fn Dest popl rA B0 rA 8 call Dest 80 Dest cmovXX rA, rB 2 fnrArB irmovl V, rB 308 rB V rmmovl rA, D ( rB ) 40 rArB D mrmovl D ( rB ), rA 50 rArB D OPl rA, rB 6 fnrArB ret 90 nop 10 halt 00

3 Y86 Instruction Set #2 Seoul National University Byte pushl rA A0 rA 8 jXX Dest 7 fn Dest popl rA B0 rA 8 call Dest 80 Dest cmovXX rA, rB 2 fnrArB irmovl V, rB 308 rB V rmmovl rA, D ( rB ) 40 rArB D mrmovl D ( rB ), rA 50 rArB D OPl rA, rB 6 fnrArB ret 90 nop 10 halt 00 rrmovl 20 cmovle 21 cmovl 22 cmove 23 cmovne 24 cmovge 25 cmovg 26

4 Y86 Instruction Set #3 Seoul National University Byte pushl rA A0 rA 8 jXX Dest 7 fn Dest popl rA B0 rA 8 call Dest 80 Dest cmovXX rA, rB 2 fnrArB irmovl V, rB 308 rB V rmmovl rA, D ( rB ) 40 rArB D mrmovl D ( rB ), rA 50 rArB D OPl rA, rB 6 fnrArB ret 90 nop 10 halt 00 addl 60 subl 61 andl 62 xorl 63

5 Y86 Instruction Set #4 Seoul National University Byte pushl rA A0 rA 8 jXX Dest 7 fn Dest popl rA B0 rA 8 call Dest 80 Dest cmovXX rA, rB 2 fnrArB irmovl V, rB 308 rB V rmmovl rA, D ( rB ) 40 rArB D mrmovl D ( rB ), rA 50 rArB D OPl rA, rB 6 fnrArB ret 90 nop 10 halt 00 jmp 70 jle 71 jl 72 je 73 jne 74 jge 75 jg 76

6 Building Blocks Seoul National University Combinational Logic  Compute Boolean functions of inputs  Continuously respond to input changes  Operate on data and implement control Storage Elements  Store bits  Addressable memories  Non-addressable registers  Loaded only as clock rises Register file Register file A B W dstW srcA valA srcB valB valW Clock ALUALU fun A B MUX 0 1 = Clock

7 Hardware Control Language Seoul National University  Very simple hardware description language  Can only express limited aspects of hardware operation  Parts we want to explore and modify Data Types  bool : Boolean  a, b, c, …  int : words  A, B, C, …  Does not specify word size---bytes, 32-bit words, … Statements  bool a = bool-expr ;  int A = int-expr ;

8 HCL Operations Seoul National University  Classify by type of value returned Boolean Expressions  Logic Operations  a && b, a || b, !a  Word Comparisons  A == B, A != B, A = B, A > B  Set Membership  A in { B, C, D } –Same as A == B || A == C || A == D Word Expressions  Case expressions  [ a : A; b : B; c : C ]  Evaluate test expressions a, b, c, … in sequence  Return word expression A, B, C, … for first successful test

9 SEQ Hardware Structure Seoul National University “State”  Register File  Memory  Instruction: for reading instructions  Data: for reading/writing program data  Program counter register (PC)  Condition code register (CC) Instruction Flow  Read instruction at address specified by PC  Process through stages  Update the “state” including the program counter Instruction memory Instruction memory PC inc PC increment CC ALU Data memory Data memory Fetch Decode Execute Memory Write back icode ifun rA, rB valC Register file Register file AB M E Register file Register file AB M E PC valP srcA,srcB dstA,dstB valA,valB aluA,aluB Cnd valE Addr, Data valM PC valE,valM newPC

10 SEQ Stages Seoul National University Fetch  Read instruction from instruction memory Decode  Read from registers Execute  Compute value or address Memory  Read from or write to memory Write Back  Write to registers PC  Update program counter Instruction memory Instruction memory PC inc PC increment CC ALU Data memory Data memory Fetch Decode Execute Memory Write back icode ifun rA, rB valC Register file Register file AB M E Register file Register file AB M E PC valP srcA,srcB dstA,dstB valA,valB aluA,aluB Cnd valE Addr, Data valM PC valE,valM newPC

11 Instruction Decoding Seoul National University Instruction Format  Instruction byteicode:ifun  Optional register byterA:rB  Optional constant wordvalC 50 rArB D icode ifun rA rB valC Optional

12 Executing Arith./Logical Operation Seoul National University Fetch  Read 2 bytes Decode  Read operand registers Execute  Perform operation  Set condition codes Memory  Do nothing Write back  Update register PC Update  Increment PC by 2 OPl rA, rB 6 fn rArB

13 Stage Computation: Arith/Log. Ops Seoul National University  Formulate instruction execution as sequence of simple steps  Use the same general form for all instructions OPl rA, rB icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch valA  R[rA] valB  R[rB] Decode valE  valB OP valA Set CC Execute Memory R[rB]  valE Write back PC  valP PC update

14 Executing rmmovl Seoul National University Fetch  Read 6 bytes Decode  Read operand registers Execute  Compute effective address Memory  Write to memory Write back  Do nothing PC Update  Increment PC by 6 rmmovl rA, D ( rB) 40 rA rB D

15 Stage Computation: rmmovl Seoul National University rmmovl rA, D(rB) icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valC  M 4 [PC+2] valP  PC+6 Fetch valA  R[rA] valB  R[rB] Decode valE  valB + valC Execute M 4 [valE]  valA Memory Write back PC  valP PC update

16 Stage Computation: mrmovl Seoul National University mrmovl D(rB), rA icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valC  M 4 [PC+2] valP  PC+6 Fetch valB  R[rB] Decode valE  valB + valC Execute valM  M 4 [valE] Memory Write back PC  valP PC update R[rA]  valM

17 Stage Computation: rrmovl Seoul National University rrmovl rA, rB icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch valA  R[rA] Decode valE  0 + valA Execute Memory Write back PC  valP PC update R[rB]  valE

18 Stage Computation: irmovl Seoul National University irmovl V, rB icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valC  M 4 [PC+2] valP  PC+6 Fetch Decode valE  0 + valC Execute Memory Write back PC  valP PC update R[rB]  valE

19 Stage Computation: pushl Seoul National University pushl rA icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch valA  R[rA] valB  R[ %esp ] Decode valE  valB + (-4) Execute M 4 [valE]  valA Memory R[ %esp ]  valE Write back PC  valP PC update

20 Stage Computation: popl Seoul National University popl rA icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch valA  R[ %esp ] valB  R[ %esp ] Decode valE  valB + 4 Execute valM  M 4 [valA] Memory R[ %esp ]  valE R[rA]  valM Write back PC  valP PC update

21 Stage Computation: Jumps Seoul National University jXX Dest icode:ifun  M 1 [PC] valC  M 4 [PC+1] valP  PC+5 Fetch Decode Cnd  Cond(CC,ifun) Execute Memory Write back PC  Cnd ? valC : valP PC update

22 Stage Computation: call Seoul National University call Dest icode:ifun  M 1 [PC] valC  M 4 [PC+1] valP  PC+5 Fetch valB  R[ %esp ] Decode valE  valB + (-4) Execute M 4 [valE]  valP Memory R[ %esp ]  valE Write back PC  valC PC update

23 Stage Computation: ret Seoul National University ret icode:ifun  M 1 [PC] Fetch valA  R[ %esp ] valB  R[ %esp ] Decode valE  valB + 4 Execute valM  M 4 [valA] Memory R[ %esp ]  valE Write back PC  valM PC update valP  PC+1

24 Stage Computation (More Structured) Seoul National University  All instructions follow the same general pattern  Differ in what gets computed during each step OPl rA, rB icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch Read instruction byte Read register byte [Read constant word] Compute next PC valA  R[rA] valB  R[rB] Decode Read operand A Read operand B valE  valB OP valA Set CC Execute Perform ALU operation Set condition code register Memory [Memory read/write] R[rB]  valE Write back Write back ALU result [Write back memory result] PC  valP PC update Update PC icode,ifun rA,rB valC valP valA, srcA valB, srcB valE Cond code valM dstE dstM PC

25 Stage Computation (More Structured) Seoul National University  All instructions follow the same general pattern  Differ in what gets computed during each step call Dest Fetch Decode Execute Memory Write back PC update icode,ifun rA,rB valC valP valA, srcA valB, srcB valE Cond code valM dstE dstM PC icode:ifun  M 1 [PC] valC  M 4 [PC+1] valP  PC+5 valB  R[ %esp ] valE  valB + (-4) M 4 [valE]  valP R[ %esp ]  valE PC  valC Read instruction byte [Read register byte] Read constant word Compute next PC [Read operand A] Read operand B Perform ALU operation [Set condition code reg.] [Memory read/write] [Write back ALU result] Write back memory result Update PC

26 Computed Values Seoul National University Fetch icodeInstruction code ifunInstruction function rAInstr. Register A rBInstr. Register B valCInstruction constant valPIncremented PC Decode srcARegister ID A srcBRegister ID B dstEDestination Register E dstMDestination Register M valARegister value A valBRegister value B Execute  valEALU result  CndBranch/move flag Memory  valMValue from memory

27 SEQ Hardware Seoul National University Key  Blue boxes :predesigned hardware blocks  E.g., memories, ALU  Gray boxes : control logic  Described in HCL  White ovals :labels for signals  Thick lines : 32-bit word values  Thin lines : 4-8 bit values  Dotted lines :1-bit values

28 Fetch Logic #1 Seoul National University Instruction memory Instruction memory PC increment PC increment rBicodeifunrA PC valCvalP Need regids Need valC Instr valid Align Split Bytes 1-5Byte 0 imem_error icodeifun Predefined Blocks  PC : Register containing PC  Instruction memory : Read 6 bytes (PC to PC+5)  Signal invalid address  Split : Divide instruction byte into icode and ifun  Align : Get fields for rA, rB, and valC

29 Control Logic  Instr. Valid  Is this instruction valid?  icode, ifun  Generate no-op if invalid address  Need regids  Does this instruction have a register byte?  Need valC  Does this instruction have a constant word? Fetch Logic #2 Seoul National University Instruction memory Instruction memory PC increment PC increment rBicodeifunrA PC valCvalP Need regids Need valC Instr valid Align Split Bytes 1-5Byte 0 imem_error icodeifun

30 Fetch Control Logic in HCL Seoul National University # Determine instruction code int icode = [ imem_error: INOP; 1: imem_icode; ]; # Determine instruction function int ifun = [ imem_error: FNONE; 1: imem_ifun; ]; Instruction memory Instruction memory PC Split Byte 0 imem_error icodeifun

31 Fetch Control Logic in HCL Seoul National University pushl rA A0 rA 8 jXX Dest 7 fn Dest popl rA B0 rA 8 call Dest 80 Dest cmovXX rA, rB 2 fnrArB irmovl V, rB 308 rB V rmmovl rA, D ( rB ) 40 rArB D mrmovl D ( rB ), rA 50 rArB D OPl rA, rB 6 fnrArB ret 90 nop 10 halt 00 bool need_regids = icode in { IRRMOVL, IOPL, IPUSHL, IPOPL, IIRMOVL, IRMMOVL, IMRMOVL }; bool instr_valid = icode in {IHALT, INOP, IRRMOVL, IIRMOVL, IRMMOVL, IMRMOVL, IOPL, IJXX, ICALL, IRET, IPUSHL, IPOPL };

32 Decode Logic Seoul National University Register File  Read ports A, B  Write ports E, M  Addresses are register IDs or 15 (0xF) (no access) rB dstEdstMsrcAsrcB Register file Register file AB M E dstEdstMsrcAsrcB icoderA valBvalAvalEvalMCnd Control Logic  srcA, srcB: read port addresses  dstE, dstM: write port addresses Signals  Cnd: Indicate whether or not to perform conditional move -> Computed in Execute stage

33 A Source Seoul National University int srcA = [ icode in { IRRMOVL, IRMMOVL, IOPL, IPUSHL } : rA; icode in { IPOPL, IRET } : RESP; 1 : RNONE; # Don't need register ]; valA  R[rA] rrmovl rA, rB Decode rmmovl rA, D(rB) pushl rA jXX Dest call Dest ret Decode valA  R[ %esp ] valA  R[rA] OPl rA, rB Decode valE  0 + valC irmovl V, rB Decode mrmovl D(rB), rA Decode valA  R[ %esp ] popl rA Decode

34 E Destination Seoul National University int dstE = [ icode in { IRRMOVL } && Cnd : rB; icode in { IIRMOVL, IOPL} : rB; icode in { IPUSHL, IPOPL, ICALL, IRET } : RESP; 1 : RNONE; # Don't write any register ]; R[ %esp ]  valE R[rB]  valE rrmovl rA, rB Write-back rmmovl rA, D(rB) pushl rA jXX Dest call Dest ret Write-back R[ %esp ]  valE R[rB]  valE OPl rA, rB Write-back valE  0 + valC irmovl V, rB Write-back R[rB]  valE mrmovl D(rB), rA Write-back R[ %esp ]  valE popl rA Write-back

35 Execute Logic Seoul National University Units  ALU  Implements 4 required functions  Generates condition code values  CC  Register with 3 condition code bits  cond  Computes conditional jump/move flag Control Logic  Set CC: Should condition code register be loaded?  ALU A: Input A to ALU  ALU B: Input B to ALU  ALU fun: What function should ALU compute? CC ALU A ALU B ALU fun. Cnd icodeifunvalCvalBvalA valE Set CC cond

36 ALU A Input Seoul National University int aluA = [ icode in { IRRMOVL, IOPL } : valA; icode in { IIRMOVL, IRMMOVL, IMRMOVL } : valC; icode in { ICALL, IPUSHL } : -4; icode in { IRET, IPOPL } : 4; # Other instructions don't need ALU ]; valE  valB + (-4) valE  valB + valC valE  0 + valA rrmovl rA, rB Execute rmmovl rA, D(rB) pushl rA jXX Dest call Dest ret Execute valE  valB + 4 valE  valB OP valA OPl rA, rB Execute valE  0 + valC irmovl V, rB Execute valE  valB + valC mrmovl D(rB), rA Execute valE  valB + 4 popl rA Execute

37 ALU Operation int alufun = [ icode == IOPL : ifun; 1 : ALUADD; ]; valE  valB + (-4) valE  valB + valC valE  0 + valA rrmovl rA, rB Execute rmmovl rA, D(rB) pushl rA jXX Dest call Dest ret Execute valE  valB + 4 valE  valB OP valA OPl rA, rB Execute valE  0 + valC irmovl V, rB Execute valE  valB + valC mrmovl D(rB), rA Execute valE  valB + 4 popl rA Execute

38 Memory Logic Seoul National University Memory  Reads or writes memory word Control Logic  stat: What is instruction status?  Mem. read: should word be read?  Mem. write: should word be written?  Mem. addr.: Select address  Mem. data.: Select data Data memory Data memory Mem. read Mem. addr read write data out Mem. data valE valM valAvalP Mem. write data in icode Stat dmem_error instr_valid imem_error stat

39 Instruction Status Seoul National University Control Logic  stat: What is instruction status? ## Determine instruction status int Stat = [ imem_error || dmem_error : SADR; !instr_valid: SINS; icode == IHALT : SHLT; 1 : SAOK; ]; Data memory Data memory Mem. read Mem. addr read write data out Mem. data valE valM valAvalP Mem. write data in icode Stat dmem_error instr_valid imem_error stat

40 Memory Address Seoul National University int mem_addr = [ icode in { IRMMOVL, IPUSHL, ICALL, IMRMOVL } : valE; icode in { IPOPL, IRET } : valA; # Other instructions don't need address ]; M 4 [valE]  valP M 4 [valE]  valA rrmovl rA, rB Memory rmmovl rA, D(rB) pushl rA jXX Dest call Dest ret Memory valM  M 4 [valA] OPl rA, rB Memory valE  0 + valC irmovl V, rB Memory valM  M 4 [valE] mrmovl D(rB), rA Memory valM  M 4 [valA] popl rA Memory

41 Memory Read Seoul National University bool mem_read = icode in { IMRMOVL, IPOPL, IRET }; M 4 [valE]  valP M 4 [valE]  valA rrmovl rA, rB Memory rmmovl rA, D(rB) pushl rA jXX Dest call Dest ret Memory valM  M 4 [valA] OPl rA, rB Memory valE  0 + valC irmovl V, rB Memory valM  M 4 [valE] mrmovl D(rB), rA Memory valM  M 4 [valA] popl rA Memory

42 PC Update Logic Seoul National University New PC  Select next value of PC New PC CndicodevalCvalPvalM PC

43 PC Update Seoul National University int new_pc = [ icode == ICALL : valC; icode == IJXX && Cnd : valC; icode == IRET : valM; 1 : valP; ]; PC  valC PC  Cnd ? valC : valP PC  valP rrmovl rA, rB PC update rmmovl rA, D(rB) pushl rA jXX Dest call Dest ret PC update PC  valM PC  valP OPl rA, rB PC update valE  0 + valC irmovl V, rB PC update PC  valP mrmovl D(rB), rA PC update PC  valP popl rA PC update

44 SEQ Operation Seoul National University “State”  Register File  Memory  Program counter register (PC)  Condition code register (CC ) All updated as clock rises Combinational Logic  ALU  Control logic  Memory reads  Instruction memory  Register file  Data memory

45 SEQ Operation #2 Seoul National University  combinational logic starting to react to state changes 0x013:

46 SEQ Operation #3 Seoul National University  combinational logic generates results for addl instruction 0x013:

47 SEQ Operation #4 Seoul National University  state set according to addl instruction  combinational logic starting to react to state changes 0x013:

48 SEQ Operation #5 Seoul National University  combinational logic generates results for je instruction 0x013:

49 SEQ Summary Seoul National University Implementation  Express every instruction as a series of simple steps  Follow the same general flow for each instruction type  Assemble registers, memories, predesigned combinational blocks  Connect with control logic Limitations  Too slow to be practical  In one cycle, must propagate through instruction memory, register file, ALU, and data memory  Would need to run clock very slowly  Hardware units only active for fraction of clock cycle