Sequential CPU Implementation Implementation. – 2 – Processor Suggested Reading - Chap 4.3.

Slides:



Advertisements
Similar presentations
Randal E. Bryant Carnegie Mellon University CS:APP CS:APP Chapter 4 Computer Architecture PipelinedImplementation Part II CS:APP Chapter 4 Computer Architecture.
Advertisements

University of Amsterdam Computer Systems – the processor architecture Arnoud Visser 1 Computer Systems The processor architecture.
Real-World Pipelines: Car Washes Idea  Divide process into independent stages  Move objects through stages in sequence  At any instant, multiple objects.
PipelinedImplementation Part I PipelinedImplementation.
1 Seoul National University Logic Design. 2 Overview of Logic Design Seoul National University Fundamental Hardware Requirements  Computation  Storage.
David O’Hallaron Carnegie Mellon University Processor Architecture Logic Design Processor Architecture Logic Design
Randal E. Bryant Carnegie Mellon University CS:APP2e CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture.
PipelinedImplementation Part I CSC 333. – 2 – Overview General Principles of Pipelining Goal Difficulties Creating a Pipelined Y86 Processor Rearranging.
CS:APP CS:APP Chapter 4 Computer Architecture Control Logic and Hardware Control Language CS:APP Chapter 4 Computer Architecture Control Logic and Hardware.
Randal E. Bryant CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture SequentialImplementation Slides.
Computer Organization Chapter 4
Datapath Design II Topics Control flow instructions Hardware for sequential machine (SEQ) Systems I.
Pipelining III Topics Hazard mitigation through pipeline forwarding Hardware support for forwarding Forwarding to mitigate control (branch) hazards Systems.
David O’Hallaron Carnegie Mellon University Processor Architecture PIPE: Pipelined Implementation Part I Processor Architecture PIPE: Pipelined Implementation.
Y86 Processor State Program Registers
Processor Architecture: The Y86 Instruction Set Architecture
1 Seoul National University Pipelined Implementation : Part I.
1 Seoul National University Logic Design. 2 Overview of Logic Design Seoul National University Fundamental Hardware Requirements  Computation  Storage.
1 Naïve Pipelined Implementation. 2 Outline General Principles of Pipelining –Goal –Difficulties Naïve PIPE Implementation Suggested Reading 4.4, 4.5.
Instructor: Erol Sahin
Randal E. Bryant Carnegie Mellon University CS:APP CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture.
Randal E. Bryant adapted by Jason Fritts CS:APP2e CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture.
Datapath Design I Topics Sequential instruction execution cycle Instruction mapping to hardware Instruction decoding Systems I.
Based on slides by Patrice Belleville CPSC 121: Models of Computation Unit 10: A Working Computer.
Computer Architecture Carnegie Mellon University
CSC 2405 Computer Systems II Advanced Topics. Instruction Set Architecture.
Computer Architecture I: Outline and Instruction Set Architecture
1 Sequential CPU Implementation. 2 Outline Logic design Organizing Processing into Stages SEQ timing Suggested Reading 4.2,4.3.1 ~
1 SEQ CPU Implementation. 2 Outline SEQ Implementation Suggested Reading 4.3.1,
Randal E. Bryant Carnegie Mellon University CS:APP2e CS:APP Chapter 4 Computer Architecture PipelinedImplementation Part II CS:APP Chapter 4 Computer Architecture.
Sequential Hardware “God created the integers, all else is the work of man” Leopold Kronecker (He believed in the reduction of all mathematics to arguments.
Rabi Mahapatra CS:APP3e Slides are from Authors: Bryant and O Hallaron.
Real-World Pipelines Idea –Divide process into independent stages –Move objects through stages in sequence –At any given times, multiple objects being.
1 Seoul National University Sequential Implementation.
CPSC 121: Models of Computation
Real-World Pipelines Idea Divide process into independent stages
CPSC 121: Models of Computation
Lecture 13 Y86-64: SEQ – sequential implementation
Lecture 14 Y86-64: PIPE – pipelined implementation
Lecture 12 Logic Design Review & HCL & Bomb Lab
Seoul National University
Course Outline Background Sequential Implementation Pipelining
Sequential Implementation
Ch. 2 Two’s Complement Boolean vs. Logical Floating Point
Samira Khan University of Virginia Feb 14, 2017
Computer Architecture adapted by Jason Fritts then by David Ferry
Y86 Processor State Program Registers
Pipelined Implementation : Part I
Seoul National University
Seoul National University
Instruction Decoding Optional icode ifun valC Instruction Format
Pipelined Implementation : Part II
Systems I Pipelining III
Computer Architecture adapted by Jason Fritts
Systems I Pipelining II
Pipelined Implementation : Part I
Seoul National University
Sequential CPU Implementation
Pipeline Architecture I Slides from: Bryant & O’ Hallaron
Pipelined Implementation : Part I
Recap: Performance Comparison
Pipelined Implementation
Computer Architecture
Systems I Pipelining II
Chapter 4 Processor Architecture
Systems I Pipelining II
Disassembly תרגול 7 ניתוח קוד.
Real-World Pipelines: Car Washes
Sequential CPU Implementation
Sequential Design תרגול 10.
Presentation transcript:

Sequential CPU Implementation Implementation

– 2 – Processor Suggested Reading - Chap 4.3

– 3 – Processor Y86 Instruction Set P259 Byte pushl rA A0 rA 8 jXX Dest 7 fn Dest popl rA B0 rA 8 call Dest 80 Dest rrmovl rA, rB 20 rArB irmovl V, rB 308 rB V rmmovl rA, D ( rB ) 40 rArB D mrmovl D ( rB ), rA 50 rArB D OPl rA, rB 6 fnrArB ret 90 nop 00 halt 10 addl 60 subl 61 andl 62 xorl 63 jmp 70 jle 71 jl 72 je 73 jne 74 jge 75 jg 76

– 4 – Processor Building Blocks P278, P279, P280 Combinational Logic Compute Boolean functions of inputs Continuously respond to input changes Operate on data and implement control Storage Elements Store bits Addressable memories Non-addressable registers Loaded only as clock rises Register file Register file A B W dstW srcA valA srcB valB valW Clock ALUALU fun A B MUX 0 1 = Clock

– 5 – Processor Hardware Control Language Very simple hardware description language Can only express limited aspects of hardware operation Parts we want to explore and modify Data Types bool : Boolean a, b, c, … int : words A, B, C, … Does not specify word size---bytes, 32-bit words, …Statements bool a = bool-expr ; int A = int-expr ;

– 6 – Processor HCL Operations Classify by type of value returned Boolean Expressions Logic Operations a && b, a || b, !a Word Comparisons A == B, A != B, A = B, A > B Set Membership A in { B, C, D } »Same as A == B || A == C || A == D Word Expressions Case expressions [ a : A; b : B; c : C ] Evaluate test expressions a, b, c, … in sequence Return word expression A, B, C, … for first successful test

– 7 – Processor Instruction Execution Stages P281 Fetch Read instruction from instruction memoryDecode Read program registersExecute Compute value or addressMemory Read or write data Write Back Write program registersPC Update program counter

– 8 – Processor Instruction Decoding Instruction Format Instruction byteicode:ifun Optional register byterA:rB Optional constant wordvalC 50 rArB D icode ifun rA rB valC Optional

– 9 – Processor Figure 4.16 P283 Executing Arith./Logical Operation Fetch Read 2 bytesDecode Read operand registersExecute Perform operation Set condition codesMemory Do nothing Write back Update register PC Update Increment PC by 2 OPl rA, rB 6 fn rArB

– 10 – Processor Stage Computation: Arith/Log. Ops P283 Figure 4.16 Formulate instruction execution as sequence of simple steps Use same general form for all instructions OPl rA, rB icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch Read instruction byte Read register byte Compute next PC valA  R[rA] valB  R[rB] Decode Read operand A Read operand B valE  valB OP valA Set CC Execute Perform ALU operation Set condition code register Memory R[rB]  valE Write back Write back result PC  valP PC update Update PC M 1 [PC] 表示从 PC 开始的内存中读取一个字节的数据。

– 11 – Processor Executing rrmovl Fetch Read 2 bytesDecode Read operand register rAExecute Do nothingMemory Write back Update register PC Update Increment PC by 2 rrmovl rA, rB 2 0 rArB

– 12 – Processor Stage Computation: rrmovl P283 Figure 4.16 Formulate instruction execution as sequence of simple steps Use same general form for all instructions rrmovl rA, rB icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch Read instruction byte Read register byte Compute next PC valA  R[rA] Decode Read operand A valE  0 + valA Execute Perform ALU operation *valE Memory R[rB]  valE Write back Write back result PC  valP PC update Update PC

– 13 – Processor Executing irmovl Fetch Read 6 bytesDecode Do nothingExecute Memory Write back Update register PC Update Increment PC by 6 irmovl V, rB 30 8 rB V

– 14 – Processor Stage Computation: irmovl P283 Figure 4.16 Formulate instruction execution as sequence of simple steps Use same general form for all instructions irmovl rA, rB icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valC  M 4 [PC+2] valP  PC+6 Fetch Read instruction byte Read register byte Read constant value Compute next PC Decode valE  0 + valC Execute Perform ALU operation Memory R[rB]  valE Write back Write back result PC  valP PC update Update PC

– 15 – Processor Figure 4.17 P283 Executing rmmovl Fetch Read 6 bytesDecode Read operand registersExecute Compute effective addressMemory Write to memory Write back Do nothing PC Update Increment PC by 6 rmmovl rA, D ( rB) 40 rA rB D

– 16 – Processor Stage Computation: rmmovl P283 Figure 4.17 Use ALU for address computation rmmovl rA, D(rB) icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valC  M 4 [PC+2] valP  PC+6 Fetch Read instruction byte Read register byte Read displacement D Compute next PC valA  R[rA] valB  R[rB] Decode Read operand A Read operand B valE  valB + valC Execute Compute effective address (sum of the displacement and the base register value) M 4 [valE]  valA Memory Write value to memory Write back PC  valP PC update Update PC

– 17 – Processor Executing mrmovl Fetch Read 6 bytesDecode Read operand registers rBExecute Compute effective addressMemory Read from memory Write back Update register rA PC Update Increment PC by 6 mrmovl D ( rB), rA 50 rA rB D

– 18 – Processor Stage Computation: mrmovl P283 Figure 4.17 Use ALU for address computation mrmovl D(rB), rA icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valC  M 4 [PC+2] valP  PC+6 Fetch Read instruction byte Read register byte Read displacement D Compute next PC valB  R[rB] Decode Read operand B valE  valB + valC Execute Compute effective address valM  M 4 [valE] Memory Read data from memory R[rA]  valM Write back Update register rA PC  valP PC update Update PC

– 19 – Processor Figure 4.18 P284 Executing pushl Fetch Read 2 bytesDecode Read stack pointer and rAExecute Decrement stack pointer by 4Memory Store valA at the address of new stack pointer Write back Update stack pointer PC Update Increment PC by 2 pushl rA a0 rA 8

– 20 – Processor Stage Computation: pushl P284 Figure 4.18 Use ALU to Decrement stack pointer pushl rA icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch Read instruction byte Read register byte Compute next PC valA  R[rA] valB  R [ %esp ] Decode Read valA Read stack pointer valE  valB + (-4) Execute Decrement stack pointer M 4 [valE]  valA Memory Store to stack R[ %esp ]  valE Write back Update stack pointer * 在 write back 之前实际上写入的元素在堆 栈外。 PC  valP PC update Update PC

– 21 – Processor Executing popl Fetch Read 2 bytesDecode Read stack pointerExecute Increment stack pointer by 4Memory Read from old stack pointer Write back Update stack pointer Write result to register PC Update Increment PC by 2 popl rA b0 rA 8

– 22 – Processor Stage Computation: popl P284 Figure 4.18 Use ALU to increment stack pointer Must update two registers Popped value New stack pointer popl rA icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch Read instruction byte Read register byte Compute next PC valA  R[ %esp ] valB  R [ %esp ] Decode Read stack pointer valE  valB + 4 Execute Increment stack pointer valM  M 4 [valA] Memory Read from stack R[ %esp ]  valE R[rA]  valM Write back Update stack pointer Write back result PC  valP PC update Update PC 在写回阶段要写两个寄存器,这两个写是有先后次序的。必须按照上面的方法进行, 因为 rA 可能就是 %esp 。具体见书 P270 Practice Problem 4.5.

– 23 – Processor Figure 4.19 P284 Executing Jumps Fetch Read 5 bytes Increment PC by 5Decode Do nothingExecute Determine whether to take branch based on jump condition and condition codesMemory Do nothing Write back Do nothing PC Update Set PC to Dest if branch taken or to incremented PC if not branch jXX Dest 7 fn Dest XX fall thru: XX target: Not taken Taken

– 24 – Processor Stage Computation: Jumps P284 Figure 4.19 Compute both addresses Choose based on setting of condition codes and branch condition jXX Dest icode:ifun  M 1 [PC] valC  M 4 [PC+1] valP  PC+5 Fetch Read instruction byte Read destination address Fall through address Decode Bch  Cond(CC,ifun) Execute Take branch? Memory Write back PC  Bch ? valC : valP PC update Update PC

– 25 – Processor Executing call Fetch Read 5 bytes Increment PC by 5Decode Read stack pointerExecute Decrement stack pointer by 4Memory Write incremented PC to new value of stack pointer Write back Update stack pointer PC Update Set PC to Dest call Dest 80 Dest XX return: XX target:

– 26 – Processor Stage Computation: call P284 Figure 4.19 Use ALU to decrement stack pointer Store incremented PC call Dest icode:ifun  M 1 [PC] valC  M 4 [PC+1] valP  PC+5 Fetch Read instruction byte Read destination address Compute return point valB  R[ %esp ] Decode Read stack pointer valE  valB + –4 Execute Decrement stack pointer M 4 [valE]  valP Memory Write return value on stack R[ %esp ]  valE Write back Update stack pointer PC  valC PC update Set PC to destination

– 27 – Processor Executing ret Fetch Read 1 byteDecode Read stack pointerExecute Increment stack pointer by 4Memory Read return address from old stack pointer Write back Update stack pointer PC Update Set PC to return address ret 90XX return:

– 28 – Processor Stage Computation: ret P284 Figure 4.19 Use ALU to increment stack pointer Read return address from memory ret icode:ifun  M 1 [PC] Fetch Read instruction byte valA  R[ %esp ] valB  R[ %esp ] Decode Read operand stack pointer valE  valB + 4 Execute Increment stack pointer valM  M 4 [valA] Memory Read return address R[ %esp ]  valE Write back Update stack pointer PC  valM PC update Set PC to return address

– 29 – Processor Computation Steps Figure 4.16 P285 All instructions follow same general pattern Differ in what gets computed on each step OPl rA, rB icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch Read instruction byte Read register byte [Read constant word] Compute next PC valA  R[rA] valB  R[rB] Decode Read operand A Read operand B valE  valB OP valA Set CC Execute Perform ALU operation Set condition code register Memory [Memory read/write] R[rB]  valE Write back Write back ALU result [Write back memory result] PC  valP PC update Update PC icode,ifun rA,rB valC valP valA, srcA valB, srcB valE Cond code valM dstE dstM PC

– 30 – Processor Computation Steps Figure 4.19 P284 All instructions follow same general pattern Differ in what gets computed on each step call Dest Fetch Decode Execute Memory Write back PC update icode,ifun rA,rB valC valP valA, srcA valB, srcB valE Cond code valM dstE dstM PC icode:ifun  M 1 [PC] valC  M 4 [PC+1] valP  PC+5 valB  R[ %esp ] valE  valB + –4 M 4 [valE]  valP R[ %esp ]  valE PC  valC Read instruction byte [Read register byte] Read constant word Compute next PC [Read operand A] Read operand B Perform ALU operation [Set condition code reg.] [Memory read/write] [Write back ALU result] Write back memory result Update PC

– 31 – Processor Computed Values Fetch icodeInstruction code ifunInstruction function rAInstr. Register A rBInstr. Register B valCInstruction constant valPIncremented PC Decode § Write back valARegister value A valBRegister value BExecute valEALU result BchBranch flagMemory valMValue from memory

– 32 – Processor , 4.3.3, SEQ Operation State Program counter register (PC) Condition code register (CC) Register File Memories Access same memory space Data: for reading/writing program data Instruction: for reading instructions All updated as clock rises Combinational Logic ALU Control logic Memory reads Instruction memory Register file Data memory

– 33 – Processor SEQ Operation #2 (point 1) state set according to second irmovl instruction combinational logic starting to react to state changes Figure 4.23 P297

– 34 – Processor SEQ Operation #3 (point 2) state set according to second irmovl instruction combinational logic generates results for addl instruction

– 35 – Processor SEQ Operation #4 (point 3) state set according to addl instruction combinational logic starting to react to state changes

– 36 – Processor SEQ Operation #5 (point 4) state set according to addl instruction combinational logic generates results for je instruction

– 37 – Processor SEQ Semantics Achieve the same effect as a sequential execution of the assignment shown in the tables of Figures 4.16 to 4.19 Though all of the state updates occur simultaneously at the clock rises to the next cycle. A problem: popl %esp need to sequentially write two registers. So the register file control logic must process it. Principle: never needs to read back the state updated by an instruction in order to complete the processing of this instruction (P295) If so, the update must happen in the instruction cycle E.g. pushl semantics E.g. no one instruction need to both set and then read the condition codes.

– 38 – Processor SEQ CPU Implementation

– 39 – Processor What we will discuss today? The implementation of a sequential CPU ---- SEQ Every Instruction finished in one cycle. Instruction executes in sequential No two instruction execute in parallel or overlap An revised version of SEQ ---- SEQ+ Modify the PC Update stage of SEQ to show the difference between ISA and implementation

– 40 – Processor Some Macros Figure 4.24 P299 NameValueMeaning INOP0 Code for nop instruction IHALT1 Code for halt instruction IRRMOVL2 Code for rrmovl instruction IIRMOVL3 Code for irmovl instruction IRMMOVL4 Code for rmmovl instruction IMRMOVL5 Code for mrmovl instruction IOPL6 Code for integer op instructions IJXX7 Code for jump instructions …………………………………………… IPOPLB Code for popl instruction RESPRENONE68 Register ID for %esp Indicates no register file access ALUADD0 Function for addition operation

– 41 – Processor SEQ Hardware Structure Figure 4.20 P292 Stages Fetch Read instruction from memory Decode Read program registers Execute Compute value or address Memory Read or write data Write Back Write program registers PC Update program counter Instruction Flow Read instruction at address specified by PC Process through stages Update program counter Instruction memory Instruction memory PC increment PC increment CC ALU Data memory Data memory Fetch Decode Execute Memory Write back icode, ifun rA,rB valC Register file Register file AB M E Register file Register file AB M E PC valP srcA,srcB dstA,dstB valA,valB aluA,aluB Bch valE Addr, Data valM PC valE,valM newPC

– 42 – Processor Difference between semantics and implementation ISA Every stage may update some states, these updates occur sequentiallySEQ All the state update operations occur simultaneously at clock rising (except memory and CC)

– 43 – Processor SEQ Hardware Figure 4.21 P293 Key Blue boxes: predesigned hardware blocks E.g., memories, ALU Gray boxes: control logic Describe in HCL White ovals ( 椭圆 ): labels for signals Thick lines: 32-bit word values Thin lines: 4-8 bit values Dotted lines: 1-bit values

– 44 – Processor Fetch Logic Predefined Blocks PC: Register containing PC Instruction memory: Read 6 bytes (PC to PC+5) Split: Divide instruction byte into icode and ifun Align: Get fields for rA, rB, and valC Figure 4.25 P299

– 45 – Processor Fetch Logic Control Logic Instr. Valid: Is this instruction valid? Need regids: Does this instruction have a register bytes? Need valC: Does this instruction have a constant word?

– 46 – Processor Fetch Control Logic P300 bool need_regids = icode in { IRRMOVL, IOPL, IPUSHL, IPOPL, IIRMOVL, IRMMOVL, IMRMOVL }; bool instr_valid = icode in { INOP, IHALT, IRRMOVL, IIRMOVL, IRMMOVL, IMRMOVL, IOPL, IJXX, ICALL, IRET, IPUSHL, IPOPL };

– 47 – Processor Decode & Write-Back Logic Register File Read ports A, B Write ports E, M Addresses are register IDs or 8 (no access) Control Logic srcA, srcB: read port addresses dstA, dstB: write port addresses Figure 4.26 P300

– 48 – Processor A Source P301 OPl rA, rB valA  R[rA] Decode Read operand A rmmovl rA, D(rB) valA  R[rA] Decode Read operand A popl rA valA  R[ %esp ] Decode Read stack pointer jXX Dest Decode No operand call Dest valA  R[ %esp ] Decode Read stack pointer ret Decode No operand int srcA = [ icode in { IRRMOVL, IRMMOVL, IOPL, IPUSHL } : rA; icode in { IPOPL, IRET } : RESP; 1 : RNONE; # Don't need register ];

– 49 – Processor E Destination P301 None R[ %esp ]  valE Update stack pointer None R[rB]  valE OPl rA, rB Write-back rmmovl rA, D(rB) popl rA jXX Dest call Dest ret Write-back Write back result R[ %esp ]  valE Update stack pointer R[ %esp ]  valE Update stack pointer int dstE = [ icode in { IRRMOVL, IIRMOVL, IOPL} : rB; icode in { IPUSHL, IPOPL, ICALL, IRET } : RESP; 1 : RNONE; # Don't need register ];

– 50 – Processor Execute Logic Units ALU Implements 4 required functions Generates condition code values CC Register with 3 condition code bits bcond Computes branch flag Control Logic Set CC: Should condition code register be loaded? ALU A: Input A to ALU ALU B: Input B to ALU ALU fun: What function should ALU compute? Figure 4.27 P302

– 51 – Processor ALU A Input valE  valB + –4 Decrement stack pointer No operation valE  valB + 4 Increment stack pointer valE  valB + valC Compute effective address valE  valB OP valA Perform ALU operation OPl rA, rB Execute rmmovl rA, D(rB) popl rA jXX Dest call Dest ret Execute valE  valB + 4 Increment stack pointer int aluA = [ icode in { IRRMOVL, IOPL } : valA; icode in { IIRMOVL, IRMMOVL, IMRMOVL } : valC; icode in { ICALL, IPUSHL } : -4; icode in { IRET, IPOPL } : 4; # Other instructions don't need ALU ]; P302

– 52 – Processor ALU Operation valE  valB + –4 Decrement stack pointer No operation valE  valB + 4 Increment stack pointer valE  valB + valC Compute effective address valE  valB OP valA Perform ALU operation OPl rA, rB Execute rmmovl rA, D(rB) popl rA jXX Dest call Dest ret Execute valE  valB + 4 Increment stack pointer int alufun = [ icode == IOPL : ifun; 1 : ALUADD; ]; P303

– 53 – Processor Condition Set Bool set_cc = icode in { IOPL }; We will not discuss the detail of Bcond Though it is also a control unit P303

– 54 – Processor Memory Logic Memory Reads or writes memory word Control Logic Mem. read: should word be read? Mem. write: should word be written? Mem. addr.: Select address Mem. data.: Select data Figure 4.28 P303

– 55 – Processor Memory Address OPl rA, rB Memory rmmovl rA, D(rB) popl rA jXX Dest call Dest ret No operation M 4 [valE]  valA Memory Write value to memory valM  M 4 [valA] Memory Read from stack M 4 [valE]  valP Memory Write return value on stack valM  M 4 [valA] Memory Read return address Memory No operation int mem_addr = [ icode in { IRMMOVL, IPUSHL, ICALL, IMRMOVL } : valE; icode in { IPOPL, IRET } : valA; # Other instructions don't need address ]; P304

– 56 – Processor Memory Read OPl rA, rB Memory rmmovl rA, D(rB) popl rA jXX Dest call Dest ret No operation M 4 [valE]  valA Memory Write value to memory valM  M 4 [valA] Memory Read from stack M 4 [valE]  valP Memory Write return value on stack valM  M 4 [valA] Memory Read return address Memory No operation bool mem_read = icode in { IMRMOVL, IPOPL, IRET }; bool mem_write = icode in { IRMMOVL, IPUSHL, ICALL }; P304

– 57 – Processor PC Update Logic New PC Select next value of PC Figure 4.29 P304

– 58 – Processor PC Update OPl rA, rB rmmovl rA, D(rB) popl rA jXX Dest call Dest ret PC  valP PC update Update PC PC  valP PC update Update PC PC  valP PC update Update PC PC  Bch ? valC : valP PC update Update PC PC  valC PC update Set PC to destination PC  valM PC update Set PC to return address int new_pc = [ icode == ICALL : valC; icode == IJXX && Bch : valC; icode == IRET : valM; 1 : valP; ]; P304

– 59 – Processor SEQ Hardware (Review) Stages occur in sequence One operation in process at a time Figure 4.21 P293

– 60 – Processor SEQ+ Hardware Still sequential implementation Reorder PC stage to put at beginning PC Stage Task is to select PC for current instruction Based on results computed by previous instruction Processor State PC is no longer stored in register But, can determine PC based on other stored information

– 61 – Processor PC Computation Int pc= [ pIcode == ICALL : pValC; pIcode == IJXX && bBch : pValC; PIcode == IRET : pValM; 1 : pValP; ]; P308

– 62 – Processor SEQ Summary Implementation Express every instruction as series of simple steps Follow same general flow for each instruction type Assemble registers, memories, predesigned combinational blocks Connect with control logicLimitations Too slow to be practical In one cycle, must propagate through instruction memory, register file, ALU, and data memory Would need to run clock very slowly Hardware units only active for fraction of clock cycle