1 Sequential CPU Implementation. 2 Outline Logic design Organizing Processing into Stages SEQ timing Suggested Reading 4.2,4.3.1 ~ 4.3.3.

Slides:



Advertisements
Similar presentations
University of Amsterdam Computer Systems – the processor architecture Arnoud Visser 1 Computer Systems The processor architecture.
Advertisements

Real-World Pipelines: Car Washes Idea  Divide process into independent stages  Move objects through stages in sequence  At any instant, multiple objects.
1 Seoul National University Logic Design. 2 Overview of Logic Design Seoul National University Fundamental Hardware Requirements  Computation  Storage.
David O’Hallaron Carnegie Mellon University Processor Architecture Logic Design Processor Architecture Logic Design
Randal E. Bryant Carnegie Mellon University CS:APP2e CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture.
Chapter Five The Processor: Datapath and Control.
CS:APP CS:APP Chapter 4 Computer Architecture Control Logic and Hardware Control Language CS:APP Chapter 4 Computer Architecture Control Logic and Hardware.
Randal E. Bryant CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture SequentialImplementation Slides.
Computer Organization Chapter 4
Datapath Design II Topics Control flow instructions Hardware for sequential machine (SEQ) Systems I.
Y86 Processor State Program Registers
Processor Architecture: The Y86 Instruction Set Architecture
1 Seoul National University Pipelined Implementation : Part I.
1 Seoul National University Logic Design. 2 Overview of Logic Design Seoul National University Fundamental Hardware Requirements  Computation  Storage.
Chapter 4: Processor Architecture How does the hardware execute the instructions? We’ll see by studying an example system  Based on simple instruction.
Instructor: Erol Sahin
Randal E. Bryant Carnegie Mellon University CS:APP CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture.
Randal E. Bryant adapted by Jason Fritts CS:APP2e CS:APP Chapter 4 Computer Architecture SequentialImplementation CS:APP Chapter 4 Computer Architecture.
Datapath Design I Topics Sequential instruction execution cycle Instruction mapping to hardware Instruction decoding Systems I.
Based on slides by Patrice Belleville CPSC 121: Models of Computation Unit 10: A Working Computer.
Architecture (I) Processor Architecture. – 2 – Processor Goal Understand basic computer organization Instruction set architecture Deeply explore the CPU.
Computer Architecture Carnegie Mellon University
CSC 2405 Computer Systems II Advanced Topics. Instruction Set Architecture.
Computer Architecture I: Outline and Instruction Set Architecture
CS:APP3e CS:APP Chapter 4 Computer Architecture Logic Design CS:APP Chapter 4 Computer Architecture Logic Design CENG331 - Computer Organization Murat.
1 Processor Architecture. Coverage Our Approach –Work through designs for particular instruction set Y86---a simplified version of the Intel IA32 (a.k.a.
Chapter 4: Processor Architecture
1 SEQ CPU Implementation. 2 Outline SEQ Implementation Suggested Reading 4.3.1,
Sequential Hardware “God created the integers, all else is the work of man” Leopold Kronecker (He believed in the reduction of all mathematics to arguments.
Rabi Mahapatra CS:APP3e Slides are from Authors: Bryant and O Hallaron.
Real-World Pipelines Idea –Divide process into independent stages –Move objects through stages in sequence –At any given times, multiple objects being.
Sequential CPU Implementation Implementation. – 2 – Processor Suggested Reading - Chap 4.3.
1 Seoul National University Sequential Implementation.
CPSC 121: Models of Computation
Real-World Pipelines Idea Divide process into independent stages
CPSC 121: Models of Computation
Lecture 13 Y86-64: SEQ – sequential implementation
Lecture 14 Y86-64: PIPE – pipelined implementation
Module 10: A Working Computer
Lecture 12 Logic Design Review & HCL & Bomb Lab
Seoul National University
Course Outline Background Sequential Implementation Pipelining
Sequential Implementation
Ch. 2 Two’s Complement Boolean vs. Logical Floating Point
Samira Khan University of Virginia Feb 14, 2017
Computer Architecture adapted by Jason Fritts then by David Ferry
Y86 Processor State Program Registers
Pipelined Implementation : Part I
Seoul National University
Instruction Decoding Optional icode ifun valC Instruction Format
Pipelined Implementation : Part II
Systems I Pipelining III
Computer Architecture adapted by Jason Fritts
Systems I Pipelining II
Pipelined Implementation : Part I
Sequential CPU Implementation
Pipeline Architecture I Slides from: Bryant & O’ Hallaron
Pipelined Implementation : Part I
Introduction to Computer System
Recap: Performance Comparison
Computer Architecture
Systems I Pipelining II
Chapter 4 Processor Architecture
Systems I Pipelining II
Disassembly תרגול 7 ניתוח קוד.
Real-World Pipelines: Car Washes
Sequential CPU Implementation
Introduction to the Architecture of Computers
CS-447– Computer Architecture M,W 10-11:20am Lecture 5 Instruction Set Architecture Sep 12th, 2007 Majd F. Sakr
Sequential Design תרגול 10.
Presentation transcript:

1 Sequential CPU Implementation

2 Outline Logic design Organizing Processing into Stages SEQ timing Suggested Reading 4.2,4.3.1 ~ 4.3.3

OF ZF SF OF ZF SF OF ZF SF OF ZF SF Arithmetic Logic Unit Combinational logic –Continuously responding to inputs Control signal selects function computed –Corresponding to 4 arithmetic/logical operations in Y86 Also computes values for condition codes We will use it as a basic component for our CPU ALUALU Y X X + Y 0 ALUALU Y X X - Y 1 ALUALU Y X X & Y 2 ALUALU Y X X ^ Y 3 A B A B A B A B 3

Storage Registers –Hold single words or bits –Loaded as clock rises –Not only program registers IO Clock 4

Storing 1 Bit 5 Bistable Element Q+ Q– q !q q = 0 or 1 V in V1V1 V2V2

Storing 1 Bit (cont.) 6 Bistable Element Q+ Q– q !q q = 0 or 1 V in V1V1 V2V2 V2V2 V1V1 V in = V 2 Stable 0 Stable 1 Metastable

Physical Analogy Stable 0 Stable 1 Metastable. Stable left. Stable right. Metastable 7

Storing and Accessing 1 Bit Q+Q+ Q– R S R-S Latch Resetting Setting Storing 0 0 !q q q Bistable Element Q+ Q– q !q q = 0 or 1 8

1-Bit Latch Latching 1 d!d d dd 0 Storing d!d q !q q D Latch Q+Q+ Q– R S D C Data Clock

Transparent 1-Bit Latch –When in latching mode, combinational propogation from D to Q+ and Q– –Value latched depends on value of D as C falls Latching 1 d!d d dd C D Q+ Time Changing D 10

Edge-Triggered Latch –Only in latching mode for brief period Rising clock edge –Value latched depends on data as clock rises –Output remains stable at all other times Q+Q+ Q– R S D C Data Clock T Trigger C D Q+ Time T 11

Registers –Stores word of data Different from program registers seen in assembly code –Collection of edge-triggered latches –Loads input on rising edge of clock IO Clock D C Q+ D C D C D C D C D C D C D C i7i7 i6i6 i5i5 i4i4 i3i3 i2i2 i1i1 i0i0 o7o7 o6o6 o5o5 o4o4 o3o3 o2o2 o1o1 o0o0 Clock Structure 12

Register Operation Stores data bits For most of time acts as barrier between input and output As clock rises, loads input State = x Rising clock Output = xInput = y x State = y Output = y y 13

State Machine Example Comb. Logic ALUALU 0 Out MUX 0 1 Clock In Load 0 14

State Machine Example 15 Comb. Logic ALUALU 0 Out MUX 0 1 Clock In Load x0x0 1 x0x0 ? ? x0x0 Clock Load In Out

State Machine Example 16 Comb. Logic ALUALU 0 Out MUX 0 1 Clock In Load x0x0 0 x 0 +x 0 x0x0 X0X0 x0x0 Clock Load In Out x0x0

State Machine Example 17 Comb. Logic ALUALU 0 Out MUX 0 1 Clock In Load x1x1 0 x 0 +x 1 x0x0 X0X0 x0x0 Clock Load In Out x0x0 x1x1

State Machine Example 18 Comb. Logic ALUALU 0 Out MUX 0 1 Clock In Load 0 x0x0 Clock Load In Out x0x0 x1x1 x2x2 x3x3 x4x4 x5x5 x 0 +x 1 x 0 +x 1 +x 2 –Accumulator circuit –Load or accumulate on each cycle x3x3 x+x 4 x 3 +x 4 +x 5

Building Blocks Combinational Logic –Compute Boolean functions of inputs –Continuously respond to input changes –Operate on data and implement control Storage Elements –Store bits –Addressable memories –Non-addressable registers –Loaded only as clock rises Register file Register file A B W dstW srcA valA srcB valB valW Clock ALUALU fun A B MUX 0 1 = Clock 19

Summary Storage –Registers Hold single words Loaded as clock rises –Random-access memories Hold multiple words Possible multiple read or write ports Read word when address input changes Write word as clock rises 20

21 Instruction Decoding Instruction Format –Instruction byteicode:ifun –Optional register byterA:rB –Optional constant wordvalC 50 rArB D icode ifun rA rB valC Optional

Y86 Instruction Set #1 Byte pushl rA A0 rA F jXX Dest 7 fn Dest popl rA B0 rA F call Dest 80 Dest cmovXX rA, rB 2 fnrArB irmovl V, rB 30F rB V rmmovl rA, D ( rB ) 40 rArB D mrmovl D ( rB ), rA 50 rArB D OPl rA, rB 6 fnrArB ret 90 nop 10 halt 00 rrmovl 20 cmovle 21 cmovl 22 cmove 23 cmovne 24 cmovge 25 cmovg 26 22

Y86 Instruction Set #2 Byte pushl rA A0 rA F jXX Dest 7 fn Dest popl rA B0 rA F call Dest 80 Dest cmovXX rA, rB 2 fnrArB irmovl V, rB 30F rB V rmmovl rA, D ( rB ) 40 rArB D mrmovl D ( rB ), rA 50 rArB D OPl rA, rB 6 fnrArB ret 90 nop 10 halt 00 addl 60 subl 61 andl 62 xorl 63 23

Y86 Instruction Set #3 Byte pushl rA A0 rA F jXX Dest 7 fn Dest popl rA B0 rA F call Dest 80 Dest rrmovl rA, rB 2 fnrArB irmovl V, rB 30F rB V rmmovl rA, D ( rB ) 40 rArB D mrmovl D ( rB ), rA 50 rArB D OPl rA, rB 6 fnrArB ret 90 nop 10 halt 00 jmp 70 jle 71 jl 72 je 73 jne 74 jge 75 jg 76 24

25 Instruction Execution Stages Fetch –Read instruction from instruction memory Decode –Read program registers Execute –Compute value or address Memory –Read or write data Write Back –Write program registers PC –Update program counter

26 Executing Arith./Logical Operation Fetch –Read 2 bytes Decode –Read operand registers Execute –Perform operation –Set condition codes Memory –Do nothing Write back –Update register PC Update –Increment PC by 2 OPl rA, rB 6 fnrArB

Stage Computation: Arith/Log. Ops OPl rA, rB icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch Read instruction byte Read register byte Compute next PC valA  R[rA] valB  R[rB] Decode Read operand A Read operand B valE  valB OP valA Set CC Execute Perform ALU operation Set condition code register Memory R[rB]  valE Write back Write back result PC  valP PC update Update PC 27

28 SEQ Hardware Structure Instruction Flow –Read instruction at address specified by PC –Process through stages –Update program counter

29 Instruction Instruction memory PC increment CC ALU Data memory Fetch Decode Execute Memory Write back icode ifun rA rB Register M valP srcA, srcB dstB valA, valB aluA,aluB valE PC valE, newPC PC A B Register File E

30 Stage Computation Formulate instruction execution as sequence of simple steps Use same general form for all instructions

31 Executing rrmovl Fetch –Read 2 bytes Decode –Read operand register rA Execute –Do nothing Memory –Do nothing Write back –Update register PC Update –Increment PC by 2 rrmovl rA, rB 2 0rArB

32 Stage Computation: rrmovl rrmovl rA, rB icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch Read instruction byte Read register byte Compute next PC valA  R[rA] Decode Read operand A valE  0 + valA Execute Perform ALU operation Memory R[rB]  valE Write back Write back result PC  valP PC update Update PC

33 Executing irmovl Fetch –Read 6 bytes Decode –Do nothing Execute –Do nothing Memory –Do nothing Write back –Update register PC Update –Increment PC by 6 irmovl V, rB 30 FrB V

34 Stage Computation: irmovl irmovl rA, rB icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valC  M 4 [PC+2] valP  PC+6 Fetch Read instruction byte Read register byte Read constant value Compute next PC Decode valE  0 + valC Execute Perform ALU operation Memory R[rB]  valE Write back Write back result PC  valP PC update Update PC

35 Executing rmmovl Fetch –Read 6 bytes Decode –Read operand registers Execute –Compute effective address Memory –Write to memory Write back –Do nothing PC Update –Increment PC by 6 rmmovl rA, D ( rB) 40 rArB D

Stage Computation: rmmovl –Use ALU for address computation rmmovl rA, D(rB) icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valC  M 4 [PC+2] valP  PC+6 Fetch Read instruction byte Read register byte Read displacement D Compute next PC valA  R[rA] valB  R[rB] Decode Read operand A Read operand B valE  valB + valC Execute Compute effective address M 4 [valE]  valA Memory Write value to memory Write back PC  valP PC update Update PC 36

37 Executing mrmovl Fetch –Read 6 bytes Decode –Read operand registers rB Execute –Compute effective address Memory –Read from memory Write back –Update register rA PC Update –Increment PC by 6 mrmovl D ( rB),rA 50 rArB D

38 Stage Computation: mrmovl Use ALU for address computation mrmovl D(rB), rA icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valC  M 4 [PC+2] valP  PC+6 Fetch Read instruction byte Read register byte Read displacement D Compute next PC valB  R[rB] Decode Read operand B valE  valB + valC Execute Compute effective address valM  M 4 [valE] Memory Read data from memory R[rA]  valM Write back Update register rA PC  valP PC update Update PC

39 Executing pushl Fetch –Read 2 bytes Decode –Read stack pointer and rA Execute –Decrement stack pointer by 4 Memory –Store valA at the address of new stack pointer Write back –Update stack pointer PC Update –Increment PC by 2 pushl rA a0 rA F

40 Stage Computation: pushl Use ALU to Decrement stack pointer pushl rA icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch Read instruction byte Read register byte Compute next PC valA  R[rA] valB  R [ %esp ] Decode Read valA Read stack pointer valE  valB + (-4) Execute Decrement stack pointer M 4 [valE]  valA Memory Store to stack R[ %esp ]  valE Write back Update stack pointer PC  valP PC update Update PC

41 Executing popl Fetch –Read 2 bytes Decode –Read stack pointer Execute –Increment stack pointer by 4 Memory –Read from old stack pointer Write back –Update stack pointer –Write result to register PC Update –Increment PC by 2 popl rA b0 rA F

Stage Computation: popl popl rA icode:ifun  M 1 [PC] rA:rB  M 1 [PC+1] valP  PC+2 Fetch Read instruction byte Read register byte Compute next PC valA  R[ %esp ] valB  R [ %esp ] Decode Read stack pointer valE  valB + 4 Execute Increment stack pointer valM  M 4 [valA] Memory Read from stack R[ %esp ]  valE R[rA]  valM Write back Update stack pointer Write back result PC  valP PC update Update PC 42

43 Stage Computation: popl Use ALU to increment stack pointer Must update two registers –Popped value –New stack pointer

44 Executing Jumps jXX Dest 7 fn Dest XX fall thru: XX target: Not taken Taken

45 Executing Jumps Fetch –Read 5 bytes –Increment PC by 5 Decode –Do nothing Execute –Determine whether to take branch based on jump condition and condition codes Memory –Do nothing Write back –Do nothing PC Update –Set PC to Dest if branch taken or to incremented PC if not branch

Stage Computation: Jumps jXX Dest icode:ifun  M 1 [PC] valC  M 4 [PC+1] valP  PC+5 Fetch Read instruction byte Read destination address Fall through address Decode Cnd  Cond(CC,ifun) Execute Take branch? Memory Write back PC  Cnd ? valC : valP PC update Update PC 46

47 Stage Computation: Jumps Compute both addresses Choose based on setting of condition codes and branch condition

48 Executing call call Dest 80 Dest XX return: XX target:

49 Executing call Fetch –Read 5 bytes –Increment PC by 5 Decode –Read stack pointer Execute –Decrement stack pointer by 4 Memory –Write incremented PC to new value of stack pointer Write back –Update stack pointer PC Update –Set PC to Dest

Stage Computation: call call Dest icode:ifun  M 1 [PC] valC  M 4 [PC+1] valP  PC+5 Fetch Read instruction byte Read destination address Compute return point valB  R[ %esp ] Decode Read stack pointer valE  valB + –4 Execute Decrement stack pointer M 4 [valE]  valP Memory Write return value on stack R[ %esp ]  valE Write back Update stack pointer PC  valC PC update Set PC to destination 50

51 Stage Computation: call Use ALU to decrement stack pointer Store incremented PC

52 Executing ret ret 90 XX return:

53 Executing ret Fetch –Read 1 byte Decode –Read stack pointer Execute –Increment stack pointer by 4 Memory –Read return address from old stack pointer Write back –Update stack pointer PC Update –Set PC to return address

Stage Computation: ret ret icode:ifun  M 1 [PC] Fetch Read instruction byte valA  R[ %esp ] valB  R[ %esp ] Decode Read operand stack pointer valE  valB + 4 Execute Increment stack pointer valM  M 4 [valA] Memory Read return address R[ %esp ]  valE Write back Update stack pointer PC  valM PC update Set PC to return address 54

55 Stage Computation: ret Use ALU to increment stack pointer Read return address from memory

Next SEQ Implementation Suggested Reading 4.3.1,