Building Blocks for a CPU

Building Blocks for a CPU
UCSD CSE Larry Carter Winter, 2002 Building Blocks for a CPU 2/1/02 CSE CPU components

Designing a Processor The Five Classic Components of a Computer
UCSD CSE Larry Carter Winter, 2002 Designing a Processor The Five Classic Components of a Computer Processor = Datapath + Control Processor Input Control Memory Datapath Output Before we go any further, let’s step back for a second and take a look at the big picture. All computer consist of five components: (1) Input and (2) output devices. (3) The Memory System. And the (4) Control and (5) Datapath of the Processor. Today’s lecture covers the datapath design. In Friday’s lecture, we will show you how to design the processor’s control unit. +1 = 5 min. (X:45) CSE CPU components

Middle third of course We’ll implement core MIPS processor three ways:
Single cycle implementation Multi-cycle implementation (reduces hardware) Pipelined implementation (improves throughput) But first, we’ll review the building blocks. CSE CPU components

Two types of logic components
Combinational Logic Acyclic – there are no loops in the circuit Output depends only on the current input values (after enough time has elapsed for circuit to stabilize) State elements The output can depend on previous history a a|~(b|c) b c a if a=1, then x=0 if a=0 & b=1, then x=1 if a=b=0, x=stored value x b CSE CPU components

Some combinational logic blocks
Simple gates: and, or, not, nor, nand, xor Multiplexor: control (c) chooses which input to pass through to output lines may be multi-bit busses Decoder: k-bit input selects which of 2k outputs is set to “1”. a 1 a(if c=0) b(if c=1) b c 1 if a=00 1 if a=01 1 if a=10 1 if a=11 a CSE CPU components

More combinational logic blocks
Adder: Here, lines represent multi-bit busses Arithmetic Logic Unit: control (c) chooses which operation will be used op can be +, -, shift, xor, etc. a b a+b c a b a op b CSE CPU components

3 ways to make combinational circuit
Given a truth table of some function from N input bits to M output bits, you can implement it using: Random Logic Build function up from simple gates May have long paths from input to output PLA (Programmable Logic Array) Implements function as sum-of-products PLA is 2 logic levels deep (3 if you count inverters) ROM (Read-Only Memory) Use memory holding 2N M-bit values Each memory cell holds output for one input combination CSE CPU components

PLA’s in 1 in 2 in 3 Each vertical wire is an “and” of selected inputs (or their negations) Each output is an “or” of selected vertical wires out 1 out 2 out 3 out 4 in1 & in3 (in1&in3) | (~in2&~in3) ~in2 & ~in3 CSE CPU components

Example: 3-bit adder 000 001 010 011 100 101 110 111 1 Inputs Output
1 Carry This space intentionally left black CSE CPU components

Which is best depends on concerns
Speed: Random logic might be slow (signal can go many levels) PLA can be the fastest (only 3 gates deep) Size: ROM is usually the largest (it always needs 2N cells) PLA often similar to random logic; not always Consider parity (mod-2 sum) of N inputs: Random logic needs N-1 XOR gates (or 3N-3 NAND’s) PLA needs 2N-1 product terms (one for each “1” output) Ease of implementation: ROM (esp. PROM = programmable ROM) is easy to change PLA’s are convenient too CSE CPU components

State Elements D Latch: When latch is “open”, output = data
D flip-flop: Output only changes at clock edge data & output & clk data D latch D latch output clk CSE CPU components

Storage Element: Register
UCSD CSE Larry Carter Winter, 2002 Storage Element: Register Register Like a D Flip-Flop except N-bit input and output there are really N flip-flops Write Enable input Write Enable: 0: Data in register will not change 1: Data Out becomes Data In (on the clock edge) N Data In N flip- flops Data Out & Clk N Write Enable As far as storage elements are concerned, we will need a N-bit register that is similar to the D flip-flop I showed you in class. The significant difference here is that the register will have a Write Enable input. That is the content of the register will NOT be updated if Write Enable is zero. The content is updated at the clock tick ONLY if the Write Enable signal is set to 1. +1 = 31 min. (Y:11) CSE CPU components

Register File for MIPS We need 32 reg’s and 3 ports:
UCSD CSE Larry Carter Winter, 2002 Register File for MIPS RW RA RB Write Enable We need 32 reg’s and 3 ports: Two 32-bit output buses: (A& B) One 32-bit input bus: (W) Register selection: RA selects the register to put on busA RB selects the register to put on busB RW selects the register to be written via busW when Write Enable is 1 What happens if RW = RA and WriteEnable=1 ?? 5 5 5 busA busW 32 32 32-bit Registers 32 busB Clk 32 We will also need a register file that consists of bit registers with two output busses (busA and busB) and one input bus. The register specifiers Ra and Rb select the registers to put on busA and busB respectively. When Write Enable is 1, the register specifier Rw selects the register to be written via busW. In our simplified version of the register file, the write operation will occurs at the clock tick. Keep in mind that the clock input is a factor ONLY during the write operation. During read operation, the register file behaves as a combinational logic block. That is if you put a valid value on Ra, then bus A will become valid after the register file’s access time. Similarly if you put a valid value on Rb, bus B will become valid after the register file’s access time. In both cases (Ra and Rb), the clock input is not a factor. +2 = 33 min. (Y:13) CSE CPU components

Implementing read ports
address 1 M u x Register 0 Register 1 ... Register 31 read data 1 read address 2 M u x read data 2 CSE CPU components

Implementing the write port
clk Register 0 Register 1 ... Register 31 & & decoder write address & & write data CSE CPU components

Storage Element: Memory
UCSD CSE Larry Carter Winter, 2002 Storage Element: Memory Memory One input bus: Data In One output bus: Data Out Memory word is selected by: If Write Enable = 0, memory location selected byAddress is put on Data Out bus If Write Enable = 1, the memory location selected by the Address is overwritten by Data In Clock input (CLK) The CLK input is used ONLY during write operation For read, memory acts as combinational logic: Address valid  Data Out valid after “access time.” Write Enable Address Data In DataOut 32 32 Clk The last storage element you will need for the datapath is the idealized memory to store your data and instructions. This idealized memory block has just one input bus (DataIn) and one output bus (DataOut). When Write Enable is 0, the address selects the memory word to put on the Data Out bus. When Write Enable is 1, the address selects the memory word to be written via the DataIn bus at the next clock tick. Once again, the clock input is a factor ONLY during the write operation. During read operation, it behaves as a combinational logic block. That is if you put a valid value on the address lines, the output bus DataOut will become valid after the access time of the memory. +2 = 35 min. (Y:15) CSE CPU components

UCSD CSE Larry Carter Winter, 2002 Clocking Methodology Clk Setup Hold Setup Hold Don’t Care . . Remember, we will be using a clocking methodology where all storage elements are clocked by the same clock edge. Consequently, our cycle time will be the sum of: (a) The Clock-to-Q time of the input registers. (b) The longest delay path through the combinational logic block. (c) The set up time of the output register. (d) And finally the clock skew. In order to avoid hold time violation, you have to make sure this inequality is fulfilled. +2 = 18 min. (X:58) All storage elements are clocked by same clock edge Combinational logic between storage elements must settle to correct output values in time indicated by dark bar. CSE CPU components

Computer building block of the day CORE STORAGE
UCSD CSE Larry Carter Winter, 2002 Computer building block of the day CORE STORAGE Mercury delay lines (Univac I’s storage) were 100x cheaper than vacuum tubes. Replaced by CRT memory (similar, using light instead of sound). But memory was still expensive and unreliable. “Cores” (little donuts) of certain materials are interesting: If you pass enough current through, it magnetizes “0” or “1” If you pass less current through a magnetized core, it sends a pulse down a second wire but doesn’t change. Led to invention of “core storage”: 2-D arrays of cores. “Read” by sending half-critical current through row, sensing column “Write” selected core with half-critical current through row & column. Cores used on were on order of .2 inch in diameter; CSE CPU components

Core storage used on Whirlwind computer developed at MIT in early 50’s
.14 inch in diameter bit words of storage Cores improved steadily over next 20 years .03” core with .019” hole. 4 wires passed through each (X,Y, inhibit, sense) Speeds around 1 microsecond And the inevitable patent problems MIT got 2 cents per core. IBM made billion cores/year In 1964, IBM paid one-time fee of $13M – biggest patent payment to date. CSE CPU components

Building Blocks for a CPU

Similar presentations

Presentation on theme: "Building Blocks for a CPU"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Building Blocks for a CPU

Similar presentations

Presentation on theme: "Building Blocks for a CPU"— Presentation transcript:

Similar presentations

About project

Feedback