COSC121: Computer Systems: Review Jeremy Bolton, PhD Assistant Teaching Professor Constructed using materials: - Patt and Patel Introduction to Computing Systems (2nd) - Patterson and Hennessy Computer Organization and Design (4th) **A special thanks to Rich Squier
Notes A good understanding of Digital Logic and Digital Design is a prerequisite. Please refresh as needed. Review the von Neumann Design, Basic Data Path, and LC-3 Implementation Read Patt and Patel CHs 4-5 (PP.4-PP.5) Complete HW#1 + HW#2
Outline Von Neumann Model LC-3 Main components and bottleneck Instruction processing and data path LC-3 ISA Operations and codes Addressing Modes Major components of implementation Example program
This week … our journey takes us … COSC 121: Computer Systems Application (Browser) Operating System (Win, Linux) Compiler COSC 255: Operating Systems Software Assembler Drivers Instruction Set Architecture Hardware Processor Memory I/O system Datapath & Control Digital Design COSC 120: Computer Hardware Circuit Design transistors
PP.4 Reviewing: The Von Neumann Model and LC3 Implementation
The Stored Program Computer 1945: John von Neumann wrote a report on the stored program concept The basic structure proposed in the draft became known as the “von Neumann machine” (or model). a memory, containing instructions and data a processing unit, for performing arithmetic and logical operations a control unit, for interpreting instructions Plus input and output All components connected by a shared bus. The von Neumann bottleneck
Von Neumann Model
The LC-3 as a von Neumann machine. Lets review each component …
Memory 2k x m array of stored bits Address Contents Basic Operations: Address Space Addressability 2k x m array of stored bits Address unique (k-bit) identifier of location Contents m-bit value stored in location Basic Operations: LOAD read a value from a memory location STORE write a value to a memory location 0000 0001 0010 0011 0100 0101 0110 1101 1110 1111 • 00101101 10100010 Address Space vs Addressibility.
Interface to Memory How does processing unit get data to/from memory? MAR: Memory Address Register MDR: Memory Data Register To LOAD a location (A): Write the address (A) into the MAR. Send a “read” signal to the memory. Read the data from MDR. To STORE a value (X) to a location (A): Write the data (X) to the MDR. Send a “write” signal to the memory.
Processing Unit Functional Units Registers Word Size ALU = Arithmetic and Logic Unit could have many functional units. some of them special-purpose (multiply, square root, … more later!) LC-3 performs ADD, AND, NOT Registers Small, temporary storage Operands and results of functional units LC-3 has eight registers (R0, …, R7), each 16 bits wide Word Size number of bits normally processed by ALU in one instruction also width of registers LC-3 is 16 bits
Input and Output Devices for getting data into and out of computer memory Each device has its own interface, usually a set of registers like the memory’s MAR and MDR LC-3 supports keyboard (input) and monitor (output) keyboard: data register (KBDR) and status register (KBSR) monitor: data register (DDR) and status register (DSR) Some devices provide both input and output disk, network Program that controls access to a device is usually called a driver.
Control Unit Orchestrates execution of the program Instruction Register (IR) contains the current instruction. Program Counter (PC) contains the address of the next instruction to be executed. Control unit: reads an instruction from memory the instruction’s address is in the PC interprets the instruction, generating signals that tell the other components what to do an instruction may take many machine cycles to complete
Instruction Processing Fetch instruction from memory Decode instruction Evaluate address Fetch operands from memory Execute operation Store result
Instruction The instruction is the fundamental unit of work. Specifies two things: opcode: operation to be performed operands: data/locations to be used for operation An instruction is encoded as a sequence of bits. (Just like data!) Often, but not always, instructions have a fixed length, such as 16 or 32 bits. Control unit interprets instruction: generates sequence of control signals to carry out operation. A computer’s instructions and their formats is known as its Instruction Set Architecture (ISA).
Example: LC-3 ADD Instruction LC-3 has 16-bit instructions. Each instruction has a four-bit opcode, bits [15:12]. LC-3 has eight registers (R0-R7) for temporary storage. Sources and destination of ADD are registers. “Add the contents of R2 to the contents of R6, and store the result in R6.”
Example: LC-3 LDR Instruction Load instruction -- reads data from memory Base + offset mode: add offset to base register -- result is memory address load from memory address into destination register “Add the value 6 to the contents of R3 to form a memory address. Load the contents of that memory location to R2.”
Instruction Processing: FETCH Load next instruction (at address stored in PC) from memory into Instruction Register (IR). Copy contents of PC into MAR. Send “read” signal to memory. Copy contents of MDR into IR. Then increment PC, so that it points to the next instruction in sequence. PC becomes PC+1. D EA OP EX S
Instruction Processing: DECODE F First identify the opcode. In LC-3, this is always the first four bits of instruction. A 4-to-16 decoder asserts a control line corresponding to the desired opcode. Depending on opcode, identify other operands from the remaining bits. Example: for LDR, last six bits is offset for ADD, last three bits is source operand #2 D EA OP EX S
Instruction Processing: EVALUATE ADDRESS F For instructions that require memory access, compute address used for access. Examples: add offset to base register (as in LDR) add offset to PC add offset to zero D EA OP EX S
Instruction Processing: FETCH OPERANDS Obtain source operands needed to perform operation. Examples: load data from memory (LDR) read data from register file (ADD) D EA OP EX S
Instruction Processing: EXECUTE F Perform the operation, using the source operands. Examples: send operands to ALU and assert ADD signal do nothing (e.g., for loads and stores) D EA OP EX S
Instruction Processing: STORE RESULT F Write results to destination. (register or memory) Examples: result of ADD is placed in destination register result of memory load is placed in destination register for store instruction, data is stored to memory write address to MAR, data to MDR assert WRITE signal to memory D EA OP EX S
Changing the Sequence of Instructions In the FETCH phase, we increment the Program Counter by 1. What if we don’t want to always execute the instruction that follows this one? examples: loop, if-then, function call Need special instructions that change the contents of the PC. These are called control instructions. jumps are unconditional -- they always change the PC branches are conditional -- they change the PC only if some condition is true (e.g., the result of an ADD is zero)
Example: LC-3 JMP Instruction Set the PC to the value contained in a register. This becomes the address of the next instruction to fetch. “Load the contents of R3 into the PC.”
Instruction Processing Summary Instructions look just like data -- it’s all interpretation. Three basic kinds of instructions: computational instructions (ADD, AND, …) data movement instructions (LD, ST, …) control instructions (JMP, BRnz, …) Six basic phases of instruction processing: F D EA OP EX S not all phases are needed by every instruction phases may take variable number of machine cycles
Control Unit State Diagram The control unit is a state machine. Here is part of a simplified state diagram for the LC-3: A more complete state diagram is in Appendix C. It will be more understandable after Chapter 5.
The off-switch?: Stopping the Clock Control unit will repeat instruction processing sequence as long as clock is running. If not processing instructions from your application, then it is processing instructions from the Operating System (OS). The OS is a special program that manages processor and other resources. To stop the computer: AND the clock generator signal with ZERO When control unit stops seeing the CLOCK signal, it stops processing. HALT instruction Does LC3 have an instruction to re-start the computer?
Chapter 5 The LC-3
Instruction Set Architecture ISA = All of the programmer-visible components and operations of the computer memory organization address space -- how may locations can be addressed? addressibility -- how many bits per location? register set how many? what size? how are they used? instruction set opcodes data types addressing modes ISA provides all information needed for someone that wants to write a program in machine language (or translate from a high-level language to machine language).
LC-3 Overview: Memory and Registers address space: 216 locations (16-bit addresses) addressability: 16 bits Registers temporary storage, accessed in a single machine cycle accessing memory generally takes longer than a single cycle eight general-purpose registers: R0 - R7 each 16 bits wide how many bits to uniquely identify a register? other registers not directly addressable, but used by (and affected by) instructions PC (program counter), condition codes
LC-3 Overview: Instruction Set Opcodes 15 opcodes Operate instructions: ADD, AND, NOT Data movement instructions: LD, LDI, LDR, LEA, ST, STR, STI Control instructions: BR, JSR/JSRR, JMP, RTI, TRAP some opcodes set/clear condition codes, based on result: N = negative, Z = zero, P = positive (> 0) Data Types 16-bit 2’s complement integer Addressing Modes How is the location of an operand specified? non-memory addresses: immediate, register memory addresses: PC-relative, indirect, base+offset
Operate Instructions Only three operations: ADD, AND, NOT Source and destination operands are registers These instructions do not reference memory. ADD and AND can use “immediate” mode, where one operand is hard-wired into the instruction. In the following we will show a dataflow diagram with each instruction. illustrates when and where data moves to accomplish the desired operation
NOT (Register) Note: Src and Dst could be the same register.
this zero means “register mode” ADD/AND (Register) this zero means “register mode”
this one means “immediate mode” ADD/AND (Immediate) this one means “immediate mode” Note: Immediate field is sign-extended. .
Using Operate Instructions Things to ponder … LC3 is quite limited … with only ADD, AND, NOT… How do we subtract? How do we OR? How do we copy from one register to another? How do we initialize a register to zero? Subtract: R3 = R1 - R2 Take 2’s complement of R2, then add to R1. (1) R2 = NOT(R2) (2) R2 = R2 + 1 (3) R3 = R1 + R2 OR: R3 = R1 OR R2 Use DeMorgan’s Law -- invert R1 and R2, AND, then invert result. (1) R1 = NOT(R1) (2) R2 = NOT(R2) (3) R3 = R1 AND R2 (4) R3 = NOT(R3) Register-to-register copy: R3 = R2 R3 = R2 + 0 (Add-immediate) Initialize to zero: R1 = 0 R1 = R1 AND 0 (And-immediate)
Data Movement Instructions Load -- read data from memory to register LD: PC-relative mode LDR: base+offset mode LDI: indirect mode Store -- write data from register to memory ST: PC-relative mode STR: base+offset mode STI: indirect mode Load effective address -- compute address, save in register LEA: immediate mode does not access memory
PC-Relative Addressing Mode Want to specify address directly in the instruction But an address is 16 bits, and so is an instruction! After subtracting 4 bits for opcode and 3 bits for register, we have 9 bits available for address. Solution: Use the 9 bits as a signed offset from the current PC. 9 bits: Can form any address X, such that: Remember that PC is incremented as part of the FETCH phase; This is done before the EVALUATE ADDRESS stage.
LD (PC-Relative)
ST (PC-Relative)
Indirect Addressing Mode With PC-relative mode, can only address data within 256 words of the instruction. What about the rest of memory? Solution #1: Read address from memory location, then load/store to that address. First address is generated from PC and IR (just like PC-relative addressing), then content of that address is used as target for load/store.
LDI (Indirect)
STI (Indirect)
Base + Offset Addressing Mode With PC-relative mode, can only address data within 256 words of the instruction. What about the rest of memory? Solution #2: Use a register to generate a full 16-bit address. 4 bits for opcode, 3 for src/dest register, 3 bits for base register -- remaining 6 bits are used as a signed offset. Offset is sign-extended before adding to base register.
LDR (Base+Offset)
STR (Base+Offset)
Load Effective Address Computes address like PC-relative (PC plus signed offset) and stores the result into a register. Note: The address is stored in the register, not the contents of the memory location.
LEA (Immediate)
Example: Different addressing methods Instruction Comments x30F6 1 1 1 0 0 0 1 1 1 1 1 1 1 1 0 1 R1 PC – 3 = x30F4 x30F7 0 0 0 1 0 1 0 0 0 1 1 0 1 1 1 0 R2 R1 + 14 = x3102 x30F8 0 0 1 1 0 1 0 1 1 1 1 1 1 0 1 1 M[PC - 5] R2 M[x30F4] x3102 x30F9 0 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 R2 0 x30FA 0 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 R2 R2 + 5 = 5 x30FB 0 1 1 1 0 1 0 0 0 1 0 0 1 1 1 0 M[R1+14] R2 M[x3102] 5 x30FC 1 0 1 0 0 1 1 1 1 1 1 1 0 1 1 1 R3 M[M[x30F4]] R3 M[x3102] R3 5 Load Immediate Store PC - relative Store - register Load indirect opcode
Control Instructions Used to alter the sequence of instructions (by changing the Program Counter) Conditional Branch branch is taken if a specified condition is true signed offset is added to PC to yield new PC else, the branch is not taken PC is not changed, points to the next sequential instruction Unconditional Branch (or Jump) always changes the PC TRAP changes PC to the address of an OS “service routine” routine will return control to the next instruction (after TRAP)
Condition Codes LC-3 has three condition code registers: N -- negative Z -- zero P -- positive (greater than zero) Set by any instruction that writes a value to a register (ADD, AND, NOT, LD, LDR, LDI, LEA) Exactly one will be set at all times Based on the last instruction that altered a register
Branch Instruction Branch specifies one or more condition codes. If the set bit is specified, the branch is taken. PC-relative addressing: target address is made by adding signed offset (IR[8:0]) to current PC. Note: PC has already been incremented by FETCH stage. Note: Target must be within 256 words of BR instruction. If the branch is not taken, the next sequential instruction is executed.
BR (PC-Relative) What happens if bits [11:9] are all zero? All one? If all zero, no CC is tested, so branch is never taken. (See Appendix B.) If all one, then all are tested. Since at least one of the CC bits is set to one after each operate/load instruction, then branch is always taken. (Assumes some instruction has set CC before branch instruction, otherwise undefined.) What happens if bits [11:9] are all zero? All one?
Using Branch Instructions Compute sum of 12 integers. Numbers start at location x3100. Program starts at location x3000. R1 x3100 R3 0 R2 12 R2=0? R4 M[R1] R3 R3+R4 R1 R1+1 R2 R2-1 NO YES
Sample Program Address Instruction Comments x3000 1 1 1 0 0 0 1 0 1 1 1 1 1 1 1 1 R1 x3100 (PC+0xFF) x3001 0 1 0 1 0 1 1 0 1 1 1 0 0 0 0 0 R3 0 x3002 0 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 R2 0 x3003 0 0 0 1 0 1 0 0 1 0 1 0 1 1 0 0 R2 12 x3004 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 If Z, goto x300A (PC+5) x3005 0 1 1 0 1 0 0 0 0 1 0 0 0 0 0 0 Load next value to R4 x3006 0 0 0 1 0 1 1 0 1 1 0 0 0 0 0 1 Add to R3 x3007 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 1 Increment R1 (pointer) X3008 0 0 0 1 0 1 0 0 1 0 1 1 1 1 1 1 Decrement R2 (counter) x3009 0 0 0 0 1 1 1 1 1 1 1 1 1 0 1 0 Goto x3004 (PC-6)
JMP (Register) Jump is an unconditional branch -- always taken. Target address is the contents of a register. Allows any target address.
TRAP Calls a service routine, identified by 8-bit “trap vector.” When routine is done, PC is set to the instruction following TRAP. (We’ll talk about how this works later.) vector routine x23 input a character from the keyboard x21 output a character to the monitor x25 halt the program
LC-3 Data Path Revisited Summary: LC-3 Implementation LC-3 Data Path Revisited Filled arrow = info to be processed. Unfilled arrow = control signal.
Summary: Data Path Components Global bus special set of wires that carry a 16-bit signal to many components inputs to the bus are “tri-state devices,” that only place a signal on the bus when they are enabled only one (16-bit) signal should be enabled at any time control unit decides which signal “drives” the bus any number of components can read the bus register only captures bus data if it is write-enabled by the control unit Memory Control and data registers for memory and I/O devices memory: MAR, MDR (also control signal for read/write)
Summary: Data Path Components ALU Accepts inputs from register file and from sign-extended bits from IR (immediate field). Output goes to bus. used by condition code logic, register file, memory Register File Two read addresses (SR1, SR2), one write address (DR) Input from bus result of ALU operation or memory read Two 16-bit outputs used by ALU, PC, memory address data for store instructions passes through ALU
Summary: Data Path Components PC and PCMUX Three inputs to PC, controlled by PCMUX PC+1 – FETCH stage Address adder – BR, JMP bus – TRAP (discussed later) MAR and MARMUX Two inputs to MAR, controlled by MARMUX Address adder – LD/ST, LDR/STR Zero-extended IR[7:0] -- TRAP (discussed later)
Summary: Data Path Components Condition Code Logic Looks at value on bus and generates N, Z, P signals Registers set only when control unit enables them (LD.CC) only certain instructions set the codes (ADD, AND, NOT, LD, LDI, LDR, LEA) Control Unit – Finite State Machine On each machine cycle, changes control signals for next phase of instruction processing who drives the bus? (GatePC, GateALU, …) which registers are write enabled? (LD.IR, LD.REG, …) which operation should ALU perform? (ALUK) … Logic includes decoder for opcode, etc.
Jeremy Bolton, PhD Assistant Teaching Professor Appendix Jeremy Bolton, PhD Assistant Teaching Professor Constructed using materials: - Patt and Patel Introduction to Computing Systems (2nd) - Patterson and Hennessy Computer Organization and Design (4th) **A special thanks to Rich Squier
Prgramming Example Count the occurrences of a character in a file Program begins at location x3000 Read character from keyboard Load each character from a “file” File is a sequence of memory locations Starting address of file is stored in the memory location immediately after the program If file character equals input character, increment counter End of file is indicated by a special ASCII value: EOT (x04) At the end, print the number of characters and halt (assume there will be less than 10 occurrences of the character) A special character used to indicate the end of a sequence is often called a sentinel. Useful when you don’t know ahead of time how many times to execute a loop.
Flow Chart
Program (1 of 2) Address Instruction Comments x3000 0 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 R2 0 (counter) x3001 0 0 1 0 0 1 1 0 0 0 0 1 0 0 0 0 R3 M[x3102] (ptr) x3002 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 1 Input to R0 (TRAP x23) x3003 0 1 1 0 0 0 1 0 1 1 0 0 0 0 0 0 R1 M[R3] x3004 0 0 0 1 1 0 0 0 0 1 1 1 1 1 0 0 R4 R1 – 4 (EOT) x3005 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 If Z, goto x300E x3006 1 0 0 1 0 0 1 0 0 1 1 1 1 1 1 1 R1 NOT R1 x3007 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 1 R1 R1 + 1 X3008 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 R1 R1 + R0 x3009 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 If N or P, goto x300B
Starting Address of File Program (2 of 2) Address Instruction Comments x300A 0 0 0 1 0 1 0 0 1 0 1 0 0 0 0 1 R2 R2 + 1 x300B 0 0 0 1 0 1 1 0 1 1 1 0 0 0 0 1 R3 R3 + 1 x300C 0 1 1 0 0 0 1 0 1 1 0 0 0 0 0 0 R1 M[R3] x300D 0 0 0 0 1 1 1 1 1 1 1 1 0 1 1 0 Goto x3004 x300E 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 R0 M[x3013] x300F 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 R0 R0 + R2 x3010 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 Print R0 (TRAP x21) x3011 1 1 1 1 0 0 0 0 0 0 1 0 0 1 0 1 HALT (TRAP x25) X3012 Starting Address of File x3013 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 ASCII x30 (‘0’)