INSTRUCTION SET ARCHITECTURES Chapter 5
Introduction High-level languages hide it So, why do we need? Efficient programming Understanding the machine architecture
Instruction Formats Architectures can be differenced by the following: Number of bits (16, 32, 64) Operand storage in CPU (data in stack or registers) Number of explicit operands per instruction (0-3) Operand location (register-to-register, register-to- memory, memory-to-memory) Operations (types and access restrictions) Type and size of operands
Designing decisions for instruction sets Define the instruction set format is very important Factors to consider: Amount of space required by program Complexity of the instruction set Length of instructions Number of instructions
Continuing… How to choose the design? Short instructions are typically better (but reduces possibilities) Instructions of fixed length (easy to decode but waste space) Byte addressable Fixed length does not imply fixed number of operands Addressing modes Little or big endian? Registers (how many, and how to use)
Little versus big endian The way that computer stores bytes from multiple- byte data element Little endian = store lower significant byte at lower address) Big endian = guess… UNIX usually big, PCs usually little Intel always little, Motorola always big
OK, but what does it mean? If you have an integer: Byte 3 Byte 2 Base address + Little endian Big endian Byte 0 Byte 3 1 Byte 1 Byte 2 2 3
Pros | Cons Characteristic Little endian Big endian Negative values Know the size and skip bytes to find out Just look to the first byte String operations --- A bit faster Bitmap Uses the same idea (most significant bits on the lower addresses) 32-bit to 16 bit address conversion Requires addition High precision arithmetic Faster and easier Allow words to be written in non-word address boundaries Yes, do not waste space and makes programming easy No, wastes space. Networks are big endian Have to convert “Speak” same language
Internal storage in CPU Stack Use stack to execute instructions Operands found on top Don’t allow random access (Of course, it is a stack) Accumulator Minimize complexity Short Instructions General Purpose Register (GPR) Most widely accepted Fast Easy to deal
Continuing… GPR architecture can be broken in three Memory-memory Digital Equipment’s VAX Register-memory Intel, Motorola Load-store Data must be stored in register before operations are performed SPARC, MIPS, Alpha, PowerPC
Number of operands and instruction length Fixed length Wastes space but is fast Variable length More complex, but saves storage space Instruction need to be aligned with words Common formats OPCODE only OPCODE + 1 address (usually a memory) OPCODE + 2 addresses (usually registers, or one register, one memory) OPCODE + 3 addresses (usually registers, or combination of registers and memory)
Continuing… Machine instructions that have no operands must have a stack (where operations are made from top) to perform a operation that require one or two operands Instructions READ STO X STO Y LOAD X LOAD Y ADD STO Z Stack POS Memory X Y Z 5 2 3 2 3 3 5
Continuing… The same algorithm before would be like this according to the number of addresses: Note that in the examples I presume that Z is initially 0 Three addresses Two addresses One address READ X READ Y ADD Z, X, Y ADD Z, X LOAD X ADD Z, Y ADD Z LOAD Y
Expanding opcodes If you have a machine with “short opcodes”, you can use part of the “memory addressing bits” to expand opcodes (some opcodes do not need addresses), here is an example: You have a machine with 16-bit instructions and 16 registers, you can use 4 bits to opcodes with 12 bits to addresses or 8 bits to opcodes and 8 bits to addresses or 12 bits to opcodes and 4 bits to address or only opcodes
Continuing… 15 3-addresses codes (4 bits to opcodes, 12 bits to addresses) 0000 R1 R2 R3 1110 R1 R2 R3 14 2-addresses codes (8 bits to opcodes, 8 bits to addresses) 1111 0000 R1 R2 1111 1101 R1 R2 31 1-addresses codes (12 bits to opcodes, 4 bits to addresses) 1111 1110 0000 R1 1111 1111 1110 R2 16 0-addresses codes (16 bits to opcodes, 0 bits to addresses) 1111 1111 1111 0000 1111 1111 1111 1111
Instruction types Data movement Arithmetic Boolean Bit manipulation Transfer of control Special purpose
Data Movement Most frequently used instructions. Data is moved from memory into registers, registers to registers, registers to memory. May be instructions to move between memory and register, one only for registers RISC limits the instructions that can move to memory Many machines have different instructions to bytes and words MOVE, LOAD, PUSH, POP, EXCHANGE and variations
Arithmetic operations Instruction that deal with numbers (integer and float) Many instructions sets provides different instructions for different data sizes (improve performance) Many times operands are implied Use the flag register (to report errors like overflow, underflow, use as carry) ADD, SUBTRACT, MULTIPLY, DIVIDE, INCREMENT, DECREMENT, NEGATE
Boolean logic instructions Perform Boolean operations Allow bits to set, cleared and complemented Commonly used to control I/O devices Affect flag register AND, NOT, OR, XOR, TEST, COMPARE
Bit manipulation instructions Used for setting and resetting a bit or a group of bits within a given data word Include arithmetic and logical SHIFT and ROTATE instructions (to left and right)
Input/Output instructions Input – transports data from a device or port to a register or memory Output – transports data from register or memory to a device or port May be different for characters and numbers
Instructions for transfer of control Alter the normal sequence of program execution Branches, skips, procedure calls, returns, program termination Jump, conditional jump
Special purpose instructions String processing, high-level language support, protection, flag control, word/byte conversions, cache management, register access, address calculation, no-ops, any other instructions that don’t fit in the previously mentioned
Instruction set orthogonality Don’t create instructions that duplicate other instructions, that would be a waste Makes writing a language compiler easier, but have quite long words, with means the programs will be bigger and use more memory
Addressing Data types Addressing modes
Data types Numeric data (integers and floating point numbers) Signed / Unsigned Various lengths Might be different instructions for different lengths Nonnumeric (strings, Booleans, pointers) String Copy, move, search, modify Boolean AND, OR, XOR, NOT Pointers are addresses in memory
Address modes Allow specify where instructions operands are located Immediate addressing Direct addressing Register addressing Indirect addressing Indexed addressing Stack addressing (the operand is assumed to be on stack)
Immediate addressing Data to be operated is part of instruction Very fast because it is included with instruction Not very flexible because it is loaded at compile time
Direct addressing The value to be referenced is obtained by specifying its memory address directly to instruction Fast, not part of instruction, but the value in the address is loaded to the register Much more flexible
Register addressing A register, instead of memory, is used to specify the operand. Very similar to direct addressing but the address is from a register, so it is faster
Indirect addressing Very powerful Very flexible Bits in address field specify a memory address of a pointer, the actual address of the operand is found by going to the address referenced by the pointer There is also register indirect addressing, which is quite similar, but uses a register to point the data
Indexed addressing There is an index register and the address of the data we want is found by the sum of the value in the register to the address in the instruction There is also based addressing, which is very similar, but the opposite, the register holds the base address and the value given is the index Very useful to access arrays, and strings