Computer architecture Lecture 4: Processor instruction list Piotr Bilski.

Computer architecture Lecture 4: Processor instruction list Piotr Bilski

Execution of program Processor executes machine instructions (after understanding them - decoding) Programmer creates a program in the symbolic low or high level language During compilation symbolic language is translated into the machine language instructions

Elements of the machine instructions Operation code Argument references (operation input data) Result reference (if needed) Reference to the next instruction 0 3 4 15 Operation code Argument references

Arguments and results are stored in: Memory (main, cache, virtual) Processor registers (accumulator, general purpose registers) Input/output devices (hard drive, printer)

Instructions types Data processing (logical and arithmetic operations) Data storage (instructions related to the memory access) Data transmission (input/output operations) Control (result testing, non-sequential code execution – jumps, branches)

Relation between the symbolic and machine instructions x = x + c; LOAD 1001 ADD 1002 STORE 1001 1001 1002 xcxc ALU

Number of the addresses in the instruction InstructionAction SUB Y,A,B Y  A-B MPY T,D,E T  D*E ADD T,T,C T  T+C DIV Y,Y,T Y  Y/T 3 addresses InstructionAction MOVE Y,A YAYA SUB Y,B Y  Y-B MOVE T,D TDTD MPY T,E T  T*E ADD T,C T  T+C DIV Y,T Y  Y/T 2 addresses InstructionAction LOAD D AC  D MPY E AC  AC*E ADD C AC  AC+C DIV Y AC  AC/Y 1 address Y=(A-B)/(C+D*E)

Number of the addresses in the instruction (cont.) Three addresses: ADD a,b,c Two addresses: MOVE a,b ADD a,c One address: LOAD b ADD c STOR a a = b + c

Instruction list design problems How many (and which) operations for processor to execute? What data types (arguments, results)? What instruction format (length, addresses’ number)? How many (and which) registers? Which addressing modes?

Operands Addresses (unsigned integers) Numbers (numerical data) – fixed and floating point precision, decimal Characters (ASCII / IRA, EBCDIC codes etc.) Logical data (single bits)

Computer as the data storage Writing multiple-byte data in memory can be little endian, big endian, and bi-endian The difference between the models of the data storage is in the sequence of the bytes stored in memory, for example hexadecimal number 76859432 can be written in two ways: 263 264 265 266 263 264 265 266 76 85 94 32 94 85 76 Big endian Little endian

Little and big endian Big endian Easy to sort character sequences (strings) Allows printing ASCII characters withot any conversions Integers and characters are in the same order Used in: Sun SPARC, RISC processors, Motorola 680x0 Little endian Easy to convert longer number to the shorter one Arithmetic operations are easier to execute Used in: Intel 80x86, Pentium, Alpha Bi-endian Understands both standards Used in: PowerPC

Examples of little and big endian in the file types Big endian: Adobe Photoshop IMG (GEM Raster) JPEG MacPaint SGI (Silicon Graphics) Sun Raster Little endian: BMP (Windows, OS/2 Bitmaps) GIF PCX (PC Paintbrush) TGA (Targa) Microsoft RTF (Rich Text Format) Bi-endian: Microsoft RIFF (.WAV &.AVI) TIFF XWD (X Window Dump)

Pentium data types Data are organized in the multiplicity of the byte (byte – B, word – 2 B, double word – 4 B etc.) Formats are compliant with IEEE 754 norm No need to store data under the evenly alligned addresses Unsigned integers (8, 16, 32, 64 bits) - addresses Signed integers (8,16, 32, 64 bits), two’s complement representation Floating point numbers (single, double, and extended double precision)

Pentium data types (cont.) Generic (any content 16,32 or 64 bits long) Unpacked decimal number binary representation (one digit in a byte) Packed decimal number binary representation (two digits in a byte) Pointer (32-bit address) Bit field Byte chain

PowerPC data types Data 8, 16, 32, 64 bits long Data address alignment to the even byte is not required (though sometimes used) PowerPC is bi-endian type Stored: usigned and signed numbers (byte (8b), half-word (16b), word (32b), double word (64b)), floating point numbers (IEEE 754), byte chain (up to 128 B)

Operation classification Data transfer ( STORE, LOAD, SET PUSH, POP) Arithmetic (ADD, SUB, NEG, INC, MULT) Logical (AND, OR, NOT, TEST, SHIFT, ROTATE) Control passing (JUMP, HALT, EXEC) Input/output (READ, WRITE) Conversion (TRANS, CONV)

Data transfer Aim: to move data from one location to another Requires: determining memory location (virtual address?), checking for cache memory, producing instruction of read/write operation Exemplary instructions: LOAD, STORE (in short, long, half-word versions etc.)

Logical operations Operands are treated as the bit chain The most popular operations: AND, OR, XOR, NOT Bit chains treated as masks: A 1 = 10100101 AND A 2 = 11110000 10100000 A 1 = 10100101 XOR A 2 = 11111111 01011010

Logical operations (cont.) Logical shifting Arithmetic shifting 0 0

Changing execution order Related to the instructions’ execution order Contain jumps, calling procedures and execution of one operation in a loop Control passing can be conditional or unconditional

Conditional branches Multiple-bit code contains storing results of the operations being a condition to the jump execution, for example determined by the sign of the result, overflow and zeroing the result The second method is the jump condition embedded in the jump instruction Jump can be used in both directions

Branch example 351 352 353SUB X, Y 354BRZ 373........ 372BR 353 373........ 395Rest of the code 396 BRZ – make a jump, if the result is zero BR – make a jump unconditionally Conditional code of the SUB operation determines jump in BRZ operation

Procedures They are isolated modules in the source code Their usage allows to increase flexibility of the code Require two instructions: call and return The same procedure can be called many times from different locations Procedures can be nested

Procedure and return location Procedure can be called from multiple locations in the program Nesting of calls is possible Calling the procedure requires storing the return address: –In the register –At the beginning of the called procedure –On the stack (the best option, allows the operation of the nested (recurrent) procedures)

Procedure call

Stack It is an isolated memory space to store data, organized as the LIFO structure In many processors there is the register working as the stack pointer (for example, Motorola 68000) Main stack operations: PUSH, POP

Example of the stack implementation Stack pointer End of stack F T PUSH F POP F

Working with stack Operation a+b-(c/d) Operation in the reverse polish notation: ab+cd/- a b a+b c d c/d a+b-c/d

Stack frame Set of the procedure parameters including return address Allows to call the nested procedures storing input and output parameters on the stack

Stack frame illustration x2 x1 Return point Previous frame pointer y2 y1 Previous frame pointer Return point x2 x1 Previous frame pointer Return point Stack cont. SP FP Procedure A Procedure A calls B FP SP

Stack frame in Pentium processor Used by the ENTER, CALL commands ENTER command supports compilers in the nested procedures implementation LEAVE command restores previous stack status Frame pointer is stored in the EBP registry, stack pointer in ESP registry Example of the CALL execution: PUSH EBP MOV EBP, ESP SUB ESP, space_in_memory

MMX instructions Introduced in 1996 r. to the Pentium processors In the first version they were 57 SIMD instructions Used to execute operations on the integer numbers Purpose – multimedia applications (computer games, graphics and sound processing) MMX uses four new data types: packed byte, packed word, packed double word, packed quadruple word

MMX instructions examples Arithmetic: PADD, PMUL, PMADD Logical: PAND, PNDN, POR, PXOR Comparison: PCMPEQ, PCMPGT Conversion: PUNPCKH, PUNPCKL All instructions have suffixes determining, which type of data is used in the operation: B, W, D, Q

Additional MMX registers Eight 64-bit registers from MM0 to MM7 Due to the backward compatibility, the MMX registers are accessible by the older software as the floating point registers 63 56 7 0 eight byte Seventh byte First byte Fourth word.....

Exemplary MMX operation

MMX arithmetics Saturation instead of the overflow 1111 0000 0000 0000 +0011 0000 0000 0000 10010 0000 0000 0000 overflow 1111 0000 0000 0000 +0011 0000 0000 0000 10010 0000 0000 0000 1111 1111 1111 1111 saturation

Why should we use MMX? * - compared to the C code using traditional architecture Operation Acceleration * Echo effect 5,9 Matrix transposition 2 Arithmetic and logical operations on vectors 6 Fractals drawing (2D) 1,5 Billinear texture mapping (3D) 7 Median filter 3,8 Haar transform 2x2 2,2 Calculating L1 norm 3,3 3D transformation 3,1

SSE instructions Introduced in 1999 (Pentium 3) New 70 instructions for the floating point operations Additional 8 128-bit registers, addressed directly: XMM0 – XMM7 (plus control register MXCSR). Every register stores 4 32-bit floating point numbers

SSE (cont.) New data type: 4-element vector of floating point single precision numbers Operations can be packed (PS – for all elements of the vector), or scalar (SS – inly on the first elements) Example: xmm0 = [X1 X2 X3 X4]xmm1 = [Y1 Y2 Y3 Y4] ADDPS(xmm0,xmm1) = [X1+Y1 X2+Y2 X3+Y3 X4+Y4]

3DNow! Instructions Introduced in 1997 r. by the AMD corporation Provide set of 21 new instructions for the floating point number calculations of the SIMD type Used in the multimedia applications (high resolution graphics, computer games, CAD/CAM) Extensions exist: Enchanced 3DNow!, 3DNow Professional

SSE2 instructions Introduced in 2001 (Intel Pentium IV, Athlon 64, Sempron 754, Transmeta Efficeon) Set of the additional 144 instructions, supported by 16 128-bit registers (XMM0 – XMM15) Performed operations on 64-bit floating point (coprocessors x87 work with 80-bit numbers) and integer 128-bit numbers

Next Sets of Instructions SSE3 (Prescott New Instructions) – 13 new instructions, including the complex numbers arithmetics (since 2004, Pentium IV Prescott, Athlon 64 E) SSSE3 (Supplemental Streaming SIMD Extension 3) – 16 new instructions operating on integers (since 2005 Xeon, Intel Core 2, AMD Phenom) SSE4 – 54 new instructions in two groups (47 and 7), including integer number instructions modifying EFLAGS register (new!), implemented in Intel Core 2, Celeron Conroe, Penryn

Next Sets of Instructions (c.d.) SSE5 – planned to be implemented by AMD in 2009. Finally replaced by three groups: XOP, FMA4, CVT16 (AVX compatible). Implemented in Buldozzer procesors in 2011. Instructions have even 4 arguments! Competitor to Intel’s SSE4 AVX (Advanced Vector Extensions) – implemented by Intel in 2011: 16 new 256-bit registers (YMM0-YMM15) + 19 instructions working exclusively on these registers

Assembler Low level programming language Uses both instructions and symbolic pointers to data Every processor has its own assembler

Example of the assembly program 101 0010 0010 0000 0001 102 0001 0010 0000 0010 103 0001 0010 0000 0011 104 0011 0010 0000 0100 201 0000 0000 0000 0010 202 0000 0000 0000 0011 203 0000 0000 0000 0100 204 0000 0000 0000 0000 101 LDA 201 102 ADD 202 103 ADD 203 104 STA 204 201 DAT 2 202 DAT 3 203 DAT 4 204 DAT 0 FORMUL LDA I ADD J ADD K STA L I DATA 2 J DATA 3 K DATA 4 L DATA 0 MACHINE LANGUAGE SYMBOLIC ASSEMBLER PROGRAM L = I + J + K

Computer architecture Lecture 4: Processor instruction list Piotr Bilski.

Similar presentations

Presentation on theme: "Computer architecture Lecture 4: Processor instruction list Piotr Bilski."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computer architecture Lecture 4: Processor instruction list Piotr Bilski.

Similar presentations

Presentation on theme: "Computer architecture Lecture 4: Processor instruction list Piotr Bilski."— Presentation transcript:

Similar presentations

About project

Feedback