Presentation is loading. Please wait.

Presentation is loading. Please wait.

MMX Multi Media eXtensions

Similar presentations


Presentation on theme: "MMX Multi Media eXtensions"— Presentation transcript:

1 MMX Multi Media eXtensions
Starting with Pentium II MMX 11/9/2018 TUC-N dr. Emil CEBUC

2 Outline Overview MMX programming environment Data types
SIMD execution model New arithmetic MMX Instructions Cooperation with FPU Further Enhancements 11/9/2018 TUC-N dr. Emil CEBUC

3 Overview Eight new 64-bit data registers, called MMX registers
Three new packed data types: — 64-bit packed byte integers (signed and unsigned) — 64-bit packed word integers (signed and unsigned) — 64-bit packed doubleword integers (signed and unsigned) Instructions that support the new data types and to handle MMX state Management Extensions to the CPUID instruction 11/9/2018 TUC-N dr. Emil CEBUC

4 MMX programming env. 11/9/2018 TUC-N dr. Emil CEBUC

5 MMX Registers 11/9/2018 TUC-N dr. Emil CEBUC

6 Data Types 64-bit packed byte integers — eight packed bytes
64-bit packed word integers — four packed words 64-bit packed doubleword integers — two packed double words 11/9/2018 TUC-N dr. Emil CEBUC

7 SIMD Execution Model MMX instructions move 64-bit packed data types (packed bytes, packed words, or packed double words) and the quadword data type between MMX registers and memory or between MMX registers in 64-bit blocks However, when performing arithmetic or logical operations on the packed data types, MMX instructions operate in parallel on the individual bytes, words, or double words contained in MMX registers 11/9/2018 TUC-N dr. Emil CEBUC

8 SIMD Execution Model 11/9/2018 TUC-N dr. Emil CEBUC

9 New arithmetic Wraparound
Wraparound arithmetic With wraparound arithmetic, a true out-of-range result is truncated (that is, the carry or overflow bit is ignored and only the least significant bits of the result are returned to the destination) 11/9/2018 TUC-N dr. Emil CEBUC

10 New arithmetic Signed saturation
Signed saturation arithmetic With signed saturation arithmetic, out-of range results are limited to the representable range of signed integers for the integer size being operated on 11/9/2018 TUC-N dr. Emil CEBUC

11 New arithmetic Unsigned saturation
Unsigned saturation arithmetic With unsigned saturation arithmetic, out of-range results are limited to the representable range of unsigned integers for the integer size. So, positive overflow when operating on unsigned byte integers results in FFH being returned and negative overflow results in 00H being returned 11/9/2018 TUC-N dr. Emil CEBUC

12 New arithmetic Saturation ranges
Saturation arithmetic provides an answer for many overflow situations. For example, in color calculations, saturation causes a color to remain pure black or pure white without allowing inversion 11/9/2018 TUC-N dr. Emil CEBUC

13 MMX Instructions The MMX instruction set consists of 47 instructions, grouped into the following categories: Data transfer Arithmetic Comparison Conversion Unpacking Logical Shift Empty MMX state instruction (EMMS) 11/9/2018 TUC-N dr. Emil CEBUC

14 MMX Instruction set summary
11/9/2018 TUC-N dr. Emil CEBUC

15 MMX Instruction set summary
11/9/2018 TUC-N dr. Emil CEBUC

16 MMX Instruction set summary
11/9/2018 TUC-N dr. Emil CEBUC

17 PMADDWD 11/9/2018 TUC-N dr. Emil CEBUC

18 Cooperation with FPU Applications can contain both x87 FPU floating-point and MMX instructions. However, because the MMX registers are aliased to the x87 FPU register stack, care must be taken when making transitions between x87 FPU instructions and MMX instructions When an MMX instruction (other than the EMMS instruction) is executed, the processor changes the x87 FPU state as follows: The TOS (top of stack) value of the x87 FPU status word is set to 0. The entire x87 FPU tag word is set to the valid state (00B in all tag fields) When an MMX instruction writes to an MMX register, it writes ones (11B) to the exponent part of the corresponding floating-point register (bits 64 through 79) 11/9/2018 TUC-N dr. Emil CEBUC

19 Further Enhancements streaming SIMD extensions (SSE) were introduced into the IA-32 architecture in the Pentium III processor family Eight 128-bit data registers (called XMM registers) in non-64-bit modes; Sixteen XMM registers are available in 64-bit mode. The 32-bit MXCSR register, which provides control and status bits for operations performed on XMM registers. 11/9/2018 TUC-N dr. Emil CEBUC

20 SSE The 128-bit packed single-precision floating-point data type (four IEEE single precision floating-point values packed into a double quadword). Instructions that perform SIMD operations on single-precision floating-point values and that extend SIMD operations that can be performed on integers: 128-bit Packed and scalar single-precision floating-point instructions that operate on data located in MMX registers 64-bit SIMD integer instructions that support additional operations on packed integer operands located in MMX registers instructions that save and restore the state of the MXCSR register 11/9/2018 TUC-N dr. Emil CEBUC

21 SSE2 Pentium 4 and Intel Xeon processors
support for packed double-precision floating-point values and for 128-bit packed integers. Five data types: 128-bit packed double-precision floating-point (two IEEE Standard 754 double-precision floating-point values packed into a double quadword) 128-bit packed byte integers 128-bit packed word integers 128-bit packed doubleword integers 128-bit packed quadword integers 11/9/2018 TUC-N dr. Emil CEBUC

22 SSE2 flexibility is provided with instructions that operate on single (scalar) double-precision floating-point values located in the low quadword of an XMM register greater throughput when performing SIMD operations on packed integers. The capability is particularly useful for applications such as RSA authentication and RC5 encryption 11/9/2018 TUC-N dr. Emil CEBUC

23 SSE2 Data types 11/9/2018 TUC-N dr. Emil CEBUC

24 SSE2 Instructions Packed and scalar double-precision floating-point instructions 64-bit and 128-bit SIMD integer instructions 128-bit extensions of SIMD integer instructions introduced with the MMX technology and the SSE extensions Cacheability-control and instruction-ordering instructions 11/9/2018 TUC-N dr. Emil CEBUC

25 SSE Scalar Instructions
11/9/2018 TUC-N dr. Emil CEBUC

26 SSE3 SSSE3 The Pentium 4 processor supporting Hyper-Threading Technology introduces Streaming SIMD Extensions 3 (SSE3). The Intel Xeon processor 5100 series, Intel Core 2 processor families introduced Supplemental Streaming SIMD Extensions 3 (SSSE3). 11/9/2018 TUC-N dr. Emil CEBUC

27 Asymmetric Processing
11/9/2018 TUC-N dr. Emil CEBUC

28 Horizontal Processing
11/9/2018 TUC-N dr. Emil CEBUC

29 SSE3 Instructions x87 FPU instruction SIMD integer instruction
One instruction that improves x87 FPU floating-point to integer conversion SIMD integer instruction One instruction that provides a specialized 128-bit unaligned data load SIMD floating-point instructions Three instructions that enhance LOAD/MOVE/DUPLICATE performance Two instructions that provide packed addition/subtraction Four instructions that provide horizontal addition/subtraction Thread synchronization instructions Two instructions that improve synchronization between multi-threaded agents 11/9/2018 TUC-N dr. Emil CEBUC

30 SSSE3 Instructions Twelve instructions that perform horizontal addition or subtraction operations. Six instructions that evaluate the absolute values. Two instructions that perform multiply and add operations and speed up the evaluation of dot products. Two instructions that accelerate packed-integer multiply operations and produce integer values with scaling. Two instructions that perform a byte-wise, in-place shuffle according to the second shuffle control operand. Six instructions that negate packed integers in the destination operand if the signs of the corresponding element in the source operand is less than zero. Two instructions that align data from the composite of two operands 11/9/2018 TUC-N dr. Emil CEBUC


Download ppt "MMX Multi Media eXtensions"

Similar presentations


Ads by Google