Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures Topic 13 SIMD Multimedia Extensions Prof. Zhang Gang gzhang@tju.edu.cn School.

Similar presentations


Presentation on theme: "Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures Topic 13 SIMD Multimedia Extensions Prof. Zhang Gang gzhang@tju.edu.cn School."— Presentation transcript:

1 Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures Topic 13 SIMD Multimedia Extensions Prof. Zhang Gang School of Computer Sci. & Tech. Tianjin University, Tianjin, P. R. China

2 Characteristics of media applications
Many media applications operate on narrower data types than the 32-bit processors were optimized for. Many graphics systems used 8 bits to represent each of the three primary colors plus 8 bits for transparency. Depending on the application, audio samples are usually represented with 8 or 16 bits. SIMD Multimedia Extensions started with these simple observations.

3 Figure 4.8 Summarizes typical multimedia SIMD instructions
Partitioned adders There are carry chains within partitioned adder A processor using a 256-bit adder could perform simultaneous operations on short vectors of thirty-two 8-bit operands sixteen 16-bit operands eight 32-bit operands four 64-bit operands Figure 4.8 Summarizes typical multimedia SIMD instructions

4 What are the differences between vector and SIMD instructions?
Like vector instructions, a SIMD instruction specifies the same operation on vectors of data. Unlike vector machines with large register files, SIMD instructions tend to specify fewer operands and hence use much smaller register files. VMIPS vector register can hold as many as sixty- four 64-bit elements in each of 8 vector registers

5 SIMD extensions have three major omissions
Fix the number of data operands in the opcode Lead to the addition of hundreds of instructions in the MMX, SSE, and AVX extensions of the x86 architecture Does not offer the more sophisticated addressing modes of vector architectures Not have strided accesses and gather-scatter accesses. Does not offer the mask registers Not support conditional execution of elements These omissions make it harder for the compiler to generate SIMD code and increase the difficulty of programming in SIMD assembly language.

6 Explanation of abbreviations
MMX--Multimedia Extensions(in 1996) Eight 8-bit integer ops or four 16-bit integer ops SSE--Streaming SIMD Extensions (in 1999) Eight 16-bit integer ops Four 32-bit integer/fp ops or two 64-bit integer/fp ops SSE2 in 2001/SSE3 in 2004/SSE4 in 2007 AVX--Advanced Vector Extensions (in 2010) Four 64-bit integer/fp ops

7 Why are Multimedia SIMD Extensions so popular?
There are these weaknesses, why are Multimedia SIMD Extensions so popular? Cost little to add to the standard arithmetic unit and easy to implement Require little extra state compared to vector architectures Do not need a lot of memory bandwidth Does not have to deal with problems in virtual memory when a page fault in the middle of the vector

8 Exercises Why can a processor using a 256-bit adder perform simultaneous operations on short vectors of eight 32-bit operands? What is the meaning of MMX? What is the meaning of SSE? What is the meaning of AVX? What are the major omissions of Multimedia SIMD extensions? Why are Multimedia SIMD Extensions so popular?


Download ppt "Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures Topic 13 SIMD Multimedia Extensions Prof. Zhang Gang gzhang@tju.edu.cn School."

Similar presentations


Ads by Google