Download presentation
Presentation is loading. Please wait.
Published byCharlotte Porter Modified over 9 years ago
1
Intel® IPP. Fighting for the performance Intel® IPP. Fighting for the performance Novosibirsk, 2008 Boris Sabanin Novosibirsk, 2008 Boris Sabanin
2
Why Primitives? “Было бы расточительством и неграмотностью не предоставлять разработчикам общего фундамента для их [систем] построения.” А.П.Ершов, "Математическое обеспечение 4-го поколения" Intel® Integrated Performance Primitives To optimize deeply To optimize deeply To make it cross-platform To make it cross-platform To make it orthogonal in functionality To make it orthogonal in functionality To test perfectly To test perfectly To develop independently To develop independently To give customers the build blocks To give customers the build blocks To optimize deeply To optimize deeply To make it cross-platform To make it cross-platform To make it orthogonal in functionality To make it orthogonal in functionality To test perfectly To test perfectly To develop independently To develop independently To give customers the build blocks To give customers the build blocks
3
Being Primitive ANSI C. Portable Low overhead. High perf with small data Low structure. No conversion Basic common operation. For many ISV Atomic. Making one thing. Build blocks, flexible Self contained. Min or zero OS dependency Predictable. Expectable behavior and results Well defined. No “result is not defined” Well documented. And self documented Intuitive. Understand once No magic. No side effects, explicit behavior ippsAddC_8u_I
4
High Temperature IPP SW. Applications HW. CPU & chipset OSOS IPPIPP ComponentsComponents
5
IPP & Media. What is Inside? Signal & Image Processing Signal & Image Processing String Processing String Processing Computer Vision Computer Vision Speech Recognition primitives Speech Recognition primitives Jpeg & Jpeg2000 primitives Jpeg & Jpeg2000 primitives Speech, Audio and Video Coding Speech, Audio and Video Coding Lossless Data Compression Lossless Data Compression Small Matrix operations, Vector Math Small Matrix operations, Vector Math Cryptography Cryptography Realistic Rendering Realistic Rendering Data Integrity Data Integrity Automatically generated DSP transforms Automatically generated DSP transforms
6
IPP, What Else? For Free? 50+ IPP Samples given in source codes Video codecs: MPEG2, MPEG4, H264, VC1 Video codecs: MPEG2, MPEG4, H264, VC1 Audio codecs: MP3, AAC, AC3 Audio codecs: MP3, AAC, AC3 JPEG and JPEG2000 codecs JPEG and JPEG2000 codecs Speech codecs: G722, G723, G726, G728 Speech codecs: G722, G723, G726, G728 Computer Vision: Face Detection Computer Vision: Face Detection Ray Tracing demo Ray Tracing demo Interfaces: Java, C#,.VB, F90, C++ Interfaces: Java, C#,.VB, F90, C++ Yes. Download free source-code samples Yes. Download free source-code samples http://www.intel.com/support/performancetools/libraries/ipp/
7
More Optimization Needed MHzMHz Arch Time Performance Optimization is needed. A lot of work MMX SSE Core 3GHz
8
Achieving Performance Algorithms SIMD Threading HW accelerators Hybrid Solution Algorithms SIMD Threading HW accelerators Hybrid Solution
9
Algorithm. Right DFT Decomposition Manually optimized code vs. automatically generated. The best of 200 decomposition cases are benchmarked
10
Threading. Function level and above Primitive level. 1D FFT is optimized and threaded. Performance on Core™2 Duo 22 GFlops Over primitives. IPP based GZIP even single thread version is faster, see performance on the chart in CPU clocks per byte. The threaded version is much faster due to the threading modes implemented: multi- file and in-file parallelization
11
FFTW Compares FFT Performance 3.60 GHz Intel Xeon Pentium 4 (Prescott), unknown L2 size, 64 bit mode. Linux 2.4.21, Intel C/C++ Compiler 9.0, Intel Fortran Compiler 9.0, Intel Math Kernel Library Version 8.0.1, Intel Integrated Performance Primitives v5.0. Has SSE (4-way single precision SIMD), SSE2 (2-way double precision SIMD), SSE3 http://www.fftw.org/speed FFTW web site IPPIPP
12
The Open Source Powered by IPP Data Compression Data Compression GZIP, ZLIB, BZIP2, LZO GZIP, ZLIB, BZIP2, LZO Image Coding. Jpeg Image Coding. Jpeg IJG IJG Cryptography Cryptography OpenSSL OpenSSL Computer Vision Computer Vision OpenCV OpenCV
13
OpenCV Calls IPP and Wins Stanford Racing Team has won Grand Challenge. OpenCV & IPP are used in “Stanley” computer vision software. software. DARPA "Urban Challenge“. In November with 60-mile multi-robot face-off in a simulated city. Powered by Intel Core2 Quad running IPP and OpenCV
14
HW Acceleration Low Power Computing In media, the CPU utilization decrease is desirable (unlike in HPC) Because of less power consumption and letting other applications run. IPP video decoders running on CPU and on HW accelerators compared with PowerVR technology. Menlow with Linux
15
Hybrid Solution. MC+HT+HWA
16
AMD Performance Library IPP API compatible Much less functionality Much less performance IPP API compatible Much less functionality Much less performance
17
Quality vs. Performance MSU Graphics Lab reports IPP H.264 encoder is in top 3
18
IPP Economics 16 functional domains 16 functional domains 10K functions 10K functions 350MB of source codes 350MB of source codes Windows, Linux, MacOSX Windows, Linux, MacOSX IA32, Intel®64, IA64, XScale IA32, Intel®64, IA64, XScale All development in Russia All development in Russia 3 Releases a year + Out-Of-Cycle releases 3 Releases a year + Out-Of-Cycle releases IPP $199, IPP samples $0 IPP $199, IPP samples $0
19
IPP Customers Microsoft Microsoft Adobe Adobe Philips Medical Philips Medical MathWorks MathWorks Ulead Ulead Thomson Thomson Yahoo Yahoo OKI OKI Apple Apple Symantec Symantec Pixar Pixar Envivio Envivio SGI SGI Oracle Oracle SAP SAP Google Google Russian?Russian?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.