Christopher Mitchell CDA 6938, Spring 2009. The Discrete Cosine Transform  In the same family as the Fourier Transform  Converts data to frequency domain.

Christopher Mitchell CDA 6938, Spring 2009

The Discrete Cosine Transform  In the same family as the Fourier Transform  Converts data to frequency domain.  Represents data via summation of variable frequency cosine waves.  Since it is a discrete version, conducive to problems formatted for computer analysis.  Captures only real components of the function.  Discrete Sine Transform (DST) captures odd (imaginary) components → not as useful.  Discrete Fourier Transform (DFT) captures both odd and even components → computationally intense.

Significance / Where is this used?  Image Processing  Compression - Ex.) JPEG  Scientific Analysis - Ex.) Radio Telescope Data  Audio Processing  Compression - Ex.) MPEG – Layer 3, aka. MP3  Scientific Computing / High Performance Computing (HPC)  Partial Differential Equation Solvers

Significance, Cont.  Image Processing Example  Exhibits Energy Compaction  Drop small amplitude coefficients Original ImageDCT Transformed Image

Implementation Platform NVIDIA CUDA Version 2.0

Implementation Platform, Cont.  What Happened to the Cell/BE?  Too many technical challenges compared to the deadline.  Algorithm is embarrassingly parallel  Conducive of launching hundreds of threads → GPU  Algorithm requires too much data per pass compared to local store size.  Would have to be creative with DMA and no guarantee of bottleneck mitigation.

Algorithm Walk Through  Mathematical Basis  1D Version:  Where:  2D Version:  Where α(u) and α(v) are defined as shown in the 1D case.

Algorithm Walk Through  CPU Version – 1D DCT

Algorithm Walk Through  CPU Version – 2D DCT

Algorithm Walk Through  Problem  1D DCT is O(n 2 )  2D DCT is O(n 3 )  Additionally, the Algorithm uses calls to calculate the cosine and square root.  Long Latency ALU Operations

Algorithm Walk Through  CUDA Version – 1D DCT

Algorithm Walk Through  CUDA Version – 2D DCT

Algorithm Walk Through  Solution  1D DCT is now O(n)  2D DCT is now O(n 2 )  Parallelization key to success with this algorithm

Testing  Platform  Intel Core 2 Duo E6700 @ 2.66 GHz.  Gigabyte GA-P35-DQ6 Motherboard  2 GB RAM  2 NVIDIA GeForce 8600 GTS Superclocked GPUs  720 MHz. Core Clock  256 MB GDDR3 Memory  4 Multiprocessors → 32 Streaming Processors  Windows XP Professional (32-bit) w\ SP3 and NVIDIA ForceWare 178.24 Drivers

Testing - Overview Vector Test CaseCPU VersionCUDA Version Vector: 2563.00 ms0.016930 ms Vector: 51214.67 ms0.027778 ms Vector: 102464.33 ms0.015876 ms Vector: 2058246.33 ms0.015213 ms Vector: 4096989.33 ms0.015721 ms Matrix Test CaseCPU VersionGPU Version Matrix: 64 x641,055.67 ms0.009612 ms Matrix: 128 x 12816,205.33 ms0.010277 ms Matrix: 256 x 256254,448.33 ms0.009850 ms Matrix: 512 x 5124,007,952.00 ms0.014130 ms

Testing – 1D DCT

Testing – 2D DCT

Future Work  Multiple GPU version  Have a dual card setup to test this with.  Need to find efficient way to split the problem between the two cards without incurring a large I/O penalty.  Still interested in trying a Cell/BE version of the algorithm.  Need to improve at CBEA programming.  DMA & local store size is the limiting factor for this particular problem.

References  NVIDIA CUDA Programming Guide, Version 2.1  http://developer.download.nvidia.com/compute/c uda/2_1/toolkit/docs/NVIDIA_CUDA_Programmin g_Guide_2.1.pdf http://developer.download.nvidia.com/compute/c uda/2_1/toolkit/docs/NVIDIA_CUDA_Programmin g_Guide_2.1.pdf  The Discrete Cosine Transform (DCT): Theory and Application  http://www.egr.msu.edu/waves/people/Ali_files/D CT_TR802.pdf http://www.egr.msu.edu/waves/people/Ali_files/D CT_TR802.pdf  CDA 6938 Lecture Notes and Slides

Christopher Mitchell CDA 6938, Spring 2009. The Discrete Cosine Transform  In the same family as the Fourier Transform  Converts data to frequency domain.

Similar presentations

Presentation on theme: "Christopher Mitchell CDA 6938, Spring 2009. The Discrete Cosine Transform  In the same family as the Fourier Transform  Converts data to frequency domain."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Christopher Mitchell CDA 6938, Spring 2009. The Discrete Cosine Transform  In the same family as the Fourier Transform  Converts data to frequency domain.

Similar presentations

Presentation on theme: "Christopher Mitchell CDA 6938, Spring 2009. The Discrete Cosine Transform  In the same family as the Fourier Transform  Converts data to frequency domain."— Presentation transcript:

Similar presentations

About project

Feedback