Download presentation
Presentation is loading. Please wait.
1
Fast Fourier Transform
CS 498LVK Hassan Jafri
2
Overview An FFT is an efficient algorithm to compute the Discrete Fourier Transform (DFT) and it inverse Complexity of Direct computation of DFT is O(n^2)
3
FFT Algorithms FFT algorithms reduce the complexity to O(n log n)
However, these algorithms are not cache friendly Radix-2, Radix-4, Radix-8 etc.
4
The Matrix Algorithm Matrix Fourier Algorithm (4-step algorithm) has better cache locality Works for composite data lenghth. For input set size n = R x C Consider input array as RxC matrix
5
The Matrix Algorithm THE ALGORITHM
Apply a (length R) FFT on each column Multiply each matrix element (index r, c) by the twiddle factor Apply a (length C) transform on each row Transpose the Matrix
6
MFA with Slight Variation
n1 simultaneous n2-point multirow FFTs with twiddle factor multiplication n2 individual n1-point multicolumn FFTs Transpose
7
The Code subroutine parallel_fft(A, W, U, N)
double complex A(*), W(*), U(*) if (N .LE. CACHESIZE) then CALL in_cache_fft(A, W, U, N) return end if
8
Step 1 !$OMP PARALLEL !$OMP DO do I=1, N/2 W(I) = A(I) + A(I+N/2)
W(I+N/2) = (A(I)-A(I+N/2)) * U(I) end do
9
Step 2 !$OMP DO do J=1, 2 call rec_fft(W((J-1)*(N/2)+1),
A(J-1)*(N/2)+1, U(N/2+1), N/2) end do
10
Step 3 !$OMP DO do I=1, N/2 A(2*I-1)=W(I) A(2*I)= W(I+N/2) end do
!$OMP END PARALLEL return end
11
For Reference Swarztrauber, P.N.: Multiprocessor FFTs. Parallel Computing 5 (1987) 197–210 Cochrane, W.T., Cooley, J.W., Favin, D.L., Helms, H.D., Kaenel, R.A., Lang,W.W., Maling, Jr., G.C., Nelson, D.E., Rader, C.M., Welch, P.D.: What is the fast Fourier transform? IEEE Trans. Audio Electroacoust. 15 (1967) 45–55 Swarztrauber, P.N.: FFT algorithms for vector computers. Parallel Computing 1 (1984) 45–63 Frigo, M., Johnson, S.G.: Fftw. ( Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19 (1965) 297–301. Daisuke Takahashi, Mitsuhisa Sato, Taisuke Boku: "An OpenMP Implementation of Parallel FFT and Its Performance on IA-64 Processors". WOMPAT 2003: Wadleigh, K.R.: High performance FFT algorithms for cache-coherent multiprocessors.The International Journal of High Performance Computing Applications 13 (1999) 163–171 Takahashi, D.: A blocking algorithm for parallel 1-D FFT on shared-memory parallel computers. In: Proc. 6th International Conference on Applied Parallel Computing (PARA 2002). Volume 2367 of Lecture Notes in Computer Science., Springer-Verlag (2002) 380–389
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.