Fast Fourier Transform
Overview An FFT is an efficient algorithm to compute the Discrete Fourier Transform (DFT) and it inverse Complexity of Direct computation of DFT is O(n^2)
FFT Algorithms FFT algorithms reduce the complexity to O(n log n) However, these algorithms are not cache friendly Radix-2, Radix-4, Radix-8 etc.
The Matrix Algorithm Matrix Fourier Algorithm (4-step algorithm) has better cache locality Works for composite data lenghth. For input set size n = R x C Consider input array as RxC matrix
The Matrix Algorithm THE ALGORITHM Apply a (length R) FFT on each column Multiply each matrix element (index r, c) by the twiddle factor Apply a (length C) transform on each row Transpose the Matrix
MFA with Slight Variation n1 simultaneous n2-point multirow FFTs with twiddle factor multiplication n2 individual n1-point multicolumn FFTs Transpose
The Code subroutine parallel_fft(A, W, U, N) double complex A(*), W(*), U(*) if (N .LE. CACHESIZE) then CALL in_cache_fft(A, W, U, N) return end if
Step 1 !$OMP PARALLEL !$OMP DO do I=1, N/2 W(I) = A(I) + A(I+N/2) W(I+N/2) = (A(I)-A(I+N/2)) * U(I) end do
Step 2 !$OMP DO do J=1, 2 call rec_fft(W((J-1)*(N/2)+1), A(J-1)*(N/2)+1, U(N/2+1), N/2) end do
Step 3 !$OMP DO do I=1, N/2 A(2*I-1)=W(I) A(2*I)= W(I+N/2) end do !$OMP END PARALLEL return end
