Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fast Fourier Transform

Similar presentations


Presentation on theme: "Fast Fourier Transform"โ€” Presentation transcript:

1 Fast Fourier Transform
Algorithms in Action Fast Fourier Transform Haim Kaplan, Uri Zwick Tel Aviv University March 2016 Last updated: March 6, 2018

2 Discrete Fourier Transform (DFT)
A very special linear transformation ๐‘ฆ 0 ๐‘ฆ 1 โ‹ฎ ๐‘ฆ ๐‘›โˆ’1 = โ‹ฎ โ‹ฏ ๐œ” ๐‘› ๐‘—๐‘˜ โ‹ฏ โ‹ฎ โ‹ฎ ๐‘ฅ 0 ๐‘ฅ 1 โ‹ฎ ๐‘ฅ ๐‘›โˆ’1 ๐‘— โ€“ row index , ๐‘˜ โ€“ column index ๐‘ฆ ๐‘— = ๐‘˜=0 ๐‘›โˆ’1 ๐œ” ๐‘› ๐‘—๐‘˜ ๐‘ฅ ๐‘˜ ๐œ” = ๐œ” ๐‘› = ๐‘’ 2๐œ‹๐‘–/๐‘› = cos 2๐œ‹ ๐‘› +๐‘– sin 2๐œ‹ ๐‘› ๐‘–= โˆ’1

3 Complex roots of unity (๐‘›=8)
๐œ” 2 =๐‘– ๐œ” 3 = ๐‘’ ๐œ‹๐‘–/4 =โˆ’ ๐‘– ๐œ” = ๐œ” 8 = ๐‘’ ๐œ‹๐‘–/ = ๐‘– ๐œ” 4 =โˆ’1 ๐œ” 0 = ๐œ” 8 =1 ๐œ” 5 ๐œ” 7 ๐œ” 6 =โˆ’๐‘–

4 Discrete Fourier Transform (DFT)
The case ๐‘›=4 ๐‘ฆ 0 ๐‘ฆ 1 ๐‘ฆ 2 ๐‘ฆ 3 = ๐œ” ๐œ” 2 ๐œ” ๐œ” 2 ๐œ” ๐œ” ๐œ” 2 ๐œ” ๐‘ฅ 0 ๐‘ฅ 1 ๐‘ฅ 2 ๐‘ฅ 3 ๐œ” = ๐œ” 4 = ๐‘’ 2๐œ‹๐‘–/4 =๐‘– In general: ๐œ” ๐‘› ๐‘˜ = ๐œ” ๐‘› ๐‘˜ mod ๐‘› .

5 DFT as polynomial evaluation
Evaluating a polynomial at 1,๐œ”, ๐œ” 2 ,โ€ฆ, ๐œ” ๐‘›โˆ’1 ๐‘ฆ 0 ๐‘ฆ 1 โ‹ฎ ๐‘ฆ ๐‘›โˆ’1 = โ‹ฎ โ‹ฏ ๐œ” ๐‘› ๐‘—๐‘˜ โ‹ฏ โ‹ฎ โ‹ฎ ๐‘ฅ 0 ๐‘ฅ 1 โ‹ฎ ๐‘ฅ ๐‘›โˆ’1 โ€œ๐‘ง-transformโ€ ๐‘‹ ๐‘ง = ๐‘˜=0 ๐‘›โˆ’1 ๐‘ฅ ๐‘˜ ๐‘ง ๐‘˜ = ๐‘ฅ 0 + ๐‘ฅ 1 ๐‘ง+โ€ฆ+ ๐‘ฅ ๐‘›โˆ’1 ๐‘ง ๐‘›โˆ’1 ๐‘ฆ ๐‘— = ๐‘˜=0 ๐‘›โˆ’1 ๐œ” ๐‘› ๐‘—๐‘˜ ๐‘ฅ ๐‘˜ =๐‘‹ ๐œ” ๐‘› ๐‘— , ๐‘—=0,1,โ€ฆ,๐‘›โˆ’1

6 Fast Fourier Transform (FFT)
FFT is an algorithm for computing DFT. Naรฏve computation of DFT requires ฮ˜ ๐‘› 2 time. FFT computes DFT in ฮ˜(๐‘› log ๐‘›) time. Developed by Cooley and Tuckey in 1965, but similar ideas were used much earlier, e.g., by Runge and Kรถnig in 1924 and others. We assume that ๐‘›= 2 ๐‘˜ .

7 Digital signal processing: Computing convolutions:
Applications of the FFT Digital signal processing: Transforming signals from time to frequency domain Computing convolutions: Multiplication of polynomials Multiplication of large integers String matching problems Quantum computing: Used in Shorโ€™s integer factorization algorithm โ‹ฎ

8 Sampling sin (9 2๐œ‹๐‘ฅ ) Sampling rate = 32 Hz

9 Sampling sin (9 2๐œ‹๐‘ฅ ) Sampling rate = 32 Hz

10 sin (9 2๐œ‹๐‘ฅ ) and its spectrum
Spectrum = absolute value of the Fourier coefficients. In addition to the spectrum, we also have the phase. 9 (=9Hz) is right. What is 23? 23=32โˆ’9 !

11 Aliasing example: sin 5 2๐œ‹๐‘ฅ vs. โˆ’ sin 11 2๐œ‹๐‘ฅ
Same samples at 16Hz. sin 2๐œ‹๐‘—โˆ’๐‘ฅ =โˆ’ sin ๐‘ฅ sin ๐‘›โˆ’๐‘“ 2๐œ‹๐‘—/๐‘› =โˆ’ sin ๐‘“2๐œ‹๐‘—/๐‘›

12 The Sampling Theorem (Nyquist, Shannon, โ€ฆ)
โ€œIf a real continuous signal ๐‘ฅ(๐‘ก) contains no frequencies higher than ๐ต Hz, then ๐‘ฅ(๐‘ก) is uniquely determined by its sampled version ๐‘ฅ ๐‘— =๐‘ฅ(๐‘—/๐น), where ๐‘—=โ€ฆ,โˆ’1,0,1,โ€ฆ, at frequency ๐น Hz, provided that ๐นโ‰ฅ2๐ต.โ€ For information only. Not part of this course. We only consider finite discrete โ€œsignalsโ€.

13 โ€œSymmetryโ€ of DFT for real signals
Lemma: If ๐ฑ= ๐‘ฅ 0 , ๐‘ฅ 1 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 โˆˆ โ„ ๐‘› , y = ๐‘ฆ 0 , ๐‘ฆ 1 ,โ€ฆ, ๐‘ฆ ๐‘›โˆ’1 โˆˆ โ„‚ ๐‘› and ๐ฒ=๐ท๐น๐‘‡(๐ฑ), then ๐‘ฆ ๐‘›โˆ’๐‘— = ๐‘ฆ ๐‘— โˆ— , for ๐‘—=1,2,โ€ฆ,๐‘›โˆ’1. Complex conjugates: If ๐‘ง=๐‘ฅ+๐‘–๐‘ฆ then ๐‘ง โˆ— =๐‘ฅโˆ’๐‘–๐‘ฆ. Absolute values: If ๐‘ง=๐‘ฅ+๐‘–๐‘ฆ then ๐‘ง 2 =๐‘ง ๐‘ง โˆ— = ๐‘ฅ 2 + ๐‘ฆ 2 . Exercise: Prove the lemma. For real inputs, the first half of the DFT contains all the information.

14 Frequencies and their relative contribution correctly identified!
sin 9 2๐œ‹๐‘ฅ sin 2 2๐œ‹๐‘ฅ +2 cos 2๐œ‹๐‘ฅ Frequencies and their relative contribution correctly identified! As we shall soon see, from the Fourier coefficients, not just their absolute values, we can reconstruct the original signal.

15 sin 9.5 2๐œ‹๐‘ฅ Basis vectors correspond to integer frequencies.
9.5 Hz is a non-trivial combination of basis vectors. See below.

16 For more information, take a course on digital signal processing.
๐‘ฅ For more information, take a course on digital signal processing.

17 Decomposing the DFT (I)
Goal: Compute a DFT of even size ๐‘› by computing two DFTs of size ๐‘›/2. Split ๐ฑ into even and odd parts. ๐ฑ= ๐‘ฅ 0 , ๐‘ฅ 1 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 ๐‘‹ ๐‘ง = ๐‘—=0 ๐‘›โˆ’1 ๐‘ฅ ๐‘— ๐‘ง ๐‘— ๐ฑ (0) = ๐‘ฅ 0 , ๐‘ฅ 2 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’2 ๐‘‹ 0 ๐‘ง = ๐‘—=0 ๐‘›/2โˆ’1 ๐‘ฅ 2๐‘— ๐‘ง ๐‘— ๐ฑ (1) = ๐‘ฅ 1 , ๐‘ฅ 3 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 ๐‘‹ 1 ๐‘ง = ๐‘—=0 ๐‘›/2โˆ’1 ๐‘ฅ 2๐‘—+1 ๐‘ง ๐‘— ๐‘‹ ๐‘ง = ๐‘‹ ๐‘ง 2 + ๐‘ง ๐‘‹ ๐‘ง 2

18 Decomposing the DFT (II)
๐‘‹ ๐‘ง = ๐‘‹ ๐‘ง 2 + ๐‘ง ๐‘‹ ๐‘ง 2 We need to evaluate ๐‘‹(๐‘ง) at ๐œ” ๐‘› 0 , ๐œ” ๐‘› 1 ,โ€ฆ, ๐œ” ๐‘› ๐‘›โˆ’1 To do that we need to evaluate ๐‘‹ 0 (๐‘ง) and ๐‘‹ 1 (๐‘ง) at ๐œ” ๐‘› 0 , ๐œ” ๐‘› 2 ,โ€ฆ, ๐œ” ๐‘› 2(๐‘›โˆ’1) But these ๐‘› points are exactly ๐œ” ๐‘›/2 0 , ๐œ” ๐‘›/2 1 ,โ€ฆ, ๐œ” ๐‘›/2 ๐‘›/2โˆ’1 , ๐œ” ๐‘›/2 0 , ๐œ” ๐‘›/2 1 ,โ€ฆ, ๐œ” ๐‘›/2 ๐‘›/2โˆ’1 Thus, we only need to compute ๐ท๐น๐‘‡( ๐ฑ 0 ) and ๐ท๐น๐‘‡( ๐ฑ 1 ) , use each computed number twice, and multiply the values of ๐ท๐น๐‘‡ ๐ฑ by appropriate powers of ๐œ” ๐‘› !!!

19 FFT โ€“ recursive version
๐น๐น๐‘‡ ๐‘ฅ 0 , ๐‘ฅ 1 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 : if ๐‘›=2: return ( ๐‘ฅ 0 + ๐‘ฅ 1 , ๐‘ฅ 0 โˆ’ ๐‘ฅ 1 ) ( ๐‘Ž 0 , ๐‘Ž 1 ,โ€ฆ, ๐‘Ž ๐‘›/2โˆ’1 ) โ†๐น๐น๐‘‡ ๐‘ฅ 0 , ๐‘ฅ 2 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’2 ( ๐‘ 0 , ๐‘ 1 ,โ€ฆ, ๐‘ ๐‘›/2โˆ’1 ) โ†๐น๐น๐‘‡ ๐‘ฅ 1 , ๐‘ฅ 3 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 for ๐‘—โ†0 to ๐‘›/2โˆ’1: ๐‘ฆ ๐‘— โ† ๐‘Ž ๐‘— + ๐œ” ๐‘› ๐‘— ๐‘ ๐‘— ๐‘ฆ ๐‘›/2+๐‘— โ† ๐‘Ž ๐‘— โˆ’ ๐œ” ๐‘› ๐‘— ๐‘ ๐‘— // ๐œ” ๐‘› ๐‘›/2+๐‘— =โˆ’ ๐œ” ๐‘› ๐‘— return ๐‘ฆ 0 , ๐‘ฆ 1 ,โ€ฆ, ๐‘ฆ ๐‘›โˆ’1

20 ๐น๐น๐‘‡ ๐‘ฅ 0 , ๐‘ฅ 1 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 : // Slightly optimized
if ๐‘›=2: return ( ๐‘ฅ 0 + ๐‘ฅ 1 , ๐‘ฅ 0 โˆ’ ๐‘ฅ 1 ) ( ๐‘Ž 0 , ๐‘Ž 1 ,โ€ฆ, ๐‘Ž ๐‘›/2โˆ’1 ) โ†๐น๐น๐‘‡ ๐‘ฅ 0 , ๐‘ฅ 2 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’2 ( ๐‘ 0 , ๐‘ 1 ,โ€ฆ, ๐‘ ๐‘›/2โˆ’1 ) โ†๐น๐น๐‘‡ ๐‘ฅ 1 , ๐‘ฅ 3 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 ๐œ”โ†1 for ๐‘—โ†0 to ๐‘›/2โˆ’1: ๐‘กโ†๐œ” ๐‘ ๐‘— ๐‘ฆ ๐‘— โ† ๐‘Ž ๐‘— +๐‘ก ๐‘ฆ ๐‘›/2+๐‘— โ† ๐‘Ž ๐‘— โˆ’๐‘ก ๐œ”โ†๐œ” ๐œ” ๐‘› // Can be precomputed return ๐‘ฆ 0 , ๐‘ฆ 1 ,โ€ฆ, ๐‘ฆ ๐‘›โˆ’1

21 Complexity of the FFT ๐‘‡(๐‘›) โ€“ Cost of an FFT of size ๐‘›.
๐‘‡ ๐‘› =2๐‘‡ ๐‘›/2 +๐‘‚(๐‘›) ๐‘‡ ๐‘› =๐‘‚(๐‘› log ๐‘› ) ๐ด(๐‘›) โ€“ Number of additions/subtractions in ๐น๐น ๐‘‡ ๐‘› ๐‘€(๐‘›) โ€“ Number of (complex) multiplications in ๐น๐น ๐‘‡ ๐‘› ๐ด 2 =2 ๐‘€ 2 =0 ๐ด ๐‘› =2๐ด ๐‘›/2 +๐‘› ๐‘€ ๐‘› =2๐‘€ ๐‘›/2 +๐‘›/2 ๐‘€ ๐‘› = ๐‘› 2 log 2 ๐‘› 2 ๐ด ๐‘› =๐‘› log 2 ๐‘›

22 A butterfly ๐‘Ž ๐‘Ž+๐œ”๐‘ + โ€œTwiddle factorโ€ ๐œ” ๐‘ ๐‘Žโˆ’๐œ”๐‘ ๏‚ด ๏€ญ ๐œ” ๐‘Ž ๐‘ ๐‘Ž+๐œ”๐‘ ๐‘Žโˆ’๐œ”๐‘

23 An FFT circuit ๐น 4 ๐น 4 Input permuted! ๐‘ฅ 0 ๐‘ฅ 4 ๐‘ฅ 6 ๐‘ฅ 1 ๐‘ฅ 3 ๐‘ฅ 5 ๐‘ฅ 2 ๐‘ฅ 7
๐‘ฆ 0 ๐‘ฆ 2 ๐‘ฆ 3 ๐‘ฆ 4 ๐‘ฆ 5 ๐‘ฆ 6 ๐‘ฆ 1 ๐‘ฆ 7 ๐น 4 ๐œ” 8 0 ๐œ” 8 1 ๐œ” 8 2 ๐œ” 8 3 ๐น 4

24 Input further permuted!
An FFT circuit Input further permuted! ๐‘ฅ 0 ๐‘ฅ 2 ๐‘ฅ 6 ๐‘ฅ 1 ๐‘ฅ 5 ๐‘ฅ 3 ๐‘ฅ 4 ๐‘ฅ 7 ๐น 2 ๐œ” 4 0 ๐œ” 4 1 ๐‘ฆ 0 ๐‘ฆ 2 ๐‘ฆ 3 ๐‘ฆ 4 ๐‘ฆ 5 ๐‘ฆ 6 ๐‘ฆ 1 ๐‘ฆ 7 ๐œ” 8 0 ๐œ” 8 1 ๐œ” 8 2 ๐œ” 8 3 ๐น 2 ๐œ” 4 0 ๐œ” 4 1

25 An FFT circuit Input permuted! ๐‘ฅ 0 ๐‘ฆ 0 ๐œ” 2 0 ๐‘ฅ 4 ๐œ” 4 0 ๐‘ฆ 1 ๐‘ฅ 2 ๐‘ฆ 2
๐‘ฅ 6 ๐‘ฅ 1 ๐‘ฅ 5 ๐‘ฅ 3 ๐‘ฅ 4 ๐‘ฅ 7 ๐‘ฆ 0 ๐‘ฆ 2 ๐‘ฆ 3 ๐‘ฆ 4 ๐‘ฆ 5 ๐‘ฆ 6 ๐‘ฆ 1 ๐‘ฆ 7 ๐œ” 2 0 ๐œ” 4 0 ๐œ” 8 0 ๐œ” 4 1 ๐œ” 8 1 ๐œ” 2 0 ๐œ” 8 2 ๐œ” 8 3 ๐œ” 2 0 ๐œ” 4 0 ๐œ” 4 1 ๐œ” 2 0

26 Bit-reversal permutation
An FFT circuit Bit-reversal permutation ๐‘ฅ 0 ๐‘ฅ 2 ๐‘ฅ 6 ๐‘ฅ 1 ๐‘ฅ 5 ๐‘ฅ 3 ๐‘ฅ 4 ๐‘ฅ 7 ๐‘ฅ 0 ๐‘ฆ 0 ๐‘ฆ 2 ๐‘ฆ 3 ๐‘ฆ 4 ๐‘ฆ 5 ๐‘ฆ 6 ๐‘ฆ 1 ๐‘ฆ 7 ๐œ” 2 0 ๐œ” 4 0 ๐œ” 8 0 ๐‘ฅ 1 ๐œ” 4 1 ๐œ” 8 1 ๐‘ฅ 2 ๐œ” 2 0 ๐œ” 8 2 ๐‘ฅ 3 ๐œ” 8 3 ๐‘ฅ 4 ๐œ” 2 0 ๐œ” 4 0 ๐‘ฅ 5 ๐œ” 4 1 ๐‘ฅ 6 ๐œ” 2 0 ๐‘ฅ 7

27 Bit-reversal permutation
An FFT circuit Bit-reversal permutation ๐‘ฅ 0 ๐‘ฅ 2 ๐‘ฅ 6 ๐‘ฅ 1 ๐‘ฅ 5 ๐‘ฅ 3 ๐‘ฅ 4 ๐‘ฅ 7 ๐‘ฅ 0 ๐‘ฆ 0 ๐‘ฆ 2 ๐‘ฆ 3 ๐‘ฆ 4 ๐‘ฆ 5 ๐‘ฆ 6 ๐‘ฆ 1 ๐‘ฆ 7 ๐œ” 2 0 ๐œ” 4 0 ๐‘ฅ 1 ๐œ” 4 1 ๐‘ฅ 2 ๐œ” 2 0 ๐‘ฅ 3 ๐‘ฅ 4 ๐œ” 2 0 ๐œ” 4 0 ๐‘ฅ 5 ๐œ” 4 1 ๐‘ฅ 6 ๐œ” 2 0 ๐‘ฅ 7

28 FFT and Algorithm Engineering
In real life, constant factors matter. A tremendous amount of work was invested is optimizing the performance of FFT algorithms on specific architectures. The algorithm we saw is a radix-2 FFT. Radix-4 and varying radices work better in practice. A good FFT implementation should use cache and memory cleverly, and use parallelism if possible.

29 Decomposing the DFT (I)
๐œ” ๐‘› ๐‘—โ‹…2๐‘˜ = ๐œ” ๐‘›/2 ๐‘—๐‘˜ ๐œ” ๐‘› ๐‘—โ‹…(2๐‘˜+1) = ๐œ” ๐‘› ๐‘— โ‹…๐œ” ๐‘›/2 ๐‘—๐‘˜ ๐œ” ๐‘› ( ๐‘› 2 +๐‘—)โ‹…2๐‘˜ = ๐œ” ๐‘›/2 ๐‘—๐‘˜ ๐œ” ๐‘› ( ๐‘› 2 +๐‘—)โ‹…(2๐‘˜+1) = โˆ’ ๐œ” ๐‘› ๐‘— โ‹…๐œ” ๐‘›/2 ๐‘—๐‘˜ This gives the algorithm we have seen.

30 Decomposing the DFT (II)
๐œ” ๐‘› 2๐‘—โ‹…๐‘˜ = ๐œ” ๐‘›/2 ๐‘—๐‘˜ ๐œ” ๐‘› (2๐‘—+1)โ‹…๐‘˜ = ๐œ” ๐‘›/2 ๐‘—๐‘˜ โ‹… ๐œ” ๐‘› ๐‘˜ ๐œ” ๐‘› 2๐‘—โ‹…( ๐‘› 2 +๐‘˜) = ๐œ” ๐‘›/2 ๐‘—๐‘˜ ๐œ” ๐‘› (2๐‘—+1)โ‹…( ๐‘› 2 +๐‘˜) = ๐œ” ๐‘› 2 ๐‘—๐‘˜ โ‹…(โˆ’ ๐œ” ๐‘› ๐‘˜ ) This gives an alternative algorithm.

31 The Inverse DFT The inverse DFT is very similar to the DFT:
๐‘ฅ 0 ๐‘ฅ 1 โ‹ฎ ๐‘ฅ ๐‘›โˆ’1 = 1 ๐‘› โ‹ฎ โ‹ฏ ๐œ” ๐‘› โˆ’๐‘—๐‘˜ โ‹ฏ โ‹ฎ ๐‘ฆ 0 ๐‘ฆ 1 โ‹ฎ ๐‘ฆ ๐‘›โˆ’1 To prove it, we need show that โ„“=0 ๐‘›โˆ’1 ๐œ” ๐‘› ๐‘—โ„“ ๐œ” ๐‘› โˆ’โ„“๐‘˜ = ๐‘› if ๐‘—=๐‘˜ 0 otherwise (Recall that if ๐ถ=๐ด๐ต, then ๐‘ ๐‘—,๐‘˜ = โ„“=0 ๐‘›โˆ’1 ๐‘Ž ๐‘—,โ„“ ๐‘ โ„“,๐‘˜ .)

32 The Inverse DFT โ„“=0 ๐‘›โˆ’1 ๐œ” ๐‘› ๐‘—โ„“ ๐œ” ๐‘› โˆ’โ„“๐‘˜ = ๐‘› if ๐‘—=๐‘˜ 0 otherwise
If ๐‘—=๐‘˜, then ๐œ” ๐‘› ๐‘—โ„“ ๐œ” ๐‘› โˆ’โ„“๐‘˜ =1, so the claim is obvious. If ๐‘—โ‰ ๐‘˜, and 0โ‰ค๐‘—,๐‘˜<๐‘›, then: โ„“=0 ๐‘›โˆ’1 ๐œ” ๐‘› ๐‘—โ„“ ๐œ” ๐‘› โˆ’โ„“๐‘˜ = โ„“=0 ๐‘›โˆ’1 ๐œ” ๐‘› (๐‘—โˆ’๐‘˜)โ„“ = ๐œ” ๐‘› ๐‘—โˆ’๐‘˜ ๐‘› โˆ’1 ๐œ” ๐‘› ๐‘—โˆ’๐‘˜ โˆ’1 =0 as ๐œ” ๐‘› ๐‘—โˆ’๐‘˜ ๐‘› = ๐œ” ๐‘› ๐‘› ๐‘—โˆ’๐‘˜ =1, while ๐œ” ๐‘› ๐‘—โˆ’๐‘˜ โ‰ 1. (Recall that if ๐‘Žโ‰ 1, then โ„“=0 ๐‘›โˆ’1 ๐‘Ž โ„“ = ๐‘Ž ๐‘› โˆ’1 ๐‘Žโˆ’1 .)

33 ๐ท๐น ๐‘‡ โˆ’1 as polynomial interpolation
๐ท๐น๐‘‡(๐ฑ) evaluates the polynomial ๐‘‹ ๐‘ง = ๐‘—=0 ๐‘›โˆ’1 ๐‘ฅ ๐‘— ๐‘ง ๐‘— corresponding to ๐ฑ at the points 1, ๐œ” ๐‘› , ๐œ” ๐‘› 2 ,โ€ฆ, ๐œ” ๐‘› ๐‘›โˆ’1 . ๐ท๐น ๐‘‡ โˆ’1 (๐ฒ) interpolates the coefficients of a polynomial ๐‘‹ ๐‘ง = ๐‘—=0 ๐‘›โˆ’1 ๐‘ฅ ๐‘— ๐‘ง ๐‘— such that ๐‘‹( ๐œ” ๐‘› ๐‘— ) = ๐‘ฆ ๐‘— , ๐‘—=0,โ€ฆ,๐‘›โˆ’1. As ๐ท๐น๐‘‡ and ๐ท๐น ๐‘‡ โˆ’1 are inverses of each other, the interpolation polynomial is unique. Interpolation Theorem: For any sequence ๐›ผ 0 , ๐›ผ 1 ,โ€ฆ, ๐›ผ ๐‘› of distinct numbers, and any sequence ๐‘ฆ 0 , ๐‘ฆ 1 ,โ€ฆ, ๐‘ฆ ๐‘› , there is a unique polynomial ๐‘“ ๐‘ง = ๐‘—=0 ๐‘›โˆ’1 ๐‘ฅ ๐‘— ๐‘ง ๐‘— of degree less than ๐‘› such that ๐‘“ ๐›ผ ๐‘— = ๐‘ฆ ๐‘— , for ๐‘—=0,1,โ€ฆ,๐‘›โˆ’1.

34 Change of basis The standard basis of โ„‚ ๐‘› is ๐ž 0 , ๐ž 1 ,โ€ฆ, ๐ž ๐‘›โˆ’1 . ๐ฑ= ๐‘ฅ 0 , ๐‘ฅ 1 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 ๐‘‡ = ๐‘˜=0 ๐‘›โˆ’1 ๐‘ฅ ๐‘˜ ๐ž ๐‘˜ Let ๐› 0 , ๐› 1 ,โ€ฆ, ๐› ๐‘›โˆ’1 โˆˆ โ„‚ ๐‘› be a basis of โ„‚ ๐‘› , i.e., a sequence of ๐‘› linearly independent vectors. Then, ๐ฑ= ๐‘˜=0 ๐‘›โˆ’1 ๐‘ฅ ๐‘˜ ๐› ๐‘˜ , for some ๐‘ฅ 0 , ๐‘ฅ 1 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 that can be obtained by solving a system of linear equations: ๐‘ฅ 0 ๐‘ฅ 1 โ‹ฎ ๐‘ฅ ๐‘›โˆ’1 = | | | ๐› 0 ๐› 1 โ‹ฏ ๐› ๐‘›โˆ’1 | | | ๐‘ฅ ๐‘ฅ 1 โ‹ฎ ๐‘ฅ ๐‘›โˆ’1

35 Orthonormal basis ๐› ๐‘— ,๐› ๐‘˜ = ๐› ๐‘— โˆ— ๐› ๐‘˜ = 1 if ๐‘—=๐‘˜ 0 otherwise
A basis ๐› 0 , ๐› 1 ,โ€ฆ, ๐› ๐‘›โˆ’1 โˆˆ โ„‚ ๐‘› is orthonormal if ๐› ๐‘— ,๐› ๐‘˜ = ๐› ๐‘— โˆ— ๐› ๐‘˜ = 1 if ๐‘—=๐‘˜ 0 otherwise ๐ฑ= ๐‘ฅ 0 ๐‘ฅ 1 โ‹ฎ ๐‘ฅ ๐‘›โˆ’ ๐ฑ โˆ— =[ ๐‘ฅ 0 โˆ— ๐‘ฅ 1 โˆ— โ€ฆ ๐‘ฅ ๐‘›โˆ’1 โˆ— ] Conjugate transpose ๐‘ฅ=๐‘Ž+๐‘–๐‘ ๐‘ฅ โˆ— =๐‘Žโˆ’๐‘–๐‘

36 Orthonormal basis If ๐› 0 , ๐› 1 ,โ€ฆ, ๐› ๐‘›โˆ’1 โˆˆ โ„‚ ๐‘› is orthonormal then | | | ๐› 0 ๐› 1 โ‹ฏ ๐› ๐‘›โˆ’1 | | | โˆ’1 = ๏‚พ ๐› 0 โˆ— ๏‚พ ๏‚พ ๐› 1 โˆ— ๏‚พ โ‹ฎ ๏‚พ ๐› ๐‘›โˆ’1 โˆ— ๏‚พ ๐‘ฅ ๐‘ฅ 1 โ‹ฎ ๐‘ฅ ๐‘›โˆ’1 = ๏‚พ ๐› 0 โˆ— ๏‚พ ๏‚พ ๐› 1 โˆ— ๏‚พ โ‹ฎ ๏‚พ ๐› ๐‘›โˆ’1 โˆ— ๏‚พ ๐‘ฅ 0 ๐‘ฅ 1 โ‹ฎ ๐‘ฅ ๐‘›โˆ’1 ๐‘ฅ ๐‘– = ๐› ๐‘– ,๐ฑ =๐› ๐‘– โˆ— ๐ฑ

37 The Fourier basis ๐Ÿ ๐‘— = 1 ๐‘› 1, ๐œ” ๐‘› โˆ’๐‘— , ๐œ” ๐‘› โˆ’2๐‘— ,โ€ฆ, ๐œ” ๐‘› โˆ’ ๐‘›โˆ’1 ๐‘— ๐‘‡
๐Ÿ ๐‘— = 1 ๐‘› 1, ๐œ” ๐‘› โˆ’๐‘— , ๐œ” ๐‘› โˆ’2๐‘— ,โ€ฆ, ๐œ” ๐‘› โˆ’ ๐‘›โˆ’1 ๐‘— ๐‘‡ ๐Ÿ ๐‘— โˆ— ๐Ÿ ๐‘˜ = 1 if ๐‘—=๐‘˜ 0 otherwise The Fourier basis is orthonormal. The DFT performs a change of basis, from the standard basis to the Fourier basis. (If we multiply the result by 1 ๐‘› .) The minus signs can be removed from the definition of the basis vectors ๐Ÿ ๐‘— and moved into the DFT matrix.

38 Why did the โ€œsignal processingโ€ examples work?
Exercise: Let ๐ฑ= ๐‘“ 0 ,๐‘“ ,โ€ฆ,๐‘“ , where ๐‘“ ๐‘ฅ = sin 9 2๐œ‹๐‘ฅ sin 2 2๐œ‹๐‘ฅ +2 cos 2๐œ‹๐‘ฅ . What is ๐ท๐น๐‘‡(๐ฑ) ? Hint: No complicated calculations are required. Use the facts that sin ๐‘ฅ = 1 2๐‘– ๐ž ๐‘–๐‘ฅ โˆ’ ๐ž โˆ’๐‘–๐‘ฅ , and similar relations. Note: The values shown on slide 14 are normalized absolute values.

39 A butterfly and its inverse
+ ๏€ญ ๏‚ด ๐œ” ๐‘Ž ๐‘ ๐‘=๐‘Ž+๐œ”๐‘ ๐‘‘=๐‘Žโˆ’๐œ”๐‘ ๏‚ด + ๏€ญ ๐œ” โˆ’1 ๐‘+๐‘‘=2๐‘Ž ๐œ” โˆ’1 ๐‘โˆ’๐‘‘ =2๐‘ ๐‘=๐‘Ž+๐œ”๐‘ ๐‘‘=๐‘Žโˆ’๐œ”๐‘ To compute ๐น๐น ๐‘‡ โˆ’1 we can also run the ๐น๐น๐‘‡ network backwards. This also gives the alternative ๐น๐น๐‘‡ network.

40 Convolution ๐ฑ= ๐‘ฅ 0 , ๐‘ฅ 1 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 ๐ณ=๐ฑโˆ—๐’š ๐ฒ= ๐‘ฆ 0 , ๐‘ฆ 1 ,โ€ฆ, ๐‘ฆ ๐‘›โˆ’1
๐ฑ= ๐‘ฅ 0 , ๐‘ฅ 1 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 ๐ณ=๐ฑโˆ—๐’š ๐ฒ= ๐‘ฆ 0 , ๐‘ฆ 1 ,โ€ฆ, ๐‘ฆ ๐‘›โˆ’1 ๐ณ= ๐‘ง 0 , ๐‘ง 1 ,โ€ฆ, ๐‘ง 2๐‘›โˆ’1 ๐‘ง ๐‘˜ = ๐‘–+๐‘—=๐‘˜ ๐‘ฅ ๐‘– ๐‘ฆ ๐‘— = ๐‘– ๐‘ฅ ๐‘– ๐‘ฆ ๐‘˜โˆ’๐‘– max 0,๐‘˜โˆ’๐‘› โ‰ค๐‘–โ‰คminโก{๐‘˜,๐‘›} Naturally extends to ๐ฑ and ๐ฒ having different length.

41 Convolution ๐ฑ= ๐‘ฅ 0 , ๐‘ฅ 1 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 ๐ณ=๐ฑโˆ—๐’š ๐ฒ= ๐‘ฆ 0 , ๐‘ฆ 1 ,โ€ฆ, ๐‘ฆ ๐‘›โˆ’1
๐ฑ= ๐‘ฅ 0 , ๐‘ฅ 1 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 ๐ณ=๐ฑโˆ—๐’š ๐ฒ= ๐‘ฆ 0 , ๐‘ฆ 1 ,โ€ฆ, ๐‘ฆ ๐‘›โˆ’1 ๐ณ= ๐‘ง 0 , ๐‘ง 1 ,โ€ฆ, ๐‘ง 2๐‘›โˆ’1 ๐‘ง ๐‘˜ = ๐‘–+๐‘—=๐‘˜ ๐‘ฅ ๐‘– ๐‘ฆ ๐‘— ๐‘ง 0 = ๐‘ฅ 0 ๐‘ฆ 0 ๐‘ง 1 = ๐‘ฅ 0 ๐‘ฆ 1 + ๐‘ฅ 1 ๐‘ฆ 0 ๐‘ง 2 = ๐‘ฅ 0 ๐‘ฆ 2 + ๐‘ฅ 1 ๐‘ฆ 1 + ๐‘ฅ 2 ๐‘ฆ 0 ๐‘ง 3 = ๐‘ฅ 0 ๐‘ฆ 3 + ๐‘ฅ 1 ๐‘ฆ 2 + ๐‘ฅ 2 ๐‘ฆ 1 + ๐‘ฅ 3 ๐‘ฆ 0 ๐‘ง 4 = ๐‘ฅ 1 ๐‘ฆ 3 + ๐‘ฅ 2 ๐‘ฆ 2 + ๐‘ฅ 3 ๐‘ฆ 1 ๐‘ง 5 = ๐‘ฅ 2 ๐‘ฆ 3 + ๐‘ฅ 3 ๐‘ฆ 2 Example: ๐‘›=4 ๐‘ง 6 = ๐‘ฅ 3 ๐‘ฆ 3 For convenience ๐‘ง 7 =0

42 Convolution ๐‘ฅ 0 ๐‘ฅ 1 ๐‘ฅ 2 ๐‘ฅ 3 ๐‘ฆ 3 ๐‘ฆ 2 ๐‘ฆ 1 ๐‘ฆ 0 ๐‘ฆ 0 ๐‘ฆ 1 ๐‘ฆ 2 ๐‘ฆ 3 Reverse ๐ฒ.
Shift ๐ฒ to starting position. Compute products for each aligment.

43 Convolution ๐‘ฅ 0 ๐‘ฅ 1 ๐‘ฅ 2 ๐‘ฅ 3 ๐‘ฆ 3 ๐‘ฆ 2 ๐‘ฆ 1 ๐‘ฆ 0 ๐‘ง 0 = ๐‘ฅ 0 ๐‘ฆ 0
๐‘ง 0 = ๐‘ฅ 0 ๐‘ฆ 0 ๐‘ง 1 = ๐‘ฅ 0 ๐‘ฆ 1 + ๐‘ฅ 1 ๐‘ฆ 0 ๐‘ง 2 = ๐‘ฅ 0 ๐‘ฆ 2 + ๐‘ฅ 1 ๐‘ฆ 1 + ๐‘ฅ 2 ๐‘ฆ 0 ๐‘ง 3 = ๐‘ฅ 0 ๐‘ฆ 3 + ๐‘ฅ 1 ๐‘ฆ 2 + ๐‘ฅ 2 ๐‘ฆ 1 + ๐‘ฅ 3 ๐‘ฆ 0 ๐‘ง 4 = ๐‘ฅ 1 ๐‘ฆ 3 + ๐‘ฅ 2 ๐‘ฆ 2 + ๐‘ฅ 3 ๐‘ฆ 1 ๐‘ง 5 = ๐‘ฅ 2 ๐‘ฆ 3 + ๐‘ฅ 3 ๐‘ฆ 2 ๐‘ง 6 = ๐‘ฅ 3 ๐‘ฆ 3

44 Convolution and polynomial multiplication
๐ด ๐‘ฅ = ๐‘—=0 ๐‘›โˆ’1 ๐‘Ž ๐‘— ๐‘ฅ ๐‘— ๐ต ๐‘ฅ = ๐‘˜=0 ๐‘›โˆ’1 ๐‘ ๐‘˜ ๐‘ฅ ๐‘˜ ๐ถ ๐‘ฅ =๐ด ๐‘ฅ ๐ต ๐‘ฅ = ๐‘—=0 ๐‘›โˆ’1 ๐‘Ž ๐‘— ๐‘ฅ ๐‘— ๐‘˜=0 ๐‘›โˆ’1 ๐‘ ๐‘˜ ๐‘ฅ ๐‘˜ = ๐‘–=0 2๐‘›โˆ’2 ๐‘—+๐‘˜=๐‘– ๐‘Ž ๐‘— ๐‘ ๐‘˜ ๐‘ฅ ๐‘– = ๐‘–=0 2๐‘˜โˆ’1 ๐‘ ๐‘– ๐‘ฅ ๐‘– ๐œ=๐šโˆ—๐›

45 ๐‘ง ๐‘˜ = ๐‘–+๐‘—โ‰ก๐‘˜ mod ๐‘› ๐‘ฅ ๐‘– ๐‘ฆ ๐‘— = ๐‘–+๐‘—=๐‘˜ ๐‘ฅ ๐‘– ๐‘ฆ ๐‘— + ๐‘–+๐‘—=๐‘›+๐‘˜ ๐‘ฅ ๐‘– ๐‘ฆ ๐‘—
Cyclic Convolution ๐ฑ= ๐‘ฅ 0 , ๐‘ฅ 1 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 ๐ณ=๐ฑโŠ›๐’š ๐ฒ= ๐‘ฆ 0 , ๐‘ฆ 1 ,โ€ฆ, ๐‘ฆ ๐‘›โˆ’1 ๐ณ= ๐‘ง 0 , ๐‘ง 1 ,โ€ฆ, ๐‘ง ๐‘›โˆ’1 ๐‘ง ๐‘˜ = ๐‘–+๐‘—โ‰ก๐‘˜ mod ๐‘› ๐‘ฅ ๐‘– ๐‘ฆ ๐‘— = ๐‘–+๐‘—=๐‘˜ ๐‘ฅ ๐‘– ๐‘ฆ ๐‘— + ๐‘–+๐‘—=๐‘›+๐‘˜ ๐‘ฅ ๐‘– ๐‘ฆ ๐‘— ๐‘ง 0 = ๐‘ฅ 0 ๐‘ฆ 0 + ๐‘ฅ 1 ๐‘ฆ 3 + ๐‘ฅ 2 ๐‘ฆ 2 + ๐‘ฅ 3 ๐‘ฆ 1 Example: ๐‘›=4 ๐‘ง 1 = ๐‘ฅ 0 ๐‘ฆ 1 + ๐‘ฅ 1 ๐‘ฆ 0 + ๐‘ฅ 2 ๐‘ฆ 3 + ๐‘ฅ 3 ๐‘ฆ 2 ๐‘ง 2 = ๐‘ฅ 0 ๐‘ฆ 2 + ๐‘ฅ 1 ๐‘ฆ 1 + ๐‘ฅ 2 ๐‘ฆ 0 + ๐‘ฅ 3 ๐‘ฆ 3 ๐‘ง 3 = ๐‘ฅ 0 ๐‘ฆ 3 + ๐‘ฅ 1 ๐‘ฆ 2 + ๐‘ฅ 2 ๐‘ฆ 1 + ๐‘ฅ 3 ๐‘ฆ 0

46 Convolution ๏ƒ  Cyclic Convolution
Cyclic convolutions can be computed using ๐น๐น๐‘‡ and ๐น๐น ๐‘‡ โˆ’1 . Convolutions can be reduced to cyclic convolutions by padding. ๐‘› ๐ฑโ€ฒ= ๐‘ฅ 0 , ๐‘ฅ 1 ,โ€ฆ, ๐‘ฅ ๐‘›โˆ’1 ,0,0,โ€ฆ,0 ๐ฒโ€ฒ= ๐‘ฆ 0 , ๐‘ฆ 1 ,โ€ฆ, ๐‘ฆ ๐‘›โˆ’1 ,0,0,โ€ฆ,0 ๐ฑโˆ—๐ฒ=๐ฑโ€ฒโŠ›๐ฒโ€ฒ

47 The Convolution Theorem
๐ฑโŠ›๐ฒ= ๐ท๐น๐‘‡ โˆ’1 ๐ท๐น๐‘‡ ๐ฑ โˆ™๐ท๐น๐‘‡ ๐ฒ Point-wise multiplication Cyclic convolution Proof idea: Let ๐‘‹๐‘Œ ๐‘ง = ๐‘–=0 ๐‘›โˆ’1 ๐‘—+๐‘˜โ‰ก๐‘– ๐‘ฅ ๐‘— ๐‘ฆ ๐‘˜ ๐‘ง ๐‘– be the polynomial corresponding to ๐ฑโŠ›๐ฒ. We show that ๐‘‹๐‘Œ ๐œ” ๐‘› โ„“ =๐‘‹ ๐œ” ๐‘› โ„“ ๐‘Œ ๐œ” ๐‘› โ„“ , for every โ„“=0,1,โ€ฆ,๐‘›โˆ’1.

48 Proof of Convolution Theorem
Let ๐œ”= ๐œ” ๐‘› . ๐‘‹ ๐œ” โ„“ ๐‘Œ ๐œ” โ„“ = ๐‘—=0 ๐‘›โˆ’1 ๐‘ฅ ๐‘— ๐œ” โ„“๐‘— ๐‘˜=0 ๐‘›โˆ’1 ๐‘ฆ ๐‘˜ ๐œ” โ„“๐‘˜ = ๐‘–=0 2๐‘›โˆ’2 ๐‘—+๐‘˜=๐‘– ๐‘ฅ ๐‘— ๐‘ฆ ๐‘˜ ๐œ” โ„“๐‘– = ๐‘–=0 ๐‘›โˆ’1 ๐‘—+๐‘˜โ‰ก๐‘– ๐‘ฅ ๐‘— ๐‘ฆ ๐‘˜ ๐œ” โ„“๐‘– =๐‘‹๐‘Œ ๐œ” โ„“ As ๐œ” โ„“ ๐‘›+๐‘– = ๐œ” โ„“๐‘– . The claim follows as the interpolation polynomial is unique. Uniqueness follows from the fact that DFT is invertible. (BTW, ๐‘‹๐‘Œ ๐‘ง = ๐‘–=0 ๐‘›โˆ’1 ๐‘—+๐‘˜โ‰ก๐‘– ๐‘ฅ ๐‘— ๐‘ฆ ๐‘˜ ๐‘ง ๐‘– =๐‘‹ ๐‘ง ๐‘Œ ๐‘ง (mod ๐‘ฅ ๐‘› โˆ’1).)

49 The Chirp Transform Let ๐‘ง be an arbitrary (but fixed) complex number.
The Chirp transform of ๐ฑโˆˆ โ„‚ ๐‘› , w.r.t. ๐‘ง, is: ๐‘ฆ ๐‘˜ = ๐‘—=0 ๐‘›โˆ’1 ๐‘ฅ ๐‘— ๐‘ง ๐‘—๐‘˜ =๐‘‹ ๐‘ง ๐‘˜ ๏€ฌ ๐‘˜=0,1,โ€ฆ,๐‘›โˆ’1 Exercise: Show that the Chirp transform, for any ๐‘งโˆˆโ„‚, can be computed in ๐‘‚ ๐‘› log ๐‘› time. Hint: use the relation: ๐‘ฆ ๐‘˜ = ๐‘ง ๐‘˜ 2 /2 ๐‘—=0 ๐‘›โˆ’1 ๐‘ฅ ๐‘— ๐‘ง ๐‘— 2 /2 ๐‘ง โˆ’ ๐‘˜โˆ’๐‘— 2 /2 ๏€ฌ ๐‘˜=0,1,โ€ฆ,๐‘›โˆ’1 Exercise: Show that ๐ท๐น๐‘‡ of size ๐‘› can be computed in ๐‘‚ ๐‘› log ๐‘› time for every ๐‘›, not necessarily a power of 2.

50 Polynomial arithmetic
Let ๐ด ๐‘ฅ = ๐‘—=0 ๐‘›โˆ’1 ๐‘Ž ๐‘— ๐‘ฅ ๐‘— and ๐ต ๐‘ฅ = ๐‘˜=0 ๐‘›โˆ’1 ๐‘ ๐‘˜ ๐‘ฅ ๐‘˜ be two polynomials of degree less than ๐‘›, with real or complex coefficients. The coefficients of ๐ด ๐‘ฅ +๐ต(๐‘ฅ) can be easily computed using ๐‘› additions. A naรฏve computations of the coefficients of ๐ด ๐‘ฅ ๐ต(๐‘ฅ) requires ฮ˜ ๐‘› 2 operations. Using FFT we can compute the coefficients of ๐ด ๐‘ฅ ๐ต(๐‘ฅ) using only ฮ˜ ๐‘› log ๐‘› operations!

51 Karatsubaโ€™s algorithm
When ๐‘› is only moderately large, the following polynomial multiplication algorithm works better in practice. ๐ด ๐‘ฅ = ๐ด 0 ๐‘ฅ + ๐‘ฅ ๐‘›/2 ๐ด 1 ๐‘ฅ ๐ต ๐‘ฅ = ๐ต 0 ๐‘ฅ + ๐‘ฅ ๐‘›/2 ๐ต 1 ๐‘ฅ ๐ถ 0 ๐‘ฅ = ๐ด 0 (๐‘ฅ)๐ต 0 ๐‘ฅ ๐ถ 1 ๐‘ฅ = ( ๐ด 0 ๐‘ฅ +๐ด 1 ๐‘ฅ ) ( ๐ต 0 ๐‘ฅ +๐ต 1 ๐‘ฅ ) ๐ถ 2 ๐‘ฅ = ๐ด 1 (๐‘ฅ)๐ต 1 ๐‘ฅ ๐ด ๐‘ฅ ๐ต ๐‘ฅ = ๐ถ 0 ๐‘ฅ + ๐‘ฅ ๐‘›/2 ๐ถ 1 ๐‘ฅ โˆ’ ๐ถ 0 ๐‘ฅ โˆ’ ๐ถ 2 ๐‘ฅ + ๐‘ฅ ๐‘› ๐ถ 2 (๐‘ฅ) ๐‘‡ ๐‘› =3 ๐‘‡ ๐‘› 2 +๐‘‚(๐‘›) ๐‘‡ ๐‘› =๐‘‚ ๐‘› log =๐‘‚( ๐‘› 1.59 )

52 Numerical issues So far, we assumed that all arithmetical operations are exact. This is not a realistic assumption, as ๐œ” ๐‘› is usually irrational. The FFT algorithm is well-behaved numerically. The errors introduced if all operations are done using floating-point arithmetic are relatively small. In signal processing applications small errors are acceptable.

53 Integer Polynomial Multiplication
We now want to add and multiply polynomials with integer coefficients. We want an exact result. If we use high enough precision, we can use ๐น๐น๐‘‡ and ๐น๐น ๐‘‡ โˆ’1 and round the results obtained to the nearest integers. To multiply two polynomials of degree at most ๐‘› with integer coefficients of absolute value at most ๐‘›, ๐‘‚( log ๐‘› ) bits of precision are enough. (Proof omitted.)

54 Integer Multiplication
There are practical applications, e.g., cryptography, that require multiplying very large integers. The naรฏve method for multiplying two ๐‘›-bit numbers requires ฮ˜ ๐‘› 2 bit operations. Can we use FFTs to obtain a aster integer multiplication algorithm/circuit? Yes, as integer multiplication can be reduced to polynomial multiplication.

55 Schรถnhage-Strassenโ€™s algorithm
Basic idea ๐ฑ= ๐‘ฅ ๐‘›โˆ’1 โ€ฆ ๐‘ฅ 1 ๐‘ฅ = ๐‘–=0 ๐‘›โˆ’1 ๐‘ฅ ๐‘– 2 ๐‘– =๐‘‹ 2 ๐ฒ= ๐‘ฆ ๐‘›โˆ’1 โ€ฆ ๐‘ฆ 1 ๐‘ฆ = ๐‘–=0 ๐‘›โˆ’1 ๐‘ฆ ๐‘– 2 ๐‘– =๐‘Œ 2 Compute ๐‘ ๐‘ก =๐‘‹ ๐‘ก ๐‘Œ(๐‘ก) (polynomial multiplication) ๐ฑโˆ™๐ฒ=๐ณ= ๐‘ง 2๐‘›โˆ’1 โ€ฆ ๐‘ง 1 ๐‘ง = ๐‘–=0 2๐‘›โˆ’1 ๐‘ง ๐‘– 2 ๐‘– =๐‘ 2 We are not done yet, as the ๐‘ง ๐‘– are not binary. But, as 0โ‰ค ๐‘ง ๐‘– <๐‘› this is not a serious problem. Some clever tricks are used to speed-up the algorithm. The first trick is to use base ๐‘›= 2 ๐‘˜ rather than 2.

56 Schรถnhage-Strassenโ€™s algorithm
๐‘˜= log ๐‘› ๐‘˜ ๐‘›= 2 ๐‘˜ ๐‘ฅ ๐‘›/๐‘˜โˆ’1 ๐‘ฅ ๐‘›/๐‘˜โˆ’2 โ€ฆ ๐‘ฅ 1 ๐‘ฅ 0 ๐ฑ= ๐‘ฆ ๐‘›/๐‘˜โˆ’1 ๐‘ฆ ๐‘›/๐‘˜โˆ’2 ๐‘ฆ 1 ๐‘ฆ 0 ๐ฒ= ๐ฑ= ๐‘ฅ ๐‘›/๐‘˜โˆ’1 โ€ฆ ๐‘ฅ 1 ๐‘ฅ 0 ๐‘› = ๐‘–=0 ๐‘›/๐‘˜โˆ’1 ๐‘ฅ ๐‘– 2 ๐‘˜๐‘– =๐‘‹ 2 ๐‘˜ ๐ฒ= ๐‘ฆ ๐‘›/๐‘˜โˆ’1 โ€ฆ ๐‘ฆ 1 ๐‘ฆ 0 ๐‘› = ๐‘–=0 ๐‘›/๐‘˜โˆ’1 ๐‘ฆ ๐‘– 2 ๐‘˜๐‘– =๐‘Œ 2 ๐‘˜ Compute ๐‘ ๐‘ก =๐‘‹ ๐‘ก ๐‘Œ(๐‘ก) (polynomial multiplication) ๐ฑโˆ™๐ฒ=๐ณ= ๐‘ง 2๐‘›/๐‘˜โˆ’1 โ€ฆ ๐‘ง 1 ๐‘ง 0 ๐‘› = ๐‘–=0 2๐‘›/๐‘˜โˆ’1 ๐‘ง ๐‘– 2 ๐‘˜๐‘– =๐‘ 2 ๐‘˜

57 Schรถnhage-Strassenโ€™s algorithm
๐ฑ = ๐‘–=0 ๐‘›/๐‘˜โˆ’1 ๐‘ฅ ๐‘– 2 ๐‘˜๐‘– ๐ฒ = ๐‘–=0 ๐‘›/๐‘˜โˆ’1 ๐‘ฆ ๐‘– 2 ๐‘˜๐‘– ๐ณ = ๐‘–=0 2๐‘›/๐‘˜โˆ’1 ๐‘ง ๐‘– 2 ๐‘˜๐‘– ๐‘ง ๐‘– = ๐‘—+โ„“=๐‘– ๐‘ฅ ๐‘— ๐‘ฆ โ„“ 0โ‰ค ๐‘ฅ ๐‘— , ๐‘ฆ ๐‘˜ < 2 ๐‘˜ =๐‘› โ‰ค๐‘ง ๐‘– < ๐‘›/logโก๐‘› ๐‘› 2 < ๐‘› 3 Each ๐‘ง ๐‘– is a 3-digit number in base ๐‘›. We can thus pack all the ๐‘ง ๐‘– into 3 long integers. Adding these 3 integers gives us the final answer.

58 Schรถnhage-Strassenโ€™s algorithm
The final step 3๐‘˜= 3 log ๐‘› ๐‘˜= log ๐‘› โ€ฆ ๐‘ง 3 ๐‘ง 0 โ€ฆ ๐‘ง 4 ๐‘ง 1 โ€ฆ ๐‘ง 5 ๐‘ง 2 Adding 3 2๐‘›-bit numbers can be easily done using ๐‘‚(๐‘›) bit operations.

59 Schรถnhage-Strassenโ€™s algorithm
To multiply two ๐‘›-bit numbers, we compute two ๐น๐น๐‘‡s and one ๐น๐น ๐‘‡ โˆ’1 of size ๐‘›/๐‘˜=๐‘›/logโก๐‘›. Each input number is an integer between 0 and ๐‘›โˆ’1. Each output number is an integer between 0 and ๐‘› 3 โˆ’1. It is enough to perform all arithmetical operations using a precision of ๐‘‚( log ๐‘›) bits. (Stated without proof.) Let ๐‘€ ๐‘› be the total number of bit operations performed. ๐‘€ ๐‘› =๐‘‚ ๐‘› log ๐‘› log ๐‘› log ๐‘› ร—๐‘€ ๐‘‚ log ๐‘› =O(๐‘› ๐‘€(๐‘‚( log ๐‘›))) Number of arithmetical operations in an ๐น๐น๐‘‡ of size ๐‘›/ log ๐‘› . Number of bit operations per each arithmetical operation.

60 Schรถnhage-Strassenโ€™s algorithm
๐‘€ ๐‘› =๐‘‚ ๐‘› 2 ๐‘€ ๐‘› =O(๐‘› ๐‘€(๐‘‚( log ๐‘›))) ๐‘€ ๐‘› =๐‘‚ ๐‘› log ๐‘› 2 ๐‘€ ๐‘› =๐‘‚ ๐‘› log ๐‘› log log ๐‘› 2 ๐‘€ ๐‘› =๐‘‚ ๐‘› log ๐‘› (log log ๐‘› ) log log log ๐‘› 2 โ‹ฎ

61 Integer Multiplication
[Schรถnhage-Strassen (1971)] obtained an improved version of their algorithm with a running time of ??? ๐‘€ ๐‘› =๐‘‚ ๐‘› log ๐‘› ( log log ๐‘› ) Improvement obtained by performing the ๐น๐น๐‘‡s in a suitable integer ring in which ๐œ”=2 is a primitive root of unity. Multiplications by powers of ๐œ” are very cheap! No numerical issues! [Fรผrer (2007)] and [De-Kurur-Saha-Saptharishi (2008)] improved the running time to ๐‘€ ๐‘› =๐‘‚ ๐‘› log ๐‘› 2 ๐‘‚ log โˆ— ๐‘›

62 String Matching abraabracadabracadabraabara abracadabra abracadabra
Given a text of length ๐‘› and a pattern of length ๐‘š, find all occurrences of the pattern in the text. The naรฏve algorithm runs in ๐‘‚ ๐‘š๐‘› time. Several classical algorithms run in ๐‘‚ ๐‘š+๐‘› time. [Knuth-Morris-Pratt (1977)] [Boyer-Moore (1977)]

63 More String Matching Problems
abraabracadabracadabraabara abracadabra abracadabra Count the number of matches/mismatches in each alignment of the pattern with the text. Find all aligments with at most ๐‘˜ mismatches. Allow a wildcard (โ€œdonโ€™t careโ€) (โˆ—) that match any (single) symbol in the pattern and/or text. โ€œTraditionalโ€ string matching techniques are not so efficient for these extensions.

64 (Cross-)Correlation ๐‘ฅ 0 ๐‘ฅ 1 ๐‘ฅ 2 ๐‘ฅ 3 ๐‘ฆ 0 ๐‘ฆ 1 ๐‘ฆ 2 ๐‘ฆ 3 ๐‘ง โˆ’3 = ๐‘ฅ 0 ๐‘ฆ 3
๐‘ง โˆ’3 = ๐‘ฅ 0 ๐‘ฆ 3 ๐‘ง โˆ’2 = ๐‘ฅ 0 ๐‘ฆ 2 + ๐‘ฅ 1 ๐‘ฆ 3 ๐‘ง โˆ’1 = ๐‘ฅ 0 ๐‘ฆ 1 + ๐‘ฅ 1 ๐‘ฆ 2 + ๐‘ฅ 2 ๐‘ฆ 3 ๐‘ง 0 = ๐‘ฅ 0 ๐‘ฆ 0 + ๐‘ฅ 1 ๐‘ฆ 1 + ๐‘ฅ 2 ๐‘ฆ 2 + ๐‘ฅ 3 ๐‘ฆ 3 ๐‘ง 1 = ๐‘ฅ 1 ๐‘ฆ 0 + ๐‘ฅ 2 ๐‘ฆ 1 + ๐‘ฅ 3 ๐‘ฆ 2 ๐‘ง 2 = ๐‘ฅ 2 ๐‘ฆ 0 + ๐‘ฅ 3 ๐‘ฆ 1 ๐‘ง 3 = ๐‘ฅ 3 ๐‘ฆ 0

65 (Cross-)Correlation ๐‘ง ๐‘˜ = ๐‘– ๐‘ฅ ๐‘– ๐‘ฆ ๐‘–โˆ’๐‘˜ = ๐‘— ๐‘ฅ ๐‘—+๐‘˜ ๐‘ฆ ๐‘— = ๐ฑโˆ— ๐ฒ ๐‘… ๐‘˜+๐‘šโˆ’1
A convolution without the initial reversal, with a shift of indices. ๐‘ง ๐‘˜ = ๐‘– ๐‘ฅ ๐‘– ๐‘ฆ ๐‘–โˆ’๐‘˜ = ๐‘— ๐‘ฅ ๐‘—+๐‘˜ ๐‘ฆ ๐‘— = ๐ฑโˆ— ๐ฒ ๐‘… ๐‘˜+๐‘šโˆ’1 If ๐ฑ is of length ๐‘› and ๐ฒ of length ๐‘š, where ๐‘šโ‰ค๐‘›, then ๐‘˜=1โˆ’๐‘š,โ€ฆ,๐‘›โˆ’1. Sometimes, only the values ๐‘˜=0,โ€ฆ,๐‘›โˆ’๐‘š, corresponding to a full overlap of ๐ฑ with a shift of ๐ฒ, are of interest. The correlation of two vectors of length ๐‘› can be computed in ๐‘‚ ๐‘› log ๐‘› time. Exercise: The correlation of two vectors of length ๐‘› and ๐‘š, where ๐‘šโ‰ค๐‘›, can be computed in ๐‘‚ ๐‘› log ๐‘š time.

66 Counting mismatches [Fischer-Paterson (1974)]
Let ฮฃ be the alphabet of the pattern and text. We may assume that ฮฃ โ‰ค๐‘š+1. (Why?) For every ๐‘Žโˆˆฮฃ create two Boolean strings: ๐‘ƒ ๐‘Ž ๐‘— =1 iff ๐‘ƒ ๐‘— =๐‘Ž ๐‘‡ ๐‘Ž ๐‘– =1 iff ๐‘‡ ๐‘– โ‰ ๐‘Ž Correlation of ๐‘ƒ ๐‘Ž and ๐‘‡ ๐‘Ž counts mismatches involving ๐‘Ž. Summing over all ๐‘Žโˆˆฮฃ we get the total no. of mismatches. Complexity: ๐‘‚( ฮฃ ๐‘› log ๐‘š ) word operations. (Each word assumed to hold ฮ˜ log ๐‘› bits.) Fast only if ฮฃ is small.

67 Counting mismatches with wildcards [Fischer-Paterson (1974)]
For every ๐‘Žโˆˆฮฃ create two Boolean strings: ๐‘ƒ ๐‘Ž ๐‘— =1 iff ๐‘ƒ ๐‘— =๐‘Ž ๐‘‡ ๐‘Ž ๐‘– =1 iff ๐‘‡ ๐‘– โ‰ ๐‘Ž and ๐‘‡ ๐‘– โ‰  โˆ— Complexity: ๐‘‚( ฮฃ ๐‘› log ๐‘š ) word operations. If we only want to find exact matches, replace each character ๐‘Žโˆˆฮฃ by a log 2 |ฮฃ| bit string. Complexity drops to ๐‘‚( log ฮฃ ๐‘› log ๐‘š ). Can we get rid of the dependence on |ฮฃ| ?

68 ๐ฟ 2 -matching [Lipsky-Porat (2011)]
Standard string matching uses the Hamming distance. Two characters either match or they do not. ๐‘Ž is not closer to ๐‘ than to ๐‘ง. Suppose that each โ€œcharacterโ€ is a real number. We want to find approximate matches. For each ๐‘˜=0,1,โ€ฆ,๐‘›โˆ’๐‘š we want to compute ๐‘‘ ๐‘˜ = ๐‘—=0 ๐‘šโˆ’1 ๐‘ ๐‘— โˆ’ ๐‘ก ๐‘˜+๐‘— 2 ๐ฟ 2 -distance: ๐ฑโˆ’๐ฒ 2 = ๐‘—=0 ๐‘šโˆ’1 ๐‘ฅ ๐‘— โˆ’ ๐‘ฆ ๐‘— 2

69 ๐ฟ 2 -matching can be computed in ๐‘‚(๐‘› log ๐‘š ) time.
[Lipsky-Porat (2011)] ๐‘—=0 ๐‘šโˆ’1 ๐‘ ๐‘— โˆ’ ๐‘ก ๐‘˜+๐‘— 2 = ๐‘—=0 ๐‘šโˆ’1 ๐‘ ๐‘— 2 โˆ’2 ๐‘—=0 ๐‘šโˆ’1 ๐‘ ๐‘— ๐‘ก ๐‘˜+๐‘— + ๐‘—=0 ๐‘šโˆ’1 ๐‘ก ๐‘˜+๐‘— 2 Constant. ๐‘‚(๐‘š) time. Correlation. ๐‘‚ ๐‘› log ๐‘š time. Easy in ๐‘‚ ๐‘› time. ๐ฟ 2 -matching can be computed in ๐‘‚(๐‘› log ๐‘š ) time.

70 Exact matches with wildcards
[Clifford-Clifford (2007)] Replace each character by a positive integer. Replace the wildcard by 0. For each ๐‘˜=0,1,โ€ฆ,๐‘›โˆ’๐‘š compute ๐‘‘ ๐‘˜ = ๐‘—=0 ๐‘šโˆ’1 ๐‘ ๐‘— ๐‘ก ๐‘˜+๐‘— ๐‘ ๐‘— โˆ’ ๐‘ก ๐‘˜+๐‘— 2 There is an exact match at position ๐‘˜ iff ๐‘‘ ๐‘˜ =0.

71 Exact matches with wildcards
[Clifford-Clifford (2007)] ๐‘‘ ๐‘˜ = ๐‘—=0 ๐‘šโˆ’1 ๐‘ ๐‘— ๐‘ก ๐‘˜+๐‘— ๐‘ ๐‘— โˆ’ ๐‘ก ๐‘˜+๐‘— 2 = ๐‘—=0 ๐‘šโˆ’1 ๐‘ ๐‘— 3 ๐‘ก ๐‘˜+๐‘— โˆ’2 ๐‘—=0 ๐‘šโˆ’1 ๐‘ ๐‘— 2 ๐‘ก ๐‘˜+๐‘— 2 + ๐‘—=0 ๐‘šโˆ’1 ๐‘ ๐‘— ๐‘ก ๐‘˜+๐‘— 3 Compute three correlations of appropriate sequences in ๐‘‚ ๐‘š log ๐‘› time. Running time is independent of |ฮฃ| ! Assuming that each character fits in an ฮ˜ log ๐‘› -bit word and that operations on such words takes constant time.

72 Not covered in class this term
Bonus material Not covered in class this term โ€œCareful. We donโ€™t want to learn from this.โ€ (Calvin in Bill Wattersonโ€™s โ€œCalvin and Hobbesโ€)

73 Continuous Fourier Transform
If ๐‘“:โ„โ†’โ„‚, then its Fourier transform ๐‘“ :โ„โ†’โ„‚ is: ๐‘“ (๐‘ฆ)= โˆ’โˆž โˆž ๐‘“(๐‘ฅ) ๐‘’ โˆ’2๐œ‹๐‘–๐‘ฅ๐‘ฆ ๐‘‘๐‘ฅ ๐‘“(๐‘ฅ)= โˆ’โˆž โˆž ๐‘“ (๐‘ฆ) ๐‘’ 2๐œ‹๐‘–๐‘ฅ๐‘ฆ ๐‘‘๐‘ฆ (Some conditions apply.)

74 Fourier series ๐‘“ ๐‘ฅ = ๐‘˜=โˆ’โˆž โˆž ๐‘ ๐‘˜ ๐‘’ ๐‘–๐‘˜๐‘ฅ ๐‘ ๐‘˜ = 1 2๐œ‹ โˆ’๐œ‹ ๐œ‹ ๐‘“ ๐‘ฅ ๐‘’ โˆ’๐‘–๐‘˜๐‘ฅ ๐‘‘๐‘ฅ
If ๐‘“:[โˆ’๐œ‹,๐œ‹]โ†’โ„‚, then its Fourier series is: ๐‘“ ๐‘ฅ = ๐‘˜=โˆ’โˆž โˆž ๐‘ ๐‘˜ ๐‘’ ๐‘–๐‘˜๐‘ฅ where: ๐‘ ๐‘˜ = 1 2๐œ‹ โˆ’๐œ‹ ๐œ‹ ๐‘“ ๐‘ฅ ๐‘’ โˆ’๐‘–๐‘˜๐‘ฅ ๐‘‘๐‘ฅ (Some conditions apply.)

75 Polynomial interpolation
๐‘ฅ 1 , ๐‘ฅ 2 ,โ€ฆ, ๐‘ฅ ๐‘› โˆˆ๐”ฝ distinct ๐‘ฆ 1 , ๐‘ฆ 2 ,โ€ฆ, ๐‘ฆ ๐‘› โˆˆ๐”ฝ (not necessarily distinct) There is a unique polynomial ๐‘ ๐‘ฅ = ๐‘–=0 ๐‘›โˆ’1 ๐‘Ž ๐‘– ๐‘ฅ ๐‘– such that ๐‘ ๐‘ฅ 1 = ๐‘ฆ 1 , ๐‘ ๐‘ฅ 2 = ๐‘ฆ 2 , โ€ฆ, ๐‘ ๐‘ฅ ๐‘› = ๐‘ฆ ๐‘› which can be found by solving by solving the linear equations: 1 ๐‘ฅ 1 โ€ฆ ๐‘ฅ 1 ๐‘›โˆ’1 1 ๐‘ฅ 2 โ€ฆ ๐‘ฅ 2 ๐‘›โˆ’1 โ‹ฎ โ‹ฎ โ‹ฑ โ‹ฎ 1 ๐‘ฅ ๐‘˜ โ€ฆ ๐‘ฅ ๐‘˜ ๐‘›โˆ’ ๐‘Ž 0 ๐‘Ž 1 โ‹ฎ ๐‘Ž ๐‘›โˆ’1 = ๐‘ฆ 1 ๐‘ฆ 2 โ‹ฎ ๐‘ฆ ๐‘› A solution exists and is unique because the matrix, known as a Vandermonde matrix, is non-singular.

76 Vandermonde Determinant

77 Lagrange formula ๐‘ ๐‘ฅ = ๐‘˜=0 ๐‘›โˆ’1 ๐‘ฆ ๐‘˜ ๐‘—โ‰ ๐‘˜ ๐‘ฅโˆ’ ๐‘ฅ ๐‘— ๐‘—โ‰ ๐‘˜ ๐‘ฅ ๐‘˜ โˆ’ ๐‘ฅ ๐‘—
๐‘ฅ 1 , ๐‘ฅ 2 ,โ€ฆ, ๐‘ฅ ๐‘› โˆˆ๐”ฝ distinct ๐‘ฆ 1 , ๐‘ฆ 2 ,โ€ฆ, ๐‘ฆ ๐‘› โˆˆ๐”ฝ (not necessarily distinct) The unique polynomial ๐‘ ๐‘ฅ = ๐‘–=0 ๐‘›โˆ’1 ๐‘Ž ๐‘– ๐‘ฅ ๐‘– such that ๐‘ ๐‘ฅ 1 = ๐‘ฆ 1 , ๐‘ ๐‘ฅ 2 = ๐‘ฆ 2 , โ€ฆ, ๐‘ ๐‘ฅ ๐‘› = ๐‘ฆ ๐‘› can be obtained as follows: ๐‘ ๐‘ฅ = ๐‘˜=0 ๐‘›โˆ’1 ๐‘ฆ ๐‘˜ ๐‘—โ‰ ๐‘˜ ๐‘ฅโˆ’ ๐‘ฅ ๐‘— ๐‘—โ‰ ๐‘˜ ๐‘ฅ ๐‘˜ โˆ’ ๐‘ฅ ๐‘—

78 ๐น๐น๐‘‡ decomposition Suppose that ๐‘›= ๐‘› 1 ๐‘› 2 .
To compute an ๐น๐น๐‘‡ of ๐‘› numbers: Input the numbers row by row into an ๐‘› 1 ร— ๐‘› 2 matrix. Do an ๐น๐น๐‘‡ of dimension ๐‘› 1 on each column. Multiply the ๐‘—-th column by ๐œ” ๐‘› ๐‘— . Do an ๐น๐น๐‘‡ of dimension ๐‘› 2 on each row. Output the numbers in the matrix column by column. In the standard algorithm, ๐‘› 1 =๐‘›/2 and ๐‘› 2 =2.

79 Raderโ€™s ๐น๐น๐‘‡ algorithm When ๐‘› is prime, ๐ท๐น๐‘‡ reduces to cyclic convolution. Let ๐‘” be a generator of โ„ค ๐‘› โˆ— . 1,๐‘”, ๐‘” 2 ,.., ๐‘” ๐‘›โˆ’2 and 1, ๐‘” โˆ’1 , ๐‘” โˆ’2 ,.., ๐‘” โˆ’(๐‘›โˆ’2) , computed mod ๐‘›, are permutations of 1,2,โ€ฆ,๐‘›โˆ’1. ๐‘ฆ ๐‘— = ๐‘˜=0 ๐‘›โˆ’1 ๐œ” ๐‘› ๐‘—๐‘˜ ๐‘ฅ ๐‘˜ = ๐‘ฅ 0 + ๐‘˜=0 ๐‘›โˆ’2 ๐œ” ๐‘— ๐‘” โˆ’๐‘˜ ๐‘ฅ ๐‘” โˆ’๐‘˜ , ๐‘—=0,1,โ€ฆ,๐‘›โˆ’1 ๐‘ฆ ๐‘” ๐‘— = ๐‘ฅ 0 + ๐‘˜=0 ๐‘›โˆ’2 ๐œ” ๐‘” ๐‘— ๐‘” โˆ’๐‘˜ ๐‘ฅ ๐‘” โˆ’๐‘˜ = ๐‘ฅ 0 + ๐‘˜=0 ๐‘›โˆ’2 ๐œ” ๐‘” ๐‘—โˆ’๐‘˜ ๐‘ฅ ๐‘” โˆ’๐‘˜ , ๐‘—=0,1,โ€ฆ,๐‘›โˆ’2 ๐‘ฅ ๐‘— โ€ฒ = ๐‘ฆ ๐‘” โˆ’๐‘— , ๐‘ฆ ๐‘— โ€ฒ = ๐‘ฆ ๐‘” ๐‘— โˆ’ ๐‘ฅ 0 , ๐‘ค ๐‘— = ๐œ” ๐‘” ๐‘— , ๐‘—=0,1,โ€ฆ,๐‘›โˆ’2, ๐ฒ โ€ฒ =๐ฐโŠ›๐ฑโ€ฒ

80 Example: ๐‘›=7, ๐‘”=3 (without first row and column)
Raderโ€™s ๐น๐น๐‘‡ algorithm Example: ๐‘›=7, ๐‘”=3 (without first row and column) ๐‘ฆ 1 ๐‘ฆ 2 ๐‘ฆ 3 ๐‘ฆ 4 ๐‘ฆ 5 ๐‘ฆ = ๐œ” 1 ๐œ” 2 ๐œ” 3 ๐œ” 4 ๐œ” 5 ๐œ” 6 ๐œ” 2 ๐œ” 4 ๐œ” 6 ๐œ” 1 ๐œ” 3 ๐œ” 5 ๐œ” 3 ๐œ” 6 ๐œ” 2 ๐œ” 5 ๐œ” 1 ๐œ” 4 ๐œ” 4 ๐œ” 1 ๐œ” 5 ๐œ” 2 ๐œ” 6 ๐œ” 3 ๐œ” 5 ๐œ” 3 ๐œ” 1 ๐œ” 6 ๐œ” 4 ๐œ” 2 ๐œ” 6 ๐œ” 5 ๐œ” 4 ๐œ” 3 ๐œ” 2 ๐œ” ๐‘ฅ 1 ๐‘ฅ 2 ๐‘ฅ 3 ๐‘ฅ 4 ๐‘ฅ 5 ๐‘ฅ 6 ๐‘ฆ 1 ๐‘ฆ 3 ๐‘ฆ 2 ๐‘ฆ 6 ๐‘ฆ 4 ๐‘ฆ = ๐œ” 1 ๐œ” 5 ๐œ” 4 ๐œ” 6 ๐œ” 2 ๐œ” 3 ๐œ” 3 ๐œ” 1 ๐œ” 5 ๐œ” 4 ๐œ” 6 ๐œ” 2 ๐œ” 2 ๐œ” 3 ๐œ” 1 ๐œ” 5 ๐œ” 4 ๐œ” 6 ๐œ” 6 ๐œ” 2 ๐œ” 3 ๐œ” 1 ๐œ” 5 ๐œ” 4 ๐œ” 4 ๐œ” 6 ๐œ” 2 ๐œ” 3 ๐œ” 1 ๐œ” 5 ๐œ” 5 ๐œ” 4 ๐œ” 6 ๐œ” 2 ๐œ” 3 ๐œ” ๐‘ฅ 1 ๐‘ฅ 5 ๐‘ฅ 4 ๐‘ฅ 6 ๐‘ฅ 2 ๐‘ฅ 3

81 Raderโ€™s ๐น๐น๐‘‡ algorithm When ๐‘› is prime, ๐ท๐น๐‘‡ reduces to cyclic convolution. Cyclic convolution can be computed using ๐น๐น๐‘‡s of a larger size, e.g., a power of 2, by padding. We thus get an ๐‘‚ ๐‘› log ๐‘› algorithm for computing a ๐ท๐น๐‘‡ of size ๐‘› when ๐‘› is prime. When ๐‘› is composite, we can decompose the problem. The end result is an ๐‘‚ ๐‘› log ๐‘› algorithm for computing a ๐ท๐น๐‘‡ for any ๐‘›. When ๐‘›= 2 ๐‘˜ , the algorithm is most efficient.

82 Negative Cyclic Convolution
๐‘ง ๐‘˜ = ๐‘–+๐‘—=๐‘˜ ๐‘ฅ ๐‘– ๐‘ฆ ๐‘— Cyclic Convolution: ๐‘ง ๐‘˜ = ๐‘–+๐‘—โ‰ก๐‘˜ mod ๐‘› ๐‘ฅ ๐‘– ๐‘ฆ ๐‘— = ๐‘–+๐‘—=๐‘˜ ๐‘ฅ ๐‘– ๐‘ฆ ๐‘— + ๐‘–+๐‘—=๐‘›+๐‘˜ ๐‘ฅ ๐‘– ๐‘ฆ ๐‘— Negative Cyclic Convolution: ๐‘ง ๐‘˜ = ๐‘–+๐‘—=๐‘˜ ๐‘ฅ ๐‘– ๐‘ฆ ๐‘— โˆ’ ๐‘–+๐‘—=๐‘›+๐‘˜ ๐‘ฅ ๐‘– ๐‘ฆ ๐‘—

83 Negative Cyclic Convolution
Polynomial multiplication modulo ๐‘ฅ ๐‘› +1. A naรฏve way of computing the negative cyclic convolution is to first compute the non-cyclic convolution. Let ๐œ” be an ๐‘›-th primitive root of unity. Let ๐œ“ be such that ๐œ“ 2 =๐œ”. ๐›™=(1,๐œ“,โ€ฆ, ๐œ“ ๐‘›โˆ’1 ) , ๐›™ โˆ’1 =(1, ๐œ“ โˆ’1 ,โ€ฆ, ๐œ“ โˆ’ ๐‘›โˆ’1 ) The negative cyclic convolution of ๐ฑ and ๐ฒ is: ๐›™ โˆ’1 โˆ™๐ท๐น ๐‘‡ โˆ’1 ๐ท๐น๐‘‡ ๐›™โˆ™๐ฑ โˆ™๐ท๐น๐‘‡ ๐›™โˆ™๐ฒ This saves a factor of 2 and plays an important role in the modular Schรถnhage-Strassen algorithm.

84 DFT and FFT in rings To define ๐ท๐น๐‘‡ and ๐ท๐น ๐‘‡ โˆ’1 over a ring we need an primitive ๐‘›-th root of unity ๐œ”. If ๐ท๐น๐‘‡ and ๐ท๐น ๐‘‡ โˆ’1 are well defined, then the ๐น๐น๐‘‡ algorithm can be used to compute them. An element ๐œ” is a primitive ๐‘›-th root of unity in ๐‘… iff: (i) ๐œ” ๐‘› =1, (ii) ๐‘—=0 ๐‘›โˆ’1 ๐œ” ๐‘—๐‘˜ =0, for ๐‘˜=0,1,โ€ฆ,๐‘›โˆ’1, (iii) ๐‘› has an inverse in ๐‘…. In โ„‚, ๐œ” ๐‘› = ๐‘’ 2๐œ‹๐‘–/๐‘› is a primitive ๐‘›-th root of unity. So is ๐œ” ๐‘› ๐‘— = ๐‘’ 2๐œ‹๐‘–๐‘—/๐‘› , if ๐‘— is relatively prime to ๐‘›. In โ„, there is such a root only for ๐‘›=2โ€ฆ Are there other useful rings in which ๐ท๐น๐‘‡ can be performed?

85 ๐‘…,โˆ™ is a commutative monoid:
Rings A (commutative) ring ๐‘…, โˆ™ ,+ is a set ๐‘… with two binary operations โˆ™,+:๐‘…ร—๐‘…โ†’๐‘… such that: ๐‘…,+ is an abelian group: โˆ€๐‘Ž,๐‘โˆˆ๐‘… . ๐‘Ž+๐‘=๐‘+๐‘Ž โˆ€๐‘Ž,๐‘,๐‘โˆˆ๐‘… . ๐‘Ž+ ๐‘+๐‘ = ๐‘Ž+๐‘ +๐‘ โˆƒ 0โˆˆ๐‘… . โˆ€ ๐‘Žโˆˆ๐‘… . ๐‘Ž+0=๐‘Ž โˆ€ ๐‘Žโˆˆ๐‘… . โˆƒ โˆ’๐‘Žโˆˆ๐‘… . ๐‘Ž+(โˆ’๐‘Ž)=0 ๐‘…,โˆ™ is a commutative monoid: โˆ€๐‘Ž,๐‘โˆˆ๐‘… . ๐‘Žโˆ™๐‘=๐‘โˆ™๐‘Ž โˆ€๐‘Ž,๐‘,๐‘โˆˆ๐‘… . ๐‘Žโˆ™ ๐‘โˆ™๐‘ = ๐‘Žโˆ™๐‘ โˆ™๐‘ โˆƒ 1โˆˆ๐‘… . โˆ€ ๐‘Žโˆˆ๐‘… . ๐‘Žโˆ™1=๐‘Ž Distributive law: โˆ€๐‘Ž,๐‘,๐‘โˆˆ๐‘… . ๐‘Žโˆ™ ๐‘+๐‘ =๐‘Žโˆ™๐‘+๐‘โˆ™๐‘

86 Rings and Fields A (commutative) ring ๐‘…, โˆ™ ,+ is a field if there are also multiplicative inverses: ๐‘…โˆ– 0 , โˆ™ is an abelian group, not just a monoid: โˆ€ ๐‘Žโˆˆ๐‘…โˆ–{0} . โˆƒ ๐‘Ž โˆ’1 โˆˆ๐‘… . ๐‘Žโˆ™ ๐‘Ž โˆ’1 =1 Examples: โ„•, โˆ™ ,+ - The natural numbers do not form a ring. โ„ค, โˆ™ ,+ - The integers form a ring, but not a field. โ„, โˆ™ ,+ - The real numbers form a field. โ„‚, โˆ™ ,+ - The complex numbers form a field. More Examples: โ„ค ๐‘š , โˆ™ ,+ - The integers modulo ๐‘š (see next slide). ๐‘…[๐‘ฅ], โˆ™ ,+ - The ring of polynomials (in ๐‘ฅ) over a ring. โ€ฆ

87 Modular arithmetic ( โ„ค ๐‘š = 0,1,โ€ฆ,๐‘šโˆ’1 , โˆ™ ,+ )
( โ„ค ๐‘š = 0,1,โ€ฆ,๐‘šโˆ’1 , โˆ™ ,+ ) Addition and multiplication performed modulo ๐‘š. For example, if ๐‘š=12, then 7+6=1 and 4โˆ™3=0. Lemma: โ„ค ๐‘š is a ring, for every integer ๐‘š. Theorem: โ„ค ๐‘š is a field, if and only if ๐‘š is prime. For example, if ๐‘š=17, then 5 โˆ’1 =7. (Why?) Addition, multiplication modulo ๐‘š can be performed using addition, multiplication and division of numbers up to ๐‘š 2 .

88 Generators of prime fields
Theorem: If ๐‘ is prime, then in โ„ค ๐‘ has a generator, i.e., an element ๐‘” such that ๐‘” ๐‘โˆ’1 =1 but ๐‘” ๐‘– โ‰ 1, for ๐‘–=2,3,โ€ฆ,๐‘โˆ’2. (Fermatโ€™s Little Theorem: If ๐‘ is prime, then for every ๐‘Žโ‰ 0 in โ„ค ๐‘ we have ๐‘Ž ๐‘โˆ’1 =1.) 2 is not a generator of โ„ค 17 , as 2 8 โ‰ก1 (mod 17) . But ๐‘”=3 is a generator of โ„ค 17 : ๐‘” ๐‘– , for ๐‘–=1,โ€ฆ,16 evaluates to 3, 9, 10, 13, 5, 15, 11, 16, 14, 8, 7, 4, 12, 2, 6, 1 Lemma: If ๐‘ is prime and ๐‘› | ๐‘โˆ’1 and ๐‘˜=(๐‘โˆ’1)/๐‘›, then ๐œ”= ๐‘” ๐‘˜ is a primitive ๐‘›-th root of unity in โ„ค ๐‘ .

89 FFT in prime fields Example: Multiply two integer polynomials of degree < 512. We need to compute ๐น๐น๐‘‡ and ๐น๐น ๐‘‡ โˆ’1 with ๐‘›=1024. Find a prime ๐‘ such that 1024 | ๐‘โˆ’1. For example, ๐‘=12โˆ™1024+1=12,289. But, we will get the coefficients modulo 12,289 โ€ฆ Suppose the input coefficients are in the range 0,1,โ€ฆ,1023. The output coefficients are in the range 0,1,โ€ฆ, โˆ’1. Find a prime ๐‘> such that 1024 | ๐‘โˆ’1. For example, ๐‘=( )โˆ™1024+1=1,073,750,017. (Still fits in one 32-bit word.) We can take ๐‘”=5, and ๐œ”= ๐‘” (๐‘โˆ’1)/1024 =381,780,781.

90 FFT in prime fields Example: Multiply two integer polynomials of degree < 512. What do we gain by working modulo 1,073,750,017 instead of working over the complex numbers? Modular arithmetic may be more โ€œelegantโ€. We donโ€™t have to worry about numerical errors. But, modular arithmetic is not necessarily faster than working with floating point complex numbers. We need to find appropriate prime numbers and generators. The prime number theorem: The number of prime numbers less than ๐‘› is about ๐‘›/ ln ๐‘› .

91 FFT using modular arithmetic
To support DFT and FFT a ring does not have to be a field. The main advantage of using modular arithmetic comes from choosing rings with very special primitive roots of unity. Lemma: Let ๐‘› and ๐œ” be positive powers of 2. Then, ๐œ” is a primitive ๐‘›-th root of unity in โ„ค ๐‘š , where ๐‘š= ๐œ” ๐‘›/2 +1. ๐œ” is a power of 2. ๐‘š is one more than a power of 2. Multiplications by ๐œ” ๐‘˜ or ๐œ” โˆ’๐‘˜ = ๐œ” ๐‘›โˆ’๐‘˜ are just shifts! Mod ๐‘š is a very simple operation. ๐‘šโˆ’1= ๐œ” ๐‘›/2 = 2 ๐‘ , ๐‘= ๐‘›/2 log ๐œ” ๐‘ฅ= ๐‘ฅ 1 2 ๐‘ + ๐‘ฅ 0 ๏ƒ  ๐‘ฅ mod ๐‘š= ๐‘ฅ 0 โˆ’ ๐‘ฅ 1

92 Schรถnhage-Strassenโ€™s algorithm
FFT performs ๐‘‚ ๐‘› log ๐‘› arithmetical operations. However, they are all either additions or multiplications by ๐œ” ๐‘˜ . To compute a convolution, we only need ๐‘› multiplications, other than multiplications by ๐œ” ๐‘˜ . Break two ๐‘›-bit integers to ๐‘› 1 blocks of ๐‘› 2 -bits each. ๐‘€ ๐‘› =๐‘‚ ๐‘› 1 log ๐‘› 1 ร—๐‘€ ๐‘‚( ๐‘› 2 ) If multiplications by ๐œ” ๐‘˜ are essentially additions, then ๐‘€ ๐‘› =๐‘‚ ๐‘› 1 log ๐‘› 1 ๐‘‚( ๐‘› 2 )+ ๐‘› 1 ๐‘€ ๐‘‚( ๐‘› 2 ) There are some technical problems to overcome. We have to choose ๐‘› 1 = ๐‘› rather than ๐‘› 1 =๐‘›/ log ๐‘› . The end result is an integer multiplication algorithm that performs only ๐‘‚(๐‘› log ๐‘› log log ๐‘› ) bit operations.

93 Modular Schรถnhage-Strassen
Let ๐‘ข,๐‘ฃ be two ๐‘›-bit integers. The algorithm computes ๐‘ข๐‘ฃ mod ( 2 ๐‘› +1). ๐‘›= 2 ๐‘˜ , ๐‘= 2 ๐‘˜/2 , ๐‘™=๐‘›/๐‘ Break ๐‘ข,๐‘ฃ into ๐‘ ๐‘™-bit blocks. Note that ๐‘|๐‘™. ๐‘ข= ๐‘ข ๐‘โˆ’1 2 ๐‘โˆ’1 ๐‘™ +โ€ฆ+ ๐‘ข 1 2 ๐‘™ + ๐‘ข 0 ๐‘ฃ= ๐‘ฃ ๐‘โˆ’1 2 ๐‘โˆ’1 ๐‘™ +โ€ฆ+ ๐‘ฃ 1 2 ๐‘™ + ๐‘ฃ 0 ๐‘ข๐‘ฃ = ๐‘ฆ 2๐‘โˆ’ ๐‘โˆ’2 ๐‘™ +โ€ฆ+ ๐‘ฆ 1 2 ๐‘™ + ๐‘ฆ 0 ๐‘ข๐‘ฃ mod 2 ๐‘› +1 = ๐‘ค ๐‘โˆ’1 2 ๐‘โˆ’1 ๐‘™ +โ€ฆ+ ๐‘ค 1 2 ๐‘™ + ๐‘ค 0 ๐‘ค ๐‘– = ๐‘ฆ ๐‘– โˆ’ ๐‘ฆ ๐‘›+๐‘– is the negative cyclic convolution of ๐‘ข and ๐‘ฃ. โˆ’ ๐‘โˆ’ ๐‘– ๐‘™ <๐‘ค ๐‘– < ๐‘– ๐‘™ Enough to compute each ๐‘ค ๐‘– modulo 2 2๐‘™ +1 and modulo ๐‘.

94 Modular Schรถnhage-Strassen
๐‘ค ๐‘– = ๐‘ฆ ๐‘– โˆ’ ๐‘ฆ ๐‘›+๐‘– is the negative cyclic convolution of ๐‘ข and ๐‘ฃ. Enough to compute each ๐‘ค ๐‘– modulo 2 2๐‘™ +1 and modulo ๐‘. To compute ๐‘ค ๐‘– mod ๐‘, pack the ๐‘ข ๐‘– , ๐‘ฃ ๐‘– into two large integers, with some padding between consecutive digits, and preform one integer product using Karatsubaโ€™s algorithm. As ๐‘=๐‘‚ ๐‘› , the cost of this step is ๐‘‚ ๐‘› . To compute ๐‘ค ๐‘– mod ( 2 2๐‘™ +1), compute a negative circular convolution modulo ๐‘š=2 2๐‘™ +1 using ๐น๐น๐‘‡ and ๐น๐น ๐‘‡ โˆ’1 . ๐œ”= 2 4๐‘™/๐‘ , ๐œ” ๐‘/2 = 2 2๐‘™ โ‰กโˆ’1 , ๐œ” ๐‘ โ‰ก1 (mod ๐‘š) ๐œ” is a ๐‘-th primitive root of unity in โ„ค ๐‘š .

95 Modular Schรถnhage-Strassen
๐‘€ ๐‘› โ‰ค๐‘ ๐‘› log ๐‘› +๐‘ ๐‘€(2๐‘™) ๐‘€ โ€ฒ (๐‘›)=๐‘€(๐‘›)/๐‘› ๐‘€ โ€ฒ ๐‘› โ‰ค๐‘ log ๐‘› +2๐‘€โ€ฒ 4 ๐‘› ๐‘› 0 =๐‘› , ๐‘› ๐‘˜ =4 ๐‘› ๐‘˜โˆ’1 โŸน ๐‘› ๐‘˜ โ‰ค16 ๐‘› 2 โˆ’๐‘˜ ๐‘€ โ€ฒ ๐‘› โ‰ค๐‘( log ๐‘› log ๐‘› log ๐‘› 2 +โ€ฆ) ๐‘€ โ€ฒ ๐‘› โ‰ค๐‘ ๐‘˜=0 log log ๐‘› โˆ’1 2 ๐‘˜ log ๐‘› ๐‘˜ โ‰ˆ๐‘ ๐‘˜=0 log log ๐‘› โˆ’1 log ๐‘› =๐‘ log ๐‘› log log ๐‘› ๐‘€ ๐‘› โ‰ˆ๐‘๐‘› log ๐‘› log log ๐‘›


Download ppt "Fast Fourier Transform"

Similar presentations


Ads by Google