Presentation is loading. Please wait.

Presentation is loading. Please wait.

September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.

Similar presentations


Presentation on theme: "September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson."— Presentation transcript:

1 September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson

2 September 4, 1997 Introduction Objective: To obtain fast algorithms for polynomial and integer multiplication based on the FFT. In order to do this we will compute the FFT over a finite field. The existence of FFTs over Zp is related to the primitive element theorem and the prime number theorem. Polynomial multiplication using interpolation Feasibility of mod p FFTs Fast polynomial multiplication Fast integer multiplication (3 primes algorithm) References: Lipson, Cormen et al.

3 Polynomial Multiplication using Interpolation
Compute C(x) = A(x)B(x), where degree(A(x)) = m, and degree(B(x)) = n. Degree(C(x)) = m+n, and C(x) is uniquely determined by its value at m+n+1 distinct points. [Evaluation] Compute A(i) and B(i) for distinct i, i=0,…,m+n. [Pointwise Product] Compute C(i) = A(i)*B(i) for i=0,…,m+n. [Interpolation] Compute the coefficients of C(x) = cnxm+n + … + c1x +c0 from the points C(i) = A(i)*B(i) for i=0,…,m+n.

4 Primitive Element Theorem
Theorem. Let F be a finite field with q = pk elements. Let F* be the q-1 non-zero elements of F. Then F* = <> = {1, , 2, …, q-2} for some   F*.  is called a primitive element. In particular there exist a primitive element for Zp for all prime p. E.G. (Z5)* = {1, 2,22=4,23=3} (Z17)* = {1, 3, 32 = 9, 33 = 10, 34 = 13, 35 = 5, 36 = 15, 37 = 11, 38 = 16, 39 = 14, 310 = 8, 311 = 7, 312 = 4, 313 = 12, 314 = 2, 315 = 6}

5 Modular Discrete Fourier Transform
The n-point DFT is defined over Zp if there is a primitive n-th root of unity in Zp (same is true for any finite field) Let  be a primitive nth root of unity.

6 Example

7 Evaluation Utilizing Symmetry
The cost of evaluation at two points can be reduced if one is the negative of the other. Let A(x) = A1(x2)x + A0(x2), where the coefficients of A1(x) are the odd coefficients of A(x) and the coefficients of A0(x) are the even coefficients of A(x) Since (-)2 = 2, A0( 2) = A0(- 2) and A1( 2) = A1(- 2) A() = A0( 2) +  A1( 2) A(-) = A0( 2) -  A1( 2) Example A(x) = a5x5 + a4x4 + a3x3 + a2x2 + a1x + a0 = A1(x2)x + A0(x2) A1(x) = a5x2 + a3x + a1 A0(x) = a4x2 + a2x + a0

8 Properties of Roots of Unity
Lemma: Let n = 2m, and  be a primitive nth root of unity. Then 2 is a primitive mth root of unity. Lemma: Let n = 2m, and  be a primitive nth root of unity. Then m = -1 and m+k = -k. Therefore, F2m is a Vandermonde matrix where half the points are negatives of the other half. Thus, we can utilize the previous factorization to compute the DFT. Moreover, if n=2k this property can be used recursively.

9 FFT Input: N = 2k, a = (a0,a1,…,aN-1) = a0+a1x + … + aN-1xN-1 Output: A = (A0,A1,…,AN-1) with Ak = a(k) A  FFT(N,a,) if N = 1 then A := a; else n := N/2; b := (a0,a2,…,a2(n-1)); # a(x) = xc(x2) + b(x2) c := (a1,a3,…,a2n-1); B := FFT(n,b, 2); # Bk = b(k2) C := FFT(n,c, 2); # Ck = c(k2) for k from 0 to n-1 do Ak := Bk + k × Ck; # Ak = b(k2) + kc(k2) = a(k) An+k := Bk - k × Ck; # An+k = b(k2) - kc(k2) = a(-k) = a(k+n) end do; end if; end proc;

10 Computing Time Let T(n), n=2k, be the computing time of the FFT then
September 4, 1997 Computing Time Let T(n), n=2k, be the computing time of the FFT then T(n) = 2T(n/2) + (n) T(n) = (nlogn) Master Theorem. Assume T(n) = aT(n/b) + (n) T(n) = (n) [a < b] T(n) = (nlog(n)) [a = b] T(n) = (nlogb(a)) [a > b]

11 Polynomial Multiplication using Interpolation
Compute C(x) = A(x)B(x), where degree(A(x)) = m, and degree(B(x)) = n. Degree(C(x)) = m+n, and C(x) is uniquely determined by its value at m+n+1 distinct points. [Evaluation] Compute A(i) and B(i) for distinct i, i=0,…,m+n. [Pointwise Product] Compute C(i) = A(i)*B(i) for i=0,…,m+n. [Interpolation] Compute the coefficients of C(x) = cnxm+n + … + c1x +c0 from the points C(i) = A(i)*B(i) for i=0,…,m+n.

12 Matrix Version of Polynomial Evaluation
September 4, 1997 Matrix Version of Polynomial Evaluation Evaluate A(x) = a3x3 + a2x2 + a1x + a0 at the points , , ,  eval

13 Matrix Interpretation of Interpolation
September 4, 1997 Matrix Interpretation of Interpolation Let A(x) = anxn + … + a1x +a0 be a polynomial of degree n. Determine the (n+1) coefficients an,…,a1,a0 from the (n+1) values A(0),…,A(n) interpolate

14 V(0,…, n) is non-singular when 0,…, n are distinct.
September 4, 1997 Vandermonde Matrix V(0,…, n) is non-singular when 0,…, n are distinct.

15 Inverse DFT

16 Example

17 Feasibility of mod p FFTs
Theorem: Zp has a primitive Nth root of unity iff N|(p-1) Proof. By the primitive element theorem there exist an element  of order (p-1) Zp. If p-1 = qN, then q is an Nth root of unity. To compute a mod p FFT of size 2m, we must find p = 2e k + 1 (k odd), where e  m. Theorem. Let a and b be relatively prime integers. The number of primes  x in the arithmetic progression ak + b (k=1,2,…) is approximately (somewhat greater) (x/log x)/(a)

18 Fast Polynomial Multiplication
Compute C(x) = a(x)b(x), where degree(a(x)) = m, and degree(b(x)) = n. Degree(c(x)) = m+n, and c(x) is uniquely determined by its value at m+n+1 distinct points. Let N = 2k  m+n+1. [Fourier Evaluation] Compute FFT(N,a(x),,A); FFT(N,b(x),,B). [Pointwise Product] Compute Ck = 1/N Ak * Bk, k=0,…,N-1. [Fourier Interpolation] Compute FFT(N,C,-1,c(x)).

19 Fast Integer Multiplication
Let A = (an-1,…,a1,a0 ) = an-1n-1 + … + a1 + a0 B = (bn-1,…,b1,b0 ) = bn-1n-1 + … + b1 + b0 C = AB = c() = a()b(), where a(x) = an-1xn-1 + … + a1x + a0, b(x) = bn-1xn-1 + … + b1x + b0, and c(x) = a(x)b(x). Idea: Compute a(x)b(x) using FFT-based polynomial multiplication and then evaluate the result at . Computation will be performed mod p for several word sized “Fourier” primes and the Chinese Remainder Theorem will be used to recover the integer product.

20 Three Primes Algorithm
Compute C = AB, where length(A) = m, and length(B) = n. Let a(x) and b(x) be the polynomials whose coefficients are the digits of A and B respectively The algorithm requires K “Fourier primes” p = 2e k + 1 for sufficiently large e [Polynomial multiplication] Compute ci(x) = a(x)b(x) mod pi for i=1,…,K using FFT-based polynomial multiplication. [CRT] Compute c(x)  ci(x) (mod pi) i=1,…,K. [Evaluation at radix] C = c().

21 Analysis of Three Primes Algorithm
Determine K Since the kth coefficient of c(x), , the coefficients of c(x) are bounded by n2 Therefore, we need the product p1 …pK  n2 If we choose pi > , then this is true if K  n2 Assuming n <  [ is typically wordsize - for 32-bit words,   109], only 3 primes are required Theorem. Assume that mod p operations can be performed in O(1) time. Then the 3-primes algorithm can multiply two n-digit numbers in time O(nlogn) provided: n <  n  2E-1, where three Fourier primes p = 2ek + 1 (p > ) can be found with e  E (need to perform the FFT of size 2n)

22 Limitations of 3 Primes Algorithms
If we choose the primes to be wordsize for 32-bit words  < pi < W = 231-1  = 109 n  2E-1 = 223 8.38  106


Download ppt "September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson."

Similar presentations


Ads by Google