September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.

September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson

September 4, 1997 Introduction Objective: To obtain fast algorithms for polynomial and integer multiplication based on the FFT. In order to do this we will compute the FFT over a finite field. The existence of FFTs over Zp is related to the primitive element theorem and the prime number theorem. Polynomial multiplication using interpolation Feasibility of mod p FFTs Fast polynomial multiplication Fast integer multiplication (3 primes algorithm) References: Lipson, Cormen et al.

Polynomial Multiplication using Interpolation
Compute C(x) = A(x)B(x), where degree(A(x)) = m, and degree(B(x)) = n. Degree(C(x)) = m+n, and C(x) is uniquely determined by its value at m+n+1 distinct points. [Evaluation] Compute A(i) and B(i) for distinct i, i=0,…,m+n. [Pointwise Product] Compute C(i) = A(i)*B(i) for i=0,…,m+n. [Interpolation] Compute the coefficients of C(x) = cnxm+n + … + c1x +c0 from the points C(i) = A(i)*B(i) for i=0,…,m+n.

Primitive Element Theorem
Theorem. Let F be a finite field with q = pk elements. Let F* be the q-1 non-zero elements of F. Then F* = <> = {1, , 2, …, q-2} for some   F*.  is called a primitive element. In particular there exist a primitive element for Zp for all prime p. E.G. (Z5)* = {1, 2,22=4,23=3} (Z17)* = {1, 3, 32 = 9, 33 = 10, 34 = 13, 35 = 5, 36 = 15, 37 = 11, 38 = 16, 39 = 14, 310 = 8, 311 = 7, 312 = 4, 313 = 12, 314 = 2, 315 = 6}

Modular Discrete Fourier Transform
The n-point DFT is defined over Zp if there is a primitive n-th root of unity in Zp (same is true for any finite field) Let  be a primitive nth root of unity.

Example

Evaluation Utilizing Symmetry
The cost of evaluation at two points can be reduced if one is the negative of the other. Let A(x) = A1(x2)x + A0(x2), where the coefficients of A1(x) are the odd coefficients of A(x) and the coefficients of A0(x) are the even coefficients of A(x) Since (-)2 = 2, A0( 2) = A0(- 2) and A1( 2) = A1(- 2) A() = A0( 2) +  A1( 2) A(-) = A0( 2) -  A1( 2) Example A(x) = a5x5 + a4x4 + a3x3 + a2x2 + a1x + a0 = A1(x2)x + A0(x2) A1(x) = a5x2 + a3x + a1 A0(x) = a4x2 + a2x + a0 Applied Symbolic Computation

Properties of Roots of Unity
Lemma: -1= . Lemma: Let n = 2m, and  be a primitive nth root of unity. Then 2 is a primitive mth root of unity. Lemma: Let n = 2m, and  be a primitive nth root of unity. Then m = -1 and m+k = -k. Therefore, F2m is a Vandermonde matrix where half the points are negatives of the other half. Thus, we can utilize the previous factorization to compute the DFT. Moreover, if n=2k this property can be used recursively. Applied Symbolic Computation

Applied Symbolic Computation
FFT Input: N = 2k, a = (a0,a1,…,aN-1) = a0+a1x + … + aN-1xN-1 Output: A = (A0,A1,…,AN-1) with Ak = a(k) A  FFT(N,a,) if N = 1 then A := a; else n := N/2; b := (a0,a2,…,a2(n-1)); # a(x) = xc(x2) + b(x2) c := (a1,a3,…,a2n-1); B := FFT(n,b, 2); # Bk = b(k2) C := FFT(n,c, 2); # Ck = c(k2) for k from 0 to n-1 do Ak := Bk + k × Ck; # Ak = b(k2) + kc(k2) = a(k) An+k := Bk - k × Ck; # An+k = b(k2) - kc(k2) = a(-k) = a(k+n) end do; end if; end proc; Applied Symbolic Computation

Fast Fourier Transform
Assume that n = 2m, then F 𝟐𝒎 = F 𝟐 ⨂ I 𝒎 I 𝒎 ⨁ W 𝒎 I 𝟐 ⨂ F 𝒎 L 𝟐 𝟐𝒎 Where I 𝒎 is the identity matrix, L 𝟐 𝟐𝒎 is the permutation that gathers the input at stride 2 𝑨⨂𝑩= 𝑨 𝒊𝒋 𝑩 , 𝑨⨁𝑩=diag(𝑨,𝑩), W 𝒎 =diag 𝟏,𝝎,…, 𝝎 𝒎−𝟏 Let T(n), n=2k, be the computing time of the FFT then T(n) = 2T(n/2) + (n) T(n) = (nlogn)

FFT Factorization over Z5

Inverse DFT

Example

Feasibility of mod p FFTs
Theorem: Zp has a primitive Nth root of unity iff N|(p-1) Proof. By the primitive element theorem there exist an element  of order (p-1) Zp. If p-1 = qN, then q is an Nth root of unity. To compute a mod p FFT of size 2m, we must find p = 2e k + 1 (k odd), where e  m. Theorem. Let a and b be relatively prime integers. The number of primes  x in the arithmetic progression ak + b (k=1,2,…) is approximately (somewhat greater) (x/log x)/(a)

Fast Polynomial Multiplication
Compute C(x) = a(x)b(x), where degree(a(x)) = m, and degree(b(x)) = n. Degree(c(x)) = m+n, and c(x) is uniquely determined by its value at m+n+1 distinct points. Let N  m+n+1. [Fourier Evaluation] Compute FFT(N,a(x),,A); FFT(N,b(x),,B). [Pointwise Product] Compute Ck = 1/N Ak * Bk, k=0,…,N-1. [Fourier Interpolation] Compute FFT(N,C,-1,c(x)).

Fast Modular and Integral Polynomial Multiplication
If Zp has a primitive Nth root of unity then the previous algorithm works fine. If Zp does not have a primitive Nth root of unity, we can either compute in Zq[x] where Zq has an Nth root of unity and is sufficiently large q or we can compute in a larger field that contains an Nth root of unity. In Z[x] use a set of primes p1,…,pt that have an Nth root of unity with p1 * … * pt  size of the resulting integral coefficients (this can easily be computed from the input polynomials) and then use the CRT

Fast Integer Multiplication
Let A = (an-1,…,a1,a0 ) = an-1n-1 + … + a1 + a0 B = (bn-1,…,b1,b0 ) = bn-1n-1 + … + b1 + b0 C = AB = c() = a()b(), where a(x) = an-1xn-1 + … + a1x + a0, b(x) = bn-1xn-1 + … + b1x + b0, and c(x) = a(x)b(x). Idea: Compute a(x)b(x) using FFT-based polynomial multiplication and then evaluate the result at . Computation will be performed mod p for several word sized “Fourier” primes and the Chinese Remainder Theorem will be used to recover the integer product.

Three Primes Algorithm
Compute C = AB, where length(A) = m, and length(B) = n. Let a(x) and b(x) be the polynomials whose coefficients are the digits of A and B respectively The algorithm requires K “Fourier primes” p = 2e k + 1 for sufficiently large e [Polynomial multiplication] Compute ci(x) = a(x)b(x) mod pi for i=1,…,K using FFT-based polynomial multiplication. [CRT] Compute c(x)  ci(x) (mod pi) i=1,…,K. [Evaluation at radix] C = c().

Analysis of Three Primes Algorithm
Determine K Since the kth coefficient of c(x), , the coefficients of c(x) are bounded by n2 Therefore, we need the product p1 …pK  n2 If we choose pi > , then this is true if K  n2 Assuming n <  [ is typically wordsize - for 32-bit words,   109], only 3 primes are required Theorem. Assume that mod p operations can be performed in O(1) time. Then the 3-primes algorithm can multiply two n-digit numbers in time O(nlogn) provided: n <  n  2E-1, where three Fourier primes p = 2ek + 1 (p > ) can be found with e  E (need to perform the FFT of size 2n)

Limitations of 3 Primes Algorithms
If we choose the primes to be wordsize for 32-bit words  < pi < W = 231-1  = 109 n  2E-1 = 223 8.38  106

September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.

Similar presentations

Presentation on theme: "September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.

Similar presentations

Presentation on theme: "September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson."— Presentation transcript:

Similar presentations

About project

Feedback