September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.

Slides:



Advertisements
Similar presentations
Fast Fourier Transform for speeding up the multiplication of polynomials an Algorithm Visualization Alexandru Cioaca.
Advertisements

Fast Fourier Transform Lecture 6 Spoken Language Processing Prof. Andrew Rosenberg.
FFT1 The Fast Fourier Transform. FFT2 Outline and Reading Polynomial Multiplication Problem Primitive Roots of Unity (§10.4.1) The Discrete Fourier Transform.
CSE 421 Algorithms Richard Anderson Lecture 15 Fast Fourier Transform.
FFT1 The Fast Fourier Transform by Jorge M. Trabal.
Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Fast Fourier Transform Jean Baptiste Joseph Fourier ( ) These lecture.
Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 16: Application-Driven Hardware Acceleration (1/4)
CSE 421 Algorithms Richard Anderson Lecture 13 Divide and Conquer.
Introduction to Algorithms
The Fourier series A large class of phenomena can be described as periodic in nature: waves, sounds, light, radio, water waves etc. It is natural to attempt.
Fast Fourier Transform Irina Bobkova. Overview I. Polynomials II. The DFT and FFT III. Efficient implementations IV. Some problems.
1 Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
1 Fingerprinting techniques. 2 Is X equal to Y? = ? = ?
FFT1 The Fast Fourier Transform. FFT2 Outline and Reading Polynomial Multiplication Problem Primitive Roots of Unity (§10.4.1) The Discrete Fourier Transform.
5.6 Convolution and FFT. 2 Fast Fourier Transform: Applications Applications. n Optics, acoustics, quantum physics, telecommunications, control systems,
The Fast Fourier Transform
Monoids, Groups, Rings, Fields
Advanced Algebraic Algorithms on Integers and Polynomials Prepared by John Reif, Ph.D. Analysis of Algorithms.
Basic Number Theory Divisibility Let a,b be integers with a≠0. if there exists an integer k such that b=ka, we say a divides b which is denoted by a|b.
Chapter 4 – Finite Fields
Karatsuba’s Algorithm for Integer Multiplication
Applied Symbolic Computation1 Applied Symbolic Computation (CS 300) Karatsuba’s Algorithm for Integer Multiplication Jeremy R. Johnson.
The Fast Fourier Transform and Applications to Multiplication
Chinese Remainder Theorem Dec 29 Picture from ………………………
1 Network and Computer Security (CS 475) Modular Arithmetic and the RSA Public Key Cryptosystem Jeremy R. Johnson.
Network and Computer Security (CS 475) Modular Arithmetic
1 Fast Polynomial and Integer Multiplication Jeremy R. Johnson.
Applied Symbolic Computation1 Applied Symbolic Computation (CS 567) The Fast Fourier Transform (FFT) and Convolution Jeremy R. Johnson TexPoint fonts used.
May 9, 2001Applied Symbolic Computation1 Applied Symbolic Computation (CS 680/480) Lecture 6: Multiplication, Interpolation, and the Chinese Remainder.
Number-Theoretic Algorithms
Chapter 2 Divide-and-Conquer algorithms
B504/I538: Introduction to Cryptography
Topic 12: Number Theory Basics (2)
DIGITAL SIGNAL PROCESSING ELECTRONICS
Chapter 2 Divide-and-Conquer algorithms
Advanced Algorithms Analysis and Design
Number-Theoretic Algorithms (UNIT-4)
Lecture 16 Fast Fourier Transform
CMSC Discrete Structures
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Polynomial + Fast Fourier Transform
Applied Symbolic Computation
September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.
Applied Symbolic Computation
Applied Symbolic Computation (CS 300) Modular Arithmetic
Polynomials and the FFT(UNIT-3)
Applied Symbolic Computation
DFT and FFT By using the complex roots of unity, we can evaluate and interpolate a polynomial in O(n lg n) An example, here are the solutions to 8 =
Enough Mathematical Appetizers!
The Fast Fourier Transform
Advanced Algorithms Analysis and Design
Applied Symbolic Computation (CS 300) Modular Arithmetic
Systems Architecture I
Applied Symbolic Computation
Applied Symbolic Computation
September 4, 1997 Applied Symbolic Computation (CS 567) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.
Applied Symbolic Computation (CS 300) Modular Arithmetic
Chapter 5 Divide and Conquer
Applied Symbolic Computation
Algorithmic Number Theory and Cryptography (CS 303) Modular Arithmetic
Richard Anderson Lecture 14 Inversions, Multiplication, FFT
Applied Symbolic Computation
Applied Symbolic Computation (CS 300) Modular Arithmetic
Applied Symbolic Computation (CS 300) Modular Arithmetic
The Fast Fourier Transform
Applied Symbolic Computation (CS 300) Modular Arithmetic
Patrick Lee 12 July 2003 (updated on 13 July 2003)
Clements MAΘ October 30th, 2014
Applied Symbolic Computation
Fast Polynomial and Integer Multiplication
Presentation transcript:

September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson

September 4, 1997 Introduction Objective: To obtain fast algorithms for polynomial and integer multiplication based on the FFT. In order to do this we will compute the FFT over a finite field. The existence of FFTs over Zp is related to the primitive element theorem and the prime number theorem. Polynomial multiplication using interpolation Feasibility of mod p FFTs Fast polynomial multiplication Fast integer multiplication (3 primes algorithm) References: Lipson, Cormen et al.

Polynomial Multiplication using Interpolation Compute C(x) = A(x)B(x), where degree(A(x)) = m, and degree(B(x)) = n. Degree(C(x)) = m+n, and C(x) is uniquely determined by its value at m+n+1 distinct points. [Evaluation] Compute A(i) and B(i) for distinct i, i=0,…,m+n. [Pointwise Product] Compute C(i) = A(i)*B(i) for i=0,…,m+n. [Interpolation] Compute the coefficients of C(x) = cnxm+n + … + c1x +c0 from the points C(i) = A(i)*B(i) for i=0,…,m+n.

Primitive Element Theorem Theorem. Let F be a finite field with q = pk elements. Let F* be the q-1 non-zero elements of F. Then F* = <> = {1, , 2, …, q-2} for some   F*.  is called a primitive element. In particular there exist a primitive element for Zp for all prime p. E.G. (Z5)* = {1, 2,22=4,23=3} (Z17)* = {1, 3, 32 = 9, 33 = 10, 34 = 13, 35 = 5, 36 = 15, 37 = 11, 38 = 16, 39 = 14, 310 = 8, 311 = 7, 312 = 4, 313 = 12, 314 = 2, 315 = 6}

Modular Discrete Fourier Transform The n-point DFT is defined over Zp if there is a primitive n-th root of unity in Zp (same is true for any finite field) Let  be a primitive nth root of unity.

Example

Evaluation Utilizing Symmetry The cost of evaluation at two points can be reduced if one is the negative of the other. Let A(x) = A1(x2)x + A0(x2), where the coefficients of A1(x) are the odd coefficients of A(x) and the coefficients of A0(x) are the even coefficients of A(x) Since (-)2 = 2, A0( 2) = A0(- 2) and A1( 2) = A1(- 2) A() = A0( 2) +  A1( 2) A(-) = A0( 2) -  A1( 2) Example A(x) = a5x5 + a4x4 + a3x3 + a2x2 + a1x + a0 = A1(x2)x + A0(x2) A1(x) = a5x2 + a3x + a1 A0(x) = a4x2 + a2x + a0

Properties of Roots of Unity Lemma: Let n = 2m, and  be a primitive nth root of unity. Then 2 is a primitive mth root of unity. Lemma: Let n = 2m, and  be a primitive nth root of unity. Then m = -1 and m+k = -k. Therefore, F2m is a Vandermonde matrix where half the points are negatives of the other half. Thus, we can utilize the previous factorization to compute the DFT. Moreover, if n=2k this property can be used recursively.

FFT Input: N = 2k, a = (a0,a1,…,aN-1) = a0+a1x + … + aN-1xN-1 Output: A = (A0,A1,…,AN-1) with Ak = a(k) A  FFT(N,a,) if N = 1 then A := a; else n := N/2; b := (a0,a2,…,a2(n-1)); # a(x) = xc(x2) + b(x2) c := (a1,a3,…,a2n-1); B := FFT(n,b, 2); # Bk = b(k2) C := FFT(n,c, 2); # Ck = c(k2) for k from 0 to n-1 do Ak := Bk + k × Ck; # Ak = b(k2) + kc(k2) = a(k) An+k := Bk - k × Ck; # An+k = b(k2) - kc(k2) = a(-k) = a(k+n) end do; end if; end proc;

Computing Time Let T(n), n=2k, be the computing time of the FFT then September 4, 1997 Computing Time Let T(n), n=2k, be the computing time of the FFT then T(n) = 2T(n/2) + (n) T(n) = (nlogn) Master Theorem. Assume T(n) = aT(n/b) + (n) T(n) = (n) [a < b] T(n) = (nlog(n)) [a = b] T(n) = (nlogb(a)) [a > b]

Polynomial Multiplication using Interpolation Compute C(x) = A(x)B(x), where degree(A(x)) = m, and degree(B(x)) = n. Degree(C(x)) = m+n, and C(x) is uniquely determined by its value at m+n+1 distinct points. [Evaluation] Compute A(i) and B(i) for distinct i, i=0,…,m+n. [Pointwise Product] Compute C(i) = A(i)*B(i) for i=0,…,m+n. [Interpolation] Compute the coefficients of C(x) = cnxm+n + … + c1x +c0 from the points C(i) = A(i)*B(i) for i=0,…,m+n.

Matrix Version of Polynomial Evaluation September 4, 1997 Matrix Version of Polynomial Evaluation Evaluate A(x) = a3x3 + a2x2 + a1x + a0 at the points , , ,  eval

Matrix Interpretation of Interpolation September 4, 1997 Matrix Interpretation of Interpolation Let A(x) = anxn + … + a1x +a0 be a polynomial of degree n. Determine the (n+1) coefficients an,…,a1,a0 from the (n+1) values A(0),…,A(n) interpolate

V(0,…, n) is non-singular when 0,…, n are distinct. September 4, 1997 Vandermonde Matrix V(0,…, n) is non-singular when 0,…, n are distinct.

Inverse DFT

Example

Feasibility of mod p FFTs Theorem: Zp has a primitive Nth root of unity iff N|(p-1) Proof. By the primitive element theorem there exist an element  of order (p-1) Zp. If p-1 = qN, then q is an Nth root of unity. To compute a mod p FFT of size 2m, we must find p = 2e k + 1 (k odd), where e  m. Theorem. Let a and b be relatively prime integers. The number of primes  x in the arithmetic progression ak + b (k=1,2,…) is approximately (somewhat greater) (x/log x)/(a)

Fast Polynomial Multiplication Compute C(x) = a(x)b(x), where degree(a(x)) = m, and degree(b(x)) = n. Degree(c(x)) = m+n, and c(x) is uniquely determined by its value at m+n+1 distinct points. Let N = 2k  m+n+1. [Fourier Evaluation] Compute FFT(N,a(x),,A); FFT(N,b(x),,B). [Pointwise Product] Compute Ck = 1/N Ak * Bk, k=0,…,N-1. [Fourier Interpolation] Compute FFT(N,C,-1,c(x)).

Fast Integer Multiplication Let A = (an-1,…,a1,a0 ) = an-1n-1 + … + a1 + a0 B = (bn-1,…,b1,b0 ) = bn-1n-1 + … + b1 + b0 C = AB = c() = a()b(), where a(x) = an-1xn-1 + … + a1x + a0, b(x) = bn-1xn-1 + … + b1x + b0, and c(x) = a(x)b(x). Idea: Compute a(x)b(x) using FFT-based polynomial multiplication and then evaluate the result at . Computation will be performed mod p for several word sized “Fourier” primes and the Chinese Remainder Theorem will be used to recover the integer product.

Three Primes Algorithm Compute C = AB, where length(A) = m, and length(B) = n. Let a(x) and b(x) be the polynomials whose coefficients are the digits of A and B respectively The algorithm requires K “Fourier primes” p = 2e k + 1 for sufficiently large e [Polynomial multiplication] Compute ci(x) = a(x)b(x) mod pi for i=1,…,K using FFT-based polynomial multiplication. [CRT] Compute c(x)  ci(x) (mod pi) i=1,…,K. [Evaluation at radix] C = c().

Analysis of Three Primes Algorithm Determine K Since the kth coefficient of c(x), , the coefficients of c(x) are bounded by n2 Therefore, we need the product p1 …pK  n2 If we choose pi > , then this is true if K  n2 Assuming n <  [ is typically wordsize - for 32-bit words,   109], only 3 primes are required Theorem. Assume that mod p operations can be performed in O(1) time. Then the 3-primes algorithm can multiply two n-digit numbers in time O(nlogn) provided: n <  n  2E-1, where three Fourier primes p = 2ek + 1 (p > ) can be found with e  E (need to perform the FFT of size 2n)

Limitations of 3 Primes Algorithms If we choose the primes to be wordsize for 32-bit words  < pi < W = 231-1  = 109 n  2E-1 = 223 8.38  106