1 How to Multiply Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. integers, matrices, and polynomials.

Slides:

Advertisements

Similar presentations

Digital Kommunikationselektronik TNE027 Lecture 5 1 Fourier Transforms Discrete Fourier Transform (DFT) Algorithms Fast Fourier Transform (FFT) Algorithms.

Advertisements

LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Use of Frequency Domain Telecommunication Channel |A| f fcfc Frequency.

Advanced Algorithms Piyush Kumar (Lecture 11: Div & Conquer) Welcome to COT5405 Source: Kevin Wayne, Harold Prokop.

DIVIDE AND CONQUER. 2 Algorithmic Paradigms Greedy. Build up a solution incrementally, myopically optimizing some local criterion. Divide-and-conquer.

Dr. Deshi Ye Divide-and-conquer Dr. Deshi Ye

1 Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.

Grade School Revisited: How To Multiply Two Numbers Great Theoretical Ideas In Computer Science Victor Adamchik Danny Sleator CS Spring 2010 Lecture.

1 Divide-and-Conquer Divide-and-conquer. n Break up problem into several parts. n Solve each part recursively. n Combine solutions to sub-problems into.

Fast Fourier Transform Lecture 6 Spoken Language Processing Prof. Andrew Rosenberg.

FFT1 The Fast Fourier Transform. FFT2 Outline and Reading Polynomial Multiplication Problem Primitive Roots of Unity (§10.4.1) The Discrete Fourier Transform.

Richard Fateman CS 282 Lecture 31 Operations, representations Lecture 3.

Richard Fateman CS 282 Lecture 101 The Finite-Field FFT Lecture 10.

Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Reductions Some of these lecture slides are adapted from CLRS Chapter 31.5 and.

FFT1 The Fast Fourier Transform by Jorge M. Trabal.

Princeton University COS 423 Theory of Algorithms Spring 2002 Kevin Wayne Fast Fourier Transform Jean Baptiste Joseph Fourier ( ) These lecture.

Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 16: Application-Driven Hardware Acceleration (1/4)

CSE 421 Algorithms Richard Anderson Lecture 13 Divide and Conquer.

Introduction to Algorithms

The Fourier series A large class of phenomena can be described as periodic in nature: waves, sounds, light, radio, water waves etc. It is natural to attempt.

11/26/02CSE FFT,etc CSE Algorithms Polynomial Representations, Fourier Transfer, and other goodies. (Chapters 28-30)

Fast Fourier Transform Irina Bobkova. Overview I. Polynomials II. The DFT and FFT III. Efficient implementations IV. Some problems.

3.4/3.5 The Integers and Division/ Primes and Greatest Common Divisors Let each of a and b be integers. We say that a divides b, in symbols a | b, provided.

Prof. Swarat Chaudhuri COMP 482: Design and Analysis of Algorithms Spring 2013 Lecture 12.

1 Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.

The Discrete Fourier Transform. The Fourier Transform “The Fourier transform is a mathematical operation with many applications in physics and engineering.

Mathematics Review Exponents Logarithms Series Modular arithmetic Proofs.

1 Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.

CS 6068 Parallel Computing Fall 2013 Lecture 10 – Nov 18 The Parallel FFT Prof. Fred Office Hours: MWF.

FFT1 The Fast Fourier Transform. FFT2 Outline and Reading Polynomial Multiplication Problem Primitive Roots of Unity (§10.4.1) The Discrete Fourier Transform.

Piyush Kumar (Lecture 2 Div & Conquer)

5.6 Convolution and FFT. 2 Fast Fourier Transform: Applications Applications. n Optics, acoustics, quantum physics, telecommunications, control systems,

The Fast Fourier Transform

Advanced Algebraic Algorithms on Integers and Polynomials Prepared by John Reif, Ph.D. Analysis of Algorithms.

Karatsuba’s Algorithm for Integer Multiplication

Applied Symbolic Computation1 Applied Symbolic Computation (CS 300) Karatsuba’s Algorithm for Integer Multiplication Jeremy R. Johnson.

The Fast Fourier Transform and Applications to Multiplication

Advanced Algorithms Piyush Kumar (Lecture 11: Div & Conquer) Welcome to COT5405 Source: Kevin Wayne, Harold Prokop.

1 Fast Polynomial and Integer Multiplication Jeremy R. Johnson.

1 Chapter 4-2 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.

Divide and Conquer Andreas Klappenecker [based on slides by Prof. Welch]

1 How to Multiply Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. integers, matrices, and polynomials.

The Discrete Fourier Transform

Divide and Conquer Faculty Name: Ruhi Fatima Topics Covered Divide and Conquer Matrix multiplication Recurrence.

Applied Symbolic Computation1 Applied Symbolic Computation (CS 567) The Fast Fourier Transform (FFT) and Convolution Jeremy R. Johnson TexPoint fonts used.

1 Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.

CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 16.

May 9, 2001Applied Symbolic Computation1 Applied Symbolic Computation (CS 680/480) Lecture 6: Multiplication, Interpolation, and the Chinese Remainder.

Chapter 2 Divide-and-Conquer algorithms

DIGITAL SIGNAL PROCESSING ELECTRONICS

Chapter 2 Divide-and-Conquer algorithms

Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt

Polynomial + Fast Fourier Transform

Great Theoretical Ideas in Computer Science

Applied Symbolic Computation

September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.

Applied Symbolic Computation

Divide-and-Conquer Divide-and-conquer.

September 4, 1997 Applied Symbolic Computation (CS 300) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.

The Fast Fourier Transform

Advanced Algorithms Analysis and Design

Applied Symbolic Computation

September 4, 1997 Applied Symbolic Computation (CS 567) Fast Polynomial and Integer Multiplication Jeremy R. Johnson.

Chapter 5 Divide and Conquer

Applied Symbolic Computation

Richard Anderson Lecture 14 Inversions, Multiplication, FFT

Applied Symbolic Computation

The Fast Fourier Transform

CSCI 256 Data Structures and Algorithm Analysis Lecture 12

Applied Symbolic Computation

Fast Polynomial and Integer Multiplication

Presentation transcript:

1 How to Multiply Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. integers, matrices, and polynomials

2 Complex Multiplication Complex multiplication. (a + bi) (c + di) = x + yi. Grade-school. x = ac - bd, y = bc + ad. Q. Is it possible to do with fewer multiplications? 4 multiplications, 2 additions

3 Complex Multiplication Complex multiplication. (a + bi) (c + di) = x + yi. Grade-school. x = ac - bd, y = bc + ad. Q. Is it possible to do with fewer multiplications? A. Yes. [Gauss] x = ac - bd, y = (a + b) (c + d) - ac - bd. Remark. Improvement if no hardware multiply. 4 multiplications, 2 additions 3 multiplications, 5 additions

5.5 Integer Multiplication

5 Addition. Given two n -bit integers a and b, compute a + b. Grade-school.  (n) bit operations. Remark. Grade-school addition algorithm is optimal. Integer Addition

6 Integer Multiplication Multiplication. Given two n -bit integers a and b, compute a  b. Grade-school.  (n 2 ) bit operations. Q. Is grade-school multiplication algorithm optimal? 

7 To multiply two n -bit integers a and b : Multiply four ½ n -bit integers, recursively. n Add and shift to obtain result. Ex. Divide-and-Conquer Multiplication: Warmup a = b = a1a1 a0a0 b1b1 b0b0

8 Recursion Tree n 4(n/2) 16(n/4) 4 k (n / 2 k ) 4 lg n (1) T(n) T(n/2) T(n/4) T(2) T(n / 2 k ) T(n/4) T(n/2) T(n/4) T(n/2) T(n/4)... T(n/2)...

9 To multiply two n -bit integers a and b : Add two ½ n bit integers. Multiply three ½ n -bit integers, recursively. n Add, subtract, and shift to obtain result. Karatsuba Multiplication

10 To multiply two n -bit integers a and b : Add two ½ n bit integers. Multiply three ½ n -bit integers, recursively. n Add, subtract, and shift to obtain result. Theorem. [Karatsuba-Ofman 1962] Can multiply two n -bit integers in O(n ) bit operations. Karatsuba Multiplication

11 Karatsuba: Recursion Tree n 3(n/2) 9(n/4) T(n) T(n/2) T(n/4) T(n/2) T(n/4) T(n/2) T(n/4) 3 lg n (1) T(2) T(n / 2 k )... 3 k (n / 2 k )

12 Integer division. Given two n -bit (or less) integers s and t, compute quotient q = s / t and remainder r = s mod t. Fact. Complexity of integer division is same as integer multiplication. To compute quotient q : Approximate x = 1 / t using Newton's method: After log n iterations, either q =  s x  or q =  s x . Fast Integer Division Too (!) using fast multiplication

Matrix Multiplication

14 Dot product. Given two length n vectors a and b, compute c = a  b. Grade-school.  (n) arithmetic operations. Remark. Grade-school dot product algorithm is optimal. Dot Product

15 Matrix multiplication. Given two n -by- n matrices A and B, compute C = AB. Grade-school.  (n 3 ) arithmetic operations. Q. Is grade-school matrix multiplication algorithm optimal? Matrix Multiplication

16 Block Matrix Multiplication C 11 A 11 A 12 B 11

17 Matrix Multiplication: Warmup To multiply two n -by- n matrices A and B : Divide: partition A and B into ½ n -by-½ n blocks. Conquer: multiply 8 pairs of ½ n -by-½ n matrices, recursively. Combine: add appropriate products using 4 matrix additions.

18 Fast Matrix Multiplication Key idea. multiply 2 -by- 2 blocks with only 7 multiplications. 7 multiplications. 18 = additions and subtractions.

19 Fast Matrix Multiplication To multiply two n -by- n matrices A and B : [Strassen 1969] Divide: partition A and B into ½ n -by-½ n blocks. Compute: 14 ½ n -by-½ n matrices via 10 matrix additions. Conquer: multiply 7 pairs of ½ n -by-½ n matrices, recursively. Combine: 7 products into 4 terms using 8 matrix additions. Analysis. Assume n is a power of 2. T(n) = # arithmetic operations.

20 Fast Matrix Multiplication: Practice Implementation issues. n Sparsity. n Caching effects. n Numerical stability. n Odd matrix dimensions. Crossover to classical algorithm around n = 128. Common misperception. “Strassen is only a theoretical curiosity.” Apple reports 8 x speedup on G4 Velocity Engine when n  2,500. n Range of instances where it's useful is a subject of controversy. Remark. Can "Strassenize" Ax = b, determinant, eigenvalues, SVD, ….

21 Begun, the decimal wars have. [Pan, Bini et al, Schönhage, …] Fast Matrix Multiplication: Theory Q. Multiply two 2 -by- 2 matrices with 7 scalar multiplications? A. Yes! [Strassen 1969] Q. Multiply two 2 -by- 2 matrices with 6 scalar multiplications? A. Impossible. [Hopcroft and Kerr 1971] Q. Two 3 -by- 3 matrices with 21 scalar multiplications? A. Also impossible. Two 48 -by- 48 matrices with 47,217 scalar multiplications. December, January, A year later. Two 20 -by- 20 matrices with 4,460 scalar multiplications.

22 Fast Matrix Multiplication: Theory Best known. O(n ) [Coppersmith-Winograd, 1987] Conjecture. O(n 2+  ) for any  > 0. Caveat. Theoretical improvements to Strassen are progressively less practical.

5.6 Convolution and FFT

24 Fourier Analysis Fourier theorem. [Fourier, Dirichlet, Riemann] Any periodic function can be expressed as the sum of a series of sinusoids. sufficiently smooth t N = 1N = 5N = 10N = 100

25 Euler's Identity Sinusoids. Sum of sine an cosines. Sinusoids. Sum of complex exponentials. e ix = cos x + i sin x Euler's identity

26 Time Domain vs. Frequency Domain Signal. [touch tone button 1] Time domain. Frequency domain. Reference: Cleve Moler, Numerical Computing with MATLAB frequency (Hz) amplitude 0.5 time (seconds) sound pressure

27 Time Domain vs. Frequency Domain Signal. [recording, 8192 samples per second] Magnitude of discrete Fourier transform. Reference: Cleve Moler, Numerical Computing with MATLAB

28 Fast Fourier Transform FFT. Fast way to convert between time-domain and frequency-domain. Alternate viewpoint. Fast way to multiply and evaluate polynomials. If you speed up any nontrivial algorithm by a factor of a million or so the world will beat a path towards finding useful applications for it. -Numerical Recipes we take this approach

29 Fast Fourier Transform: Applications Applications. n Optics, acoustics, quantum physics, telecommunications, radar, control systems, signal processing, speech recognition, data compression, image processing, seismology, mass spectrometry… n Digital media. [DVD, JPEG, MP3, H.264] n Medical diagnostics. [MRI, CT, PET scans, ultrasound] n Numerical solutions to Poisson's equation. n Shor's quantum factoring algorithm. n … The FFT is one of the truly great computational developments of [the 20th] century. It has changed the face of science and engineering so much that it is not an exaggeration to say that life as we know it would be very different without the FFT. -Charles van Loan

30 Fast Fourier Transform: Brief History Gauss (1805, 1866). Analyzed periodic motion of asteroid Ceres. Runge-König (1924). Laid theoretical groundwork. Danielson-Lanczos (1942). Efficient algorithm, x-ray crystallography. Cooley-Tukey (1965). Monitoring nuclear tests in Soviet Union and tracking submarines. Rediscovered and popularized FFT. Importance not fully realized until advent of digital computers.

31 Polynomials: Coefficient Representation Polynomial. [coefficient representation] Add. O(n) arithmetic operations. Evaluate. O(n) using Horner's method. Multiply (convolve). O(n 2 ) using brute force.

32 A Modest PhD Dissertation Title "New Proof of the Theorem That Every Algebraic Rational Integral Function In One Variable can be Resolved into Real Factors of the First or the Second Degree." - PhD dissertation, 1799 the University of Helmstedt

33 Polynomials: Point-Value Representation Fundamental theorem of algebra. [Gauss, PhD thesis] A degree n polynomial with complex coefficients has exactly n complex roots. Corollary. A degree n-1 polynomial A(x) is uniquely specified by its evaluation at n distinct values of x. x y xjxj y j = A(x j )

34 Polynomials: Point-Value Representation Polynomial. [point-value representation] Add. O(n) arithmetic operations. Multiply (convolve). O(n), but need 2n-1 points. Evaluate. O(n 2 ) using Lagrange's formula.

35 Converting Between Two Polynomial Representations Tradeoff. Fast evaluation or fast multiplication. We want both! Goal. Efficient conversion between two representations  all ops fast. coefficient representation O(n2)O(n2) multiply O(n)O(n) evaluate point-value O(n)O(n)O(n2)O(n2) coefficient representation point-value representation

36 Converting Between Two Representations: Brute Force Coefficient  point-value. Given a polynomial a 0 + a 1 x a n-1 x n-1, evaluate it at n distinct points x 0,..., x n-1. Running time. O(n 2 ) for matrix-vector multiply (or n Horner's).

37 Converting Between Two Representations: Brute Force Point-value  coefficient. Given n distinct points x 0,..., x n-1 and values y 0,..., y n-1, find unique polynomial a 0 + a 1 x a n-1 x n-1, that has given values at given points. Running time. O(n 3 ) for Gaussian elimination.  Vandermonde matrix is invertible iff x i distinct or O(n ) via fast matrix multiplication

38 Divide-and-Conquer Decimation in frequency. Break up polynomial into low and high powers. n A(x) = a 0 + a 1 x + a 2 x 2 + a 3 x 3 + a 4 x 4 + a 5 x 5 + a 6 x 6 + a 7 x 7. n A low (x) = a 0 + a 1 x + a 2 x 2 + a 3 x 3. n A high (x) = a 4 + a 5 x + a 6 x 2 + a 7 x 3. n A(x) = A low (x) + x 4 A high (x). Decimation in time. Break polynomial up into even and odd powers. n A(x) = a 0 + a 1 x + a 2 x 2 + a 3 x 3 + a 4 x 4 + a 5 x 5 + a 6 x 6 + a 7 x 7. n A even (x) = a 0 + a 2 x + a 4 x 2 + a 6 x 3. n A odd (x) = a 1 + a 3 x + a 5 x 2 + a 7 x 3. n A(x) = A even (x 2 ) + x A odd (x 2 ).

39 Coefficient to Point-Value Representation: Intuition Coefficient  point-value. Given a polynomial a 0 + a 1 x a n-1 x n-1, evaluate it at n distinct points x 0,..., x n-1. Divide. Break polynomial up into even and odd powers. n A(x) = a 0 + a 1 x + a 2 x 2 + a 3 x 3 + a 4 x 4 + a 5 x 5 + a 6 x 6 + a 7 x 7. n A even (x) = a 0 + a 2 x + a 4 x 2 + a 6 x 3. n A odd (x) = a 1 + a 3 x + a 5 x 2 + a 7 x 3. n A(x) = A even (x 2 ) + x A odd (x 2 ). n A(-x) = A even (x 2 ) - x A odd (x 2 ). Intuition. Choose two points to be ±1. n A( 1) = A even (1) + 1 A odd (1). n A(-1) = A even (1) - 1 A odd (1). Can evaluate polynomial of degree  n at 2 points by evaluating two polynomials of degree  ½ n at 1 point. we get to choose which ones!

40 Coefficient to Point-Value Representation: Intuition Coefficient  point-value. Given a polynomial a 0 + a 1 x a n-1 x n-1, evaluate it at n distinct points x 0,..., x n-1. Divide. Break polynomial up into even and odd powers. n A(x) = a 0 + a 1 x + a 2 x 2 + a 3 x 3 + a 4 x 4 + a 5 x 5 + a 6 x 6 + a 7 x 7. n A even (x) = a 0 + a 2 x + a 4 x 2 + a 6 x 3. n A odd (x) = a 1 + a 3 x + a 5 x 2 + a 7 x 3. n A(x) = A even (x 2 ) + x A odd (x 2 ). n A(-x) = A even (x 2 ) - x A odd (x 2 ). Intuition. Choose four complex points to be ±1, ±i. n A(1) = A even (1) + 1 A odd (1). n A(-1) = A even (1) - 1 A odd (1). n A( i ) = A even (-1) + i A odd (-1). n A( -i ) = A even (-1) - i A odd (-1). Can evaluate polynomial of degree  n at 4 points by evaluating two polynomials of degree  ½ n at 2 points. we get to choose which ones!

41 Discrete Fourier Transform Coefficient  point-value. Given a polynomial a 0 + a 1 x a n-1 x n-1, evaluate it at n distinct points x 0,..., x n-1. Key idea. Choose x k =  k where  is principal n th root of unity. DFTFourier matrix F n

42 Roots of Unity Def. An n th root of unity is a complex number x such that x n = 1. Fact. The n th roots of unity are:  0,  1, …,  n-1 where  = e 2  i / n. Pf. (  k ) n = (e 2  i k / n ) n = (e  i ) 2k = (-1) 2k = 1. Fact. The ½ n th roots of unity are: 0, 1, …, n/2-1 where =  2 = e 4  i / n.  0 = 0 = 1 11  2 = 1 = i 33  4 = 2 = -1 55  6 = 3 = -i 77 n = 8

43 Fast Fourier Transform Goal. Evaluate a degree n-1 polynomial A(x) = a a n-1 x n-1 at its n th roots of unity:  0,  1, …,  n-1. Divide. Break up polynomial into even and odd powers. n A even (x) = a 0 + a 2 x + a 4 x 2 + … + a n-2 x n/ n A odd (x) = a 1 + a 3 x + a 5 x 2 + … + a n-1 x n/ n A(x) = A even (x 2 ) + x A odd (x 2 ). Conquer. Evaluate A even (x) and A odd (x) at the ½ n th roots of unity: 0, 1, …, n/2-1. Combine. n A(  k ) = A even ( k ) +  k A odd ( k ), 0  k < n/2 A(  k+ ½n ) = A even ( k ) –  k A odd ( k ), 0  k < n/2  k+ ½n = -  k k = (  k ) 2 k = (  k + ½n ) 2

44 fft(n, a 0,a 1,…,a n-1 ) { if (n == 1) return a 0 (e 0,e 1,…,e n/2-1 )  FFT(n/2, a 0,a 2,a 4,…,a n-2 ) (d 0,d 1,…,d n/2-1 )  FFT(n/2, a 1,a 3,a 5,…,a n-1 ) for k = 0 to n/2 - 1 {  k  e 2  ik/n y k+n/2  e k +  k d k y k+n/2  e k -  k d k } return (y 0,y 1,…,y n-1 ) } FFT Algorithm

45 FFT Summary Theorem. FFT algorithm evaluates a degree n-1 polynomial at each of the n th roots of unity in O(n log n) steps. Running time. O(n log n) coefficient representation point-value representation ??? assumes n is a power of 2

46 Recursion Tree a 0, a 1, a 2, a 3, a 4, a 5, a 6, a 7 a 1, a 3, a 5, a 7 a 0, a 2, a 4, a 6 a 3, a 7 a 1, a 5 a 0, a 4 a 2, a 6 a0a0 a4a4 a2a2 a6a6 a1a1 a5a5 a3a3 a7a7 "bit-reversed" order perfect shuffle

47 Inverse Discrete Fourier Transform Point-value  coefficient. Given n distinct points x 0,..., x n-1 and values y 0,..., y n-1, find unique polynomial a 0 + a 1 x a n-1 x n-1, that has given values at given points. Inverse DFT Fourier matrix inverse (F n ) -1

48 Claim. Inverse of Fourier matrix F n is given by following formula. Consequence. To compute inverse FFT, apply same algorithm but use  -1 = e -2  i / n as principal n th root of unity (and divide by n ). Inverse DFT

49 Inverse FFT: Proof of Correctness Claim. F n and G n are inverses. Pf. Summation lemma. Let  be a principal n th root of unity. Then Pf. If k is a multiple of n then  k = 1  series sums to n. Each n th root of unity  k is a root of x n - 1 = (x - 1) (1 + x + x x n-1 ). if  k  1 we have: 1 +  k +  k(2) + … +  k(n-1) = 0  series sums to 0. ▪ summation lemma

50 Inverse FFT: Algorithm ifft(n, a 0,a 1,…,a n-1 ) { if (n == 1) return a 0 (e 0,e 1,…,e n/2-1 )  FFT(n/2, a 0,a 2,a 4,…,a n-2 ) (d 0,d 1,…,d n/2-1 )  FFT(n/2, a 1,a 3,a 5,…,a n-1 ) for k = 0 to n/2 - 1 {  k  e -2  ik/n y k+n/2  (e k +  k d k ) / n y k+n/2  (e k -  k d k ) / n } return (y 0,y 1,…,y n-1 ) }

51 Inverse FFT Summary Theorem. Inverse FFT algorithm interpolates a degree n-1 polynomial given values at each of the n th roots of unity in O(n log n) steps. assumes n is a power of 2 O(n log n) coefficient representation O(n log n) point-value representation

52 Polynomial Multiplication Theorem. Can multiply two degree n-1 polynomials in O(n log n) steps. O(n)O(n) point-value multiplication O(n log n) 2 FFTs inverse FFT O(n log n) coefficient representation pad with 0 s to make n a power of 2

53 FFT in Practice ?

54 FFT in Practice Fastest Fourier transform in the West. [Frigo and Johnson] n Optimized C library. n Features: DFT, DCT, real, complex, any size, any dimension. n Won 1999 Wilkinson Prize for Numerical Software. n Portable, competitive with vendor-tuned code. Implementation details. n Instead of executing predetermined algorithm, it evaluates your hardware and uses a special-purpose compiler to generate an optimized algorithm catered to "shape" of the problem. n Core algorithm is nonrecursive version of Cooley-Tukey. O(n log n), even for prime sizes. Reference:

55 Integer Multiplication, Redux Integer multiplication. Given two n bit integers a = a n-1 … a 1 a 0 and b = b n-1 … b 1 b 0, compute their product a  b. Convolution algorithm. n Form two polynomials. Note: a = A(2), b = B(2). Compute C(x) = A(x)  B(x). Evaluate C(2) = a  b. Running time: O(n log n) complex arithmetic operations. Theory. [Schönhage-Strassen 1971] O(n log n log log n) bit operations. Theory. [Fürer 2007] O(n log n 2 O(log *n) ) bit operations.

56 Integer Multiplication, Redux Integer multiplication. Given two n bit integers a = a n-1 … a 1 a 0 and b = b n-1 … b 1 b 0, compute their product a  b. Practice. [GNU Multiple Precision Arithmetic Library] It uses brute force, Karatsuba, and FFT, depending on the size of n. "the fastest bignum library on the planet"

57 Integer Arithmetic Fundamental open question. What is complexity of arithmetic? addition Operation O(n)O(n) Upper Bound (n)(n) Lower Bound multiplication O(n log n 2 O(log*n) ) (n)(n) division O(n log n 2 O(log*n) ) (n)(n)

58 Factoring Factoring. Given an n -bit integer, find its prime factorization = 47 × = = × RSA-704 ($30,000 prize if you can factor) a disproof of Mersenne's conjecture that is prime

59 Factoring and RSA Primality. Given an n -bit integer, is it prime? Factoring. Given an n -bit integer, find its prime factorization. Significance. Efficient primality testing  can implement RSA. Significance. Efficient factoring  can break RSA. Theorem. [AKS 2002] Poly-time algorithm for primality testing.

60 Shor's Algorithm Shor's algorithm. Can factor an n -bit integer in O(n 3 ) time on a quantum computer. Ramification. At least one of the following is wrong: n RSA is secure. n Textbook quantum mechanics. n Extending Church-Turing thesis. algorithm uses quantum QFT !

61 Shor's Factoring Algorithm Period finding. Theorem. [Euler] Let p and q be prime, and let N = p q. Then, the following sequence repeats with a period divisible by (p-1) (q-1) : Consequence. If we can learn something about the period of the sequence, we can learn something about the divisors of (p-1) (q-1). by using random values of x, we get the divisors of (p-1) (q-1), and from this, can get the divisors of N = p q … 2 i … 2 i mod … 2 i mod 21 period = 4 period = 6 x mod N, x 2 mod N, x 3 mod N, x 4 mod N, …

Extra Slides

63 Fourier Matrix Decomposition