Recent Developments in the Sparse Fourier Transform

Slides:



Advertisements
Similar presentations
1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.
Advertisements

Numerical Linear Algebra in the Streaming Model Ken Clarkson - IBM David Woodruff - IBM.
Efficient Private Approximation Protocols Piotr Indyk David Woodruff Work in progress.
On the Power of Adaptivity in Sparse Recovery Piotr Indyk MIT Joint work with Eric Price and David Woodruff, 2011.
Sparse Recovery (Using Sparse Matrices)
Digital Kommunikationselektronik TNE027 Lecture 5 1 Fourier Transforms Discrete Fourier Transform (DFT) Algorithms Fast Fourier Transform (FFT) Algorithms.
An Improved Data Stream Summary: The Count-Min Sketch and its Applications Graham Cormode, S. Muthukrishnan 2003.
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
Engineering Mathematics Class #15 Fourier Series, Integrals, and Transforms (Part 3) Sheng-Fang Huang.
Heavy Hitters Piotr Indyk MIT. Last Few Lectures Recap (last few lectures) –Update a vector x –Maintain a linear sketch –Can compute L p norm of x (in.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 13 June 25, 2006
FFT1 The Fast Fourier Transform. FFT2 Outline and Reading Polynomial Multiplication Problem Primitive Roots of Unity (§10.4.1) The Discrete Fourier Transform.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 12 June 18, 2006
FFT1 The Fast Fourier Transform by Jorge M. Trabal.
Lecture 3 Aug 31, 2011 Goals: Chapter 2 (algorithm analysis) Examples: Selection sorting rules for algorithm analysis discussion of lab – permutation generation.
Lecture 3 Feb 7, 2011 Goals: Chapter 2 (algorithm analysis) Examples: Selection sorting rules for algorithm analysis Image representation Image processing.
Transforms: Basis to Basis Normal Basis Hadamard Basis Basis functions Method to find coefficients (“Transform”) Inverse Transform.
Sparse Fourier Transform
Sketching via Hashing: from Heavy Hitters to Compressive Sensing to Sparse Fourier Transform Piotr Indyk MIT.
Analysis of Algorithms
FFT1 The Fast Fourier Transform. FFT2 Outline and Reading Polynomial Multiplication Problem Primitive Roots of Unity (§10.4.1) The Discrete Fourier Transform.
Streaming Algorithms Piotr Indyk MIT. Data Streams A data stream is a sequence of data that is too large to be stored in available memory Examples: –Network.
Analysis of Algorithms CSCI Previous Evaluations of Programs Correctness – does the algorithm do what it is supposed to do? Generality – does it.
Compressive Sampling Jan Pei Wu. Formalism The observation y is linearly related with signal x: y=Ax Generally we need to have the number of observation.
The Fast Fourier Transform and Applications to Multiplication
Astronomical Data Analysis I
1 Embedding and Similarity Search for Point Sets under Translation Minkyoung Cho and David M. Mount University of Maryland SoCG 2008.
7- 1 Chapter 7: Fourier Analysis Fourier analysis = Series + Transform ◎ Fourier Series -- A periodic (T) function f(x) can be written as the sum of sines.
Data Structures Haim Kaplan & Uri Zwick December 2013 Sorting 1.
Big Data Lecture 5: Estimating the second moment, dimension reduction, applications.
Sparse RecoveryAlgorithmResults  Original signal x = x k + u, where x k has k large coefficients and u is noise.  Acquire measurements Ax = y. If |x|=n,
Lecture 1.4. Sampling. Kotelnikov-Nyquist Theorem.
Digital Image Processing Lecture 8: Fourier Transform Prof. Charlene Tsai.
Singular Value Decomposition and its applications
Jean Baptiste Joseph Fourier
Probabilistic Algorithms
Size of a hypothesis test
Markov Chains Mixing Times Lecture 5
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Lecture 22: Linearity Testing Sparse Fourier Transform
Estimating L2 Norm MIT Piotr Indyk.
Polynomial + Fast Fourier Transform
Streaming & sampling.
Lecture 15 Sparse Recovery Using Sparse Matrices
CS 3343: Analysis of Algorithms
COMS E F15 Lecture 2: Median trick + Chernoff, Distinct Count, Impossibility Results Left to the title, a presenter can insert his/her own image.
Lecture 4: CountSketch High Frequencies
Randomized Algorithms CS648
Asymptotic Notations Algorithms Lecture 9.
Sparse and Redundant Representations and Their Applications in
Linear sketching with parities
Y. Kotidis, S. Muthukrishnan,
The Fast Fourier Transform
Advanced Algorithms Analysis and Design
Algorithms + Data Structures = Programs -Niklaus Wirth
Sublinear Algorihms for Big Data
Linear sketching over
Chapter 2 Ideal Sampling and Nyquist Theorem
Sparse and Redundant Representations and Their Applications in
Linear sketching with parities
Data Structures Sorting Haim Kaplan & Uri Zwick December 2014.
Lattices. Svp & cvp. lll algorithm. application in cryptography
LECTURE 18: FOURIER ANALYSIS OF CT SYSTEMS
Dimension versus Distortion a.k.a. Euclidean Dimension Reduction
The Fast Fourier Transform
Chapter 9 Advanced Topics in DSP
ENEE222 Elements of Discrete Signal Analysis Lab 9 1.
On Solving Linear Systems in Sublinear Time
Presentation transcript:

Recent Developments in the Sparse Fourier Transform Piotr Indyk MIT

Basics Overview of Sparse Fourier Transforms Faster Fourier Transform for signals with sparse spectra Elena will help me with that (thanks, Elena!) Based on lectures 1…6 from my course “Algorithms and Signal Processing”, available at https://stellar.mit.edu/S/course/6/fa14/6.893/materials.html Will try compress 6 x 80 mins into 4 hours Will try to minimize the distortion  Rough plan: Overview Algorithms for signals with one non-zero/large Fourier coefficient Average case algorithms Worst case algorithms

Sparsity Signal a=a0….an-1 Sparse approximation: approximate a by a’ parametrized by k coefficients, k<<n Applications: Compression Denoising Machine learning …

Sparse approximations – transform coding Compute Ta where T is an nxn orthonormal matrix Compute Hk(Ta) by keeping only the largest* k coefficients of Ta T-1 (Hk(Ta)) is a k-sparse approximation to a w.r.t. T *In magnitude

Transform coding: examples Fourier Transform+thresholding +Inverse FT Wavelet Transform +thesholding +Inverse WT Sparsity k=0.1 n

(Discrete) Fourier Transform Input (time domain): a=a0….an-1 Today: n=2l Output (frequency domain): â=â0…ân-1, where âu = 1/n Σj aj e -2πi/n uj Notation: ω=ωn=e 2πi/n is the n-th root of unity Then âu = 1/n Σj aj ω-uj Matrix notation: â = F a Fuj=1/n ω-uj F-1 ju= ωuj DFT

Computing â = F a Matrix-vector multiplication – O(n2) time Fast Fourier Transform: computes â from a (and vice versa) in O(n log n) time One can then sort the coefficients and select top k of them – O(n log n) time O(n log n) time overall …. to compute k coefficients Sparse Fourier Transform: Directly computes the k largest coefficients of â (approximately - formal definition in a moment) Running time: as low as O(k log n) (more details in a moment) Sub-linear time – cannot read the whole input

Sparse Fourier Transform (k=1) Warmup: â is exactly 1-sparse, i.e., âu’ =0 for all u’≠u, for some u I.e., the signal is “pure” – only one frequency is present Need to find u and âu

Two-point sampling We have aj=âu ωuj Sample a0 , a1 Observe: a0 = âu a1 = âu ωu  a1/a0 = ωu Can read u from the angle of ωu Constant time algorithm! Unfortunately, it relies heavily on signal being pure 2πi u / n

What if â is not quite1-sparse ? Ideally, would like to find 1-sparse â* that contains the largest entry of â We will need to allow approximation: compute a 1-sparse â’ such that ||â-â’||2 ≤ C ||â-â*||2 where C is the approximation factor L2/L2 guarantee The definition naturally extends to general sparsity k

Interlude: History of Sparse Fourier Transform Hadamard Transform: [Goldreich-Levin’89, Kushilevitz-Mansour’91] Fourier Transform: Mansour’92: poly(k, log n) Assuming coefficients are integers from –poly(n)…poly(n) Gilbert-Guha-Indyk-Muthukrishnan-Strauss’02: k2 poly(log n) Gilbert-Muthukrishnan-Strauss’04: k poly(log n) Hassanieh-Indyk-Katabi-Price’12: k log n log(n/k) (or k log n for signals that are exactly k sparse) Akavia-Goldwasser-Safra’03, Iwen’08, Akavia’10,… Many papers since 2012 (see later lectures)

(Discrete) Fourier Transform Input (time domain): a=a0….an-1 Today: n=2l Output (frequency domain): â=â0…ân-1, where âu = 1/n Σj aj e -2πi/n uj Notation: ω=ωn=e 2πi/n = cos(2/n)+isin(2/n) is the n-th root of unity Then âu = 1/n Σj aj ω-uj Matrix notation: â = F a Fuj=1/n ω-uj F-1 ju= ωuj DFT

Back to k=1 Compute a 1-sparse â’ such that ||â-â’||2 ≤ C ||â-â*||2 Note that the guarantee is meaningful only for signals where C||â-â*||2 <||â||2 Otherwise, can just report â’ =0 So we will assume Σu’≠u â2u’ <ε â2u holds for some u, for small ε=ε(C) Will describe the algorithm assuming the exactly 1-sparse case first, and deal with epsilons later We will find u bit by bit (method introduced in GGIMS’02 and GMS’04)

b=0 iff |ar -an/2+r| < |ar +an/2+r| , Bit 0: compute u mod 2 We assumed aj=âu ωuj Suppose u=2v+b, we want b Sample: a0 =âu an/2 =âu ωu n/2 = âu ω2v n/2 +b n/2 = âu (-1)b Test: b =0 iff |a0 -an/2| < |a0 +an/2| Actual test: b=0 iff |ar -an/2+r| < |ar +an/2+r| , where r is chosen uniformly at random from 0…n-1 +r ωur ωur ωur +r ωur

aj=âu ωuj = â2v ω2vj = â’v ωn/2 vj Bit 1 Can pretend that b =u mod 2 = 0. Why ? Claim: if a’j=aj ωbj then â’u =â’u-b (“Time shift” theorem. Follows directly from the definition of FT) If b=1 then we replace a by a’ Since u mod 2 = 0, we have u =2v, and therefore aj=âu ωuj = â2v ω2vj = â’v ωn/2 vj for a frequency vector â’v =â2v , v=0…n/2-1 So, a0…an/2-1 are time samples of â’0…â’n/2-1, and â’v is the large coefficient in â’ Bit 1 of u is bit 0 of v, so we can compute it by sampling a0…an/2-1 as before I.e., v=0 mod 2 iff |ar –an/4+r| < |ar +an/4+r|

Bit reading recap Bit b0: test if |ar -an/2+r| < |ar +an/2+r| |ar ωrb0–an/4+r ω(n/4+r) b0| < |arωrb0 +an/4+r ω(n/4+r) b0| Bit b2: test if |ar ωr (b0 +2b1) –an/8+r ω(n/8+r) (b0 +2b1)| <|ar ωr (b0 +2b1) –an/8+r ω(n/8+r) (b0 +2b1)| … Altogether, we use O(log n) samples to identify u O(log n) time algorithm in the exactly 1-sparse case

Dealing with noise We now have aj=âu ωuj + Σu’≠u âu’ ωu’j =âu ωuj + μj Observe that Σ μj2 =n Σu’≠u âu’2 Therefore, if we choose r at random from 0…n-1, we have Er[μr2] = Σu’≠u âu’2 Since we assumed Σu’≠u â2u’ <ε â2u, it follows that E[μr2] < ε â2u

Dealing with noise II Claim: For all values of μr, μn/2+r satisfying aj=âu ωuj + μj E[μr2] < ε â2u Claim: For all values of μr, μn/2+r satisfying 2 (|μr|+|μn/2+r|)<|âu| the outcome of the comparison |ar -an/2+r| < |ar +an/2+r| is always the same. From Markov inequality we have Prr [ |μr| >|âu|/4 ] = Prr [ |μr|2 >|âu|2/16 ] ≤16 ε The same probability bound holds for μn/2+r Corollary: If 32ε <1/3, then a bit test is correct with probability >2/3

Finding the heavy hitter: algorithm/analysis For each bit bi, make O(log log n) independent tests, and use a majority vote From Chernoff bound, the probability that bi is estimated incorrectly is < 1/(4 log n) The probability that the coordinate is estimated incorrectly is at most log n/(4 log n) =1/4 Total complexity: O(log n * log log n)

Estimating the value of the heavy hitter Recall that aj=âu ωuj + Σu’≠u âu’ ωu’j =âu ωuj + μj where E[μr2] = Σu’≠u âu’2 = ||â - â*||2 By Markov inequality Prr [ |μr| >2 ||â - â*||2 ] ≤ ¼ In this case, the estimate â’u =aj ω-uj satisfies |â’u - âu| ≤ 2 ||â - â*||2 If we set â’u’=0 for all u’ ≠u then ||â-â’||2 ≤ |â’u - âu| + ||â - â*||2 ≤ 3||â - â*||2 so â’ satisfies the L2/L2 guarantee