Presentation is loading. Please wait.

Presentation is loading. Please wait.

Audiovisual Communications Laboratory Sparse Sampling: Variations on a Theme by Shannon Martin Vetterli, EPFL & UC Berkeley Joint work with T.Blu (CUHK),

Similar presentations


Presentation on theme: "Audiovisual Communications Laboratory Sparse Sampling: Variations on a Theme by Shannon Martin Vetterli, EPFL & UC Berkeley Joint work with T.Blu (CUHK),"— Presentation transcript:

1 Audiovisual Communications Laboratory Sparse Sampling: Variations on a Theme by Shannon Martin Vetterli, EPFL & UC Berkeley Joint work with T.Blu (CUHK), L.Coulot, A.Hormati (EPFL), P.L.Dragotti (ICL) and P.Marziliano (NTU) Polytechnique Fall 2008

2 SanDiego-08 - 2 Acknowledgements The organizers! Sponsors –NSF Switzerland: thank you! Collaborations –C.Lee, B.Aryan, D.Julian, Qualcomm –J. Kusuma, MIT/Schlumberger –I. Maravic, Deutsche Bank –P. Prandoni, Quividi Discussions and Interactions –V. Goyal, MIT –M. Unser, EPFL

3 SanDiego-08 - 3 Outline 1. Introduction Shannon and beyond… 2. Signals with finite rate of innovation Shift-invariant spaces and Poisson spikes Sparsity in CT and DT 3. Sampling signals at their rate of innovation The basic set up: periodic set of diracs, sampled at Occam’s rate Variations on the theme: Gaussians, splines and Strang-Fix 4. The noisy case Total least squares and Cadzow Lower bounds on SNR: Cramer-Rao and uncertainty principles 5. Sparse sampling, compressed sensing, and compressive sampling Sparsity, fat matrices, and frames Norms and algorithms 6. Applications and technology Wideband communications, UWB, superresolution Technology transfer, or surfing in SanDiego 7. Discussion and conclusions

4 SanDiego-08 - 4 The Question! You are given a class of objects, a class of functions (e.g. BL) You have a sampling device, as usual to acquire the real world Smoothing kernel or lowpass filter Regular, uniform sampling That is, the workhorse of sampling! Obvious question: When does a minimum number of samples uniquely specify the function?

5 SanDiego-08 - 5 The Question! About the observation kernel: Given by nature Diffusion equation, Green function Ex: sensor networks Given by the set up Designed by somebody else, it is out there Ex: Hubble telescope Given by design Pick the best kernel Ex: engineered systems, but constraints About the sampling rate: Given by problem Ex: sensor networks Given by design Usually, as low as possible Ex: digital UWB

6 SanDiego-08 - 6 1. Shannon and Beyond… 1948 Foundation paper If a function x(t) contains no frequencies higher than W cps, it is completely determined by giving its ordinates at a series of points spaced 1/(2W) seconds apart. [if approx. T long, W wide, 2TW numbers specify the function] It is a representation theorem: sinc(t-n) is an orthogonal basis for BL[- ,  ] x(t)  BL[- ,  ] can be written as Note: … slow convergence of the series Shannon-BW, BL sufficient, not necessary BL x

7 SanDiego-08 - 7 A Sampling Nightmare…. Shannon BL case or 1/T degrees of freedom per unit time But: a single discontinuity, and no more sampling theorem… Are there other signals with finite number of degrees of freedom per unit of time that allow exact sampling results?

8 SanDiego-08 - 8 Examples of non-bandlimited signals

9 SanDiego-08 - 9 1. Shannon and Beyond… Is there a sampling theory beyond Shannon? –Shannon: bandlimitedness is sufficient but not necessary –Shannon bandwidth: dimension of subspace –Shift-invariant spaces popular with wavelets etc What about: –Piecewise constant signals (CDMA) –Pulse position modulation (UWB) –Images with sharp edges Thus, develop a sampling theory for classes of non-bandlimited but sparse signals! t x(t)

10 SanDiego-08 - 10 Related work Spectral estimation, Fourier analysis, sampling – Prony 1795 – classic high resolution spectral estimation (Stoica/Moses) – uncertainty principle (Donoho and Stark, 1989) – spectrum blind sampling (Bresler et al, 1996) Error correction coding – Errors are sparse…Reed-Solomon, Berlekamp-Massey Randomized algorithms for fast estimation – Faster Fourier transforms, Gilbert et al 2005 – sketches of databases, Indyk et al Compressed sensing, compressive sampling – Donoho et al, 2006 – Candes et al, 2006

11 SanDiego-08 - 11 Outline 1. Introduction 2. Signals with finite rate of innovation Band-limited spaces Shift-invariant spaces Poisson spikes Rate of innovation  The set up and the key questions 3. Sampling signals at their rate of innovation 4. The noisy case 5. Sparse sampling, compressed sensing, and compressive sampling 6. Applications and technology 7. Discussion and conclusions

12 SanDiego-08 - 12 2. Sparsity and Signals with Finite Rate of Innovation Sparsity: CT: parametric class, with degrees of freedom (e.g. K diracs, or 2K degrees of freedom) DT: N dimensional vector x, and its l 0 norm K = ||x|| 0. Then K/N indicates sparsity Compressibility: Object expressed in an ortho-basis has fast decaying NLA For ex., decay of ordered wavelet coefficients in O(k -  ),  > 1.  : Rate of innovation or degrees of freedom per unit of time Call C T the number of degrees of freedom in the interval [-T/2,T/2], then Note: Relation to information rate in Shannon-48

13 SanDiego-08 - 13 2. Signals with Finite Rate of Innovation 1. Bandlimited spaces: Assume x(t) is bandlimited to [-B/2,B/2], Thus, x(t) is exactly defined by {x k }, k  Z, spaced T=1/B secs apart. We call this the rate of innovation: 2. Shift-invariant spaces:  (t-kT) is an ortho basis for S In both cases:

14 SanDiego-08 - 14 2. Signals with Finite Rate of Innovation 3. Poisson Process: a set of Dirac pulses, |t k - t k+1 | exp. distrib Innovations: times {t k }, rate of innovation average # of Diracs per sec. Call C T the number of Diracs in the interval [-T/2,T/2], then Poisson process: average time is 1/  thus  = Weighted Diracs: Obvious question: is there a sampling theorem at  samples per sec? Necessity: Obvious! Sufficiency?

15 SanDiego-08 - 15 2. Signals with Finite Rate of Innovation The set up: Where x(t): signal,  (t): sampling kernel, y(t): filtering of x(t) and y n : samples For a sparse input, like a sum of Diracs (in DT and CT) When is there a one-to-one map y n  x(t)? Efficient algorithm? Stable reconstruction? Robustness to noise? Optimality of recovery?

16 SanDiego-08 - 16 Outline 1. Introduction 2. Signals with finite rate of innovation 3. Sampling signals at their rate of innovation The basic set up: periodic set of diracs, sampled at their rate of innovation A representation theorem [VMB:02] Toeplitz system Annihilating filter and root finding Vandermonde system Total least squares and SVD Fourier case, dirichlet versus sinc Gaussian case Spline case and Strang-Fix 4. The noisy case 5. Sparse sampling, compressed sensing, and compressive sampling 6. Applications and technology 7. Discussion and conclusions

17 SanDiego-08 - 17 K Diracs on the interval: 2K degrees of freedom. Periodic case: Key: The Fourier series is a weighted sum of K exponentials Result: Taking 2K+1 samples from a lowpass version of bandwidth (2K+1) allows to perfectly recover x(t) Method: Yule-Walker, annihilating filter, Vandermonde system An examplary case and overview t x(t)  tktk xkxk

18 SanDiego-08 - 18 A Representation Theorem [VMB:02] Theorem Periodic stream of K Diracs, of period , weights {x k } and locations {t k }. Take a sampling kernel of bandwidth B, with B  is an odd integer > 2K or a Dirichlet kernel. Then the samples is a sufficient characterization of x(t).

19 SanDiego-08 - 19 Note: Linear versus Non-Linear Problem Problem is non-linear in t k, and linear in x k given t k Consider two such streams of K Diracs, of period , and weights and locations {x k,t k } and {x’ k,t’ k }, respectively The sum is in general a stream with 2K Diracs. But, given a set of locations {t k } then the problem is linear in {x k }. The key to the solution: Separability of non-linear from linear problem Note: Finding locations is a key task in estimation/retrieval of sparse signals, but also in registration, feature detection etc

20 SanDiego-08 - 20 Proof The signal is periodic, so consider its Fourier series 1. The samples y n are a sufficient characterization of the central 2K+1 Fourier series coefficients. Use sampling theorem or graphically:

21 SanDiego-08 - 21 Proof 2. The Fourier series is a linear combination of K complex exponentials. These can be killed using a filter, which puts zeros at the location of the exponentials: This filter satisfies: because and is thus called annihilating filter. The filter has K+1 coefficients, but h 0 = 1. Thus, the K degrees of freedom {u k }, k = 1…K specify the locations.

22 SanDiego-08 - 22 Proof 3. To find the coefficients of the annihilating filter, we need to solve a convolution equation, which leads to a K by K Toeplitz system 4. Given the coefficients {1, h 1, h 2, … h K }, we get the {t k }’s by factorization of where

23 SanDiego-08 - 23 Proof 5. To find the coefficients {x k }, we have a linear problem, since given the {t k }’s or equivalently the {u k }’s, the Fourier series is given by This can be written as a Vandermonde system: which has always a solution for distinct {u k }’s. This completes the proof of unicity, using B   2K samples. Since B  is odd, we need 2K+1 samples, which is as close as it gets to Occam’s razor! Also, B > 2K/  = 

24 SanDiego-08 - 24 Notes on Proof The procedure is constructive, and leads to an algorithm: 1.Take 2K+1 samples y n from the Dirichlet kernel output 2.Compute the DFT to obtain the Fourier series coefficients -K..K 3.Solve a Toeplitz system of equation of size K by K to get H(z) 4.Find the roots of H(z) by factorization, to get u k and t k 5.Solve a Vandermonde system of equation of size K by K to get x k. The complexity is: 1.Analog to digital converter 2.K Log K 3.K 2 4.K 3 (can be accelerated) 5.K 2 Or polynomial in K! Note: For size N vector, with K Diracs, O(K 3 ) complexity, noiseless

25 SanDiego-08 - 25 More on annihilation: Rectangular systems What if more than 2K+1Fourier coefs? What if filter of length L > K? As long as H(z) has {u k } as roots, it will satisfy and conversely, if it satisfies the above, it has {u k } as roots. This leads to a rectangular system of size 2M-L+1 by L+1: where M=  B  /2  and H=[h 0,h 1,…,h L ] T satisfy which can be solved using SVD methods (see noisy case….)

26 SanDiego-08 - 26 Generalizations [VMB:02] For the class of periodic FRI signals which includes –Sequences of Diracs –Non-uniform or free knot splines –Piecewise polynomials There are sampling schemes with sampling at the rate of innovation with perfect recovery and polynomial complexity Variations: finite length, 2D, local kernels etc

27 SanDiego-08 - 27 Generalizations [VMB:02] Sinc Gaussian Splines

28 SanDiego-08 - 28 Generalizations [DVB:07] The return of Strang-Fix! Local, polynomial complexity reconstruction, for diracs and piecewise polynomials Pure discrete-time processing!

29 SanDiego-08 - 29 Generalizations: 2D extensions [MV:06] Note: true 2D processing!

30 SanDiego-08 - 30 Outline 1. Introduction 2. Signals with finite rate of innovation 3. Sampling signals at their rate of innovation 4. The noisy case Total least squares Cadzow or model based denoising Lower bounds on SNR: Cramer-Rao Uncertainty principles on location and amplitude of Diracs Algorithms Examples 5. Sparse sampling, compressed sensing, and compressive sampling 6. Applications and technology 7. Discussion and conclusions

31 SanDiego-08 - 31 The noisy case… Acquisition in the noisy case: where ‘’analog’’ noise is before acquisition (e.g. communication noise on a channel) and digital noise is due to acquisition (ADC, etc) Example: Ultrawide band (UWB) communication….

32 SanDiego-08 - 32 The noisy case: The non-linear estimation problem Data: where: T=  /N,  (t) Dirichlet kernel, N/(   ) redundancy Estimation: For Find the set such that is minimized This is: a non-linear estimation problem in {t k }, a linear estimation problem in {x k } given {t k }.

33 SanDiego-08 - 33 Total least squares solution (medium to high SNR) Annihilation equation: can only be approximately satisfied. Instead: Pick L=K for minimal size, and unique locations. Method: SVD of A, or eigen-decomposition of A T A. That is: Then H is the last column of V. From H, retrieve {t k } by factorization.

34 SanDiego-08 - 34 Low SNR Denoising using Cadzow’s iterated algorithm For small SNR’s, the TLS method can fail. Use a longer filter, of size L > K, but use knowledge that the matrix A is still only of rank K. Typically, use L=  B  /2  or of order N/2. Then: SVD of A = U S V T Put L+1-K smallest diagonal coefficients of S to zero, leading to S’ New matrix A’ = U S’ V T This matrix is not Toeplitz, make it so by averaging along diagonals This is a denoised version of the sequence, still fitting the model Iterate until convergence criterion Note: structured matrix approx, best in Froebenius norm

35 SanDiego-08 - 35 Example 1 7 Diracs in 5dB SNR, 71 samples Original and noisy version Original and retrieved Diracs

36 SanDiego-08 - 36 Example 2 100 Diracs in 20dB SNR, 1001 samples Noisy version Original and retrieved Diracs

37 SanDiego-08 - 37 CRB for 2 Dirac case: Experiment Find [x 1,t 1,x 2,t 2 ] from 21 noisy samples [y 1,y 2, … y 21 ] The CRB is reached down to SNR of 5dB, thus we have an optimal retrieval method with a polynomial algorithm Note: SNR not a good measure, PSNR more appropriate

38 SanDiego-08 - 38 Outline 1. Introduction 2. Signals with finite rate of innovation 3. Sampling signals at their rate of innovation 4. The noisy case 5. Sparse sampling, compressed sensing, and compressive sampling Discrete, finite size set up: sparse vectors Fat matrices and frames Unicity and conditioning Reconstruction algorithms Compressed sensing and compressive sampling 6. Applications and technology 7. Discussion and conclusions

39 SanDiego-08 - 39 Sparsity revisited… Consider a discrete-time, finite dimensional set up Model: –World is discrete, finite dimensional of size N –x  R N, but |x| 0 = K << N : K sparse Method: –Take M measurement, where K < M << N –Measurement matrix F: a fat matrix of size M x N

40 SanDiego-08 - 40 Geometry of the problem This is a vastly under-determined system… –But we know x is sparse! –F is a frame matrix, of size M by N, M << N –Key issue: the (N|K) submatrices corresponding to choosing K out of the N columns of F –Consider the set of submatrices {F k }, k=1…(N|K) –Calculate projection of y onto range of F k, –If, possible solution –In general, choose k such that –Note: this is hopeless in general ;) –Necessary conditions (for most inputs, or prob. 1) - M > K - all F k must have rank K - all ranges of F k must be different

41 SanDiego-08 - 41 Geometry of the problem Easy solution… Vandermonde matrices! –Each submatrix F k is of rank K, and each spans a different subspace –Particular case: Fourier matrix or DFT matrix, or a slice of it

42 SanDiego-08 - 42 Geometry of the problem But are these good matrices? –Fourier case very relevant for FRI –Study conditioning of submatrices of a slice of the Fourier matrix Condition number for various (N, M, K) of –Worst case: first K columns still, as M -> N, orthogonal

43 SanDiego-08 - 43 The power of random matrices! Donoho, Candes et al Set up: x  R N, but |x| 0 = K << N : K sparse, F of size M by N Measurement matrix with random entries (gaussian, bernoulli) –Pick M = O(K log N/K) –With high probability, this matrix is good! Condition on matrix F – Uniform uncertainty principle or restricted isometry property All K-sparse vectors x satisfy an approx. norm conservation Reconstruction Method: –Solve linear program under constraint Strong result: l 1 minimization finds, with high probability, sparse solution, or l 0 and l 1 problems have the same solution

44 SanDiego-08 - 44 Relation to compressed sensing: Differences Sparse Sampling of Signal Innovations +Continuous or discrete, infinite or finite dimensional +Lower bounds (CRB) provably optimal reconstruction +Close to “real” sampling, deterministic –Not universal, designer matrices Compressed sensing +Universal and more general ±Probabilistic, can be complex –Discrete, redundant The real game: In the space of frame measurements matrices F Best matrix? (constrained grassmanian manifold) Tradeoff for M: 2K, KlogN, Klog 2 N Complexity (measurement, reconstruction)

45 SanDiego-08 - 45 Outline 1. Introduction 2. Signals with finite rate of innovation 3. Sampling signals at their rate of innovation 4. The noisy case 5. Sparse sampling, compressed sensing, and compressive sampling 6. Applications and technology ODE’s and FRI acquisition Wideband communications, ranging, channel estimation EEG Shot noise Superresolution OCT A technology transfer story 7. Discussion and conclusions

46 SanDiego-08 - 46 Applications: The proof of the pie is in … 1. Look for sparsity! –In either domain, in any basis –Different singularities –Model is key –Physics might be at play 2. Choose operating point, algorithm, –Low noise, high noise? –Model precise of approximate? –Choice of kernel? –Algorithmic complexity? Iterative? 3. It is not a black box…. –It is a toolbox –Handcraft the solution –No free lunch!

47 SanDiego-08 - 47 Applications: Superresolution [Dragotti et al] Actual acquisition with a digital camera (Nikon D70) – registration using FRI with psf and moments – captured image: 60 images of 48 by 48 pixels – super-resolved image 480 by 480 pixel

48 SanDiego-08 - 48 Applications: Optical Coherence Tomography [Blu et al] Range finding in interferometric images – Backscattered waves and interference with reference wave – Depth is phase, refraction index is amplitude – Signal is a combination of Gabor function

49 SanDiego-08 - 49 From Theorems to Patents to Products ;) Theorem 1 -> Patent 1 {Patents 1..N} -> Patent troll? No! No: Real technology transfer!

50 SanDiego-08 - 50 From Theorems to Patents to Products ;) Real technology transfer!

51 SanDiego-08 - 51 Conclusions and Outlook Sampling at the rate of innovation –Cool! –Sharp theorems –Robust algorithms –Provable optimality over wide SNR ranges Many actual and potential applications –Fit the model (needs some work) –Apply the ‘’right’’ algorithm –Catch the essence! Still a number of good questions open, from the fundamental to the algorithmic and the applications A slogan: Sort the wheat from the chaff… at Occam’s rate!

52 SanDiego-08 - 52 Publications Special issue on Compressive Sampling: –R.Baraniuk, E.Candes, R.Nowak and M.Vetterli (Eds.) IEEE Signal Processing Magazine, March 2008. 10 papers: overview of the theory and practice of Sparse Sampling, Compressed Sensing and Compressive Sampling (CS). Basic paper: –M.Vetterli, P. Marziliano and T. Blu, “Sampling Signals with Finite Rate of Innovation,” IEEE Transactions on Signal Processing, June 2002. Main paper, with comprehensive review: –T.Blu, P.L.Dragotti, M.Vetterli, P.Marziliano, and L.Coulot, “Sparse Sampling of Signal Innovations: Theory, Algorithms and Performance Bounds,” IEEE Signal Processing Magazine, Special issue on Compressive Sampling, March 2008.


Download ppt "Audiovisual Communications Laboratory Sparse Sampling: Variations on a Theme by Shannon Martin Vetterli, EPFL & UC Berkeley Joint work with T.Blu (CUHK),"

Similar presentations


Ads by Google