Sudocodes Fast measurement and reconstruction of sparse signals Shriram Sarvotham Dror Baron Richard Baraniuk ECE Department Rice University dsp.rice.edu/cs Came out of my personal experience with 301 – fourier analysis and linear systems
Motivation: coding of sparse data Streaming in CDNs, distributed storage systems Delivery of content that has sparse representation E.g. thresholded DCT/wavelet coefficients in JPEG/JPEG2000 Distributed coding of sparse data
Sparse signal Acquisition Consider that contains only non-zero coefficients Are there efficient ways to measure and recover ? Traditional DSP approach: Acquisition: obtain measurements Sparsity is exploited only in the processing stage New Compressed Sensing (CS) approach: Acquisition: obtain just measurements Sparsity is exploited during signal acquisition [Candes et al; Donoho]
Compressive Sampling Signal is -sparse in basis/dictionary WLOG assume sparsity in space domain Measure signal via few linear projections sparse signal measurements nonzero entries
Compressive Sampling Signal is -sparse in basis/dictionary WLOG assume sparsity in space domain Measure signal via few linear projections Random Gaussian measurements will work! sparse signal measurements nonzero entries
CS Miracle: L1 reconstruction Find the solution with smallest L1 norm [Candes et al; Donoho] If then perfect reconstruction w/ high probability sparse signal measurements nonzero entries
CS Miracle: L1 reconstruction Performance Efficient encoding, and Polynomial complexity reconstruction sparse signal measurements nonzero entries
CS Miracle: L1 reconstruction But… is still impractical for many applications Reconstruction times: N=1,000 t=10 seconds N=10,000 t=3 hours N=100,000 t=~months sparse signal measurements nonzero entries
Why is reconstruction expensive? sparse signal measurements nonzero entries
Why is reconstruction expensive? Culprit: dense, unstructured sparse signal measurements nonzero entries
Fast CS reconstruction Sudocode matrix (sparse) Only 0/1 in Each row of contains randomly placed 1’s sparse signal measurements nonzero entries
Fast CS reconstruction Sudocode performance Efficient encoding Sublinear complexity reconstruction Encouraging numerical results N=100,000 K=1,000 t=5.47 seconds M=5,132 sparse signal measurements nonzero entries
Signal model Signal is strictly sparse sparse signal measurements nonzero entries
Signal model Signal is strictly sparse Every nonzero ~ continuous distribution each nonzero coefficients is unique almost surely sparse signal measurements nonzero entries
Sudocode reconstruction Process each in succession Each can recover some ‘’s sparse signal measurements nonzero entries
Sudocode reconstruction Like sudoku puzzles! sparse signal measurements nonzero entries
Case 1: Zero measurement
Case 1: Zero measurement Resolves all coefficients in the support Can resolve up to coefficients
Case 1: Zero measurement Resolves all coefficients in the support Can resolve up to coefficients Reduces size of problem
Case 2: #(support set)=1
Case 2: #(support set)=1 Trivially resolves
Case 2: #(support set)=1 Trivially resolves
Case 3: Matching measurements
Case 3: Matching measurements Matches originate from same support Disjoint support coefficients = 0 Common support contain nonzeros Common support
Case 3: Matching measurements Matches originate from same support Disjoint support coefficients = 0 Common support contain nonzeros
Trigger of revelations Recovery of can trigger more revelations
Trigger of revelations Recovery of can trigger more revelations An avalanche of coefficient revelations
Trigger of revelations Recovery of can trigger more revelations An avalanche of coefficient revelations
Auxiliary data structures Bottleneck: search for matches With Binary Search Tree, matches ~ Re-explain measurements: more data structures Search for matches
Design of Sudo measurement matrix Choice of L Set L based on For large N,
Number of measurements Theorem: With , decoder requires to exactly reconstruct coefficients Proof sketch:
Choice of L K=0.02N For a given choice of N and K
Choice of L Numerical evidence also suggests L = O(N/K)
Related work [Cormode, Muthukrishnan] CS scheme based on group testing Complexity [Gilbert et. al.] Chaining Pursuit CS scheme based on group testing and iterating Works best for super-sparse signals
Performance comparison Chaining Pursuit Sudocodes N=10,000 K=10 M=5,915 t=0.16 sec M=461 t=0.14 sec K=100 M=90,013 t=2.43 sec M=803 t=0.37 sec N=100,000 M=17,398 t=1.13 sec M=931 t=1.09 sec K=1000 M>106 t>30 sec M=5,132 t=5.47 sec
Sudocode applications Erasure codes in p2p and distributed file storage Stream compressed digital content Thresholded DCT/wavelet coefficients for sudocoding Partial reconstruction of signals (e.g. detection)
Ongoing work Statistical dependencies between non-zero coefficients Adaptive linear projections Noisy measurements
Erasure coding
Conclusions Sudocodes for CS Key idea: use sparse highly efficient low complexity Key idea: use sparse Applications to erasure codes, P2P networks
Number of measurements Theorem: With , phase 1 requires to exactly reconstruct coefficients Proof sketch:
Two phase decoding is not measured Phase 1: decode coefficients Phase 2: decode remaining coefficients Why? When most coefficients are decoded, Phase 2 saves a factor of measurements
Phase 2 measurements and decoding is non-sparse of dimension
Phase 2 measurements and decoding is non-sparse of dimension Resolve remaining coefficients by inverting the sub-matrix of
Phase 2 measurements and decoding is non-sparse of dimension Resolve remaining coefficients by inverting the sub-matrix of Phase 2 complexity = Key: choose Phase 2 complexity is
Compressive Sampling Signal is -sparse in basis/dictionary WLOG assume sparsity in space domain Measure signal via few linear projections Random sparse measurements will work! sparse signal measurements nonzero entries