Download presentation
Presentation is loading. Please wait.
Published byNeal Ball Modified over 8 years ago
1
Compressed Sensing in Multimedia Coding/Processing Trac D. Tran ECE Department The Johns Hopkins University Baltimore, MD 21218
2
Outline Compressed Sensing: Quick Overview Motivations. Toy Examples Incoherent Bases and Restrictive Isometry Property Decoding Strategy L0 versus L1 versus L2 Basis Pursuit and Matching Pursuit Compressed Sensing in Image/Video Processing 2D Separable Measurement Ensemble (SME) Face Recognition Distributed compressed video sensing (DISCOS) Layered compressed sensing for robust video transmission
3
Compressed Sensing History Emmanuel Candès and Terence Tao, ”Decoding by linear programming” IEEE Trans. on Information Theory, 51(12), pp. 4203 - 4215, December 2005 Emmanuel Candès, Justin Romberg, and Terence Tao, ”Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. on Information Theory, 52(2) pp. 489 - 509, Feb. 2006. David Donoho, ”Compressed sensing,” IEEE Trans. on Information Theory, 52(4), pp. 1289 - 1306, Apr. 2006. Emmanuel Candès and Michael Wakin, ”An introduction to compressive sampling,” IEEE Signal Processing Magazine, 25(2), pp. 21 - 30, Mar. 2008.
4
Traditional Data Acquisition: Sampling t x(t) t ^ Shannon Sampling Theorem In order for a band-limited signal x(t) to be reconstructed perfectly, it must be sampled at rate
5
Traditional Compression Paradigm Receive Decompress SampleCompress Transmit/ Store MP3, JPEG, JPEG200, MPEG… sampleslargest coefficients Sample first and then worry about compression later!
6
Sparse Signals largest coefficients basis functions transform coefficients Digital signals in practice are often sparse Audio: MP3, AAC… ~10:1 compression Images: JPEG, JPEG2000… ~20:1 compression Video sequences: MPEG2, MPEG4… ~40:1 compression
7
Sparse Signals II basis functions transform coefficients N-pixel image nonzero entries : -sparse signal in non- sparse domain : -sparse signal
8
Definition & Notation N = length of signal x K = the sparsity level of x or x is called K-sparse M = the number of measurements (samples) taken at the encoder
9
Compressed Sensing Framework Encoding: obtain M measurements y from linear projection onto an incoherent basis Decoding: reconstruct x from measurements y via nonlinear optimization with sparsity prior Sensing matrix = Compressed measurements has only K nonzero entries
10
At Encoder: Signal Sensing Each measurement y contains a little information of each sample of x y is not sparse, looks iid Random projection works well! Sensing & sparsifying matrix must be incoherent
11
At Decoder: Signal Reconstruction Recover x from the set of measurements y Without the sparseness assumption, the problem is ill-posed With sparseness assumption, the L0-norm minimization problem is well-posed but computationally intractable With sparseness assumption, the L1-norm minimization can be solved via linear programming – Basis Pursuit!
12
Incoherent Bases: Definition Suppose signal is sparse in a orthonomal transform domain Take K measurements from an orthonormal sensing matrix Definition : Coherence between and With, T
13
Incoherent Bases: Properties Bound of coherence : When is small, we call 2 bases are incoherent Intuition: when 2 bases are incoherent, all entries of matrix are spread out each measurement will contains more information of signal we hope to have small Some pair of incoherent bases : DFT and identity matrix : Gaussian (Bernoulli) matrix and any other basis:
14
Universality of Incoherent Bases Random Gaussian white noise basis is incoherent with any fixed orthonormal basis with high probability If the signal is sparse in frequency, the sparsifying matrix is Product of is still Gaussian white noise!
15
Restricted Isometry Property Sufficient condition for exact recovery: All sub-matrices composed of columns are nearly orthogonal non-zero entries
16
L0- and L1-norm Reconstruction - 16 - - norm reconstruction : take advantage of sparsity prior Problem: Combinatorial searchingExhaustive computation - norm reconstruction : Compressed sensing framework This is a convex optimization problem Using linear programming to solve We find the sparsest solution Also can find the sparsest which turns out to be the exact solution
17
Closed-form solution L2-norm Reconstruction - norm reconstruction : classical approach We find with smallest energy Unfortunately, this method almost never find the sparsest and correct answer
18
Problem Let Standard LP Many available techniques Simplex, primal-dual interior-point, log-barrier… L1-Minimization
19
circle at a non-sparse point Why Is L1 Better Than L2? - 19 - Bad point The lineintersect The lineintersectdiamond at the sparse point Unique and exact solution
20
CS Reconstruction: Matching Pursuit Problem Basis Pursuit Greedy Pursuit – Iterative Algorithms At each iteration, try to identify columns of A (atoms) that are associated with non-zero entries of x significant atoms y A x
21
1., set residual vector, selected index set 2.Find index yielding the maximal correlation with the residue 3.Augment selected index set: 4.Update the residue: 5., and stop when Matching Pursuit MP: At each iteration, MP attempts to identify the most significant atom. After K iteration, MP will hopefully identify the signal! t = K
22
Orthogonal Matching Pursuit OMP: guarantees that the residue is orthogonal to all previously chosen atoms no atom will be selected twice! 1., set residual vector, index set 2.Find index that yields the maximal correlation with residue 3.Augment 4.Find new signal estimate by solving 5.Set the new residual : 6., and stop when t = K
23
SP: pursues the entire subspace that the signal lives in at every iteration steps and adds a backtracking mechanism! 1.Initialization 2.Selected set 3.Signal estimate 4.Residue 5., go to Step 2; stop when residue energy does not decrease anymore Subspace Pursuit
24
How Many Measurements Are Enough? Theorem (Candes, Romberg and Tao) : Suppose x has support on T, M rows of matrix F is selected uniformly at random from N rows of DFT matrix N x N, then if M obeying : Minimize L1 will recover x exactly with extremely high probability : In practice: is enough to perfectly recover
25
CS in Multimedia Coding/Processing Practical Compressed Sensing or Sparse Signal Processing One-pixel camera 2D separable measurement ensemble for image/video Face recognition Speech recognition Distributed compressed video sensing Layered compressed-sensing robust video transmission Video denoising Video super-resolution Multiple description coding MRI application
26
One-Pixel Compressed Sensing Camera Courtesy of Richard Baraniuk & Kevin Kelly @ Rice
27
CS Analog-to-Digital Converter
28
Modulated Wideband Converter
29
Common Sensing Limitations Treat every source signal as 1D signal: perform sensing operation on a vectorized signal Increase significant complexity at both encoder and decoder Inappropriate for some compressive imaging applications such as Compressive Image Sensors Physical structure of image sensor arrays is 2D Costly implementation of dense sensing matrices due to wide dynamical range issue Block-diagonal sensing matrices results in low-performance due to incoherence degradation issue
30
2D Separable Measurement Ensembles Y = D 1 F 1 P 1 S 1 X S 2 P 2 F 2 D 2 Randomly subsampling rows Randomly subsampling columns WHT Block-Diagonal Fast Transform Randomly Permuting Rows and Columns Randomly Flipping Sign of Rows and Columns Entry 0 Entry 1Entry -1
31
2D Separable Measurement Ensembles In algorithm: Randomly flip sign of rows & columns of a source image Randomly permute rows and columns of sign-flipped image Transform the randomized image by Walsh-Hadamard block- diagonal matrix Randomly subsample rows & columns of the transform image All operations are performed on rows and columns of a source image, separately
32
Underlying Principles Preprocess a source image before subsampling its rows and columns Spread out the energy along rows and columns Guarantee energy preservation of a subset of measurements (submatrix) if coupling with a suitable scale factor (with high probability) Preprocessing
33
Performance Comparison 512x512 Lena 1D Non-separable ME (SRM [Do]) 2D Separable ME using block size 32x32 (a) SRM: PSNR = 29.4 dB (b) 2D-SME, PSNR = 28 dB Reconstruction from 25% measurement Simulation Results Reconstruction algorithm: GPSR [Figueiredo]
34
Performance Comparison 1024x1024 Man 1D Non-separable ME (SRM [Do]) 2D Separable ME using block size 32x32 (a) SRM: PSNR = 29.3 dB (b) 2D-SME, PSNR = 28 dB Reconstruction from 35% measurement Simulation Results Reconstruction algorithm: GPSR [Figueiredo]
35
Application in Face Recognition Face-subspace model: faces under varying lighting and expression lie on a subspace A new test sample y of object i approximately lies in the linear span of the training samples associated with i The test sample y is then a sparse linear combination of all training samples Sparse representation encodes the membership i of y John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma, “Robust Face Recognition via Sparse Representation”, IEEE Trans. PAMI, Feb. 2009
36
Sparse Classification Classification problem can be solved by If the solution is sparse enough Test data y is classified based on the residual where the only nonzero entries in are the entries in x that are associated with class i
37
Example I Coefficients Residual Energy
38
Robustness to Corruption Test image y can be corrupted or occluded If the occlusion e covers less than ~50% of the image, the sparsest solution to y = B w is the true generator w 0 If w 0 is sparse enough, can be solved by Classified based on the residual
39
Example II Set-Up Top left: face image is occluded by a disguise Bottom left: 50% of random pixels replaced by random values Right: training images
40
Example II Result Test images: sum of a sparse linear combination of the training images and a sparse error due to occlusion or corruption
41
Robust Local Face Recognition In previous scheme, faces must be aligned, not robust to registration errors Local sparsity model: A block y b in test data y of object i approximately lies in the linear span of the neighboring blocks in the training samples associated with i y b is a sparse linear combination of neighboring blocks in all training samples Test Data Training Data Non-zero entry Block in the test data Neighboring Blocks in the training data (only one sample is shown for illustration)
42
Illustration of Our Approach Misaligned test data Generate atoms in the dictionary Select a block Neighboring blocks in a training sample
43
Our Classification Solve the sparse recovery problem Compute residuals and determine the class Find the sparse coefficient
44
Example I: Translation Aligned data 32x28 Shifted by 5 pixels in each direction Class 38Class 24Class 27 Randomly choose 16x16 blocks four times Final classification result: Class 27 (correct)
45
Example II: Rotation Aligned data 32x28 Class 19Class 27 Class 29 Rotated by 10 degrees Randomly choose 16x16 blocks four times Final classification result: Class 27 (correct)
46
Example III: Random Corruption Original data 32x28 Class 27Class 25Class 27 Final classification result: Class 27 (correct) 40% of the pixels are randomly corrupted Randomly choose 16x16 blocks four times
47
Inter-frameVideo Coding Examples: Video Compression Standards MPEG/H.26x High complexity at Encoder & Low complexity at Decoder Inter-frame Decoder X’ Side Information X Inter-frame Encoder
48
Distributed Video Coding (DVC) Examples: PRISM at Berkeley, Turbo Coding with a feedback channel at Stanford, etc. All based on Wyner-Ziv coding techniques Low complexity at Encoder & High complexity at Decoder X’ Inter-frame Decoder Intra-frame Encoder Side Information X
49
Intra-frame Coding Inter-frame Decoding MPEG/H.26x Coding MPEG/H.26x Decoding Low Complexity Video Coding and Decoding Distributed Video Coding (DVC)
50
The Encoder Intra-code Key Frames periodically using conventional video compression standards (MPEG/H.26x) Acquire local block-based and global frame-based measurements of CS Frames Distributed Compressed Video Sensing Input Video Key Frames CS Frames Block-based measurement Acquisition Frame-based measurement Acquisition MPEG/H.26x IntraCoding Transmit to the decoder
51
The Decoder Decode key frames using conventional image/video compression standards Perform Sparsity-constraint Block Prediction for motion estimation and compensation Perform Sparse Recovery with Decoder SI for prediction error reconstruction Add reconstructed prediction error to the block-based prediction frame for final frame reconstruction Distributed Compressed Video Sensing Output Video Measurement Union Sparse recovery with Side Info at the Decoder Side Info Block Prediction using the Interframe Sparsity Model Key Frames Optimal Block-based Prediction MPEG/H.26x Decoding Measurement Generation Measurement Subtraction Sparse Recovery of Prediction Error Local block measurements Global frame measurements
52
Distributed Compressed Video Sensing Output Video EncoderDecoder Input Video Key-frames CS-frames Measurements Union Side Info Key-frames Analog-domain Compressive Sensing Block-based measurement Acquisition Frame-based measurement Acquisition Optimal Block-based Prediction MPEG/H.26x IntraCoding MPEG/H.26x Decoding Measurement Generation Measurement Subtraction Sparse Recovery of Prediction Error Block Prediction using the Interframe Sparsity Model Sparse recovery with Side Info at the Decoder
53
I-frame Inter-frame Sparsity Model A block X B can be sparsely represented by a linear combination of a few temporal neighboring blocks A generalized model of block motion BB DBDB xBxB CS-frame I-frame Non-zero entry Macro-Block in CS-frame Neighboring Macro-Blocks in nearby I-frames X B = D B B
54
I-frame Inter-frame Sparsity Model Half-pixel motion compensation BB DBDB xBxB CS-frame I-frame Non-zero entry Macro-Block in CS-frame Neighboring Macro-Blocks in nearby I-frames b1 b2 b3b4
55
Sparsity-Constraint Block Prediction Find the block that has the sparsest representation in a dictionary of temporal neighboring blocks * B = Argmin | B | 1 s.t. y B = Φ B D B B Received local block measurements Dictionary of temporal neighboring blocks X * B = D B * B Block Prediction Block Sensing Matrix A generalized prediction algorithm of both full-pixel and sub-pixel best matching block search
56
Sparse Recovery with Decoder SI Prediction Error Often very sparse Can be recovered with higher accuracy Frame Prediction (SI) Measurement Generation Sparse Recovery Received Measurements Prediction Error
57
Simulation Results Baseline: 27.9 dB DISCOS: 38.7dB Fig. 4 Reconstruction of frame 41 from 25% measurements Performance Comparison DISCOS and CS-based intra-coding and intra- decoding (Baseline) Baseline: 27.9 dB DISCOS: 38.7dB Reconstruction of frame 41 from 25% measurements
58
Simulation Results Baseline: 27.9 dB DISCOS: 38.7dB Fig. 4 Reconstruction of frame 41 from 25% measurements Baseline: 24.3 dB DISCOS: 32.9dB Reconstruction of frame 21 from 25% measurements Performance Comparison DISCOS and CS-based intra-coding and intra- decoding (Baseline)
59
Enhancement Layer Encoder Enhancement Layer Decoder Wyner-Ziv based Approaches Compressive Sensing Approach Forward Error Correction Systematic Lossy Error Protection (Stanford) Wyner-Ziv Layered Video Coding (Texas A&M) PRISM (Berkeley) Packet Loss Channel Packet Loss Channel Error-Resilient Data Transmission
60
Previous Approaches FEC Employ well-known channel codes (Reed-Solomon, Turbo code or LDPC,etc) Decoded video quality significantly degraded when packet loss rate higher than error correction capacity of the channel code (cliff effect) All based on coding technique on a finite field Recent Approaches Wyner-Ziv coding technique based: SLEP (Stanford), Layered Wyner-Ziv Video Coding (Texas A&M) Distributed Video Coding to mitigate error propagation of the predictive video coding: PRISM (Berkeley)
61
Compressive Sensing Approach Borrow principles from Compressive Sensing Effectively mitigate the cliff effect thanks to the soft-decoding feature of Sparse Recovery Algorithm Eliminate “Get all or nothing” feature of coding techniques on a finite field A new channel-coding technique on a REAL FIELD
62
Principle of Block Motion Estimation Partition current video frame into small non-overlapped blocks called macro-blocks (MB) For each block, within a search window, find the motion vector (displacement) that minimizes a pre-defined mismatch error For each block, motion vector and prediction error (residue) are encoded Search Window Reference Frame Current Frame MV
63
BME/BMC: Example I Previous FrameCurrent FrameMotion Vector Field Frame DifferenceMotion-Compensated Difference
64
BME/BMC: Example II Previous FrameCurrent FrameMotion Vector Field Frame DifferenceMotion-Compensated Difference
65
Compressive Sensing Overview Measurement vectorSensing matrix Sparsifying matrix Input signal Sparse transform coefficients * = Argmin | | 1 s.t. y = ΦΨ X * = Ψ * Sparse Recovery:
66
MPEG/H.26x Decoding Sparse Recovery with Decoder Side Info Quantized transformed prediction error Motion vectors, mode decisions Side info MPEG/H.26x Encoding EE -1 Sparse Recovery Measurement Acquisition Measurement Generation R Packet Loss Channe l E -1 RoundingEntropy Coding Layered Compressive Sensing Video Codec
67
Conventionally encoded by video compression standards MPEG/H.26x Slices of an prediction error frame are entropy-coded and packetized before being transmitted over error- prone channels without any error correcting code Quantized transformed prediction error Motion vectors, mode decisions MPEG/H.26x Encoding E Measurement Acquisition R Slice of MacroBlocks Prediction Error Frame Base Layer Coding
68
Measurements Acquired across slices of an error prediction frame Rounded to integers, entropy-coded and sent to the decoder (along with motion vectors and mode decisions) Quantized transformed prediction error Motion vectors, mode decisions MPEG/H.26x Encoding E Measurement Acquisition R Cross-slice measurements A slice of MacroBlocks Enhancement Layer Coding
69
Entropy-decode a corrupted base layer (regarded as SI) Feed the SI and cross-slice measurements received from the enhancement layer into a sparse recovery with decoder SI for recovering lost slices/packets Add recovered slices/packets back to the corrupted base layer for a final reconstruction of prediction error frames Feed reconstruction of prediction error frames into a regular MPEG/H.26x decoder for final video frame reconstruction MPEG/H.26x Decoding Motion vectors, mode decisions Side info E -1 Sparse Recovery Measurement Generation E -1 LACOS Decoder
70
Example of Coding & Decoding = y Φ x x SI Φ = y SI Observed u = y- y SI v = x- x SI = u Φ v Observed v* = argmin |v| 1 s.t. u = Φv x = x SI + v*
71
Sparse Recovery Algorithm for LACOS Sparsity Adaptive Matching Pursuit Algorithm (SAMP) (D. & Tran) Follow the “divide and conquer” principle through stage by stage estimation the sparsity K and the true support set At each stage, a fixed size finalist of nonzero, significant coefficients is iteratively refined via the Final Test. When energy of a current residue is greater than that of previous iteration residue, shift to a new stage and expand the size of finalist by a step-size s Optimal performance guarantee without prior info of sparsity K Prelim Test Update res. r k Update F k F k-1 r k-1 Candidate C k Final Test |C k | adaptive |F k | adaptive
72
Simulation Results Performance comparison : LACOS, FEC and Error Concealment with Football sequence. Base layers is encoded at 2.97 Mbps FEC: 29 dB LACOS: 30.7dB Reconstruction of the frame 27 with 13.3% packet loss
73
Simulation Results Performance Comparison: LACOS, FEC and Error Concealment with CIF sequence Mobile. Base layers is encoded at 3.79 Mbps FEC: 31.3 dB LACOS: 33 dB Reconstruction of the frame 34 with 13.9% packet loss
74
Some Remarks WZ-based Approaches (e.g. FEC) Work perfectly when packet loss rate is lower than error correction capacity of the channel code Perform error concealment when channel decoder fails that results in low performance (cliff effect) LACOS Holds soft-decoding feature of sparse recovery algorithm Mitigates the cliff effect effectively or the decoded video quality gradually degrades when the amount of packet loss increases Efficient sensing and fast recovery, enabling it to work well in real-time scenarios
75
75 Clean key frame Video De-noising xBxB DBDB BB + e B noise
76
76 Video Denoising: Sparse Model x B = D B B + e B = [D B | I B ] BeBBeB = W B B Identity matrix Noisy Block * B = Argmin | B | 1 s.t. x B = W B B x * B = D B * B Denoised Block Sparsity constrained block prediction
77
77 High resolution key frame Noisy low resolution non-key frames High resolution key frame Video Super-Resolution Unobservable high resolution non-key frame SBSB HBHB = + e B subsamplingbluring yByB xBxB noise Typical relationship between LR and HR patches
78
78 Video Super-Resolution: Sparse Model x B = S B H B y B + e B = S B H B D B B + e B = [S B H B D B | I B ] BeBBeB = W B B Dictionary of neighboring blocks * B = Argmin | B | 1 s.t. x B = W B B y * B = D B * B High resolution block approximation Sparsity constrained block prediction
79
79 Blocking-Effect Elimination Averaging from multiple approximations from shifted and overlapping grids
80
Compressed Sensing in Medical Imaging Goal so far: Achieve faster MR imaging while maintaining reconstruction quality Methods: under sample discrete Fourier space using some pseudo-random patterns then reconstruct using L 1 -minimization (Lustig) or homotopic L 0 -minimization (Trzasko)
81
Sparsity of MR Images Brain MR images, cardiology dynamic MR images Sparse in Wavelet, DCT domains Not sparse in spatial domain, or finite-difference domain Can be reconstructed with good quality using 5-10% of coefficients Angiogram images Space in finite-difference domain and spatial domain (edges of blood vessels occupy only 5% in space) allow very good Compressed Sensing performance
82
Sampling Methods Using smooth k-space trajectories Cartesian scanning Radial scanning Spiral scanning Fully sampling in each read-out Under sampling by: Cartesian grid: under sampling in phase encoding (uniform, non- uniform) Radial: angular under-sampling (using less angles) Spiral: using less spirals and randomly perturb spiral trajectories
83
Sampling Patterns: Spiral & Radial Spiral scanning: uniform density, varying density, and perturbed spiral trajectories New algorithm (FOCUSS) allows reconstruction without angular aliasing artifacts
84
Reconstruction Methods Lustig’s : L1-minimization with non-linear Conjugate Gradient method Trzasko’s: homotopic L0-minimization
85
Reconstruction Results (2DFT) Multi-slice 2DFT fast spin echo CS at 2.4 acceleration.
86
Results – 3DFT Contrast-enhanced 3D angiography reconstruction results as a function of acceleration. Left Column: Acceleration by LR. Note the diffused boundaries with acceleration. Middle Column: ZF-w/dc reconstruction. Note the increase of apparent noise with acceleration. Right Column: CS reconstruction with TV penalty from randomly under-sampled k-space
87
1 3 2 4 Results – Radial scan, FOCUSS reconstruction Reconstruction results from full-scan with uniform angular sampling between 0◦–360◦. 1st row: Reference reconstruction from 190 views. 2nd row: Reconstruction results from 51 views using LINFBP. 3rd row: Reconstruction results from 51 views using CG-ALONE. 4th row: Reconstruction results from 51 views using PR-FOCUSS
88
Results – spiral (a) Sagittal T2-weighted image of the spine, (b) simulated k-space trajectory (multishot Cartesian spiral, 83% under-sampling), (c) minimum energy solution via zero-filling, (d) reconstruction by L1 minimization, (e) reconstruction by homotopic L0 minimization using (| ∇ u|, ) = | ∇ u|/ (| ∇ u|+ ), (f) line profile across C6, (g-j) enlargements of (a,c-e), respectively.
89
Conclusion Compressed sensing A different paradigm for data acquisition Sample less and compute more Simple encoding; most computation at decoder Exploit a priori signal sparsity Universality, robustness Compressed sensing applications for multimedia 2D separable measurement ensemble for image/video Face/speech recognition Distributed compressed video sensing Layered robust video transmission Image/Video De-noising & Concealment MRI applications
90
References http://www.dsp.ece.rice.edu/cs/ http://www.dsp.ece.rice.edu/cs/ http://nuit- blanche.blogspot.com/search/label/compressed%20sensing
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.