On Missing Data Prediction using Sparse Signal Models: A Comparison of Atomic Decompositions with Iterated Denoising Onur G. Guleryuz DoCoMo USA Labs,

Slides:

Advertisements

Similar presentations

Object Specific Compressed Sensing by minimizing a weighted L2-norm A. Mahalanobis.

Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Compressive Sensing IT530, Lecture Notes.

Joint work with Irad Yavneh

Pixel Recovery via Minimization in the Wavelet Domain Ivan W. Selesnick, Richard Van Slyke, and Onur G. Guleryuz *: Polytechnic University, Brooklyn, NY.

Submodular Dictionary Selection for Sparse Representation Volkan Cevher Laboratory for Information and Inference Systems - LIONS.

Multi-Task Compressive Sensing with Dirichlet Process Priors Yuting Qi 1, Dehong Liu 1, David Dunson 2, and Lawrence Carin 1 1 Department of Electrical.

Patch-based Image Deconvolution via Joint Modeling of Sparse Priors Chao Jia and Brian L. Evans The University of Texas at Austin 12 Sep

Contents 1. Introduction 2. UWB Signal processing 3. Compressed Sensing Theory 3.1 Sparse representation of signals 3.2 AIC (analog to information converter)

Image Denoising using Locally Learned Dictionaries Priyam Chatterjee Peyman Milanfar Dept. of Electrical Engineering University of California, Santa Cruz.

Extensions of wavelets

* * Joint work with Michal Aharon Freddy Bruckstein Michael Elad

More MR Fingerprinting

An Introduction to Sparse Coding, Sparse Sensing, and Optimization Speaker: Wei-Lun Chao Date: Nov. 23, 2011 DISP Lab, Graduate Institute of Communication.

Compressed sensing Carlos Becker, Guillaume Lemaître & Peter Rennert

Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.

SRINKAGE FOR REDUNDANT REPRESENTATIONS ? Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel.

Richard Baraniuk Rice University dsp.rice.edu/cs Compressive Signal Processing.

Image Denoising via Learned Dictionaries and Sparse Representations

DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri.

ITERATED SRINKAGE ALGORITHM FOR BASIS PURSUIT MINIMIZATION Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa.

7th IEEE Technical Exchange Meeting 2000 Hybrid Wavelet-SVD based Filtering of Noise in Harmonics By Prof. Maamar Bettayeb and Syed Faisal Ali Shah King.

Random Convolution in Compressive Sampling Michael Fleyer.

Basic Concepts and Definitions Vector and Function Space. A finite or an infinite dimensional linear vector/function space described with set of non-unique.

Recovering Articulated Object Models from 3D Range Data Dragomir Anguelov Daphne Koller Hoi-Cheung Pang Praveen Srinivasan Sebastian Thrun Computer Science.

EE565 Advanced Image Processing Copyright Xin Li Different Frameworks for Image Processing Statistical/Stochastic Models: Wiener’s MMSE estimation.

A Weighted Average of Sparse Several Representations is Better than the Sparsest One Alone Michael Elad The Computer Science Department The Technion –

A Sparse Solution of is Necessarily Unique !! Alfred M. Bruckstein, Michael Elad & Michael Zibulevsky The Computer Science Department The Technion – Israel.

Multiscale transforms : wavelets, ridgelets, curvelets, etc.

Alfredo Nava-Tudela John J. Benedetto, advisor

A Nonlinear Loop Filter for Quantization Noise Removal in Hybrid Video Compression Onur G. Guleryuz DoCoMo USA Labs

Compressed Sensing Compressive Sampling

Unitary Extension Principle: Ten Years After Zuowei Shen Department of Mathematics National University of Singapore.

Image Denoising using Wavelet Thresholding Techniques Submitted by Yang

Compressive Sampling: A Brief Overview

Predicting Wavelet Coefficients Over Edges Using Estimates Based on Nonlinear Approximants Onur G. Guleryuz Epson Palo Alto Laboratory.

AMSC 6631 Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Midyear Report Alfredo Nava-Tudela John J. Benedetto,

WEIGHTED OVERCOMPLETE DENOISING Onur G. Guleryuz Epson Palo Alto Laboratory Palo Alto, CA (Please view in full screen mode to see.

Cs: compressed sensing

Overview Harris interest points Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points Evaluation and comparison of different.

Iterated Denoising for Image Recovery Onur G. Guleryuz To see the animations and movies please use full-screen mode. Clicking on.

EE369C Final Project: Accelerated Flip Angle Sequences Jan 9, 2012 Jason Su.

Eran Treister and Irad Yavneh Computer Science, Technion (with thanks to Michael Elad)

Progress in identification of damping: Energy-based method with incomplete and noisy data Marco Prandina University of Liverpool.

BARCODE IDENTIFICATION BY USING WAVELET BASED ENERGY Soundararajan Ezekiel, Gary Greenwood, David Pazzaglia Computer Science Department Indiana University.

Basis Expansions and Regularization Part II. Outline Review of Splines Wavelet Smoothing Reproducing Kernel Hilbert Spaces.

Direct Robust Matrix Factorization Liang Xiong, Xi Chen, Jeff Schneider Presented by xxx School of Computer Science Carnegie Mellon University.

Image Denoising Using Wavelets

Computer Vision Lecture #10 Hossam Abdelmunim 1 & Aly A. Farag 2 1 Computer & Systems Engineering Department, Ain Shams University, Cairo, Egypt 2 Electerical.

EE565 Advanced Image Processing Copyright Xin Li Image Denoising Theory of linear estimation Spatial domain denoising techniques Conventional Wiener.

Spatial Sparsity Induced Temporal Prediction for Hybrid Video Compression Gang Hua and Onur G. Guleryuz Rice University, Houston, TX DoCoMo.

Image Decomposition, Inpainting, and Impulse Noise Removal by Sparse & Redundant Representations Michael Elad The Computer Science Department The Technion.

Image Priors and the Sparse-Land Model

Nonlinear Approximation Based Image Recovery Using Adaptive Sparse Reconstructions Onur G. Guleryuz Epson Palo Alto Laboratory.

Compressive Sensing Techniques for Video Acquisition EE5359 Multimedia Processing December 8,2009 Madhu P. Krishnan.

Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.

From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images Alfred M. Bruckstein (Technion), David L. Donoho (Stanford), Michael.

Singular Value Decomposition and its applications

Sparsity Based Poisson Denoising and Inpainting

Compressive Coded Aperture Video Reconstruction

Jeremy Watt and Aggelos Katsaggelos Northwestern University

Image Denoising in the Wavelet Domain Using Wiener Filtering

A Motivating Application: Sensor Array Signal Processing

Sudocodes Fast measurement and reconstruction of sparse signals

Optimal sparse representations in general overcomplete bases

INFONET Seminar Application Group

Advanced deconvolution techniques and medical radiography

Outline Sparse Reconstruction RIP Condition

Lecture 7 Patch based methods: nonlocal means, BM3D, K- SVD, data-driven (tight) frame.

Presentation transcript:

On Missing Data Prediction using Sparse Signal Models: A Comparison of Atomic Decompositions with Iterated Denoising Onur G. Guleryuz DoCoMo USA Labs, San Jose, CA (google: onur guleryuz) (Please view in full screen mode. The presentation tries to squeeze in too much, please feel free to me any questions you may have.)

Problem statement: Prediction of missing data. Formulation as a sparse linear expansion over overcomplete basis. AD ( regularized) and ID formulations. Short simulation results ( regularized). Why ID is better than AD. Adaptive predictors on general data: all methods are mathematically the same. Key issues are basis selection, and utilizing what you have effectively. Overview 1 Mini FAQ: 1.Is ID the same as ? No. 2.Is ID the same as, except implemented iteratively? No. 3.Are predictors that yield the sparsest set of expansion coefficients the best? No, predictors that yield the smallest mse are the best. 4.On images, look for performance over large missing chunks (with edges). Some results from Ivan W. Selesnick, Richard Van Slyke, and Onur G. Guleryuz,``Pixel Recovery via l1 Minimization in the Wavelet Domain,'‘ Proc. IEEE Int'l Conf. on Image Proc. (ICIP2004), Singapore, Oct ( Some software available at my webpage.) Pretty ID pictures: Onur G. Guleryuz, ``Nonlinear Approximation Based Image Recovery Using Adaptive Sparse Reconstructions and Iterated Denoising: Part II – Adaptive Algorithms,‘’ IEEE Tr. on IP, to appear.

Problem Statement Original image available pixels lost region pixels (assume zero mean) Lost region available data projection (“mask”) Derive predicted 3.

+ = Signal space Noisy signal (noise correlated with the data) type 1 iterations ?

Recipe for 1. Take NxM matrix of overcomplete basis, 2. Write y in terms of the basis 3. Find “sparse” expansion coefficients (AD v.s. ID)

Any y has to be sparse null space of dimension y has to be sparse Onur’s trivial sparsity theorem: Estimation algorithms equivalent basis in which estimates are sparse

Who cares about y, what about the original x? If successful prediction is possible x also has to be ~sparse small, then x ~ sparse 1. Predictable sparse 2. Sparsity of x is a necessary leap of faith to make in estimation i.e., if Caveat: Any estimator is putting up a sparse y. Assuming x is sparse, the estimator that wins is the one that matches the sparsity “correctly”! Putting up sparse estimates is not the issue, putting up estimates that minimize mse is. Can we be proud of the formulation? Not really. It is honest, but ambitious.

AD: Find the expansion coefficients to minimize the norm norm of expansion coefficients RegularizationAvailable data constraint subject to Getting to the heart of the matter:

AD with Significance Sets subject to and Finds the sparsest (the most predictable) signal consistent with the available data.

Iterated Denoising with Insignificant Sets subject to (Once the insignificant set is determined, ID uses well defined denoising operators to construct mathematically sound equations)... Progressions Pick Recipe for using your transform based image denoiser (to justify progressions, think decaying coefficients): … 1. 2.

Mini Formulation Comparison subject to No progression ID AD If H is orthonormal the two formulations come close. Important thing is how you determine the sets/sparsity (ID: Robust DSP, AD: sparsest) ID uses progressions, progressions change everything!

Simulation Comparison vs. H: Two times expansive M=2N, real, isotropic, dual-tree, DWT. Real part of: N. G. Kingsbury, ``Complex wavelets for shift invariant analysis and filtering of signals,‘’ Appl. Comput. Harmon. Anal., 10(3): , May ID (no layering and no selective thresholding ) AD subject to 1 D. Donoho, M. Elad, and V. Temlyakov, ``Stable Recovery of Sparse Overcomplete Representations in the Presence of Noise‘’.

Simulation Results ( results are doctored!)

Problems in ? Yes and no. What is wrong with AD? I will argue that even if we used an “ solver”, ID will in general prevail. Specific issues with. How to fix the problems with based AD. How to do better. So let’s assume we can solve the problem...

Bottom Up (AD) v.s. Top Down (ID) Prediction as signal construction: AD is a builder that tries to accomplish constructions using as few bricks as possible. Requires very good basis. ID is a sculptor that removes portions that do not belong in the final construction by using as many carving steps as needed. Requires good denoising. AD BuilderID Sculptor Application is not compression! (“ Where will the probe hit the meteor?”, “What is the value of S&P500 tomorrow?”) easy

Significance v.s. Insignificance, The Sherlock Holmes Principle Both ID and AD do well with very good basis. But ID can also use unintuitive basis for sophisticated results. E.g.: ID can use “unsophisticated”, “singularity unfriendly” DCT basis to recover singularities. AD cannot! Secret: DCTs are not great on singularities but they are very good on everything else! "How often have I said to you that when you have eliminated the impossible whatever remains, however improbable, must be the truth?" Sherlock Holmes, in "The Sign of the Four" non-singularities singularities DCTs are very good at eliminating non-singularities. ID is more robust to basis selection compared to AD (secretly violate coherency restrictions). You can add to the AD dictionary but solvers won’t be able to handle it.

Sherlock Holmes Principle using overcomplete DCTs for elimination Predicting missing edge pixels: basis: DCT 16x16 Predicting missing wavelet coefficients over edges: Onur G. Guleryuz, ``Predicting Wavelet Coefficients Over Edges Using Estimates Based on Nonlinear Approximants,’’ Proc. Data Compression Conference, IEEE DCC-04, April basis: DCT 8x8 Do not abandon isotropic *lets, use a framework that can extract the most mileage from the chosen basis (“sparsest”).

Progressions “Annealing” Progressions (think decaying coefficients)... basis: DCT 16x16, best threshold Progressions generate up to tens of dBs. If the data was very sparse with respect to H, if we were solving a convex problem, why should progressions matter? Modeling assumptions… iterations of simple denoising type 1

Sparse Modeling Generates Non- Convex Problems Pixel coordinates for a “two pixel” image x x Transform coordinates available pixel missing pixel x available pixel constraint Equally sparse solutions More skeptical picture:

How does this affect some “AD solvers”, i.e., ? x Geometry x ball x Case 1 Case 2Case 3 Linear/Quadratic program, …, Not sparse!

Case 3: the magic is gone… “Under i.i.d. Laplacian model for the joint probability of expansion coefficients,... norm min x You now have to argue:

Problems with the norm I What about all the optimality/sparsest results? Results such as: D. Donoho et. al. ``Stable Recovery of Sparse Overcomplete Representations in the Presence of Noise‘’… are very impressive, but they are closely tied to H providing the sparsest decomposition for x. Not every problem has this structure. Worst case noise robustness results, but overwhelming noise: modeling error error due to missing data

Problems with the norm II subject to “nice” basis, “decoherent” “not nice” basis (due to masking), may become very “coherent” (problem due to )

Example orthonormal, coherency=0 unnormalized coherency= normalized coherency= 1 (worst possible) Optimal solution sometimes tries to make coefficients of scaling functions zero.

Possible fix using Progressions subject to If you pick a large T maybe you can pretend the first one is a convex problem. This is not an l1 problem! No single l1 solution will generate the final. After the first few solutions, you may start hitting l1 issues Enforce available data

The fix is ID! v.s. : You can do soft thresholding, “block descent”, or Daubechies, Defrise, De Mol, “An iterative thresholding algorithm for linear inverse problems with a sparsity constraint”, Figueiredo and Nowak, “An EM Algorithm for Wavelet-Based Image Restoration”. There are many “denoising” techniques that discover the “true” sparsity. Pick the technique that is cross correlation robust. >> Experience suggests:

Conclusion Smallest mse not necessarily = sparsest. Somebody putting up really bad estimates maybe very sparse (sparser than us) with respect to some basis. Good denoisers should be cross correlation robust (hard thresholding tends to beat soft). How many iterations you do within each l1_recons() or denoising_recons() is not very important. Progressions! Wil l1 generate sparse results? In the sense of the trivial sparsity theorem, of course! (Sparsity may not be in terms of your intended basis :). Please check the assumptions for your problem! subject to 1 To see its limitations, go ahead and solve the real l1 (with or without masking setups, you can even cheat on T) and compare to ID. The trivial sparsity theorem is true. The prediction problem is all about the basis. ID simply allows the construction of a sophisticated, signal adaptive basis, by starting with a simple dictionary!