Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU.

Slides:

Advertisements

Similar presentations

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

Advertisements

Compressive Sensing IT530, Lecture Notes.

Visual Dictionaries George Papandreou CVPR 2014 Tutorial on BASIS

Pixel Recovery via Minimization in the Wavelet Domain Ivan W. Selesnick, Richard Van Slyke, and Onur G. Guleryuz *: Polytechnic University, Brooklyn, NY.

Online Performance Guarantees for Sparse Recovery Raja Giryes ICASSP 2011 Volkan Cevher.

Submodular Dictionary Selection for Sparse Representation Volkan Cevher Laboratory for Information and Inference Systems - LIONS.

by Rianto Adhy Sasongko Supervisor: Dr.J.C.Allwright

Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.

Wangmeng Zuo, Deyu Meng, Lei Zhang, Xiangchu Feng, David Zhang

Learning sparse representations to restore, classify, and sense images and videos Guillermo Sapiro University of Minnesota Supported by NSF, NGA, NIH,

* * Joint work with Michal Aharon Freddy Bruckstein Michael Elad

More MR Fingerprinting

Ilias Theodorakopoulos PhD Candidate

An Introduction to Sparse Coding, Sparse Sensing, and Optimization Speaker: Wei-Lun Chao Date: Nov. 23, 2011 DISP Lab, Graduate Institute of Communication.

Learning Near-Isometric Linear Embeddings Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU Wotao Yin UCLA.

ECE Department Rice University dsp.rice.edu/cs Measurements and Bits: Compressed Sensing meets Information Theory Shriram Sarvotham Dror Baron Richard.

Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.

Sparse & Redundant Signal Representation, and its Role in Image Processing Michael Elad The CS Department The Technion – Israel Institute of technology.

Dictionary-Learning for the Analysis Sparse Model Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000,

Volkan Cevher, Marco F. Duarte, and Richard G. Baraniuk European Signal Processing Conference 2008.

Sparse and Overcomplete Data Representation

Mathematics and Image Analysis, MIA'06

Richard Baraniuk Rice University dsp.rice.edu/cs Compressive Signal Processing.

Image Denoising via Learned Dictionaries and Sparse Representations

Compressive Signal Processing

New Results in Image Processing based on Sparse and Redundant Representations Michael Elad The Computer Science Department The Technion – Israel Institute.

Optimized Projection Directions for Compressed Sensing Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa.

Image Denoising with K-SVD Priyam Chatterjee EE 264 – Image Processing & Reconstruction Instructor : Prof. Peyman Milanfar Spring 2007.

Random Convolution in Compressive Sampling Michael Fleyer.

Introduction to Compressive Sensing

Rice University dsp.rice.edu/cs Distributed Compressive Sensing A Framework for Integrated Sensing and Processing for Signal Ensembles Marco Duarte Shriram.

Recent Trends in Signal Representations and Their Role in Image Processing Michael Elad The CS Department The Technion – Israel Institute of technology.

A Sparse Solution of is Necessarily Unique !! Alfred M. Bruckstein, Michael Elad & Michael Zibulevsky The Computer Science Department The Technion – Israel.

Sparse and Redundant Representation Modeling for Image Processing Michael Elad The Computer Science Department The Technion – Israel Institute of technology.

Random Projections of Signal Manifolds Michael Wakin and Richard Baraniuk Random Projections for Manifold Learning Chinmay Hegde, Michael Wakin and Richard.

Compressed Sensing Compressive Sampling

An ALPS’ view of Sparse Recovery Volkan Cevher Laboratory for Information and Inference Systems - LIONS

Compressive Sampling: A Brief Overview

Learning Near-Isometric Linear Embeddings Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU Wotao Yin UCLA.

Richard Baraniuk Chinmay Hegde Marco Duarte Mark Davenport Rice University Michael Wakin University of Michigan Compressive Learning and Inference.

Cs: compressed sensing

Recovering low rank and sparse matrices from compressive measurements Aswin C Sankaranarayanan Rice University Richard G. Baraniuk Andrew E. Waters.

 Karthik Gurumoorthy  Ajit Rajwade  Arunava Banerjee  Anand Rangarajan Department of CISE University of Florida 1.

Fast and incoherent dictionary learning algorithms with application to fMRI Authors: Vahid Abolghasemi Saideh Ferdowsi Saeid Sanei. Journal of Signal Processing.

Learning With Structured Sparsity

Eran Treister and Irad Yavneh Computer Science, Technion (with thanks to Michael Elad)

Model-Based Compressive Sensing Presenter: Jason David Bonior ECE / CMR Tennessee Technological University November 5, 2010 Reading Group (Richard G. Baraniuk,

Shriram Sarvotham Dror Baron Richard Baraniuk ECE Department Rice University dsp.rice.edu/cs Sudocodes Fast measurement and reconstruction of sparse signals.

An Introduction to Compressive Sensing Speaker: Ying-Jou Chen Advisor: Jian-Jiun Ding.

Constrained adaptive sensing Mark A. Davenport Georgia Institute of Technology School of Electrical and Computer Engineering TexPoint fonts used in EMF.

Learning to Sense Sparse Signals: Simultaneous Sensing Matrix and Sparsifying Dictionary Optimization Julio Martin Duarte-Carvajalino, and Guillermo Sapiro.

Regularization and Feature Selection in Least-Squares Temporal Difference Learning J. Zico Kolter and Andrew Y. Ng Computer Science Department Stanford.

E XACT MATRIX C OMPLETION VIA CONVEX OPTIMIZATION E MMANUEL J. C ANDES AND B ENJAMIN R ECHT M AY 2008 Presenter: Shujie Hou January, 28 th,2011 Department.

Large-Scale Matrix Factorization with Missing Data under Additional Constraints Kaushik Mitra University of Maryland, College Park, MD Sameer Sheoreyy.

Zhilin Zhang, Bhaskar D. Rao University of California, San Diego March 28,

An Introduction to Compressive Sensing

Single Image Interpolation via Adaptive Non-Local Sparsity-Based Modeling The research leading to these results has received funding from the European.

From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images Alfred M. Bruckstein (Technion), David L. Donoho (Stanford), Michael.

Modulated Unit Norm Tight Frames for Compressed Sensing

Jeremy Watt and Aggelos Katsaggelos Northwestern University

Basic Algorithms Christina Gallner

Nuclear Norm Heuristic for Rank Minimization

Optimal sparse representations in general overcomplete bases

Sparse and Redundant Representations and Their Applications in

Symmetric Matrices and Quadratic Forms

Rank-Sparsity Incoherence for Matrix Decomposition

* * Joint work with Michal Aharon Freddy Bruckstein Michael Elad

Symmetric Matrices and Quadratic Forms

LAB MEETING Speaker : Cheolsun Kim

Presentation transcript:

Learning Measurement Matrices for Redundant Dictionaries Richard Baraniuk Rice University Chinmay Hegde MIT Aswin Sankaranarayanan CMU

Sparse Recovery Sparsity rocks, etc. Previous talk focused mainly on signal inference (ex: classification, NN search) This talk focuses on signal recovery

Compressive Sensing Sensing via randomized dimensionality reduction random measurements sparse signal nonzero entries Recovery:solve an ill-posed inverse problem exploit the geometrical structure of sparse/compressible signals

Gaussian measurements incoherent with any fixed orthonormal basis (with high probability) Ex: frequency domain: General Sparsifying Bases

Sparse Modeling: Approach 1 Step 1: Choose a signal model with structure –e.g. bandlimited, smooth with r vanishing moments, etc. Step 2: Analytically design a sparsifying basis/frame that exploits this structure –e.g. DCT, wavelets, Gabor, etc. DCT Wavelets Gabor ?

Sparse Modeling: Approach 2 Learn the sparsifying basis/frame from training data Problem formulation: given a large number of training signals, design a dictionary D that simultaneously sparsifies the training data Called sparse coding / dictionary learning

Dictionaries Dictionary: an NxQ matrix whose columns are used as basis functions for the data Convention: assume columns are unit-norm More columns than rows, so dictionary is redundant / overcomplete

Dictionary Learning Rich vein of theoretical and algorithmic work Olshausen and Field [‘97], Lewicki and Sejnowski [’00], Elad [‘06], Sapiro [‘08] Typical formulation: Given training data Solve: Several efficient algorithms, ex: K-SVD

Dictionary Learning Successfully applied to denoising, deblurring, inpainting, demosaicking, super-resolution, … –State-of-the-art results in many of these problems Aharon and Elad ‘06

Dictionary Coherence Suppose that the learned dictionary is normalized to have unit -norm columns: The mutual coherence of D is defined as Geometrically, represents the cosine of the minimum angle between the columns of D, smaller is better Crucial parameter in analysis as well as practice (line of work starting with Tropp [04])

Dictionaries and CS Can extend CS to work with non-orthonormal, redundant dictionaries Coherence of determines recovery success Rauhut et al. [08], Candes et al. [10] Fortunately, random guarantees low coherence Holographic basis

Geometric Intuition Columns of D: points on the unit sphere Coherence: minimum angle between the vectors J-L Lemma: Random projections approximately preserve angles between vectors

Q: Can we do better than random projections for dictionary-based CS? Q restated: For a given dictionary D, find the best CS measurement matrix

Optimization Approach Assume that a good dictionary D has been provided. Goal: Learn the best for this particular D As before, want the “shortest” matrix such that the coherence of is at most some parameter To avoid degeneracies caused by a simple scaling, also want that does not shrink columns much:

A NuMax-like Framework Convert quadratic constraints in into linear constraints in (via the “lifting trick”) Use a nuclear-norm relaxation of the rank Simplified problem:

Alternating Direction Method of Multipliers (ADMM) - solve for P using spectral thresholding - solve for L using least-squares - solve for q using “squishing” Convergence rate depends on the size of the dictionary (since #constraints = ) Algorithm: “NuMax-Dict” [HSYB12]

NuMax vs. NuMax-Dict Same intuition, trick, algorithm, etc; Key enabler is that coherence is intrinsically a quadratic function of the data Key difference: the (linearized) constraints are no longer symmetric –We have constraints of the form –This might result in intermediate P estimates having complex eigenvalues, so the notion of spectral thresholding needs to be slightly modified

Experimental Results

Expt 1: Synthetic Dictionary Generic dictionary: random w/ unit norm. columns Dictionary size: 64x128 We construct different measurement matrices: Random NuMax-Dict Algorithm by Elad [06] Algorithm by Duarte-Carvajalino & Sapiro [08] We generate K=3 sparse signals with Gaussian amplitudes, add 30dB measurement noise Recovery using OMP Measure recovery SNR, plot as a function of M

Exp 1: Synthetic Dictionary

Expt 2: Practical Dictionaries 2x overcomplete DCT dictionary, same parameters 2x overcomplete dictionary learned on 8x8 patches of a real-world image (Barbara) using K-SVD Recovery using OMP

Analysis Exact problem seems to be hard to analyze But, as in NuMax, can provide analytical bounds in the special case where the measurement matrix is further constrained to be orthonormal

Orthogonal Sensing of Dictionary-Sparse Signals Given a dictionary D, find the orthonormal measurement matrix that provides the best possible coherence From a geometric perspective, ortho-projections cannot improve coherence, so necessarily

Semidefinite Relaxation The usual trick: Lifting and trace-norm relaxation

Theoretical Result Theorem: For any given redundant dictionary D, denote its mutual coherence by. Denote the optimum of the (nonconvex) problem as Then, there exists a method to produce a rank-2M ortho matrix such that the coherence of is at most i.e., We can obtain close to optimal performance, but pay a price of a factor 2 in the number of measurements

Conclusions NuMax-Dict performance comparable to the best existing algorithms Principled convex optimization framework Efficient ADMM-type algorithm that exploits the rank-1 structure of the problem Upshot: possible to incorporate other structure into the measurement matrix, such as positivity, sparsity, etc.

Open Question Above framework assumes a two-step approach: first construct a redundant dictionary (analytically or from data) and then construct a measurement matrix Given a large number of training data, how to efficiently solve jointly for both the dictionary and the sensing matrix? (Approach introduced in DC-Sapiro [08])