Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel Sparse & Redundant Representations by Iterative-Shrinkage.

Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel Sparse & Redundant Representations by Iterative-Shrinkage Algorithms * * Joint work with: Boaz Joseph Michael Matalon Shtok Zibulevsky

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 2 the Minimization of the Function by Iterative-Shrinkage Algorithms Today’s Talk is About Today we will discuss:  Why this minimization task is important?  Which applications could benefit from this minimization?  How can it be minimized effectively?  What iterative-shrinkage methods are out there? and

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 3 Agenda 1.Motivating the Minimization of f( α ) Describing various applications that need this minimization 2.Some Motivating Facts General purpose optimization tools, and the unitary case 3.Iterative-Shrinkage Algorithms We describe five versions of those in detail 4.Some Results Image deblurring results 5.Conclusions Why?

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 4 Lets Start with Image Denoising Remove Additive Noise ? Likelihood: Relation to measurements Prior or regularization y : Given measurements x : Unknown to be recovered Many of the existing image denoising algorithms are related to the minimization of an energy function of the form We will use a Sparse & Redundant Representation prior.

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 5 Our MAP Energy Function  We assume that x is created by M : where  is a sparse & redundant representation and D is a known dictionary.  This leads to:  This MAP denoising algorithm is known as basis Pursuit Denoising [Chen, Donoho, Saunders 1995].  The term ρ(α) measures the sparsity of the solution  : L 0 -norm ( ||α|| 0 ) leads to non-smooth & non-convex problem. The L p norm ( ) with 0<p≤1 is often found to be equivalent. Many other ADDITIVE sparsity measures are possible. MM D  =x= = This is Our Problem !!!

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 6 General (linear) Inverse Problems  Assume that x is known to emerge from M, as before.  Suppose we observe, a “blurred” and noisy version of x. How could we recover x? MM Noise ?  A MAP estimator leads to: This is Our Problem !!!

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 7 Inverse Problems of Interest  De-Noising  De-Blurring  In-Painting  De-Mosaicing  Tomography  Image Scale-Up & super-resolution  And more … This is Our Problem !!!

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 8 M2M2 Signal Separation M1M1  Given a mixture z=x 1 +x 2 +v – two sources, M 1 and M 2, and white Gaussian noise v, we desire to separate it to its ingredients.  Written differently:  Thus, solving this problem using MAP leads to the Morphological Component Analysis (MCA) [Starck, Elad, Donoho, 2005] : v z This is Our Problem !!!

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 9 Compressed-Sensing [Candes et.al. 2006], [Donoho, 2006] MM  In compressed-sensing we compress the signal x by exploiting its origin. This is done by p<<n random projections.  The core idea: (P size: p×n) holds all the information about the original signal x, even though p<<n.  Reconstruction? Use MAP again and solve Noise ? This is Our Problem !!!

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 10 Brief Summary #1 The minimization of the function is a worthy task, serving many & various applications. So, How This Should be Done When the Dimension involved are Huge (Thousands to many Millions)?

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 11 1.Motivating the Minimization of f( α ) Describing various applications that need this minimization 2.Some Motivating facts General purpose optimization tools, and the unitary case 3.Iterative-Shrinkage Algorithms We describe five versions of those in detail 4.Some Results Image deblurring results 5.Conclusions Agenda Apply Wavelet Transform Apply Inv. Wavelet Transform LUT

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 12 Is there a Problem?  The first thought: With all the existing knowledge in optimization, we could find a solution. MMethods to consider: (Normalized) Steepest Descent – compute the gradient and follow it’s path. Conjugate Gradient – use the gradient and the previous update direction, combined by a preset formula. Pre-Conditioned SD – weight the gradient by the Hessian’s diagonal inverse. Truncated Newton – Use the gradient and Hessian to define a linear system, and solve it approximately by a set of CG steps. Interior-Point Algorithms – Separate to positive and negative entries, and use both the primal and the dual problems + barrier for forcing positivity.

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 13 General-Purpose Software?  So, simply download one of many general-purpose packages: L1-Magic (interior-point solver), Sparselab (interior-point solver), MOSEK (various tools), Matlab Optimization Toolbox (various tools), …  A Problem: General purpose software-packages (algorithms) are typically performing poorly on our task. Possible reasons: The fact that the solution is expected to be sparse (or nearly so) in our problem is not exploited in such algorithms. The Hessian of f( α ) tends to be highly ill-conditioned near the (sparse) solution.  So, are we stuck? Is this problem really that complicated? 51015202530 5 10 15 20 25 30 0 50 100 150 200 Relatively small entries for the non-zero entries in the solution

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 14 We got a separable set of m identical 1D optimization problems Consider the Unitary Case (DD H =I) Use L 2 is unitarily invariant Define

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 15 The 1D Task We need to solve the following 1D problem: Such a Look-Up-Table (LUT) α opt = S ρ,λ (β) can be built for ANY sparsity measure function ρ ( α ), including non-convex ones and non-smooth ones (e.g., L 0 norm), giving in all cases the GLOBAL minimizer of g( α ). LUT β α opt

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 16 The Unitary Case: A Summary Minimizing is done by: Multiply by D H Multiply by D The obtained solution is the GLOBAL minimizer of f( α ), even if f( α ) is non-convex. LUT DONE!

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 17 Brief Summary #2 The minimization of Leads to two very Contradicting Observations: 1.The problem is quite hard – classic optimization find it hard. 2.The problem is trivial for the case of unitary D. How Can We Enjoy This Simplicity in the General Case?

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 18 1.Motivating the Minimization of f( α ) Describing various applications that need this minimization 2.Some Motivating Facts General purpose optimization tools, and the unitary case 3.Iterative-Shrinkage Algorithms We describe several versions of those in detail 4.Some Results Image deblurring results 5.Conclusions Agenda

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 19 Lets Start with Something Simple  What if our problem has a two-ortho dictionary?  Sardy, Bruce and Zheng suggested a Block-Coordinate-Descent (BCD) to handle this problem. The key idea is to iterate the steps:  Fix and update - this is an ortho problem for which we have a closed solution.

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 20 Here are the Details … and similarly,

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 21 The BCR Algorithm  Iterate the following two steps:  This algorithm has been shown to guarantee a converge to a local minimum of the original problem.  If the choice of  (  ) is convex, this implies convergence to the global optimum.

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 22 The BCR Algorithm  Lets consider a somewhat different (and slower) scheme:  Little bit of algebra enables to merge these two equations into  The beauty in this iterative formula is that it does not rely on the splitting of D into the two ortho parts. Could this step be true for general dictionaries D?

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 23 Iterative-Shrinkage Algorithms?  We will present THE PRINCIPLES of three leading methods: Bound-Optimization and EM [Figueiredo & Nowak, `03], Surrogate-Separable-Function (SSF) [Daubechies, Defrise, & De-Mol, `04], Parallel-Coordinate-Descent (PCD) algorithm [Elad `05], [Matalon, et.al. `06],  There are other such methods: IRLS-based, StOMP, FISTA, ADMM and augmented-Lagrange methods, Sprint Bergman …  Common to all is a set of operations in every iteration that includes: (i) Multiplication by D, (ii) Multiplication by D H, and (iii) A Scalar shrinkage on the solution S ρ, λ ( α ).  Some of these algorithms pose a direct generalization of the unitary case, their 1 st iteration is the solver we have seen.

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 24 1. The Proximal-Point Method  Aim: minimize f( α ) – Suppose it is found to be too hard.  Define a surrogate-function g( α,α 0 )=f( α )+dist( α-α 0 ), using a general (uni-modal, non-negative) distance function.  Then, the following algorithm necessarily converges to a local minima of f( α ) [Rokafellar, `76] :  Comments: (i) Is the minimization of g( α,α 0 ) easier? It better be! (ii) Looks like it will slow-down convergence. Really? Minimize g( α,α 1 ) α2α2 Minimize g( α,α k ) αkαk α k+1 Minimize g( α,α 0 ) α0α0 α1α1

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 25 The Proposed Surrogate-Functions  Our original function is:  The distance to use: Proposed by [Daubechies, Defrise, & De-Mol `04]. Require.  The beauty in this choice: the term vanishes It is a separable sum of m 1D problems. Thus, we have a closed form solution by THE SAME SHRINKAGE !!

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 26 Detailed Derivation …

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 27 Detailed Derivation …  This is a classic format for which we know that the solution is plain shrinkage.  Observe that this formula is nothing but a gradient descent over the quadratic term, followed by projection onto the sparsity penalty.

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 28 The Proposed Surrogate-Functions  Our original function is:  The distance to use: Proposed by [Daubechies, Defrise, & De-Mol `04]. Require.  The beauty in this choice: the term vanishes  Minimization of g( α,α 0 ) is done in a closed form by shrinkage, done on the vector β k, and this generates the solution α k+1 of the next iteration.

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 29 The Resulting SSF Algorithm Multiply by D H Multiply by D LUT While the Unitary case solution is given by Multiply by D H /c Multiply by D LUT + + - + the general case, by SSF requires :

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 30 A Word about Convergence Speed  We are minimizing the surrogate functional  The affect of the extra-distance term is to slow-down the convergence, because  is not permitted to be “far” from  0.  The added distance is not isotropic – it affects the optimization in some directions, while being almost transparent in others.  For example, assume that c=1 and  -  0 is sparse. Then, due to RIP, we have that  Thus, for sparse solutions, there is a “Tunneling” affect ….

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 31 2. Bound-Optimization Technique  Aim: minimize f( α ) – Suppose it is found to be too hard.  Define a function Q( α,α 0 ) that satisfies the following conditions: Q( α 0,α 0 )=f( α 0 ), Q( α,α 0 )≥f( α ) for all α, and  Q( α,α 0 )=  f( α ) at α 0.  Then, the following algorithm necessarily converges to a local minima of f( α ) [Hunter & Lange, (Review)`04] :  Well, regarding this method … The above is closely related to the EM algorithm [Neal & Hinton, `98]. Figueiredo & Nowak’s method (`03): use the BO idea to minimize f( α ). They use the VERY SAME Surrogate functions we saw before. Minimize Q( α,α 1 ) α2α2 Minimize Q( α,α k ) αkαk α k+1 Minimize Q( α,α 0 ) α0α0 α1α1

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 32 3. Start With Coordinate Descent  We aim to minimize.  First, consider the Coordinate Descent (CD) algorithm.  This is a 1D minimization problem:  It has a closed for solution, using a simple SHRINKAGE as before, applied on the scalar. - 2 2 D - 2 2 LUT in α opt

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 33 Detailed Derivation This is a classic shrinkage format

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 34 Parallel Coordinate Descent (PCD) Current solution for minimization of f( α ) Descent directions obtained by the previous CD algorithm m-dimensional space  We will take the sum of these m descent directions for the update step.  Line search is mandatory:  This leads to

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 35 The PCD Algorithm [Elad, `05] [Matalon, Elad, & Zibulevsky, `06] Note: Q can be computed quite easily off-line. Its storage is just like storing the vector α k. Where and  represents a line search (LS). Multiply by QD H Multiply by D LUT + + - + Line Search

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 36 Algorithms’ Speed-Up Option 1 – Use v as is: For example (SSF): Option 2 – Use line-search with v: Option 3 – Use Sequential Subspace OPtimization (SESOP): Surprising as it may sound, these very effective acceleration methods can be implemented with no additional “cost” (i.e., multiplications by D or D T ) [Zibulevsky & Narkis, ‘05] [Elad, Matalon, & Zibulevsky, `07]

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 37 Brief Summary #3 For an effective minimization of the function we saw several iterative-shrinkage algorithms, built using 4.Iterative Reweighed LS 5.Fixed Point Iteration 6.Greedy Algorithms 1.Proximal Point Method 2.Bound Optimization 3.Parallel Coordinate Descent How Are They Performing?

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 38 1.Motivating the Minimization of f( α ) Describing various applications that need this minimization 2.Some Motivating Facts General purpose optimization tools, and the unitary case 3.Iterative-Shrinkage Algorithms We describe five versions of those in detail 4.Some Results Image deblurring results 5.Conclusions Agenda

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 39 A Deblurring Experiment White Gaussian Noise σ 2 =2 15×15 kernel v

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 40 Penalty Function: More Details A given blurred and noisy image of size 256×256 2D un-decimated Haar wavelet transform, 3 resolution layers, 7:1 redundancy Approximation of L 1 with slight (S=0.01) smoothing The blur operator Note: This experiment is similar (but not eqiuvalent) to one of tests done in [Figueiredo & Nowak `05], that leads to state-of-the-art results.

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 41 The Dictionary – Undecimated Haar H-LPF [+0.5,+0.5] H-HPF [+0.5,-0.5] V-LPF [+0.5,+0.5] V-HPF [+0.5,-0.5] V-LPF [+0.5,+0.5] V-HPF [+0.5,-0.5] HH1 HL1 LH1 LL1 F F HH1 HL1 LH1 LL1 F HH2 HL2 LH2 LL2 This process is actually creating the multiplication by the matrix D T F HH3 HL3 LH3 LL3

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 42 The Dictionary – Example Atoms

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 43 The Function

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 44 The Function - A Closer Look

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 45 The Function - The Shrinkage Analytic Expression for the Shrinkage

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 46 Iterations/Computations f( α )-f min 05101520253035404550 10 2 3 4 5 6 7 8 9 SSF SSF-LS SSF-SESOP-5 So, The Results: The Function Value

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 47 f( α )-f min 05101520253035404550 10 2 3 4 5 6 7 8 9 SSF SSF-LS SSF-SESOP-5 PCD-LS PCD-SESOP-5 So, The Results: The Function Value Iterations/Computations Comment: Both SSF and PCD (and their accelerated versions) are provably converging to the minima of f( α ).

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 48 So, The Results: ISNR 051015202530 -20 -15 -10 -5 0 5 10 35404550 ISNR [dB] SSF SSF-LS SSF-SESOP-5 6.41dB Iterations/Computations

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 49 So, The Results: ISNR 051015202530 -20 -15 -10 -5 0 5 10 35404550 SSF SSF-LS SSF-SESOP-5 PCD-LS PCD-SESOP-5 ISNR [dB] 7.03dB Iterations/Computations

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 50 Visual Results original (left), Measured (middle), and Restored (right): Iteration: 0 ISNR=-16.7728 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 1 ISNR=0.069583 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 2 ISNR=2.46924 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 3 ISNR=4.1824 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 4 ISNR=4.9726 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 5 ISNR=5.5875 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 6 ISNR=6.2188 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 7 ISNR=6.6479 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 8 ISNR=6.6789 dB original (left), Measured (middle), and Restored (right): Iteration: 12 ISNR=6.9416 dB original (left), Measured (middle), and Restored (right): Iteration: 19 ISNR=7.0322 dB PCD-SESOP-5 Results:

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 51 1.Motivating the Minimization of f( α ) Describing various applications that need this minimization 2.Some Motivating Facts General purpose optimization tools, and the unitary case 3.Iterative-Shrinkage Algorithms We describe five versions of those in detail 4.Some Results Image deblurring results 5.Conclusions Agenda

A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 52 Conclusions – The Bottom Line If your work leads you to the need to minimize the problem: Then:  We recommend you use an Iterative-Shrinkage algorithm.  SSF and PCD are Preferred: both are provably converging to the (local) minima of f( α ), and their performance is very good, getting a reasonable result in few iterations.  Use SESOP Acceleration – it is very effective, and with hardly any cost.  There is Room for more work on various aspects of these algorithms – see the accompanying paper.

Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel Sparse & Redundant Representations by Iterative-Shrinkage.

Similar presentations

Presentation on theme: "Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel Sparse & Redundant Representations by Iterative-Shrinkage."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel Sparse & Redundant Representations by Iterative-Shrinkage.

Similar presentations

Presentation on theme: "Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel Sparse & Redundant Representations by Iterative-Shrinkage."— Presentation transcript:

Similar presentations

About project

Feedback