Download presentation
Presentation is loading. Please wait.
Published byFranklin Richardson Modified over 8 years ago
1
Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000, Israel Sparse & Redundant Representations by Iterative-Shrinkage Algorithms * * Joint work with: Boaz Joseph Michael Matalon Shtok Zibulevsky
2
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 2 the Minimization of the Function by Iterative-Shrinkage Algorithms Today’s Talk is About Today we will discuss: Why this minimization task is important? Which applications could benefit from this minimization? How can it be minimized effectively? What iterative-shrinkage methods are out there? and
3
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 3 Agenda 1.Motivating the Minimization of f( α ) Describing various applications that need this minimization 2.Some Motivating Facts General purpose optimization tools, and the unitary case 3.Iterative-Shrinkage Algorithms We describe five versions of those in detail 4.Some Results Image deblurring results 5.Conclusions Why?
4
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 4 Lets Start with Image Denoising Remove Additive Noise ? Likelihood: Relation to measurements Prior or regularization y : Given measurements x : Unknown to be recovered Many of the existing image denoising algorithms are related to the minimization of an energy function of the form We will use a Sparse & Redundant Representation prior.
5
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 5 Our MAP Energy Function We assume that x is created by M : where is a sparse & redundant representation and D is a known dictionary. This leads to: This MAP denoising algorithm is known as basis Pursuit Denoising [Chen, Donoho, Saunders 1995]. The term ρ(α) measures the sparsity of the solution : L 0 -norm ( ||α|| 0 ) leads to non-smooth & non-convex problem. The L p norm ( ) with 0<p≤1 is often found to be equivalent. Many other ADDITIVE sparsity measures are possible. MM D =x= = This is Our Problem !!!
6
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 6 General (linear) Inverse Problems Assume that x is known to emerge from M, as before. Suppose we observe, a “blurred” and noisy version of x. How could we recover x? MM Noise ? A MAP estimator leads to: This is Our Problem !!!
7
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 7 Inverse Problems of Interest De-Noising De-Blurring In-Painting De-Mosaicing Tomography Image Scale-Up & super-resolution And more … This is Our Problem !!!
8
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 8 M2M2 Signal Separation M1M1 Given a mixture z=x 1 +x 2 +v – two sources, M 1 and M 2, and white Gaussian noise v, we desire to separate it to its ingredients. Written differently: Thus, solving this problem using MAP leads to the Morphological Component Analysis (MCA) [Starck, Elad, Donoho, 2005] : v z This is Our Problem !!!
9
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 9 Compressed-Sensing [Candes et.al. 2006], [Donoho, 2006] MM In compressed-sensing we compress the signal x by exploiting its origin. This is done by p<<n random projections. The core idea: (P size: p×n) holds all the information about the original signal x, even though p<<n. Reconstruction? Use MAP again and solve Noise ? This is Our Problem !!!
10
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 10 Brief Summary #1 The minimization of the function is a worthy task, serving many & various applications. So, How This Should be Done When the Dimension involved are Huge (Thousands to many Millions)?
11
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 11 1.Motivating the Minimization of f( α ) Describing various applications that need this minimization 2.Some Motivating facts General purpose optimization tools, and the unitary case 3.Iterative-Shrinkage Algorithms We describe five versions of those in detail 4.Some Results Image deblurring results 5.Conclusions Agenda Apply Wavelet Transform Apply Inv. Wavelet Transform LUT
12
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 12 Is there a Problem? The first thought: With all the existing knowledge in optimization, we could find a solution. MMethods to consider: (Normalized) Steepest Descent – compute the gradient and follow it’s path. Conjugate Gradient – use the gradient and the previous update direction, combined by a preset formula. Pre-Conditioned SD – weight the gradient by the Hessian’s diagonal inverse. Truncated Newton – Use the gradient and Hessian to define a linear system, and solve it approximately by a set of CG steps. Interior-Point Algorithms – Separate to positive and negative entries, and use both the primal and the dual problems + barrier for forcing positivity.
13
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 13 General-Purpose Software? So, simply download one of many general-purpose packages: L1-Magic (interior-point solver), Sparselab (interior-point solver), MOSEK (various tools), Matlab Optimization Toolbox (various tools), … A Problem: General purpose software-packages (algorithms) are typically performing poorly on our task. Possible reasons: The fact that the solution is expected to be sparse (or nearly so) in our problem is not exploited in such algorithms. The Hessian of f( α ) tends to be highly ill-conditioned near the (sparse) solution. So, are we stuck? Is this problem really that complicated? 51015202530 5 10 15 20 25 30 0 50 100 150 200 Relatively small entries for the non-zero entries in the solution
14
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 14 We got a separable set of m identical 1D optimization problems Consider the Unitary Case (DD H =I) Use L 2 is unitarily invariant Define
15
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 15 The 1D Task We need to solve the following 1D problem: Such a Look-Up-Table (LUT) α opt = S ρ,λ (β) can be built for ANY sparsity measure function ρ ( α ), including non-convex ones and non-smooth ones (e.g., L 0 norm), giving in all cases the GLOBAL minimizer of g( α ). LUT β α opt
16
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 16 The Unitary Case: A Summary Minimizing is done by: Multiply by D H Multiply by D The obtained solution is the GLOBAL minimizer of f( α ), even if f( α ) is non-convex. LUT DONE!
17
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 17 Brief Summary #2 The minimization of Leads to two very Contradicting Observations: 1.The problem is quite hard – classic optimization find it hard. 2.The problem is trivial for the case of unitary D. How Can We Enjoy This Simplicity in the General Case?
18
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 18 1.Motivating the Minimization of f( α ) Describing various applications that need this minimization 2.Some Motivating Facts General purpose optimization tools, and the unitary case 3.Iterative-Shrinkage Algorithms We describe several versions of those in detail 4.Some Results Image deblurring results 5.Conclusions Agenda
19
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 19 Lets Start with Something Simple What if our problem has a two-ortho dictionary? Sardy, Bruce and Zheng suggested a Block-Coordinate-Descent (BCD) to handle this problem. The key idea is to iterate the steps: Fix and update - this is an ortho problem for which we have a closed solution.
20
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 20 Here are the Details … and similarly,
21
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 21 The BCR Algorithm Iterate the following two steps: This algorithm has been shown to guarantee a converge to a local minimum of the original problem. If the choice of ( ) is convex, this implies convergence to the global optimum.
22
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 22 The BCR Algorithm Lets consider a somewhat different (and slower) scheme: Little bit of algebra enables to merge these two equations into The beauty in this iterative formula is that it does not rely on the splitting of D into the two ortho parts. Could this step be true for general dictionaries D?
23
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 23 Iterative-Shrinkage Algorithms? We will present THE PRINCIPLES of three leading methods: Bound-Optimization and EM [Figueiredo & Nowak, `03], Surrogate-Separable-Function (SSF) [Daubechies, Defrise, & De-Mol, `04], Parallel-Coordinate-Descent (PCD) algorithm [Elad `05], [Matalon, et.al. `06], There are other such methods: IRLS-based, StOMP, FISTA, ADMM and augmented-Lagrange methods, Sprint Bergman … Common to all is a set of operations in every iteration that includes: (i) Multiplication by D, (ii) Multiplication by D H, and (iii) A Scalar shrinkage on the solution S ρ, λ ( α ). Some of these algorithms pose a direct generalization of the unitary case, their 1 st iteration is the solver we have seen.
24
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 24 1. The Proximal-Point Method Aim: minimize f( α ) – Suppose it is found to be too hard. Define a surrogate-function g( α,α 0 )=f( α )+dist( α-α 0 ), using a general (uni-modal, non-negative) distance function. Then, the following algorithm necessarily converges to a local minima of f( α ) [Rokafellar, `76] : Comments: (i) Is the minimization of g( α,α 0 ) easier? It better be! (ii) Looks like it will slow-down convergence. Really? Minimize g( α,α 1 ) α2α2 Minimize g( α,α k ) αkαk α k+1 Minimize g( α,α 0 ) α0α0 α1α1
25
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 25 The Proposed Surrogate-Functions Our original function is: The distance to use: Proposed by [Daubechies, Defrise, & De-Mol `04]. Require. The beauty in this choice: the term vanishes It is a separable sum of m 1D problems. Thus, we have a closed form solution by THE SAME SHRINKAGE !!
26
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 26 Detailed Derivation …
27
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 27 Detailed Derivation … This is a classic format for which we know that the solution is plain shrinkage. Observe that this formula is nothing but a gradient descent over the quadratic term, followed by projection onto the sparsity penalty.
28
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 28 The Proposed Surrogate-Functions Our original function is: The distance to use: Proposed by [Daubechies, Defrise, & De-Mol `04]. Require. The beauty in this choice: the term vanishes Minimization of g( α,α 0 ) is done in a closed form by shrinkage, done on the vector β k, and this generates the solution α k+1 of the next iteration.
29
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 29 The Resulting SSF Algorithm Multiply by D H Multiply by D LUT While the Unitary case solution is given by Multiply by D H /c Multiply by D LUT + + - + the general case, by SSF requires :
30
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 30 A Word about Convergence Speed We are minimizing the surrogate functional The affect of the extra-distance term is to slow-down the convergence, because is not permitted to be “far” from 0. The added distance is not isotropic – it affects the optimization in some directions, while being almost transparent in others. For example, assume that c=1 and - 0 is sparse. Then, due to RIP, we have that Thus, for sparse solutions, there is a “Tunneling” affect ….
31
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 31 2. Bound-Optimization Technique Aim: minimize f( α ) – Suppose it is found to be too hard. Define a function Q( α,α 0 ) that satisfies the following conditions: Q( α 0,α 0 )=f( α 0 ), Q( α,α 0 )≥f( α ) for all α, and Q( α,α 0 )= f( α ) at α 0. Then, the following algorithm necessarily converges to a local minima of f( α ) [Hunter & Lange, (Review)`04] : Well, regarding this method … The above is closely related to the EM algorithm [Neal & Hinton, `98]. Figueiredo & Nowak’s method (`03): use the BO idea to minimize f( α ). They use the VERY SAME Surrogate functions we saw before. Minimize Q( α,α 1 ) α2α2 Minimize Q( α,α k ) αkαk α k+1 Minimize Q( α,α 0 ) α0α0 α1α1
32
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 32 3. Start With Coordinate Descent We aim to minimize. First, consider the Coordinate Descent (CD) algorithm. This is a 1D minimization problem: It has a closed for solution, using a simple SHRINKAGE as before, applied on the scalar. - 2 2 D - 2 2 LUT in α opt
33
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 33 Detailed Derivation This is a classic shrinkage format
34
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 34 Parallel Coordinate Descent (PCD) Current solution for minimization of f( α ) Descent directions obtained by the previous CD algorithm m-dimensional space We will take the sum of these m descent directions for the update step. Line search is mandatory: This leads to
35
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 35 The PCD Algorithm [Elad, `05] [Matalon, Elad, & Zibulevsky, `06] Note: Q can be computed quite easily off-line. Its storage is just like storing the vector α k. Where and represents a line search (LS). Multiply by QD H Multiply by D LUT + + - + Line Search
36
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 36 Algorithms’ Speed-Up Option 1 – Use v as is: For example (SSF): Option 2 – Use line-search with v: Option 3 – Use Sequential Subspace OPtimization (SESOP): Surprising as it may sound, these very effective acceleration methods can be implemented with no additional “cost” (i.e., multiplications by D or D T ) [Zibulevsky & Narkis, ‘05] [Elad, Matalon, & Zibulevsky, `07]
37
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 37 Brief Summary #3 For an effective minimization of the function we saw several iterative-shrinkage algorithms, built using 4.Iterative Reweighed LS 5.Fixed Point Iteration 6.Greedy Algorithms 1.Proximal Point Method 2.Bound Optimization 3.Parallel Coordinate Descent How Are They Performing?
38
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 38 1.Motivating the Minimization of f( α ) Describing various applications that need this minimization 2.Some Motivating Facts General purpose optimization tools, and the unitary case 3.Iterative-Shrinkage Algorithms We describe five versions of those in detail 4.Some Results Image deblurring results 5.Conclusions Agenda
39
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 39 A Deblurring Experiment White Gaussian Noise σ 2 =2 15×15 kernel v
40
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 40 Penalty Function: More Details A given blurred and noisy image of size 256×256 2D un-decimated Haar wavelet transform, 3 resolution layers, 7:1 redundancy Approximation of L 1 with slight (S=0.01) smoothing The blur operator Note: This experiment is similar (but not eqiuvalent) to one of tests done in [Figueiredo & Nowak `05], that leads to state-of-the-art results.
41
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 41 The Dictionary – Undecimated Haar H-LPF [+0.5,+0.5] H-HPF [+0.5,-0.5] V-LPF [+0.5,+0.5] V-HPF [+0.5,-0.5] V-LPF [+0.5,+0.5] V-HPF [+0.5,-0.5] HH1 HL1 LH1 LL1 F F HH1 HL1 LH1 LL1 F HH2 HL2 LH2 LL2 This process is actually creating the multiplication by the matrix D T F HH3 HL3 LH3 LL3
42
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 42 The Dictionary – Example Atoms
43
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 43 The Function
44
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 44 The Function - A Closer Look
45
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 45 The Function - The Shrinkage Analytic Expression for the Shrinkage
46
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 46 Iterations/Computations f( α )-f min 05101520253035404550 10 2 3 4 5 6 7 8 9 SSF SSF-LS SSF-SESOP-5 So, The Results: The Function Value
47
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 47 f( α )-f min 05101520253035404550 10 2 3 4 5 6 7 8 9 SSF SSF-LS SSF-SESOP-5 PCD-LS PCD-SESOP-5 So, The Results: The Function Value Iterations/Computations Comment: Both SSF and PCD (and their accelerated versions) are provably converging to the minima of f( α ).
48
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 48 So, The Results: ISNR 051015202530 -20 -15 -10 -5 0 5 10 35404550 ISNR [dB] SSF SSF-LS SSF-SESOP-5 6.41dB Iterations/Computations
49
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 49 So, The Results: ISNR 051015202530 -20 -15 -10 -5 0 5 10 35404550 SSF SSF-LS SSF-SESOP-5 PCD-LS PCD-SESOP-5 ISNR [dB] 7.03dB Iterations/Computations
50
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 50 Visual Results original (left), Measured (middle), and Restored (right): Iteration: 0 ISNR=-16.7728 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 1 ISNR=0.069583 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 2 ISNR=2.46924 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 3 ISNR=4.1824 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 4 ISNR=4.9726 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 5 ISNR=5.5875 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 6 ISNR=6.2188 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 7 ISNR=6.6479 dBoriginal (left), Measured (middle), and Restored (right): Iteration: 8 ISNR=6.6789 dB original (left), Measured (middle), and Restored (right): Iteration: 12 ISNR=6.9416 dB original (left), Measured (middle), and Restored (right): Iteration: 19 ISNR=7.0322 dB PCD-SESOP-5 Results:
51
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 51 1.Motivating the Minimization of f( α ) Describing various applications that need this minimization 2.Some Motivating Facts General purpose optimization tools, and the unitary case 3.Iterative-Shrinkage Algorithms We describe five versions of those in detail 4.Some Results Image deblurring results 5.Conclusions Agenda
52
A Wide-Angle View Of Iterative-Shrinkage Algorithms By: Michael Elad, Technion, Israel 52 Conclusions – The Bottom Line If your work leads you to the need to minimize the problem: Then: We recommend you use an Iterative-Shrinkage algorithm. SSF and PCD are Preferred: both are provably converging to the (local) minima of f( α ), and their performance is very good, getting a reasonable result in few iterations. Use SESOP Acceleration – it is very effective, and with hardly any cost. There is Room for more work on various aspects of these algorithms – see the accompanying paper.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.