Eran Treister and Irad Yavneh Computer Science, Technion (with thanks to Michael Elad)

Example: Image Denoising 3 f y v Noisy Signal Additive Noise Signal Denoising + =

Many denoising algorithms minimize: Example: Image Denoising relation to “prior” or measurement regularization 4 f y v + =

Sparse Representation Modeling A xf= 5 Dictionary (matrix) Sparse Representation Signal The signal f – represented by only a few columns of A. The matrix A is redundant (# columns > # rows).

Sparse Representation Modeling A x x S – Support sub-vector A S - Support sub-matrix 6 The support – the set of columns that comprise the signal. S = supp{x} = {i : x i ≠ 0}. f =

Denoising by sparse representation A xy 7 + v Additive Noise f Reconstruct clean signal f from noisy y Image Noisy Signal

Applications De-noising. De-blurring. In-painting. De-mosaicing. Computed Tomography. Image scale-up & super-resolution And more… 8

Formulation 1 The straightforward way to formulate sparse representation: By constrained minimization The problem is not convex and may have many local minima. Solution approximated by “greedy algorithms”. 9

Formulation 2 - Basis Pursuit A relaxation of the previous problem: ||x|| 1 – l 1 norm; minimizer x̂ is typically sparse. The problem is convex; has a convex set of “equivalent” solutions. 10

Alternative Formulation 2 l 1 penalized least-squares F (x) – convex. Bigger μ → sparser minimizer. Gradient is discontinuous for x i = 0. General purpose optimization tools struggle. 11

Iterated Shrinkage Methods Bound-Optimization and EM [Figueiredo & Nowak, `03]. Surrogate-Separable-Function (SSF) [Daubechies et al., `04]. Parallel-Coordinate-Descent (PCD) [Elad `05], [Matalon, et.al. `06]. IRLS-based algorithm [Adeyemi & Davies, `06]. Gradient Projection Sparse Reconstruction (GPSR) [M. Figueiredo et.al. `07]. Sparse Reconstruction by separable approx.(SpaRSA) [Wright et al. `09] 12

Iterated Shrinkage Coordinate Descent (CD) Updates each scalar variable in turn so as to minimize the objective. Parallel Coordinate Descent (PCD) Applies the CD update simultaneously to all variables. Based on the projection of the residual: A T (Ax-y). PCD + (non-linear) Conjugate Gradient (CG-PCD) [Zibulevski, Elad `10]. Uses two consecutive PCD steps to calculate the next one. 13

The main idea: Use existing iterated shrinkage methods (as “relaxations”). Improve the current approximation by using a reduced (lower-level) dictionary. 15

The main idea: Reducing the dimension of A The solution is sparse – most columns will not end up in the support! At each stage: Many columns are highly unlikely to contribute to the minimizer. Such columns can be temporarily dropped – resulting in a smaller problem. 16

Reducing the dimension of A 17 C : lower-level subset of columns

Fine level problem: Assume we have a prolongation P that satisfies substituting for x: 18 Reducing the problem

The choice of C - Likelihood to enter the support The residual is defined by: A column is likely to enter the support – If it has a high inner-product with r ( greedy approach ). Likely: Columns that are currently not in the support, which have the largest likelihood. 20

Lower-level dictionary choosing m c = m/2 columns 21

22 The multilevel cycle Repeated iteratively until convergence

Theoretical Properties Inter-level correspondence: Direct-Solution (two-level): 23

Theoretical Properties No stagnation (two-level): Complementary roles: 24

Theoretical Properties Monotonicity: Convergence: Assuming that: Relax(x) reduces F(x) proportionally to the square of its gradient. 25

Theoretical Properties C-selection guarantee Assume columns of A are normalized. x – current approximation. x̂ - solution. C - chosen using x, |C|>|supp{x}|. 26

Initialization When starting with zero initial guess, relaxations tend to initially generate supports that are too rich. V-cycle efficiency might be hampered. We adopt a “Full multigrid (FMG) algorithm”: 27

Numerical Results Synthetic denoising experiment Experiments with various dictionaries A nxm. n=1024, m=4096. Initial Support S – randomly chosen of size 0.1n. x S – random vector ~ N( 0,I ). f = A S x S. Addition of noise: v ~ N( 0,σ 2 I ). σ = 0.02 y = f + v 28

Numerical Results Stopping criterion [Loris 2009]: One level methods: CD + - CD + linesearch. PCD, CG – non-linear CG with PCD [Zibulevsky & Elad 2010]. SpaRSA [Wright et al. `09]. ML - multilevel method. ML-CD - multilevel framework with CD + as shrinkage iteration. ML-CG - multilevel framework with CG as shrinkage iteration. 29

Experiment 1: Random Normal A – random dense n x m matrix. A i,j ~N( 0,1 ). 30

Experiment 2: Random ± 1 31

Experiment 3: ill-conditioned A – random dense n x m matrix. A i,j ~N( 0,1 ). Singular values manipulated so that A becomes ill- conditioned [Loris 2009, Zibulevsky & Elad 2010]. 32

33 Experiment 3: ill-conditioned

Experiment 4: Similar columns A = [B|C]. B – random. C – perturbed rank 1 matrix. 34

Conclusions & Future work New multilevel approach developed. Exploits the sparsity of the solution. Accelerates existing iterated shrinkage methods. Future work: Improvements: Faster lowest-level solutions. More suitable iterated shrinkage schemes. Handling non-sparse solutions (different priors). A multilevel method for fast-operator dictionaries. 35

Next step: Covariance Selection Given a few random vectors: we wish to estimate the inverse of the covariance, Σ -1, assuming it is sparse. From probability theory: 36

Problem formulation Maximum likelihood (ML) estimation - we maximize Likeliest mean: Likeliest covariance: 37

Problem formulation Setting the gradient of J to zero yields: However, K<<n, so X is of low rank. Ill-posed problem. Introducing regularization: λ > 0: minimizer is sparse and positive definite. 38

Our direction Via Newton steps we can get a series of l 1 regularized least squares problems. Only a few steps needed. Current “state of the art”: CD on Newton steps. Formulating a step: O(n 3 ) If supp( Ɵ ) is restricted to O(m): each Newton problem can be solved in O(mn ·#it ). First few steps: O(n 3 ·#it ). Last few steps: O(n 2 ·#it ). Our direction: ML-CD+linesearch: all steps O(n 2 ·#it ), with fewer #it. Hopefully: quasi Newton steps: formulation in O(n 2 ). 39

Eran Treister and Irad Yavneh Computer Science, Technion (with thanks to Michael Elad)

Similar presentations

Presentation on theme: "Eran Treister and Irad Yavneh Computer Science, Technion (with thanks to Michael Elad)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Eran Treister and Irad Yavneh Computer Science, Technion (with thanks to Michael Elad)

Similar presentations

Presentation on theme: "Eran Treister and Irad Yavneh Computer Science, Technion (with thanks to Michael Elad)"— Presentation transcript:

Similar presentations

About project

Feedback