Download presentation
Presentation is loading. Please wait.
Published byMeryl McGee Modified over 8 years ago
1
SUPER RESOLUTION USING NEURAL NETS Hila Levi & Eran Amar Weizmann Ins. 2016
2
Lecture Outline ■ Introduction ■ Previous work ■ NN #1 - Dong et al, pure learning ■ NN #2 - Wang et al, domain knowledge integrating ■ NN #3 - Kim et al, pure learning ■ Conclusions
3
Introduction – Super resolution ■ Goal: obtaining a high resolution (HR) image from a low resolution (LR) input image ■ Ill posed problem ■ Motivation – overcoming the inherent resolution limitations of low cost imaging sensors/compressed images allowing better utilization of high resolution displays
4
Introduction – Neural Networks Old machine learning algorithm (first work - 1943) Widely used since 2012 (Alex net) Mostly on high-level-vision tasks (classification, detection, segmentation)
5
Previous work – Single Image Super resolutions ■ Interpolation based – Bilinear, Bicubic, Splines ■ Reconstruction based - Exploiting natural images priors ■ Example based – Using External Database (millions of HR + LR patch pairs) – Using redundancy in the image itself – at different locations and across different scales – Using Predefined dictionary - Sparse Coding Algorithm ■ NN – Image denoising – Image super-resolution
6
Previous work - Interpolation based ■ Results overly smoothed edges + ringing artifacts
7
Previous work – Single Image Super resolutions ■ Interpolation based – Bilinear, Bicubic, Splines ■ Reconstruction based - Exploiting natural images priors ■ Example based – Using External Database (millions of HR + LR patch pairs) – Using redundancy in the image itself – at different locations and across different scales – Using Predefined dictionary - Sparse Coding Algorithm ■ NN – Image denoising – Image super-resolution
8
Previous work - Reconstruction based ■ Limited to small magnification factors ■ L1 minimization + regularization based on bilateral prior ■ Gradient profile prior
9
Previous work – Single Image Super resolutions ■ Interpolation based – Bilinear, Bicubic, Splines ■ Reconstruction based - Exploiting natural images priors ■ Example based – Using External Database (lots of HR + LR patch pairs) – Using redundancy in the image itself – at different locations and across different scales – Using Predefined dictionary - Sparse Coding Algorithm ■ NN – Image denoising – Image super-resolution
10
Previous work - Example based - External DB ■ [*] W. T. Freeman et al, Example based super-resolution, 2002
11
Previous work – Single Image Super resolutions ■ Interpolation based – Bilinear, Bicubic, Splines ■ Reconstruction based - Exploiting natural images priors ■ Example based – Using External Database (millions of HR + LR patch pairs) – Using redundancy in the image itself – at different locations and across different scales – Using Predefined dictionary - Sparse Coding Algorithm ■ NN – Image denoising – Image super-resolution
12
Previous work - Example based - using the image ■ Use patch recurrence within and across scales of a single image: ■ [*] D. Glasner, S. Bagon, and M. Irani. Super-resolution from a single image. In ICCV, 2009
13
Previous work - Example based - using the image ■ [*] J. Huang et al, Single Image Super-resolution from Transformed Self- Exemplars, 2015
14
Previous work – Single Image Super resolutions ■ Interpolation based – Bilinear, Bicubic, Splines ■ Reconstruction based - Exploiting natural images priors ■ Example based – Using External Database (millions of HR + LR patch pairs) – Using redundancy in the image itself – at different locations and across different scales – Using Predefined dictionary - Sparse Coding Algorithm ■ NN – Image denoising – Image super-resolution
15
Previous work - Example based - Sparse Coding Suppose we have Jointly Pre-trained HR + LR Dictionaries Typical numbers - 512 9x9 patches ■ [*] J. Yang et al, Image SR via sparse representation, IEEE TIP 2010
20
Previous work - Example based - SC -Sparse code extraction ■ We want to solve – NP hard problem ■ Instead we will solve – Still Sparsity encourage … – Known problem with an iterative solution – ISTA (Iterative Shrinkage and Thresholding Algorithm
21
Previous work – Single Image Super resolutions ■ Interpolation based ■ Reconstruction based ■ Example based ■ NN – Image denoising – ■ MLP (2012) ■ Image Denoising and Inpainting (2012) – Image super-resolution – ■ Dong et al (2014) ■ Wang et al (2015) ■ Kim et al (2015)
22
Previous work – Single Image Super resolutions ■ Interpolation based ■ Reconstruction based ■ Example based ■ NN – Image denoising – ■ MLP (2012), ■ Image Denoising and Inpainting (2012) – Image super-resolution – ■ Dong et al (2014) ■ Wang et al (2015) ■ Kim et al (2015)
23
Metrics ■ MSE (Mean Square Error) ■ PSNR (Peak Signal to Noise Ratio) – Common in image restoration tasks – Partially related to perceptual quality ■ SSIM (Structural SIMilarity index)
24
Databases ■ Training – 91 images for training – ImageNet ■ Testing – Set5 – 5 images ( factor 2 3 4) – Set 14 – 14 images (factor 3) – urban100 – BDS100 – Imagenet
25
NN #1 – (SR-CNN) ■ Based on two articles: – C. Dong et al, Learning a deep convolutional network for image SR, ECCV 2014 – C. Dong et al, Image SR using deep convolutional networks, TPAMI 2015
26
#1 - Contribution of the work
27
#1 - Super Resolution Pipeline ■ Patch Extraction and representation – extracts overlapping patches and represent each patch by a set of pre-trained bases (PCA, DCT, Haar) ■ Non linear mapping - each LR vector maps conceptually to a HR vector ■ Reconstruction – aggregates the HR vectors to generate HR image
28
#1 - Relationship to convolutional neural network ■ Patch Extraction and representation ■ Non linear mapping ……………. ■ Reconstruction SR-CNN ■ Applying a convolutional layer with n1 filters on the input image ■ One (or more) convolutional layers with a nonlinear activation……… ■ Linear convolution on the n2 feature maps SR creation stages
29
#1 - Relationship to convolutional neural network ■ Patch Extraction and representation ■ Non linear mapping ……………. ■ Reconstruction SR-CNN ■ Applying a convolutional layer with n1 filters on the input image ■ One (or more) convolutional layers with a nonlinear activation……… ■ Linear convolution on the n2 feature maps SR creation stages
30
#1 - Relationship to convolutional neural network ■ Patch Extraction and representation ■ Non linear mapping ……………. ■ Reconstruction SR-CNN ■ Applying a convolutional layer with n1 filters on the input image ■ One (or more) convolutional layers with a nonlinear activation……… ■ Linear convolution on the n2 feature maps SR creation stages
31
#1 - Relationship to the sparse coding based methods ■ Extract LR patch, Project on a LR dictionary, size n1 ■ Sparse coding solver, transform to HR sparse code, size n2 ■ Project to HR dictionary, average HR patches SRCNN ■ Applying n1 linear filters on the input image ■ Non linear mapping ………… ….. ■ Linear convolution on the n2 feature maps Sparse Coding
32
#1 - Pros & Cons ■ Pros – – End to end optimization scheme – Very flexible, standard building blocks ■ Cons – – A very simple network with 3 layers. Limited expressive power – Sparse coding techniques rely on sparsity – there isn’t any mechanism to ensure sparsity
33
#1 - Training ■ Loss function - MSE ■ Stochastic gradient descent ■ Training data: – 91 images, total of 24800 patches (33x33) – 396000 images (ImageNet), 5 million patches ■ Testing – Set5 – Set14 – BSD200 ■ Data creation: – Image bluring – Subsampling – Bicubic interpolation
34
#1 - Training Data ■ More training data leads to better resul ts ■ The effect of big data is less expressive than in high-level-vision problems – maybe because the relatively small network
35
#1 - Net Parameters - Filter Size, Net width
36
#1 - Net Parameters - deeper structure ●The performance was worse than before ●No convergence after a week of training
37
#1 - Comparison with existing methods ■ SR-CNN 9-5-5 (~60K params) trained on ImageNet
40
#1 - Conclusions ■ Nice results ■ Pure learning scheme with the simplest architecture exists ■ The comparison to the sparse coding algorithm is not needed ■ Experienced bad convergence properties, that are not usual in classification CNN anymore – probably because no regularizations: – No dropouts, – no batch normalization, – no proper initialization, – no pre-processing
41
NN #2 – (SCN) ■ Based on: – Z. Wang, et al, Deep networks for image super-resolution with sparse prior. ICCV, 2015
42
#2 - Contribution of the work ■ C ombine the domain expertise of sparse coding and the merits of deep learning to achieve better SR performance with faster training and smaller model size; ■ U se network cascading for large and arbitrary scaling factors; ■ C onduct a subjective evaluation on several recent state-of-the-art methods.
43
#2 - Reminder – Sparse coding
44
#2 - Network implementation of sparse coding ■ Based on: – K. Gregor and Y. LeCun, Learning fast approximations of sparse coding, ICML 2010 ■ Main idea: – ISTA can be easily implemented as a recurrent network – Adding end-to-end learning (LISTA) – a good approximation can be obtained within a fixed (and small) number of recurrent stages
45
#2 - Sparse coding based Network ■ Now: – Sparse coding solver – learnable block – Patch wise processing – convolutional layers – Combining all together
46
#2 - Advantages over previous models ■ Architecture follows exactly the SC based SR method, enabling end-to-end training ■ Net’s block are meaningful – can be initialized by previously learned dictionaries - and further improve by training
47
#2 - Network cascade ■ SCN - A separate model is needed to be trained for each scaling factor ■ CSCN – cascade of SCNs
48
#2 - Training ■ Loss function - MSE ■ Stochastic gradient descent ■ Training data: – 91 images, total of 24800 patches (33x33), data augmentation ■ Testing – Set5 – Set14 – BSD100 ■ Data creation: – Image bluring – Subsampling – Bicubic interpolation
49
#2 - Comparison with existing methods
51
#2 - Subjective evaluation
52
#2 - Conclusions
53
NN #3 – (VDSR) ■ Based on: – J. Kim et al, Accurate Image Super-Resolution Using Very Deep Convolutional Networks, arXiv:1511.04587, 2015
54
#3 - Contribution of the work ■ Highly accurate SR method based on a very deep convolutional network ■ Boosting convergence rate using residual learning and gradient clipping ■ Extension to multi-scale SR
55
#3 - Proposed network ■ Inspired by the VGG net (Simonyan & Zisserman) ■ 20 layers, each consists of 64 feature maps using 3x3 filters
56
#3 - The deeper – the better ■ Deeper network results in: – Larger receptive field per output pixel (more information) – More expressive functions
57
#3 - Residual Learning ■ Much faster convergence ■ Superior performance
58
#3 - Training ■ Loss function - MSE ■ Stochastic gradient descent ■ Training data: – 291 images, data augmentation ■ Testing – Set5 – Set14 – B100 – Urban100 ■ Data creation: – Image bluring – Subsampling – Bicubic interpolation
59
#3 - Comparison with existing methods
62
#3 - Conclusions ■ Better results than the former method ■ Elegantly organized work, the residual prediction is a nice idea ■ Can be easily applied to other image restoration domains
63
just another last slide - How to utilize Neural Networks for algorithmic challenges? One possible approach, is to try to combine the network with existing well- engineered algorithms (“physically” or by better initialization). On the other hand, there is a “pure” learning approach which looks at the NN as a “black box”. That is, one should build a network with some (possibly customized) architecture and let it optimize its parameters jointly in an end-to-end manner. In this talk we will discuss those two approaches for the task of image restoration. Domain Expertise vs. End-to-End optimization
64
references ■ A. Beck and M. Teboulle, A fast iterative shrinkage-thresholding algorithm with application to wavelet based image deblurring, ICASSP 2009 ■ D. Glasner, S. Bagon, and M. Irani. Super-resolution from a single image. In ICCV, 2009 ■ K. Gregor and Y. LeCun, Learning fast approximations of sparse coding, ICML 2010 ■ J. Yang et al, Image SR via sparse representation, IEEE TIP 2010 ■ C. Dong et al, Learning a deep convolutional network for image SR, ECCV 2014 ■ C. Dong et al, Image SR using deep convolutional networks, TPAMI 2015 ■ J. Kim et al, Accurate Image Super-Resolution Using Very Deep Convolutional Networks, arXiv:1511.04587, 2015 ■ Z. Wang, et al, Deep networks for image super-resolution with sparse prior. ICCV, 2015
65
Any questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.