Presentation is loading. Please wait.

Presentation is loading. Please wait.

SUPER RESOLUTION USING NEURAL NETS Hila Levi & Eran Amar Weizmann Ins. 2016.

Similar presentations


Presentation on theme: "SUPER RESOLUTION USING NEURAL NETS Hila Levi & Eran Amar Weizmann Ins. 2016."— Presentation transcript:

1 SUPER RESOLUTION USING NEURAL NETS Hila Levi & Eran Amar Weizmann Ins. 2016

2 Lecture Outline ■ Introduction ■ Previous work ■ NN #1 - Dong et al, pure learning ■ NN #2 - Wang et al, domain knowledge integrating ■ NN #3 - Kim et al, pure learning ■ Conclusions

3 Introduction – Super resolution ■ Goal: obtaining a high resolution (HR) image from a low resolution (LR) input image ■ Ill posed problem ■ Motivation – overcoming the inherent resolution limitations of low cost imaging sensors/compressed images allowing better utilization of high resolution displays

4 Introduction – Neural Networks Old machine learning algorithm (first work - 1943) Widely used since 2012 (Alex net) Mostly on high-level-vision tasks (classification, detection, segmentation)

5 Previous work – Single Image Super resolutions ■ Interpolation based – Bilinear, Bicubic, Splines ■ Reconstruction based - Exploiting natural images priors ■ Example based – Using External Database (millions of HR + LR patch pairs) – Using redundancy in the image itself – at different locations and across different scales – Using Predefined dictionary - Sparse Coding Algorithm ■ NN – Image denoising – Image super-resolution

6 Previous work - Interpolation based ■ Results overly smoothed edges + ringing artifacts

7 Previous work – Single Image Super resolutions ■ Interpolation based – Bilinear, Bicubic, Splines ■ Reconstruction based - Exploiting natural images priors ■ Example based – Using External Database (millions of HR + LR patch pairs) – Using redundancy in the image itself – at different locations and across different scales – Using Predefined dictionary - Sparse Coding Algorithm ■ NN – Image denoising – Image super-resolution

8 Previous work - Reconstruction based ■ Limited to small magnification factors ■ L1 minimization + regularization based on bilateral prior ■ Gradient profile prior

9 Previous work – Single Image Super resolutions ■ Interpolation based – Bilinear, Bicubic, Splines ■ Reconstruction based - Exploiting natural images priors ■ Example based – Using External Database (lots of HR + LR patch pairs) – Using redundancy in the image itself – at different locations and across different scales – Using Predefined dictionary - Sparse Coding Algorithm ■ NN – Image denoising – Image super-resolution

10 Previous work - Example based - External DB ■ [*] W. T. Freeman et al, Example based super-resolution, 2002

11 Previous work – Single Image Super resolutions ■ Interpolation based – Bilinear, Bicubic, Splines ■ Reconstruction based - Exploiting natural images priors ■ Example based – Using External Database (millions of HR + LR patch pairs) – Using redundancy in the image itself – at different locations and across different scales – Using Predefined dictionary - Sparse Coding Algorithm ■ NN – Image denoising – Image super-resolution

12 Previous work - Example based - using the image ■ Use patch recurrence within and across scales of a single image: ■ [*] D. Glasner, S. Bagon, and M. Irani. Super-resolution from a single image. In ICCV, 2009

13 Previous work - Example based - using the image ■ [*] J. Huang et al, Single Image Super-resolution from Transformed Self- Exemplars, 2015

14 Previous work – Single Image Super resolutions ■ Interpolation based – Bilinear, Bicubic, Splines ■ Reconstruction based - Exploiting natural images priors ■ Example based – Using External Database (millions of HR + LR patch pairs) – Using redundancy in the image itself – at different locations and across different scales – Using Predefined dictionary - Sparse Coding Algorithm ■ NN – Image denoising – Image super-resolution

15 Previous work - Example based - Sparse Coding Suppose we have Jointly Pre-trained HR + LR Dictionaries Typical numbers - 512 9x9 patches ■ [*] J. Yang et al, Image SR via sparse representation, IEEE TIP 2010

16

17

18

19

20 Previous work - Example based - SC -Sparse code extraction ■ We want to solve – NP hard problem ■ Instead we will solve – Still Sparsity encourage … – Known problem with an iterative solution – ISTA (Iterative Shrinkage and Thresholding Algorithm

21 Previous work – Single Image Super resolutions ■ Interpolation based ■ Reconstruction based ■ Example based ■ NN – Image denoising – ■ MLP (2012) ■ Image Denoising and Inpainting (2012) – Image super-resolution – ■ Dong et al (2014) ■ Wang et al (2015) ■ Kim et al (2015)

22 Previous work – Single Image Super resolutions ■ Interpolation based ■ Reconstruction based ■ Example based ■ NN – Image denoising – ■ MLP (2012), ■ Image Denoising and Inpainting (2012) – Image super-resolution – ■ Dong et al (2014) ■ Wang et al (2015) ■ Kim et al (2015)

23 Metrics ■ MSE (Mean Square Error) ■ PSNR (Peak Signal to Noise Ratio) – Common in image restoration tasks – Partially related to perceptual quality ■ SSIM (Structural SIMilarity index)

24 Databases ■ Training – 91 images for training – ImageNet ■ Testing – Set5 – 5 images ( factor 2 3 4) – Set 14 – 14 images (factor 3) – urban100 – BDS100 – Imagenet

25 NN #1 – (SR-CNN) ■ Based on two articles: – C. Dong et al, Learning a deep convolutional network for image SR, ECCV 2014 – C. Dong et al, Image SR using deep convolutional networks, TPAMI 2015

26 #1 - Contribution of the work

27 #1 - Super Resolution Pipeline ■ Patch Extraction and representation – extracts overlapping patches and represent each patch by a set of pre-trained bases (PCA, DCT, Haar) ■ Non linear mapping - each LR vector maps conceptually to a HR vector ■ Reconstruction – aggregates the HR vectors to generate HR image

28 #1 - Relationship to convolutional neural network ■ Patch Extraction and representation ■ Non linear mapping ……………. ■ Reconstruction SR-CNN ■ Applying a convolutional layer with n1 filters on the input image ■ One (or more) convolutional layers with a nonlinear activation……… ■ Linear convolution on the n2 feature maps SR creation stages

29 #1 - Relationship to convolutional neural network ■ Patch Extraction and representation ■ Non linear mapping ……………. ■ Reconstruction SR-CNN ■ Applying a convolutional layer with n1 filters on the input image ■ One (or more) convolutional layers with a nonlinear activation……… ■ Linear convolution on the n2 feature maps SR creation stages

30 #1 - Relationship to convolutional neural network ■ Patch Extraction and representation ■ Non linear mapping ……………. ■ Reconstruction SR-CNN ■ Applying a convolutional layer with n1 filters on the input image ■ One (or more) convolutional layers with a nonlinear activation……… ■ Linear convolution on the n2 feature maps SR creation stages

31 #1 - Relationship to the sparse coding based methods ■ Extract LR patch, Project on a LR dictionary, size n1 ■ Sparse coding solver, transform to HR sparse code, size n2 ■ Project to HR dictionary, average HR patches SRCNN ■ Applying n1 linear filters on the input image ■ Non linear mapping ………… ….. ■ Linear convolution on the n2 feature maps Sparse Coding

32 #1 - Pros & Cons ■ Pros – – End to end optimization scheme – Very flexible, standard building blocks ■ Cons – – A very simple network with 3 layers. Limited expressive power – Sparse coding techniques rely on sparsity – there isn’t any mechanism to ensure sparsity

33 #1 - Training ■ Loss function - MSE ■ Stochastic gradient descent ■ Training data: – 91 images, total of 24800 patches (33x33) – 396000 images (ImageNet), 5 million patches ■ Testing – Set5 – Set14 – BSD200 ■ Data creation: – Image bluring – Subsampling – Bicubic interpolation

34 #1 - Training Data ■ More training data leads to better resul ts ■ The effect of big data is less expressive than in high-level-vision problems – maybe because the relatively small network

35 #1 - Net Parameters - Filter Size, Net width

36 #1 - Net Parameters - deeper structure ●The performance was worse than before ●No convergence after a week of training

37 #1 - Comparison with existing methods ■ SR-CNN 9-5-5 (~60K params) trained on ImageNet

38

39

40 #1 - Conclusions ■ Nice results ■ Pure learning scheme with the simplest architecture exists ■ The comparison to the sparse coding algorithm is not needed ■ Experienced bad convergence properties, that are not usual in classification CNN anymore – probably because no regularizations: – No dropouts, – no batch normalization, – no proper initialization, – no pre-processing

41 NN #2 – (SCN) ■ Based on: – Z. Wang, et al, Deep networks for image super-resolution with sparse prior. ICCV, 2015

42 #2 - Contribution of the work ■ C ombine the domain expertise of sparse coding and the merits of deep learning to achieve better SR performance with faster training and smaller model size; ■ U se network cascading for large and arbitrary scaling factors; ■ C onduct a subjective evaluation on several recent state-of-the-art methods.

43 #2 - Reminder – Sparse coding

44 #2 - Network implementation of sparse coding ■ Based on: – K. Gregor and Y. LeCun, Learning fast approximations of sparse coding, ICML 2010 ■ Main idea: – ISTA can be easily implemented as a recurrent network – Adding end-to-end learning (LISTA) – a good approximation can be obtained within a fixed (and small) number of recurrent stages

45 #2 - Sparse coding based Network ■ Now: – Sparse coding solver – learnable block – Patch wise processing – convolutional layers – Combining all together

46 #2 - Advantages over previous models ■ Architecture follows exactly the SC based SR method, enabling end-to-end training ■ Net’s block are meaningful – can be initialized by previously learned dictionaries - and further improve by training

47 #2 - Network cascade ■ SCN - A separate model is needed to be trained for each scaling factor ■ CSCN – cascade of SCNs

48 #2 - Training ■ Loss function - MSE ■ Stochastic gradient descent ■ Training data: – 91 images, total of 24800 patches (33x33), data augmentation ■ Testing – Set5 – Set14 – BSD100 ■ Data creation: – Image bluring – Subsampling – Bicubic interpolation

49 #2 - Comparison with existing methods

50

51 #2 - Subjective evaluation

52 #2 - Conclusions

53 NN #3 – (VDSR) ■ Based on: – J. Kim et al, Accurate Image Super-Resolution Using Very Deep Convolutional Networks, arXiv:1511.04587, 2015

54 #3 - Contribution of the work ■ Highly accurate SR method based on a very deep convolutional network ■ Boosting convergence rate using residual learning and gradient clipping ■ Extension to multi-scale SR

55 #3 - Proposed network ■ Inspired by the VGG net (Simonyan & Zisserman) ■ 20 layers, each consists of 64 feature maps using 3x3 filters

56 #3 - The deeper – the better ■ Deeper network results in: – Larger receptive field per output pixel (more information) – More expressive functions

57 #3 - Residual Learning ■ Much faster convergence ■ Superior performance

58 #3 - Training ■ Loss function - MSE ■ Stochastic gradient descent ■ Training data: – 291 images, data augmentation ■ Testing – Set5 – Set14 – B100 – Urban100 ■ Data creation: – Image bluring – Subsampling – Bicubic interpolation

59 #3 - Comparison with existing methods

60

61

62 #3 - Conclusions ■ Better results than the former method ■ Elegantly organized work, the residual prediction is a nice idea ■ Can be easily applied to other image restoration domains

63 just another last slide - How to utilize Neural Networks for algorithmic challenges? One possible approach, is to try to combine the network with existing well- engineered algorithms (“physically” or by better initialization). On the other hand, there is a “pure” learning approach which looks at the NN as a “black box”. That is, one should build a network with some (possibly customized) architecture and let it optimize its parameters jointly in an end-to-end manner. In this talk we will discuss those two approaches for the task of image restoration. Domain Expertise vs. End-to-End optimization

64 references ■ A. Beck and M. Teboulle, A fast iterative shrinkage-thresholding algorithm with application to wavelet based image deblurring, ICASSP 2009 ■ D. Glasner, S. Bagon, and M. Irani. Super-resolution from a single image. In ICCV, 2009 ■ K. Gregor and Y. LeCun, Learning fast approximations of sparse coding, ICML 2010 ■ J. Yang et al, Image SR via sparse representation, IEEE TIP 2010 ■ C. Dong et al, Learning a deep convolutional network for image SR, ECCV 2014 ■ C. Dong et al, Image SR using deep convolutional networks, TPAMI 2015 ■ J. Kim et al, Accurate Image Super-Resolution Using Very Deep Convolutional Networks, arXiv:1511.04587, 2015 ■ Z. Wang, et al, Deep networks for image super-resolution with sparse prior. ICCV, 2015

65 Any questions?


Download ppt "SUPER RESOLUTION USING NEURAL NETS Hila Levi & Eran Amar Weizmann Ins. 2016."

Similar presentations


Ads by Google