A Neural Approach to Blind Motion Deblurring

Slides:



Advertisements
Similar presentations
QR Code Recognition Based On Image Processing
Advertisements

Zhimin CaoThe Chinese University of Hong Kong Qi YinITCS, Tsinghua University Xiaoou TangShenzhen Institutes of Advanced Technology Chinese Academy of.
Patch-based Image Deconvolution via Joint Modeling of Sparse Priors Chao Jia and Brian L. Evans The University of Texas at Austin 12 Sep
Recognising Panoramas
Digital Image Processing Chapter 5: Image Restoration.
Information Fusion Yu Cai. Research Article “Comparative Analysis of Some Neural Network Architectures for Data Fusion”, Authors: Juan Cires, PA Romo,
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
Spatial Filtering: Basics
Computer Vision - Restoration Hanyang University Jong-Il Park.
Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.
Yu-Wing Tai, Hao Du, Michael S. Brown, Stephen Lin CVPR’08 (Longer Version in Revision at IEEE Trans PAMI) Google Search: Video Deblurring Spatially Varying.
R. Ray and K. Chen, department of Computer Science engineering  Abstract The proposed approach is a distortion-specific blind image quality assessment.
CSC Lecture 8a Learning Multiplicative Interactions Geoffrey Hinton.
Image Processing Edge detection Filtering: Noise suppresion.
Digital Image Processing Lecture 10: Image Restoration March 28, 2005 Prof. Charlene Tsai.
Digital Image Processing Lecture 10: Image Restoration
Vincent DeVito Computer Systems Lab The goal of my project is to take an image input, artificially blur it using a known blur kernel, then.
CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.
Vincent DeVito Computer Systems Lab The goal of my project is to take an image input, artificially blur it using a known blur kernel, then.
Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.
Convolutional Neural Networks for Direct Text Deblurring
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Image Enhancement in the Spatial Domain.
Learning to Compare Image Patches via Convolutional Neural Networks SERGEY ZAGORUYKO & NIKOS KOMODAKIS.
Image Subtraction Mask mode radiography h(x,y) is the mask.
Convolutional Sequence to Sequence Learning
Learning to Compare Image Patches via Convolutional Neural Networks
Deep Feedforward Networks
Summary of “Efficient Deep Learning for Stereo Matching”
LECTURE 11: Advanced Discriminant Analysis
Computer Science and Engineering, Seoul National University
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Bag-of-Visual-Words Based Feature Extraction
Digital Image Processing Lecture 10: Image Restoration
Image Deblurring and noise reduction in python
René Vidal and Xiaodong Fan Center for Imaging Science
Intelligent Information System Lab
Neural networks (3) Regularization Autoencoder
Recognition using Nearest Neighbor (or kNN)
Training Techniques for Deep Neural Networks
Image Analysis Image Restoration.
Image Deblurring Using Dark Channel Prior
Outline Multilinear Analysis
Random walk initialization for training very deep feedforward networks
Fully Convolutional Networks for Semantic Segmentation
Outline Linear Shift-invariant system Linear filters
Computer Vision James Hays
Introduction to Neural Networks
Image Classification.
Neuro-Computing Lecture 4 Radial Basis Function Network
Introduction of MATRIX CAPSULES WITH EM ROUTING
Very Deep Convolutional Networks for Large-Scale Image Recognition
Gary Margrave and Michael Lamoureux
Neural Network - 2 Mayank Vatsa
Single Image Rolling Shutter Distortion Correction
Biointelligence Laboratory, Seoul National University
Model generalization Brief summary of methods
Neural networks (3) Regularization Autoencoder
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Deblurring Shaken and Partially Saturated Images
Image recognition.
Multi-UAV to UAV Tracking
Example of training and deployment of deep convolutional neural networks. Example of training and deployment of deep convolutional neural networks. During.
Shengcong Chen, Changxing Ding, Minfeng Liu 2018
Presentation transcript:

A Neural Approach to Blind Motion Deblurring Ayan Chakrabarti ECCV 2014

Introduction A new approach for blind deconvolution of natural images degraded by arbitrary motion blur kernels due to camera shake Train our network to output the complex Fourier coefficients of a deconvolution filter to be applied to the input patch A multi-resolution frequency decomposition to encode the input patch, and limit the connectivity of initial network layers based on locality in frequency, which leads to a significant reduction in the number of weights to be learned during training Trained network is independently applied to every overlapping patch in the input image, and its outputs are composed to form an initial estimate of the latent sharp image Despite reasoning with patches independently and not sharing information about a common global motion kernel, this procedure by itself performs surprisingly well

Patch-wise Neural Deconvolution Blur model: 𝑦 𝑛 = 𝑥∗𝑘 𝑛 +𝜀 𝑛 , 𝑘 𝑛 ≥0, 𝑛 𝑘 𝑛 =1 𝑦 𝑛 : observed image 𝑘: blur kernel 𝜀 𝑛 : i.i.d. Gaussian noise Goal: to design a network to recover the sharp intensity values 𝑥 𝑝 = 𝑥 𝑛 :𝑛∈𝑝 of a patch 𝑝, given as input a larger patch 𝑦 𝑝+ = 𝑦 𝑛 :𝑛∈ 𝑝 + from the observed image. 𝑝⊂ 𝑝 +

Restoration by Predicting Deconvolution Filter Coefficients The output of network (DFT) coefficients 𝐺 𝑝+ 𝑧 ∈𝐶 of a deconvolution filter, DFT 𝑌 𝑝+ 𝑧 of the input patch, the restored output patch is 𝑋 𝑝+ 𝑧 = 𝐺 𝑝+ 𝑧 ∙ 𝑌 𝑝+ 𝑧 (2) Estimate 𝑥 𝑝 [𝑛] of the sharp image patch is computed by taking the inverse discrete Fourier transform (IDFT) of 𝑋 𝑝+ 𝑧 , and then cropping out the central patch 𝑝⊂ 𝑝 + Note: 𝐺 𝑝+ 𝑧 = 𝐺 ∗ 𝑝+ −𝑧 , and 𝐺 𝑝+ 0 =1. The network only needs to output ( 𝑝 + −1)/2 for unique complex numbers to characterize 𝐺 𝑝+ 𝑧 , where 𝑝 + is the number of pixels in 𝑝 + . The loss function for the network is the mean squared error 𝐿 𝑥 𝑝 , 𝑥 𝑝 = 1 𝑝 𝑛∈𝑝 𝑥 𝑝 𝑛 − 𝑥 𝑝 [𝑛] 2 (3) Note: Both the IDFT and the filtering in (2) are linear operations, and therefore it is trivial to back-propagate the gradients of (3) to the outputs 𝐺 𝑝+ 𝑧 , and subsequently to all layers within the network.

Network Architecture 65x65 𝐵 1 :4<𝑚𝑎𝑥 𝑧 ≤8 L: 𝑧 ≤4 down- sampling 𝐵 1 :4<𝑚𝑎𝑥 𝑧 ≤8 L: 𝑧 ≤4 33x33 𝐵 2 :4<𝑚𝑎𝑥 𝑧 ≤8 down- sampling 17x17 H:4<𝑚𝑎𝑥 𝑧 ≤8 Total number of coefficients in the four bands is lower than the size of the input patch

Training Extracting sharp image patches from images in the Pascal VOC 2012 dataset Blurring with synthetically generated kernels Randomly sampling six points in a limited size grid (we generate an equal number of kernels from grid sizes of 8 × 8, 16 × 16, and 24 × 24) Fitting a spline through these points Setting the kernel values at each pixel on this spline to a value sampled from N(1,1/4) Clipped those values to be positive, and normalized the kernel to be unit sum Adding Gaussian noise with standard deviation to 1 % Training kernels with a “canonical” translation by centering them so that each kernel’s center of mass (weighted by kernel values) is at the center of the window Separate training and validation sets with different sharp patches and randomly generated kernels. About 520,000 and 3,000 image patches and 100,000 and 3,000 kernels for the training and validation sets respectively To minimize disk access, loaded the entire set of sharp patches and kernels into memory. Training(validation) data was generated on the fly by selecting random pairs of patches and kernels, and convolving the two to create the input patch. Also used rotated and mirrored versions of the sharp patches

Whole Image Restoration Initial estimate 𝑥 𝑁 [𝑛] Given an observed blurry image 𝑦[𝑛], consider all overlapping patches 𝑦 𝑝+ in the image, and use the trained network to compute estimates 𝑥 𝑝 of their latent sharp versions. Combines these restored patches to form an initial an estimate 𝑥 𝑁 [𝑛] of the sharp image, by setting 𝑥 𝑁 [𝑛] to the average of its estimates 𝑥 𝑝 [𝑛] from all patches 𝑝∋𝑛, using a Hanning window to weight the contributions from different patches. Note: we have so far not taken into account the fact that the entire image has been blurred by the same motion kernel. Estimation of the global kernel Minimize 𝑘 𝜆 = arg min 𝑖 𝑘∗ 𝑓 𝑖 ∗ 𝑥 𝑁 −( 𝑓 𝑖 ∗𝑦) 2 +𝜆 𝑛 𝑘[𝑛] , where 𝑓 𝑖 𝑛 s are various derivative filters. (first and second order derivatives at 8 orientations) Solve the problem in the Fourier domain using half quadratic splitting clip each kernel estimate kλ[n] to be positive, set very small or isolated values to zero, and normalize the result to be unit sum Given this estimate of the global kernel, use EPLL —a state of-the-art non-blind deconvolution algorithm—to deconvolve 𝑦 𝑛 and arrive at our final estimate of the sharp image x[𝑛].

Experiment 640 blurred images generated from 80 high quality natural images, and 8 real motion blur kernels acquired by Levin et al. Per-patch restoration different depending on the image content

End-to-End Learning for Image Burst Deblurring Patrick Wieschollek, Bernhard Scholkopf, Hendrik P.A. Lensch, and Michael Hirsch

A end-to-end trained neural network approach for multi-frame blind deconvolution for motion blurring due to camera shake Problem: Given a burst of observed color images 𝑌 1 , 𝑌 2 ,…, 𝑌 𝑁 capturing the same scene 𝑋 𝑌 𝑡 = 𝑘 𝑡 ∗𝑋+ 𝜀 𝑡 Predicting a latent single sharp image 𝑥 through a deep neural network architecture, i.e. 𝜋 𝜃 ( 𝑦 1 , 𝑦 2 ,…, 𝑦 𝑁 )→ 𝑥 = 𝜋 𝜃 ( 𝑦 1 , 𝑦 2 ,…, 𝑦 𝑁 ) The network operates on a patch-by-patch basis All predicted patches are recomposed to form the final prediction 𝑥 by averaging the predicted pixel values. Optimize the learning parameters 𝜃 by directly minimizing the objective function 𝜋 𝜃 𝑦 1 , 𝑦 2 ,…, 𝑦 𝑁 −𝑥 2

Network Architecture Three stages Frequency band analysis with Fourier coefficient prediction Deconvolution part Image fusion The first two stages of our proposed system.

Frequency band analysis 1x1xN(number of bursts) fully connected network sharing weights for each frequency coefficient 𝑓 𝑖𝑗 to adjust the extracted Fourier coefficients Deconvolution Pairwise merging of the resulting bands 𝑏 1 ’, 𝑏 2 ’, 𝑏 3 ’, 𝑏 4 ’ using fully connected layers with ReLU activation units to reduce the dimensionality The produced 4096 feature vector encoding is then fed through several fully connected layers producing a 4225 dimensional prediction of the filter coefficients of the deconvolution kernel. Applying the deconvolution kernel predicts a sharp patch 𝑥 of size 33 × 33 from each input sequence of patches. This is implemented as a multiplication of the predicted Wiener Filter with the Fourier transform of the input patch

Image fusion Fuse all available sharp patches 𝑦 1 , 𝑦 2 ,…, 𝑦 𝑁 by adopting the FBA(Fourier Burst Accumulation) approach as a neural network component with learnable weights. Delbracio, M., Sapiro, G.: Burst deblurring: Removing camera shake through fourier burst accumulation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society (2015) Fact: The magnitude of a Fourier coefficient in an image depends on the degree of sharpness in the burst which it belongs. Take a weighted average to aggregate Fourier coefficients for merging the coefficients of a burst of images FBA

Up to now: For a burst of images Frequency band analysis: adjust the extracted Fourier coefficients Deconvolution kernel prediction: learn the proper deconvolution filter Wiener filtering: just multiplication of input patch and the filter Results: a burst of deblurred image patches in frequency domain Training - Artificially generated data set obtained by applying synthetic blur kernels to the patches extracted from the MS COCO data set - 542217 sharp patches for training and validation - Input bursts of 14 blurry images are generated on-the-fly by applying synthetic blur kernels to the ground-truth patches of sizes, with size of 17 × 17 and 7 × 7 pixels - Data augmentation by rotating and mirroring - Add zero-mean Gaussian noise with variance 0.1

Deployment - Feed input patches of size 65×65 into neural network with stride 5 - Averaging the prediction with 2D Hanning window - Post processing of color channel to avoid color desaturation

Experimental result