Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Neural Approach to Blind Motion Deblurring

Similar presentations


Presentation on theme: "A Neural Approach to Blind Motion Deblurring"β€” Presentation transcript:

1 A Neural Approach to Blind Motion Deblurring
Ayan Chakrabarti ECCV 2014

2 Introduction A new approach for blind deconvolution of natural images degraded by arbitrary motion blur kernels due to camera shake Train our network to output the complex Fourier coefficients of a deconvolution filter to be applied to the input patch A multi-resolution frequency decomposition to encode the input patch, and limit the connectivity of initial network layers based on locality in frequency, which leads to a significant reduction in the number of weights to be learned during training Trained network is independently applied to every overlapping patch in the input image, and its outputs are composed to form an initial estimate of the latent sharp image Despite reasoning with patches independently and not sharing information about a common global motion kernel, this procedure by itself performs surprisingly well

3 Patch-wise Neural Deconvolution
Blur model: 𝑦 𝑛 = π‘₯βˆ—π‘˜ 𝑛 +πœ€ 𝑛 , π‘˜ 𝑛 β‰₯0, 𝑛 π‘˜ 𝑛 =1 𝑦 𝑛 : observed image π‘˜: blur kernel πœ€ 𝑛 : i.i.d. Gaussian noise Goal: to design a network to recover the sharp intensity values π‘₯ 𝑝 = π‘₯ 𝑛 :π‘›βˆˆπ‘ of a patch 𝑝, given as input a larger patch 𝑦 𝑝+ = 𝑦 𝑛 :π‘›βˆˆ 𝑝 + from the observed image. π‘βŠ‚ 𝑝 +

4 Restoration by Predicting Deconvolution Filter Coefficients
The output of network (DFT) coefficients 𝐺 𝑝+ 𝑧 ∈𝐢 of a deconvolution filter, DFT π‘Œ 𝑝+ 𝑧 of the input patch, the restored output patch is 𝑋 𝑝+ 𝑧 = 𝐺 𝑝+ 𝑧 βˆ™ π‘Œ 𝑝+ 𝑧 (2) Estimate π‘₯ 𝑝 [𝑛] of the sharp image patch is computed by taking the inverse discrete Fourier transform (IDFT) of 𝑋 𝑝+ 𝑧 , and then cropping out the central patch π‘βŠ‚ 𝑝 + Note: 𝐺 𝑝+ 𝑧 = 𝐺 βˆ— 𝑝+ βˆ’π‘§ , and 𝐺 𝑝+ 0 =1. The network only needs to output ( 𝑝 + βˆ’1)/2 for unique complex numbers to characterize 𝐺 𝑝+ 𝑧 , where 𝑝 + is the number of pixels in 𝑝 + . The loss function for the network is the mean squared error 𝐿 π‘₯ 𝑝 , π‘₯ 𝑝 = 1 𝑝 π‘›βˆˆπ‘ π‘₯ 𝑝 𝑛 βˆ’ π‘₯ 𝑝 [𝑛] (3) Note: Both the IDFT and the filtering in (2) are linear operations, and therefore it is trivial to back-propagate the gradients of (3) to the outputs 𝐺 𝑝+ 𝑧 , and subsequently to all layers within the network.

5 Network Architecture 65x65 𝐡 1 :4<π‘šπ‘Žπ‘₯ 𝑧 ≀8 L: 𝑧 ≀4
down- sampling 𝐡 1 :4<π‘šπ‘Žπ‘₯ 𝑧 ≀8 L: 𝑧 ≀4 33x33 𝐡 2 :4<π‘šπ‘Žπ‘₯ 𝑧 ≀8 down- sampling 17x17 H:4<π‘šπ‘Žπ‘₯ 𝑧 ≀8 Total number of coefficients in the four bands is lower than the size of the input patch

6 Training Extracting sharp image patches from images in the Pascal VOC 2012 dataset Blurring with synthetically generated kernels Randomly sampling six points in a limited size grid (we generate an equal number of kernels from grid sizes of 8 Γ— 8, 16 Γ— 16, and 24 Γ— 24) Fitting a spline through these points Setting the kernel values at each pixel on this spline to a value sampled from N(1,1/4) Clipped those values to be positive, and normalized the kernel to be unit sum Adding Gaussian noise with standard deviation to 1 % Training kernels with a β€œcanonical” translation by centering them so that each kernel’s center of mass (weighted by kernel values) is at the center of the window Separate training and validation sets with different sharp patches and randomly generated kernels. About 520,000 and 3,000 image patches and 100,000 and 3,000 kernels for the training and validation sets respectively To minimize disk access, loaded the entire set of sharp patches and kernels into memory. Training(validation) data was generated on the fly by selecting random pairs of patches and kernels, and convolving the two to create the input patch. Also used rotated and mirrored versions of the sharp patches

7 Whole Image Restoration
Initial estimate π‘₯ 𝑁 [𝑛] Given an observed blurry image 𝑦[𝑛], consider all overlapping patches 𝑦 𝑝+ in the image, and use the trained network to compute estimates π‘₯ 𝑝 of their latent sharp versions. Combines these restored patches to form an initial an estimate π‘₯ 𝑁 [𝑛] of the sharp image, by setting π‘₯ 𝑁 [𝑛] to the average of its estimates π‘₯ 𝑝 [𝑛] from all patches π‘βˆ‹π‘›, using a Hanning window to weight the contributions from different patches. Note: we have so far not taken into account the fact that the entire image has been blurred by the same motion kernel. Estimation of the global kernel Minimize π‘˜ πœ† = arg min 𝑖 π‘˜βˆ— 𝑓 𝑖 βˆ— π‘₯ 𝑁 βˆ’( 𝑓 𝑖 βˆ—π‘¦) πœ† 𝑛 π‘˜[𝑛] , where 𝑓 𝑖 𝑛 s are various derivative filters. (first and second order derivatives at 8 orientations) Solve the problem in the Fourier domain using half quadratic splitting clip each kernel estimate kΞ»[n] to be positive, set very small or isolated values to zero, and normalize the result to be unit sum Given this estimate of the global kernel, use EPLL β€”a state of-the-art non-blind deconvolution algorithmβ€”to deconvolve 𝑦 𝑛 and arrive at our final estimate of the sharp image x[𝑛].

8 Experiment 640 blurred images generated from 80 high quality natural images, and 8 real motion blur kernels acquired by Levin et al. Per-patch restoration different depending on the image content

9

10 End-to-End Learning for Image Burst Deblurring
Patrick Wieschollek, Bernhard Scholkopf, Hendrik P.A. Lensch, and Michael Hirsch

11 A end-to-end trained neural network approach for multi-frame blind deconvolution for motion blurring due to camera shake Problem: Given a burst of observed color images π‘Œ 1 , π‘Œ 2 ,…, π‘Œ 𝑁 capturing the same scene 𝑋 π‘Œ 𝑑 = π‘˜ 𝑑 βˆ—π‘‹+ πœ€ 𝑑 Predicting a latent single sharp image π‘₯ through a deep neural network architecture, i.e. πœ‹ πœƒ ( 𝑦 1 , 𝑦 2 ,…, 𝑦 𝑁 )β†’ π‘₯ = πœ‹ πœƒ ( 𝑦 1 , 𝑦 2 ,…, 𝑦 𝑁 ) The network operates on a patch-by-patch basis All predicted patches are recomposed to form the final prediction π‘₯ by averaging the predicted pixel values. Optimize the learning parameters πœƒ by directly minimizing the objective function πœ‹ πœƒ 𝑦 1 , 𝑦 2 ,…, 𝑦 𝑁 βˆ’π‘₯ 2

12 Network Architecture Three stages
Frequency band analysis with Fourier coefficient prediction Deconvolution part Image fusion The first two stages of our proposed system.

13 Frequency band analysis
1x1xN(number of bursts) fully connected network sharing weights for each frequency coefficient 𝑓 𝑖𝑗 to adjust the extracted Fourier coefficients Deconvolution Pairwise merging of the resulting bands 𝑏 1 ’, 𝑏 2 ’, 𝑏 3 ’, 𝑏 4 ’ using fully connected layers with ReLU activation units to reduce the dimensionality The produced 4096 feature vector encoding is then fed through several fully connected layers producing a 4225 dimensional prediction of the filter coefficients of the deconvolution kernel. Applying the deconvolution kernel predicts a sharp patch π‘₯ of size 33 Γ— 33 from each input sequence of patches. This is implemented as a multiplication of the predicted Wiener Filter with the Fourier transform of the input patch

14 Image fusion Fuse all available sharp patches 𝑦 1 , 𝑦 2 ,…, 𝑦 𝑁 by adopting the FBA(Fourier Burst Accumulation) approach as a neural network component with learnable weights. Delbracio, M., Sapiro, G.: Burst deblurring: Removing camera shake through fourier burst accumulation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society (2015) Fact: The magnitude of a Fourier coefficient in an image depends on the degree of sharpness in the burst which it belongs. Take a weighted average to aggregate Fourier coefficients for merging the coefficients of a burst of images FBA

15 Up to now: For a burst of images
Frequency band analysis: adjust the extracted Fourier coefficients Deconvolution kernel prediction: learn the proper deconvolution filter Wiener filtering: just multiplication of input patch and the filter Results: a burst of deblurred image patches in frequency domain Training - Artificially generated data set obtained by applying synthetic blur kernels to the patches extracted from the MS COCO data set sharp patches for training and validation - Input bursts of 14 blurry images are generated on-the-fly by applying synthetic blur kernels to the ground-truth patches of sizes, with size of 17 Γ— 17 and 7 Γ— 7 pixels - Data augmentation by rotating and mirroring - Add zero-mean Gaussian noise with variance 0.1

16 Deployment - Feed input patches of size 65Γ—65 into neural network with stride 5 - Averaging the prediction with 2D Hanning window - Post processing of color channel to avoid color desaturation

17 Experimental result


Download ppt "A Neural Approach to Blind Motion Deblurring"

Similar presentations


Ads by Google