Image Inpainting Using Pre-trained Classification CNN

Image Inpainting Using Pre-trained Classification CNN
By - Yaniv Kerzhner & Adar Elad Supervisor - Yaniv Romano

Achievements ! We wrote an academic paper which we will soon send to a conference.

Non-Blind Image Inpainting
The process of reconstructing lost parts of a given image. A platform for interesting applications: Removal of undesired objects, replace a face with another, super resolution is a special case, compression, and more… ? The GIP Framework goals are: Being a simple as possible so that the overhead for learning to use it will be as low as possible. Providing flexibility so that the user will be able to implement any algorithm and will not be drawn back by missing functionalities. To answer the needs of the GIP researchers by providing UI utilities that are need for research and by attempting to support a wide type of image formats. Providing a simple GUI for working on a batch of images or on a single image.

Project Scope Goal: Solving the inpainting problem using Convolutional Neural Networks (CNN) Starting point: Handwritten digits Later: Facial images Background on CNN Define an architecture: Convolution – ReLU – … – Fully-connected – Soft-max. Minimize a loss function (example, label) using backpropagation and gradient descent. We use MatConvNet environment Our approach to tackle the inpainting problem. The framework was designed for GIP students and researchers. Other existing frameworks are either too complex, the overhead for learning is too great or they are not flexible enough. The GIP framework was designed specifically for the needs of GIP researchers. It intends to provide both simplicity and flexibilty.

Neural Networks (CNN) We take an image, pass it through a series of convolutional, nonlinear, pooling (down-sampling), and fully connected layers, and get an output. That output can be a single class or a probability of classes that best describes the image. CNN for digit classification The process of learning the parameters is called backpropagation. Feed an image to the network, and compare the output of the desired class, and pay a Loss in case of error. Update the parameters by minimizing the Loss. Requires access the gradient of the loss as function of the parameters.

abel Back-Prop (CNN) Feed Forward: 012..9 X X’s classification 20% 60%
70% 012..9 20% 60% 40% X’s classification

Back-Prop (CNN) label 012.. 9 CNN(X) 012..9 01 2.. 9
0% Expected classification 100% Creating the Loss Taking the derivative of the penalty X’s classification 012..9 20% 60% 40% 90% CNN(X) 01 2.. 9 20% 60% 90% 40%

Background: Inpainting
The straightforward solution Given a set of training pairs (original and corrupted images), train a network in a supervised fashion. CNN ℒ , The GIP Framework goals are: Being a simple as possible so that the overhead for learning to use it will be as low as possible. Providing flexibility so that the user will be able to implement any algorithm and will not be drawn back by missing functionalities. To answer the needs of the GIP researchers by providing UI utilities that are need for research and by attempting to support a wide type of image formats. Providing a simple GUI for working on a batch of images or on a single image. [J. Xie et al. (‘12)] [K. Rolf et al. (‘14)] …

Project Objectives The straightforward solution
Given a set of training pairs (original and corrupted images), train a network in a supervised fashion. In contrast, Our project offers an alternative way for inpainting Can we leverage a network that was trained for classification to solve the in-painting problem? We offer a novel approach to hallucinate missing data in images. The GIP Framework goals are: Being a simple as possible so that the overhead for learning to use it will be as low as possible. Providing flexibility so that the user will be able to implement any algorithm and will not be drawn back by missing functionalities. To answer the needs of the GIP researchers by providing UI utilities that are need for research and by attempting to support a wide type of image formats. Providing a simple GUI for working on a batch of images or on a single image.

Pre-trained Network (CNN)
Block Diagram Pre-trained Network (CNN) MinX E(Y,CNN(X),label) M What is E(•) that will lead to the hoped result?

Defining the Cost Function
The cost is a linear combination of several penalties, each of them represent a different force, which would lead to high quality restoration Should be recognizable Its Importance Smooth

The Penalties (Mathematically)
Should be classified correctly: Smoothness (artifacts removal): TV Mask Horizontal Derivative Vertical Derivative

Minimizing the Cost The gradient of E is the sum of the gradients of each penalty. Psmooth is known and commonly used term. The interesting penalty is Plabel, computing its gradient is similar to back-propagation!

Calculating P_label Calculating Plabel Feed Forward 012..9
70% 012..9 20% 60% 40% X’s classification X

Minimizing the Cost × × × label 0..5. 9 CNN(X) 012..9 01 2.. 9
0% Expected classification 100% Creating the penalty Taking the derivative of the penalty X’s classification 012..9 20% 60% 40% 90% CNN(X) 01 2.. 9 20% 60% 90% 40% × × ×

Minimizing the Cost label 0..5. 9 CNN(X) 012..9 0..5. 9
0% Expected classification 100% x’s classification 012..9 20% 60% 40% label CNN(X) Creating the penalty Taking the derivative of the penalty 10% 7% 15% 90% Better classification

Initialization (1) Since we use the gradient descent method, we face an uncertainty in the minimum we reach. In order to improve our chances of converging to a meaningful minimum that relates to our inpainting task, we should initialize the missing parts wisely.

Initialization (2) The initialization methods we have explored include: Completing the missing parts with the image after enlargement and reduction. In this way we create a diffusion of the hole's boundaries into the missing parts (this initialization worked best for digit inpaintings). Completing the missing parts with the average image of the desired classification. This gives the network a successful starting guess to start from Initialization strategies for the algorithm. The right-most is the corrupted image to inpaint. The left-most is initialization (2) and the right-most is initialization (1)

Failure Cases: Past We noticed that for some images the algorithm does not successfully fill the hole. The network doesn't memorize the small pieces of the image content. Meaning, the network isn’t able to add content to the corrupted regions because the network hasn’t assimilated the visual characteristics of each label.

Failure Cases: Old Solution
The Problem: We noticed that for some images the algorithm does not successfully fill the hole. Solution Instead of forcing the image to be classified by the network we force the image to be close to the representation of another image from the database from the same class [A. Vedaldi et al. (‘16)] [L. Gatys et al. (‘15)] …

Results – Post solution
The first row displays the corrupted images and in the second row we can see the inpainting results. These examples are taken from ImageNet and MNIST datasets.

Past Suggestments Those are the main ideas and comments we received after the mid-semester presentation: Train the network on small parts of images. Inpaint by all labels and take the smallest energy as the best inpainting. Those comments made us rethink in order to improve the results. We also reviewed literature and found similar elements between your recommendations and the works we read.

Related Work Interesting work has been done recently for solving the inpainting problem using Neural Network approaches. Here are some of them: Nian Cai, et al., "Blind inpainting using the fully convolutional neural network", Springer 2015 J. Xie, L. Xu, and E. Chen. "Image denoising and inpainting with deep neural networks". In NIPS, 2012. D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell, and A. Efros. Context encoders: Feature learning by inpainting. 2016 Raymond A. Yeh, Chen Chen, T. Yian Lim, A. G. Schwing, M. Hasegawa-Johnson, M. N. Do. "Semantic Image Inpainting with Deep Generative Models". 2017

Results - Mnist dataset
The images in the first row are the original images from the database. The images in the second row are the original images above with different types of holes. The images in the third row are the results of the inpainting.

Results - Yale B dataset
The images in the top row are the original images from the database, in the second row are the corrupted images and in the most bottom row displays the inpainting results.

Special Cases (1) - Visualization
In this section we present and discuss some special and interesting experiments involving our inpainting algorithm. These experiments do not involve changes in the network nor the algorithm correctness. A specific case of image inpainting where all the data is lost, is called illusion.

Special Cases (2) – Inpainting by all labels
We tried to inpaint a certain image as all labels and at the end we defined the best inpainting result to be the one which led to the lowest energy function. The most-right image is the corrupted image and the others are the best inpainting results To our surprise, when we performed this experiment, we received the best inpainting (according to the criterion we defined) for a different label than the original one (as shown in the figure)

Special Cases (3) - illumination
We noticed during the experiments with the Yale Extended B database that there are many images taken in the dark, the question then arises: could these images be illuminated with the assistant of our algorithm? We discovered that it is possible by defining the missing parts of the image as all the pixels in the image that are below a specific threshold. It is possible to distinguish that all the details of the original image still remains after the inpainting. Meaning, all we did is to add information to the dark parts.

Conclusion Our solution is not based on a network that has been taught to inpaint. Instead we present the concept of using a classification-oriented network for solving a completely different problem based on the data it has learned.

An alternative - modify Plabel
Instead of forcing the image to be classified by the network we force the image to be close to the representation of another image from the database from the same class Layer 8 Layer n

Result – Pfeature After Old Algorithm Holed image Reference Image
Reference representation After Algorithm

Image Inpainting Using Pre-trained Classification CNN

Similar presentations

Presentation on theme: "Image Inpainting Using Pre-trained Classification CNN"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Image Inpainting Using Pre-trained Classification CNN

Similar presentations

Presentation on theme: "Image Inpainting Using Pre-trained Classification CNN"— Presentation transcript:

Similar presentations

About project

Feedback