Convolutional Neural Networks for Direct Text Deblurring

Slides:

Advertisements

Similar presentations

Removing blur due to camera shake from images. William T. Freeman Joint work with Rob Fergus, Anat Levin, Yair Weiss, Fredo Durand, Aaron Hertzman, Sam.

Advertisements

Classification spotlights

Lecture 5: CNN: Regularization

ImageNet Classification with Deep Convolutional Neural Networks

Patch-based Image Deconvolution via Joint Modeling of Sparse Priors Chao Jia and Brian L. Evans The University of Texas at Austin 12 Sep

Unnatural L 0 Representation for Natural Image Deblurring Speaker: Wei-Sheng Lai Date: 2013/04/26.

Fast Removal of Non-uniform Camera Shake By Michael Hirsch, Christian J. Schuler, Stefan Harmeling and Bernhard Scholkopf Max Planck Institute for Intelligent.

Rob Fergus Courant Institute of Mathematical Sciences New York University A Variational Approach to Blind Image Deconvolution.

04/12/10SIAM Imaging Science Superresolution and Blind Deconvolution of Images and Video Institute of Information Theory and Automation Academy of.

Our output Blur kernel. Close-up of child Our output Original photograph.

Spatial Pyramid Pooling in Deep Convolutional

1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL.

Chapter 7 Case Study 1: Image Deconvolution. Different Types of Image Blur Defocus blur --- Depth of field effects Scene motion --- Objects in the scene.

Yu-Wing Tai, Hao Du, Michael S. Brown, Stephen Lin CVPR’08 (Longer Version in Revision at IEEE Trans PAMI) Google Search: Video Deblurring Spatially Varying.

Vincent DeVito Computer Systems Lab The goal of my project is to take an image input, artificially blur it using a known blur kernel, then.

Deep Convolutional Nets

Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:

Ghent University Pattern recognition with CNNs as reservoirs David Verstraeten 1 – Samuel Xavier de Souza 2 – Benjamin Schrauwen 1 Johan Suykens 2 – Dirk.

Vincent DeVito Computer Systems Lab The goal of my project is to take an image input, artificially blur it using a known blur kernel, then.

Convolutional Neural Network

Philipp Gysel ECE Department University of California, Davis

Gaussian Conditional Random Field Network for Semantic Segmentation

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Lecture 4b Data augmentation for CNN training

Understanding Convolutional Neural Networks for Object Recognition

Yann LeCun Other Methods and Applications of Deep Learning Yann Le Cun The Courant Institute of Mathematical Sciences New York University

Learning to Compare Image Patches via Convolutional Neural Networks SERGEY ZAGORUYKO & NIKOS KOMODAKIS.

Cancer Metastases Classification in Histological Whole Slide Images

Deeply-Recursive Convolutional Network for Image Super-Resolution

Deep Learning for Dual-Energy X-Ray

Learning to Compare Image Patches via Convolutional Neural Networks

Analysis of Sparse Convolutional Neural Networks

CS 388: Natural Language Processing: LSTM Recurrent Neural Networks

CS 6501: 3D Reconstruction and Understanding Convolutional Neural Networks Connelly Barnes.

Computer Science and Engineering, Seoul National University

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Image Deblurring and noise reduction in python

A Neural Approach to Blind Motion Deblurring

Deconvolution , , Computational Photography

Part-Based Room Categorization for Household Service Robots

Ajita Rattani and Reza Derakhshani,

Training Techniques for Deep Neural Networks

Efficient Deep Model for Monocular Road Segmentation

Policy Compression for MDPs

Image Deblurring Using Dark Channel Prior

CS6890 Deep Learning Weizhen Cai

Adversarially Tuned Scene Generation

A brief introduction to neural network

Computer Vision James Hays

Introduction to Neural Networks

Deep CNN of JPEG 2000 電信所R 林俊廷.

Counting in Dense Crowds using Deep Learning

Convolutional Neural Networks

Deep Learning Hierarchical Representations for Image Steganalysis

Dog/Cat Classifier Christina Stiff.

Very Deep Convolutional Networks for Large-Scale Image Recognition

Smart Robots, Drones, IoT

Single Image Rolling Shutter Distortion Correction

Visualizing and Understanding Convolutional Networks

RCNN, Fast-RCNN, Faster-RCNN

Lecture 2: Image filtering

Course Recap and What’s Next?

Department of Computer Science Ben-Gurion University of the Negev

Deep Object Co-Segmentation

Deep Learning Libraries

CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Deblurring Shaken and Partially Saturated Images

Example of training and deployment of deep convolutional neural networks. Example of training and deployment of deep convolutional neural networks. During.

Presented By: Firas Gerges (fg92)

Presentation transcript:

Convolutional Neural Networks for Direct Text Deblurring Michal Hradiš, Jan Kotera, Pavel Zemčík, Filip Šroubek {ihradis, zemcik}@fit.vutbr.cz {kotera, sroubekf}@utia.cas.cz

Blind deconvolution with CNNs

Blind deconvolution with CNNs Predict sharp image directly from a blurry input. Why? Hard problem – can CNNs do it? Improve blind-deconvolution methods. Mobile devices: improve readability and legibility, preprocessing for OCR Pre-treaining for OCR networks Teaser and very short overview. Take blurry image run it through a CNN – get a high quality image Why? Scientifically important by itself Direct applications: human readability, OCR preprocessing, OCR pretreaining

Classical blind deconvolution 𝒙= 𝐚𝐫𝐠𝐦𝐢𝐧 𝒚 −𝒙 ∗ 𝒌 +𝒓 𝒙 𝒙= 𝐚𝐫𝐠𝐦𝐢𝐧 − ∗ +𝒓 Shortly introduce blind deconvolution Why is the classical approach problematic inverting the imaging process Weak priors – priors and imaging model separate JPEG compression, RGB color mask, saturation, motion blur, rolling shutter, processing inside a device, …

Classical blind deconvolution 𝒙= 𝐚𝐫𝐠𝐦𝐢𝐧 𝒚 −𝒙 ∗ 𝒌 +𝒓 𝒙 𝒙= 𝐚𝐫𝐠𝐦𝐢𝐧 − ∗ +𝒓 Hard to model and invert space-variant blur, structured noise, saturation, color aberration, JPEG compression, gamma Weak image priors Shortly introduce blind deconvolution Why is the classical approach problematic inverting the imaging process Weak priors – priors and imaging model separate JPEG compression, RGB color mask, saturation, motion blur, rolling shutter, processing inside a device, …

Learn deconvolution function from data Simulate space-variant blur, noise, saturation, color aberration, JPEG compression, gamma Update 𝒙=𝒇(𝒚, 𝜽) 𝒙= 𝐚𝐫𝐠𝐦𝐢𝐧 𝒚 −𝒙 ∗ 𝒌 +𝒓 𝒙

CNN in image restoration Denoising Jain and Seung: Natural Image Denoising with Convolutional Networks. NIPS 2009. Structured noise removal Eigen et al.: Restoring an Image Taken through a Window Covered with Dirt or Rain. ICCV, 2013. Non-blind deconvolution Li Xu et al.: Deep Convolutional Neural Network for Image Deconvolution. NIPS 2014. Super resolution Dong et al.: Learning a Deep Convolutional Network for Image Super-Resolution. ECCV 2014. Blind deconvolution (sub-task) Schuler et al.: Learning to Deblur. CoRR, abs/1406.7, 2014.

Our CNN Based on (Krizhevsky et al., 2012) and many others Only convolutions and ReLU Fully convolutional networks No pooling = output has the same resolution as input

Our CNN Based on (Krizhevsky et al., 2012) and many others Only convolutions and ReLU Fully convolutional networks No pooling = output has the same resolution as input

Stochastic Gradient Descent with momentum Training Stochastic Gradient Descent with momentum Objective = Mean Squared Error Random weight initialization Batches of 96 16x16 sharp patches Good balance between efficiency and variability Blurred patches are 50x50–70x70 depending on the spatial support of the network Using modified Caffe framework

50,000 + 2,000 pdf files from CiteSeerX Data 50,000 + 2,000 pdf files from CiteSeerX 200,000 + 8,000 rendered pages at 120-150 DPI 3M + 35,000 image regions Small projective transformations (virtual camera rotations) Out-of-focus blur (disc) Camera shake blur (random walk) Gaussian noise

Data

Results

Large CNN

Results - baselines 100 regions each with single random blur kernel 200x200 px – PSNR evaluated on central 160x160 px 2 blind and 2 non-blind baselines – optimized for each noise level

Results - examples Blurred CNN-L15 Pan non-blind Pan blind

Our results on real images Photo CNN Photo CNN

Our results with non-uniform blur

Our results with non-uniform blur

Computational efficiency L15-CNN has ~2.3M weights (9MB) ~ 2.3M multiply and accumulate per pixel ~ 4 Tflop for 1 Mpx image = ~ 2s on high-end GPU 5s in our sub-optimal implementation 1 Mpx image needs 1.3 GB for activations of the largest layers (320 channels) split the image if this is an issue

More efficient CNN structure Train on real images Future work Other data domains License plates Specific object categories (cars) More efficient CNN structure e.g. Long et al.: Fully Convolutional Networks for Semantic Segmentation Train on real images Dataset Shift invariant objective function?

Non-rigid registration New Dataset >2,000 documents/pages, >15,000 photos, 25 mobile devices Marker-based page localization Non-rigid registration with template Rectification

Thanks' for your attention This research is supported by the ARTEMIS joint undertaking under grant agreement no 641439 (ALMARVI), GACR agency project GA13-29225S. Jan Kotera was also supported by GA UK project 938213/2013, Faculty of Mathematics and Physics, Charles University in Prague.