Convolutional Neural Networks for Direct Text Deblurring

Slides:



Advertisements
Similar presentations
Removing blur due to camera shake from images. William T. Freeman Joint work with Rob Fergus, Anat Levin, Yair Weiss, Fredo Durand, Aaron Hertzman, Sam.
Advertisements

Classification spotlights
Lecture 5: CNN: Regularization
ImageNet Classification with Deep Convolutional Neural Networks
Patch-based Image Deconvolution via Joint Modeling of Sparse Priors Chao Jia and Brian L. Evans The University of Texas at Austin 12 Sep
Unnatural L 0 Representation for Natural Image Deblurring Speaker: Wei-Sheng Lai Date: 2013/04/26.
Fast Removal of Non-uniform Camera Shake By Michael Hirsch, Christian J. Schuler, Stefan Harmeling and Bernhard Scholkopf Max Planck Institute for Intelligent.
Rob Fergus Courant Institute of Mathematical Sciences New York University A Variational Approach to Blind Image Deconvolution.
04/12/10SIAM Imaging Science Superresolution and Blind Deconvolution of Images and Video Institute of Information Theory and Automation Academy of.
Our output Blur kernel. Close-up of child Our output Original photograph.
Spatial Pyramid Pooling in Deep Convolutional
1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL.
Chapter 7 Case Study 1: Image Deconvolution. Different Types of Image Blur Defocus blur --- Depth of field effects Scene motion --- Objects in the scene.
Yu-Wing Tai, Hao Du, Michael S. Brown, Stephen Lin CVPR’08 (Longer Version in Revision at IEEE Trans PAMI) Google Search: Video Deblurring Spatially Varying.
Vincent DeVito Computer Systems Lab The goal of my project is to take an image input, artificially blur it using a known blur kernel, then.
Deep Convolutional Nets
Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:
Ghent University Pattern recognition with CNNs as reservoirs David Verstraeten 1 – Samuel Xavier de Souza 2 – Benjamin Schrauwen 1 Johan Suykens 2 – Dirk.
Vincent DeVito Computer Systems Lab The goal of my project is to take an image input, artificially blur it using a known blur kernel, then.
Convolutional Neural Network
Philipp Gysel ECE Department University of California, Davis
Gaussian Conditional Random Field Network for Semantic Segmentation
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Lecture 4b Data augmentation for CNN training
Understanding Convolutional Neural Networks for Object Recognition
Yann LeCun Other Methods and Applications of Deep Learning Yann Le Cun The Courant Institute of Mathematical Sciences New York University
Learning to Compare Image Patches via Convolutional Neural Networks SERGEY ZAGORUYKO & NIKOS KOMODAKIS.
Cancer Metastases Classification in Histological Whole Slide Images
Deeply-Recursive Convolutional Network for Image Super-Resolution
Deep Learning for Dual-Energy X-Ray
Learning to Compare Image Patches via Convolutional Neural Networks
Analysis of Sparse Convolutional Neural Networks
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
CS 6501: 3D Reconstruction and Understanding Convolutional Neural Networks Connelly Barnes.
Computer Science and Engineering, Seoul National University
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Image Deblurring and noise reduction in python
A Neural Approach to Blind Motion Deblurring
Deconvolution , , Computational Photography
Part-Based Room Categorization for Household Service Robots
Ajita Rattani and Reza Derakhshani,
Training Techniques for Deep Neural Networks
Efficient Deep Model for Monocular Road Segmentation
Policy Compression for MDPs
Image Deblurring Using Dark Channel Prior
CS6890 Deep Learning Weizhen Cai
Adversarially Tuned Scene Generation
A brief introduction to neural network
Computer Vision James Hays
Introduction to Neural Networks
Deep CNN of JPEG 2000 電信所R 林俊廷.
Counting in Dense Crowds using Deep Learning
Convolutional Neural Networks
Deep Learning Hierarchical Representations for Image Steganalysis
Dog/Cat Classifier Christina Stiff.
Very Deep Convolutional Networks for Large-Scale Image Recognition
Smart Robots, Drones, IoT
Single Image Rolling Shutter Distortion Correction
Visualizing and Understanding Convolutional Networks
RCNN, Fast-RCNN, Faster-RCNN
Lecture 2: Image filtering
Course Recap and What’s Next?
Department of Computer Science Ben-Gurion University of the Negev
Deep Object Co-Segmentation
Deep Learning Libraries
CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Deblurring Shaken and Partially Saturated Images
Example of training and deployment of deep convolutional neural networks. Example of training and deployment of deep convolutional neural networks. During.
Presented By: Firas Gerges (fg92)
Presentation transcript:

Convolutional Neural Networks for Direct Text Deblurring Michal Hradiš, Jan Kotera, Pavel Zemčík, Filip Šroubek {ihradis, zemcik}@fit.vutbr.cz {kotera, sroubekf}@utia.cas.cz

Blind deconvolution with CNNs

Blind deconvolution with CNNs Predict sharp image directly from a blurry input. Why? Hard problem – can CNNs do it? Improve blind-deconvolution methods. Mobile devices: improve readability and legibility, preprocessing for OCR Pre-treaining for OCR networks Teaser and very short overview. Take blurry image run it through a CNN – get a high quality image Why? Scientifically important by itself Direct applications: human readability, OCR preprocessing, OCR pretreaining

Classical blind deconvolution 𝒙= 𝐚𝐫𝐠𝐦𝐢𝐧 𝒚 −𝒙 ∗ 𝒌 +𝒓 𝒙 𝒙= 𝐚𝐫𝐠𝐦𝐢𝐧 − ∗ +𝒓 Shortly introduce blind deconvolution Why is the classical approach problematic inverting the imaging process Weak priors – priors and imaging model separate JPEG compression, RGB color mask, saturation, motion blur, rolling shutter, processing inside a device, …

Classical blind deconvolution 𝒙= 𝐚𝐫𝐠𝐦𝐢𝐧 𝒚 −𝒙 ∗ 𝒌 +𝒓 𝒙 𝒙= 𝐚𝐫𝐠𝐦𝐢𝐧 − ∗ +𝒓 Hard to model and invert space-variant blur, structured noise, saturation, color aberration, JPEG compression, gamma Weak image priors Shortly introduce blind deconvolution Why is the classical approach problematic inverting the imaging process Weak priors – priors and imaging model separate JPEG compression, RGB color mask, saturation, motion blur, rolling shutter, processing inside a device, …

Learn deconvolution function from data Simulate space-variant blur, noise, saturation, color aberration, JPEG compression, gamma Update 𝒙=𝒇(𝒚, 𝜽) 𝒙= 𝐚𝐫𝐠𝐦𝐢𝐧 𝒚 −𝒙 ∗ 𝒌 +𝒓 𝒙

CNN in image restoration Denoising Jain and Seung: Natural Image Denoising with Convolutional Networks. NIPS 2009. Structured noise removal Eigen et al.: Restoring an Image Taken through a Window Covered with Dirt or Rain. ICCV, 2013. Non-blind deconvolution Li Xu et al.: Deep Convolutional Neural Network for Image Deconvolution. NIPS 2014. Super resolution Dong et al.: Learning a Deep Convolutional Network for Image Super-Resolution. ECCV 2014. Blind deconvolution (sub-task) Schuler et al.: Learning to Deblur. CoRR, abs/1406.7, 2014.

Our CNN Based on (Krizhevsky et al., 2012) and many others Only convolutions and ReLU Fully convolutional networks No pooling = output has the same resolution as input

Our CNN Based on (Krizhevsky et al., 2012) and many others Only convolutions and ReLU Fully convolutional networks No pooling = output has the same resolution as input

Stochastic Gradient Descent with momentum Training Stochastic Gradient Descent with momentum Objective = Mean Squared Error Random weight initialization Batches of 96 16x16 sharp patches Good balance between efficiency and variability Blurred patches are 50x50–70x70 depending on the spatial support of the network Using modified Caffe framework

50,000 + 2,000 pdf files from CiteSeerX Data 50,000 + 2,000 pdf files from CiteSeerX 200,000 + 8,000 rendered pages at 120-150 DPI 3M + 35,000 image regions Small projective transformations (virtual camera rotations) Out-of-focus blur (disc) Camera shake blur (random walk) Gaussian noise

Data

Results

Large CNN

Results - baselines 100 regions each with single random blur kernel 200x200 px – PSNR evaluated on central 160x160 px 2 blind and 2 non-blind baselines – optimized for each noise level

Results - examples Blurred CNN-L15 Pan non-blind Pan blind

Our results on real images Photo CNN Photo CNN

Our results with non-uniform blur

Our results with non-uniform blur

Computational efficiency L15-CNN has ~2.3M weights (9MB) ~ 2.3M multiply and accumulate per pixel ~ 4 Tflop for 1 Mpx image = ~ 2s on high-end GPU 5s in our sub-optimal implementation 1 Mpx image needs 1.3 GB for activations of the largest layers (320 channels) split the image if this is an issue

More efficient CNN structure Train on real images Future work Other data domains License plates Specific object categories (cars) More efficient CNN structure e.g. Long et al.: Fully Convolutional Networks for Semantic Segmentation Train on real images Dataset Shift invariant objective function?

Non-rigid registration New Dataset >2,000 documents/pages, >15,000 photos, 25 mobile devices Marker-based page localization Non-rigid registration with template Rectification

Thanks' for your attention This research is supported by the ARTEMIS joint undertaking under grant agreement no 641439 (ALMARVI), GACR agency project GA13-29225S. Jan Kotera was also supported by GA UK project 938213/2013, Faculty of Mathematics and Physics, Charles University in Prague.