Compressive Image Recovery using Recurrent Generative Model

Slides:

Advertisements

Similar presentations

A Two-Step Approach to Hallucinating Faces: Global Parametric Model and Local Nonparametric Model Ce Liu Heung-Yeung Shum Chang Shui Zhang CVPR 2001.

Advertisements

Bayesian Belief Propagation

Greedy Layer-Wise Training of Deep Networks

Advanced topics.

CS-MUVI Video compressive sensing for spatial multiplexing cameras Aswin Sankaranarayanan, Christoph Studer, Richard G. Baraniuk Rice University.

Patch-based Image Deconvolution via Joint Modeling of Sparse Priors Chao Jia and Brian L. Evans The University of Texas at Austin 12 Sep

Internet Vision - Lecture 3 Tamara Berg Sept 10. New Lecture Time Mondays 10:00am-12:30pm in 2311 Monday (9/15) we will have a general Computer Vision.

1 Micha Feigin, Danny Feldman, Nir Sochen

Rob Fergus Courant Institute of Mathematical Sciences New York University A Variational Approach to Blind Image Deconvolution.

1 Distributed localization of networked cameras Stanislav Funiak Carlos Guestrin Carnegie Mellon University Mark Paskin Stanford University Rahul Sukthankar.

Original image: 512 pixels by 512 pixels. Probe is the size of 1 pixel. Picture is sampled at every pixel ( samples taken)

Rice University dsp.rice.edu/cs Distributed Compressive Sensing A Framework for Integrated Sensing and Processing for Signal Ensembles Marco Duarte Shriram.

AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/

Spatial Pyramid Pooling in Deep Convolutional

Jeff Howbert Introduction to Machine Learning Winter Classification Bayesian Classifiers.

Lensless Imaging Richard Baraniuk Rice University Ashok Veeraraghavan

Feature and object tracking algorithms for video tracking Student: Oren Shevach Instructor: Arie nakhmani.

SVCL Automatic detection of object based Region-of-Interest for image compression Sunhyoung Han.

Object Bank Presenter ： Liu Changyu Advisor ： Prof. Alex Hauptmann Interest ： Multimedia Analysis April 4 th, 2013.

Robust global motion estimation and novel updating strategy for sprite generation IET Image Processing, Mar H.K. Cheung and W.C. Siu The Hong Kong.

Fields of Experts: A Framework for Learning Image Priors (Mon) Young Ki Baik, Computer Vision Lab.

Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov

Single Image Interpolation via Adaptive Non-Local Sparsity-Based Modeling The research leading to these results has received funding from the European.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Conditional Generative Adversarial Networks

Sparsity Based Poisson Denoising and Inpainting

Generative Adversarial Network (GAN)

Deep Neural Net Scenery Generation

Tracking Objects with Dynamics

Matt Gormley Lecture 16 October 24, 2016

Generative Adversarial Networks

CSCI 5922 Neural Networks and Deep Learning Generative Adversarial Networks Mike Mozer Department of Computer Science and Institute of Cognitive Science.

Unrolling: A principled method to develop deep neural networks

COMP61011 : Machine Learning Ensemble Models

Intelligent Information System Lab

Synthesis of X-ray Projections via Deep Learning

Super-resolution Image Reconstruction

CSCI 5922 Neural Networks and Deep Learning: NIPS Highlights

Authors: Jun-Yan Zhu*, Taesun Park*, Phillip Isola, Alexei A. Efros

Adversarially Tuned Scene Generation

Low Dose CT Image Denoising Using WGAN and Perceptual Loss

Proposed (MoDL-SToRM)

Presenter: Usman Sajid

PixelGAN Autoencoders

Deep CNN of JPEG 2000 電信所R 林俊廷.

Learning to Combine Bottom-Up and Top-Down Segmentation

Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.

(boy, that’s a long title)

Presented by Shixing Chen

Goodfellow: Chapter 14 Autoencoders

Object Detection + Deep Learning

Applications of Cellular Neural Networks to Image Understanding

Image recognition: Defense adversarial attacks

Image to Image Translation using GANs

Neural Networks Geoff Hulten.

GAN Applications.

RCNN, Fast-RCNN, Faster-RCNN

Machine Learning based Data Analysis

实习生汇报 ——北邮张安迪.

Textual Video Prediction

Neural Network Pipeline CONTACT & ACKNOWLEDGEMENTS

边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University

Scalable light field coding using weighted binary images

Unrolling the shutter: CNN to correct motion distortions

Cengizhan Can Phoebe de Nooijer

CSC 578 Neural Networks and Deep Learning

Learned Convolutional Sparse Coding

CS249: Neural Language Model

20 November 2019 Output maps Normal Diffuse Roughness Specular

Presentation transcript:

Compressive Image Recovery using Recurrent Generative Model Akshat Dave, Anil Kumar Vadathya, Kaushik Mitra Computational Imaging lab, IIT Madras, India

Single Pixel Camera y=ϕx x ϵ ℝ 𝑁 , ϕ ϵ ℝ 𝑀×𝑁 y ϵ ℝ 𝑀 𝑴≪𝑵 𝑎𝑟𝑔𝑚𝑖𝑛 x 𝜓(x) Single photon detector Series of random projections SPC, RICE DSP lab y=ϕx 𝐲 𝛜 ℝ 𝑴 √𝑵×√𝑵 x ϵ ℝ 𝑁 , ϕ ϵ ℝ 𝑀×𝑁 y ϵ ℝ 𝑀 Solving for x is ill posed 𝑴≪𝑵 priors 𝑎𝑟𝑔𝑚𝑖𝑛 x 𝜓(x) y=ϕx

Analytic Priors for SPC reconstruction 𝑎𝑟𝑔𝑚𝑖𝑛 x 𝜓 x 1 s.t y=𝜙x Traditional solutions; TVAL and sparsity based priors Challenge - reconstruction from low measurements 40% 10% Get PSNR for 40% reconstruction

Deep learning for Compressive Imaging ReconNet by Kulkarni et al., DR2-Net by Yao et al., ReconNet by Kulkarni et al. CVPR 16 Discriminative learning - task specific networks Add Rich’s recent papers Does not account global multiplexing M.R. 10%

Solution… y=Ax+n 𝑎𝑟𝑔𝑚𝑎𝑥 x 𝑝(x|y) 𝑝 x y ∝𝑝 y x 𝑝(x) Analytic priors Suffer at low measurement rates Deep discriminative models Task specific Do not account global multiplexing Solution? Use data driven priors based on deep learning Specifically, We use recurrent generative model, RIDE by Theis et al. NIPS’ 15 y=Ax+n Original Analytic Data driven 24.70 dB 29.68 dB 𝑎𝑟𝑔𝑚𝑎𝑥 x 𝑝(x|y) 𝑝 x y ∝𝑝 y x 𝑝(x) likelihood prior

Deep generative models Autoregressive models 𝑝(𝑥1) 𝑝(𝑥2|𝑥1) 𝑝(𝑥3| 𝑥 1:2 ) 𝑝(𝑥3| 𝑥 1:2 ) 𝑝 𝑥 = 𝑖 𝑝( 𝑥 𝑖 | 𝑥 <𝑖 ) <START> It was raining 𝑥1 𝑥2 𝑥3 generative model (neural net) 𝜽 𝑧 𝑝 (𝑥) 𝑝(𝑥) loss unit gaussian generated distribution true data distribution GANs VAEs code Enccoder Decoder 𝐼 𝐼 𝑞(𝑧|𝑥) 𝑲𝑳(𝒑,𝒒) 𝑝(𝑧)

Autoregressive models Figure courtesy: Oord et al. 𝑝 𝐱 = 𝑖𝑗 𝑝( 𝑥 𝑖𝑗 | 𝑥 <𝑖𝑗 ) Factorize the distribution Each factor, 𝑝( 𝑥 𝑖𝑗 | 𝑥 <𝑖𝑗 ) sequence prediction 𝑁𝐿𝐿:− log 𝑝 𝐱 =− 𝑖𝑗 log⁡𝑝( 𝑥 𝑖𝑗 | 𝑥 <𝑖𝑗 , 𝜃 𝑖𝑗 ) =− 𝑖𝑗 log⁡𝑝( 𝑥 𝑖𝑗 | 𝑥 <𝑖𝑗 ,𝜃) Training: Recent works: RIDE by Theis et al., PixelRNN/CNN by Oord et al., PixelCNN++ by Salimans et al., 𝑎𝑟𝑔min 𝜃 − 𝑛 log 𝑝 𝐱 𝐧 ,𝜽

Autoregressive models 𝑝 𝐱 = 𝑖𝑗 𝑝( 𝑥 𝑖𝑗 | 𝑥 <𝑖𝑗 ) Factorize the distribution Each factor, 𝑝( 𝑥 𝑖𝑗 | 𝑥 <𝑖𝑗 ) sequence prediction Figure courtesy: Oord et al. 𝑁𝐿𝐿:− log 𝑝 𝐱 =− 𝑖𝑗 log⁡𝑝( 𝑥 𝑖𝑗 | 𝑥 <𝑖𝑗 , 𝜃 𝑖𝑗 ) =− 𝑖𝑗 log⁡𝑝( 𝑥 𝑖𝑗 | 𝑥 <𝑖𝑗 ,𝜃) why RIDE? Models continuous distribution Simple and easy to train Recent works: RIDE by Theis et al., PixelRNN/CNN by Oord et al., PixelCNN++ by Salimans et al.,

Deep autoregressive models Recurrent Image Density Estimator (RIDE) by Theis et al. 2015 𝑝 𝐱 = 𝑖𝑗 𝑝( 𝑥 𝑖𝑗 | 𝑥 <𝑖𝑗 ) Markovian case 𝑝 𝑥 𝑖𝑗 𝑥 <𝑖𝑗 ,𝜃 = 𝑐 𝑝 𝑐 𝑥 <𝑖𝑗 𝑝( 𝑥 𝑖𝑗 | 𝑐,𝑥 <𝑖𝑗 ,𝜃) *picture courtesy, Theis et al. ℎ 𝑖𝑗 =𝑓( 𝑥 <𝑖𝑗 , ℎ 𝑖−1,𝑗 , ℎ 𝑖,𝑗−1 ) Recurrent neural network *picture courtesy, Theis et al.

Deep autoregressive models Recurrent Image Density Estimator (RIDE) by Theis et al. 2015 𝑝 𝐱 = 𝑖𝑗 𝑝( 𝑥 𝑖𝑗 | 𝑥 <𝑖𝑗 ) Markovian case 𝑝 𝑥 𝑖𝑗 𝑥 <𝑖𝑗 ,𝜃 = 𝑐 𝑝 𝑐 𝑥 <𝑖𝑗 𝑝( 𝑥 𝑖𝑗 | 𝑐,𝑥 <𝑖𝑗 ,𝜃) *picture courtesy, Theis et al. ℎ 𝑖𝑗 =𝑓( 𝑥 <𝑖𝑗 , ℎ 𝑖−1,𝑗 , ℎ 𝑖,𝑗−1 ) Recurrent neural network Samples from RIDE on CIFAR data by Theis et al., 2015 *picture courtesy, Theis et al.

RIDE for SPC reconstruction 𝑎𝑟𝑔𝑚𝑖𝑛 x 𝜓 x 1 s.t y=𝜙x Data driven Recurrent, can handle global multiplexing Flexible unlike discriminative models RIDE as an image prior, 𝑎𝑟𝑔𝑚𝑎𝑥 x 𝑝(x) 𝑠.𝑡. y=𝜙x Now, with RIDE as prior, In each iteration, t : - gradient ascent - projection x 𝑡= x 𝑡−1 +𝜂 𝛻 x 𝑡−1 𝑝(𝑥) x𝑡= x 𝑡 −𝜙𝑇 𝜙𝜙𝑇 −1 (𝜙 x 𝑡 −y) 30%

RIDE for Image In-painting Original Missing pixels 80% Multiscale KSVD by Mairal et al. 21.21 dB With RIDE 22.07 dB

RIDE for SPC reconstruction Comparisons D-AMP TVAL Ours 26.76 dB 24.70 dB 29.68 dB 𝟏𝟐𝟖 𝐱 𝟏𝟐𝟖 15% measurements

RIDE for SPC reconstruction Quantitative results; five 128x128 images from BSDS test set Method M.R. = 40% M.R. = 30% M.R. = 25% M.R. = 15% PSNR SSIM TVAL3 29.70 0.833 28.68 0.793 27.73 0.759 25.58 0.670 D-AMP 32.54 0.848 29.95 0.800 28.26 0.760 24.02 0.615 Ours 33.71 0.903 31.91 0.862 30.71 0.830 27.11 0.704

RIDE for SPC reconstruction Reconstruction using RIDE at 30% M.R. Original D-AMP TVAL Ours 32.21 dB, 0.929 26.16 dB, 0.784 33.82 dB, 0.935

RIDE for SPC reconstruction Original (256x256) 30% M.R. 10% M.R. 7% M.R.

Real SPC reconstructions @ 30% M.R. D-AMP TVAL Ours Full reconstruction 32.31 dB, 0.876 32.95 dB, 0.883 35.87 dB, 0.908

Real SPC reconstructions @ 15% M.R. D-AMP TVAL Ours Full reconstruction 28.34 dB, 0.760 24.68 dB, 0.687 31.12 dB, 0.813

Summary Analytic priors Deep discriminative models Suffer at low measurement rates Deep discriminative models Task specific, do not handle global multiplexing SPC reconstruction RIDE as an image prior for SPC Data driven Recurrent Flexible On the downside Sequential nature increases computational cost Check arxiv for more details: https://arxiv.org/abs/1612.04229 Code is available on github: https://adaveiitm.github.io/research.html

END