Unrolling: A principled method to develop deep neural networks

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

Object Specific Compressed Sensing by minimizing a weighted L2-norm A. Mahalanobis.
Mixed-Resolution Patch- Matching (MRPM) Harshit Sureka and P.J. Narayanan (ECCV 2012) Presentation by Yaniv Romano 1.
Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.
More MR Fingerprinting
Mean transform, a tutorial KH Wong mean transform v.5a1.
Volkan Cevher, Marco F. Duarte, and Richard G. Baraniuk European Signal Processing Conference 2008.
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
1 Automated Feature Abstraction of the fMRI Signal using Neural Network Clustering Techniques Stefan Niculescu and Tom Mitchell Siemens Medical Solutions,
LOGO Classification III Lecturer: Dr. Bo Yuan
Radial Basis Function Networks
Compressed Sensing Based UWB System Peng Zhang Wireless Networking System Lab WiNSys.
Scalable and Fully Distributed Localization With Mere Connectivity.
Selective Block Minimization for Faster Convergence of Limited Memory Large-scale Linear Models Kai-Wei Chang and Dan Roth Experiment Settings Block Minimization.
Applications of Neural Networks in Time-Series Analysis Adam Maus Computer Science Department Mentor: Doctor Sprott Physics Department.
1 A fast algorithm for learning large scale preference relations Vikas C. Raykar and Ramani Duraiswami University of Maryland College Park Balaji Krishnapuram.
Review: Neural Network Control of Robot Manipulators; Frank L. Lewis; 1996.
M. Wang, T. Xiao, J. Li, J. Zhang, C. Hong, & Z. Zhang (2014)
Practical Message-passing Framework for Large-scale Combinatorial Optimization Inho Cho, Soya Park, Sejun Park, Dongsu Han, and Jinwoo Shin KAIST 2015.
Chapter 6 Neural Network.
Terahertz Imaging with Compressed Sensing and Phase Retrieval Wai Lam Chan Matthew Moravec Daniel Mittleman Richard Baraniuk Department of Electrical and.
SUPER RESOLUTION USING NEURAL NETS Hila Levi & Eran Amar Weizmann Ins
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Today’s Lecture Neural networks Training
Machine Learning Supervised Learning Classification and Regression
Big data classification using neural network
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift CS838.
Fall 2004 Backpropagation CS478 - Machine Learning.
RNNs: An example applied to the prediction task
Compressive Coded Aperture Video Reconstruction
Deep Feedforward Networks
Neural Networks for Quantum Simulation
Extreme Learning Machine
Data Mining, Neural Network and Genetic Programming
Chilimbi, et al. (2014) Microsoft Research
ECE 5424: Introduction to Machine Learning
One-layer neural networks Approximation problems
Photorealistic Image Colourization with Generative Adversarial Nets
CSCI 5822 Probabilistic Models of Human and Machine Learning
Neural Networks and Backpropagation
Proposed (MoDL-SToRM)
Imaging Through Multiple Scattering Media Using Phase Retrieval
By: Kevin Yu Ph.D. in Computer Engineering
Mean transform , a tutorial
RNNs: Going Beyond the SRN in Language Prediction
Terahertz Imaging with Compressed Sensing and Phase Retrieval
CNNs and compressive sensing Theoretical analysis
Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.
MXNet Internals Cyrus M. Vahid, Principal Solutions Architect,
Dog/Cat Classifier Christina Stiff.
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Pose Estimation for non-cooperative Spacecraft Rendevous using CNN
Oral presentation for ACM International Conference on Multimedia, 2014
Generalization in deep learning
Deep Learning and Mixed Integer Optimization
Neural Networks Geoff Hulten.
المشرف د.يــــاســـــــــر فـــــــؤاد By: ahmed badrealldeen
Outline Background Motivation Proposed Model Experimental Results
Machine Learning based Data Analysis
Neural networks (1) Traditional multi-layer perceptrons
Image Classification & Training of Neural Networks
Neural Network Pipeline CONTACT & ACKNOWLEDGEMENTS
Compressive Image Recovery using Recurrent Generative Model
Unsupervised Perceptual Rewards For Imitation Learning
Reuben Feinman Research advised by Brenden Lake
Scalable light field coding using weighted binary images
Mohammad Samragh Mojan Javaheripi Farinaz Koushanfar
Learned Convolutional Sparse Coding
20 November 2019 Output maps Normal Diffuse Roughness Specular
Sebastian Semper1 and Florian Roemer1,2
Presentation transcript:

Unrolling: A principled method to develop deep neural networks I have about 30 minutes to present Hopefully Maarten’s talk has motivated the use of deep learning for solving inverse problems. In this talk I’ll describe a process by which we can turn general iterative reconstruction algorithms into deep neural networks. Chris Metzler, Ali Mousavi, Richard Baraniuk Rice University 1

Solving Imaging Inverse Problems Traditional Methods (e.g., ADMM) Deep Neural Networks (DNNs) Have well-understood behavior E.g., iterations are refining an estimate Convergence guarantees Interpretable priors Are easy to apply Blackbox methods which learn complex functions What happens between layers is an open research question Priors are learned with training data Demonstrate state-of-the-art performance on a variety of imaging tasks Superhuman performance at classifying the breeds of dogs: [1] Chilimbi et al. "Project Adam: Building an Efficient and Scalable Deep Learning Training System." OSDI. Vol. 14. 2014

Solving Imaging Inverse Problems Traditional Methods (e.g., ADMM) Deep Neural Networks (DNNs) Have well-understood behavior E.g., iterations are refining an estimate Convergence guarantees Interpretable priors Are easy to apply Blackbox methods which learn complex functions What happens between layers is an open research question Priors are learned with training data Demonstrate state-of-the-art performance on a variety of imaging tasks Is there a way to combine the strengths of both?

This talk Describe unrolling; a process to turn an iterative algorithm into a deep neural network Can use training data Is interpretable Maintains convergence and performance guarantees Apply unrolling to the Denoising-based AMP algorithm to solve the compressive imaging problem

Unrolling an algorithm

Example: Iterative Shrinkage/Thresholding

The unrolling process [Gregor and Lecun 2010, Kamilov and Mansour 2015, Borgerding and Schniter 2016]

The unrolling process [Gregor and Lecun 2010, Kamilov and Mansour 2015, Borgerding and Schniter 2016]

Training the Network has free parameters. A and AH can also be treated as parameters to learn Feed training data, i.e., (x, y) pairs, through the network, calculate errors, backpropagate Error Gradients [Gregor and Lecun 2010, Kamilov and Mansour 2015, Borgerding and Schniter 2016]

Performance [Borgerding and Schniter 2016]

Unrolling Denoising-based Approximate Message Passing

The Compressive Imaging Problem Sensor Array (y) Target (x)

The Compressive Imaging Problem Sensor Array (y) Target (x)

The Compressive Imaging Problem Sensor Array (y) Target (x)

The Compressive Imaging Problem Sensor Array (y) Target (x)

The Compressive Imaging Problem Sensor Array (y) Target (x)

The Compressive Imaging Problem Sensor Array (y) Target (x) Every measurement has a cost; $, time, bandwidth, etc. Compressed sensing enables us to recover x with fewer measurements

Compressive Imaging Applications Higher Resolution Synthetic Aperture Radar Higher Resolution Seismic Imaging Faster Medical Imaging Image is of a 3000 rpm drill, measured at 25 fps Low Cost Infrared/Hyperspectral Imaging Low Cost High Speed Imaging Baraniuk 2007 Veeraraghavan et al. 2011

Compressive Imaging Problem We know that x_o is a natural image, and thus belongs to the set C of natural images. This set is much much smaller than all of R^n, and thus hopefully has a very small intersection with the set of possible solutions. Ideally it intersects the solution set at only a single point, x_o. Set of all natural images

Traditional Methods vs DNNs Traditional Methods (D-AMP) DNNs (Ali’s talk) Well understood behavior Recovery guarantees Noise sensitivity analysis More accurate Slower Uses denoisers to impose priors Black box Why does it work? When will it fail? Less accurate Much faster Learns priors with training data Learned D-AMP gets the strengths of both

Denoising-based AMP M. et al. 2016

Neural Network (NN) Recovery Methods Mousavi and Baraniuk. 2017

Our contribution Unroll D-AMP to form a NN Learned D-AMP Efficiently train a 200 layer network Demonstrate proposed algorithm is fast, flexible, and effective >10x faster than D-AMP Handles arbitrary right-rotationally-invariant matrices State-of-the-art recovery accuracy

Unroll D-AMP

Unroll D-AMP

Need a denoiser that can be trained Unroll D-AMP Need a denoiser that can be trained

CNN-based Denoiser Deep convolution networks are now state-of-the-art image denoisers [Zhang et al. 2016]

Learned D-AMP Place a DNN-based denoiser unrolled D-AMP to form Learned D-AMP We need to train this huge network

Training The Challenge: Network is 200 layers deep and has over 6 million free parameters Solution: [1] proved that for D-AMP layer-by-layer training is optimal [1] M. et al. "From denoising to compressed sensing." IEEE Transactions on Information Theory 62.9 (2016): 5117-5144.

Training Details 400 training images were broken into ~300000 50x50 patches Patches were randomly flipped and/or rotated during training 10 identical networks were trained to remove additive white Gaussian noise at 10 different noise levels Stopped training when validation error stopped improving; generally 10 to 20 epochs Trained on 3584 core Titan X GPU for between 3 and 5 hours

High-Res 10% Gaussian Matrix BM3D-AMP (199 sec) Learned DVAMP (62 sec)

Performance: Gaussian Measurements 1 dB Better than BM3D-AMP

Runtime: Gaussian Measurements >10x Faster than BM3D-AMP

Computational Complexity It is now dominated by the matrix multiplies Computational complexity is dominated by matrix multiplies

Performance: Coded Diffraction Measurements 2 dB Better than BM3D-AMP

Runtime: Coded Diffraction Measurements >17x Faster than BM3D-AMP (at 128x128)

High-Res 5% Coded Diffraction Matrix TVAL3 (6.85 sec) Learned DVAMP (1.22 sec)

High-Res 5% Coded Diffraction Matrix BM3D-AMP (75.04 sec) Learned DVAMP (1.22 sec)

Summary Unrolling turns an iterative algorithm into a deep neural net Illustrated here with D-AMP  Learned D-AMP Learned-DAMP is fast, flexible, and effective >10x faster than D-AMP Handles arbitrary right-rotationally-invariant matrices State-of-the-art recovery accuracy

Acknowledgments NSF GRFP