CS6890 Deep Learning Weizhen Cai

Slides:

Advertisements

Similar presentations

Rich feature Hierarchies for Accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitandra Malik (UC Berkeley)

Advertisements

A brief review of non-neural-network approaches to deep learning

ImageNet Classification with Deep Convolutional Neural Networks

Spatial Pyramid Pooling in Deep Convolutional

ECE 6504: Deep Learning for Perception

Fully Convolutional Networks for Semantic Segmentation

Deep Convolutional Nets

ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.

Convolutional Neural Network

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition arXiv: v4 [cs.CV(CVPR)] 23 Apr 2015 Kaiming He, Xiangyu Zhang, Shaoqing.

Recent developments in object detection

CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.

The Relationship between Deep Learning and Brain Function

CS 6501: 3D Reconstruction and Understanding Convolutional Neural Networks Connelly Barnes.

Compact Bilinear Pooling

Dhruv Batra Georgia Tech

Data Mining, Neural Network and Genetic Programming

Computer Science and Engineering, Seoul National University

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Matt Gormley Lecture 16 October 24, 2016

CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.

Combining CNN with RNN for scene labeling (segmentation)

Dhruv Batra Georgia Tech

Synthesis of X-ray Projections via Deep Learning

Supervised Training of Deep Networks

Lecture 5 Smaller Network: CNN

Training Techniques for Deep Neural Networks

Efficient Deep Model for Monocular Road Segmentation

Convolutional Networks

Deep Belief Networks Psychology 209 February 22, 2013.

CS 698 | Current Topics in Data Science

Machine Learning: The Connectionist

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Object detection.

Introduction to Convolutional Neural Network (CNN/ConvNET)-insights from amateur George (Tian Zhou)

Fully Convolutional Networks for Semantic Segmentation

Layer-wise Performance Bottleneck Analysis of Deep Neural Networks

Computer Vision James Hays

CNNs and compressive sensing Theoretical analysis

Introduction to Neural Networks

Image Classification.

Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.

Counting in Dense Crowds using Deep Learning

Deep learning Introduction Classes of Deep Learning Networks

Object Classification through Deconvolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

Smart Robots, Drones, IoT

8-3 RRAM Based Convolutional Neural Networks for High Accuracy Pattern Recognition and Online Learning Tasks Z. Dong, Z. Zhou, Z.F. Li, C. Liu, Y.N. Jiang,

CSC 578 Neural Networks and Deep Learning

Semantic segmentation

Lecture: Deep Convolutional Neural Networks

Outline Background Motivation Proposed Model Experimental Results

Visualizing and Understanding Convolutional Networks

Forward and Backward Max Pooling

Analysis of Trained CNN (Receptive Field & Weights of Network)

RCNN, Fast-RCNN, Faster-RCNN

Deep Learning Some slides are from Prof. Andrew Ng of Stanford.

CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.

CSC 578 Neural Networks and Deep Learning

Department of Computer Science Ben-Gurion University of the Negev

Automatic Handwriting Generation

Deep Object Co-Segmentation

CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Semantic Segmentation

Learning Deconvolution Network for Semantic Segmentation

CSC 578 Neural Networks and Deep Learning

Shengcong Chen, Changxing Ding, Minfeng Liu 2018

Presentation transcript:

CS6890 Deep Learning Weizhen Cai 04-10-2018 Image Segmentation and Generation: Upsampling through Transposed Convolution and Max Unpooling Part II: Applications CS6890 Deep Learning Weizhen Cai 04-10-2018

Visualization of feature maps Image segmentation FCN SegNet Conclusion Outline Outline Two applications of Transposed Convolution and Max Unpooling in neural network Visualization of feature maps Image segmentation FCN SegNet Conclusion Visualization Segmentation Transposed Convolutiaon and max unpooling are usually applied with in NN. Conclusion

Applications Same operations as CNNs, but in reverse Visualization of the feature maps Mapping activations at high layers back to the input pixel space Showing the patterns caused a given activation in the feature maps Visualization Transposed convolution Originally proposed as a way of unsupervised leaning ([Zeiler11]) No learning, no inference Segmentation [Zeiler11] M. Zeiler, G. Taylor, and R. Fergus: Adaptive Deconvolutional Networks for Mid and High Level Feature Learning. ICCV 2011 Same operations as CNNs, but in reverse Unpool feature maps Convolve unpooled maps Conclusion

Applications Visualization of the feature maps Visualization of the feature activations on the ImageNet validation set after the training is complete. Visualization Deconvnet Max unpooling Convnet Max pooling Segmentation Same filter but in revised mode in Transposed Convolution: transposed filter (flipped horizontal and vertical) Figure coms from: http://www.mdpi.com/1099-4300/19/6/242/htm Conclusion

Applications Input Feature maps Visualization of the feature maps Deconvnet model Visualization Input Feature maps Coming from convnet, one at a time Max Unpooling Segmentation Rectification Non-linearity (e.g. ReLU ) Left figure modified from source: http://www.aisociety.kr/prml/PRMLSS_2015_deconvolution.pdf Filtering Using transposed filter in convnet (flippe horizontal and vertical) Conclusion Output images Visualized filters/kernels (raw weights)

Applications Visualization of the feature maps Results of visualization of a fully trained model with the deconvet described earlier. Visualization Layer 1 edges colors Layer 2 corners Segmentation Figure coming from: https://zhuanlan.zhihu.com/p/24833574 Displaying which part of the image the given feature map focuses. Reconstructed patterns from the validation set that cause top 9 highest activations in a given feature map. For each feature map the corresponding image patches are also attached. Conclusion

Applications Visualization of the feature maps Results of visualization with the deconvet. Visualization Segmentation Figure coming from: https://zhuanlan.zhihu.com/p/24833574 Layer 3 similar structures consisted of lines Conclusion

Applications Visualization of the feature maps Results of visualization of a fully trained model with the deconvet described earlier. Visualization Layer 4 & 5 more information used for classify Segmentation Figure coming from: https://zhuanlan.zhihu.com/p/24833574 Conclusion

Applications Visualization of the feature maps Visualization of the different layers during training Visualization Segmentation Figure coming from: https://zhuanlan.zhihu.com/p/24833574 Conclusion

Applications Image segmentation (FCN) Brief introduction of FCN ( FCN modified from AlexNet) Input: Arbitrary size of image Segmentation Converting the fully connected layer in a classifier neural network to a convolutional layer + upsample 8x, 16x, 32x Figure coming from Long, J., Shelhamer, E., & Darrell, T. (2014). Fully Convolutional Networks for Semantic Segmentation. arXiv:1411.4038 [cs]. Retrieved from http://arxiv.org/abs/1411.4038 Conclusion Output: Same size as input, pixel level prediction

Applications Image segmentation (FCN) Conversion from CNN to FCN: Convolutionalizaion (e.g. AlexNet) A fully connected layer can also be viewed as convolutions with kernels that cover its entire input region. Segmentation Figure comes from: https://arxiv.org/pdf/1411.4038.pdf Fine-tuning using AlexNet Output 1000 heatmaps. Conclusion Figure: Transforming fully connected layers into convolution layers enables a classification net to output a heatmap.

Image segmentation (FCN) Conversion from CNN to FCN (e.g. AlexNet) Converting all the FC layers to convolution layers. Using transposed convolution layer to recover the activation positions to something meaningful related to the image size. Imagine that we're just scaling up the activation size to the same image size. Segmentation Converting the fully connected layer in a classifier neural network to a convolutional layer + upsample 8x, 16x, 32x Figure coming from Long, J., Shelhamer, E., & Darrell, T. (2014). Fully Convolutional Networks for Semantic Segmentation. arXiv:1411.4038 [cs]. Retrieved from http://arxiv.org/abs/1411.4038 Conclusion

Applications Image segmentation (FCN) Upsampling and the “skip” processor Upsampling with factor f is convolution with a fractional input stride of 1/f. e.g. Deconvolution with an output stride of f. Simply reverses the forward and backward passes of convolution Segmentation Using AlexNet as example. Upsampling is deconcovolution (backwards convolution) which is simply reverses the forward and backward passes of convolution. Higher layer- deep, global, coarse, what Lower layer – shallow, local, fine, where Figures comes from : https://blog.csdn.net/taigw/article/details/51401448 Conclusion

Applications Image segmentation (FCN) Advantages Disadvantages Arbitrary input size of images High efficiency to train the entire image instead of patches Segmentation Disadvantages Not sensitive to details No spatial regulation No leaning in this paper, does not count as strict transposed convolution Using AlexNet as example. Conclusion

Applications Image segmentation (SegNet) Segmentation Figure comes from: https://arxiv.org/pdf/1511.00561.pdf Maximums remain the same locations. Rest value is 0s. The decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. Conclusion

Conclusions Structures predictions: Semantic segmentation Visualization of features Training transposed CNN requires too many parameter Large dataset Large GPU memory Having a lot potential and applications Conclusion

Acknowledgment Dr. Jundong Liu Zhewei Wang Nidal Abuhajar Tao Sun Dr. Razvan Bunescu

Additional Applications Object generation (generator in GAN) A. Dosovitskiy, J. T. Springenberg and T. Brox. Learning to generate chairs with convolutional neural networks. CVPR 2015 Generation Deconvolution: 5 * 5 Fixed unplooing: Assuming the maximum values always locates at the left most corner. (e.g. 2 * 2 => 4 * 4) Conclusion