Download presentation
Presentation is loading. Please wait.
1
CS6890 Deep Learning Weizhen Cai 04-10-2018
Image Segmentation and Generation: Upsampling through Transposed Convolution and Max Unpooling Part II: Applications CS6890 Deep Learning Weizhen Cai
2
Visualization of feature maps Image segmentation FCN SegNet Conclusion
Outline Outline Two applications of Transposed Convolution and Max Unpooling in neural network Visualization of feature maps Image segmentation FCN SegNet Conclusion Visualization Segmentation Transposed Convolutiaon and max unpooling are usually applied with in NN. Conclusion
3
Applications Same operations as CNNs, but in reverse
Visualization of the feature maps Mapping activations at high layers back to the input pixel space Showing the patterns caused a given activation in the feature maps Visualization Transposed convolution Originally proposed as a way of unsupervised leaning ([Zeiler11]) No learning, no inference Segmentation [Zeiler11] M. Zeiler, G. Taylor, and R. Fergus: Adaptive Deconvolutional Networks for Mid and High Level Feature Learning. ICCV 2011 Same operations as CNNs, but in reverse Unpool feature maps Convolve unpooled maps Conclusion
4
Applications Visualization of the feature maps
Visualization of the feature activations on the ImageNet validation set after the training is complete. Visualization Deconvnet Max unpooling Convnet Max pooling Segmentation Same filter but in revised mode in Transposed Convolution: transposed filter (flipped horizontal and vertical) Figure coms from: Conclusion
5
Applications Input Feature maps Visualization of the feature maps
Deconvnet model Visualization Input Feature maps Coming from convnet, one at a time Max Unpooling Segmentation Rectification Non-linearity (e.g. ReLU ) Left figure modified from source: Filtering Using transposed filter in convnet (flippe horizontal and vertical) Conclusion Output images Visualized filters/kernels (raw weights)
6
Applications Visualization of the feature maps
Results of visualization of a fully trained model with the deconvet described earlier. Visualization Layer 1 edges colors Layer 2 corners Segmentation Figure coming from: Displaying which part of the image the given feature map focuses. Reconstructed patterns from the validation set that cause top 9 highest activations in a given feature map. For each feature map the corresponding image patches are also attached. Conclusion
7
Applications Visualization of the feature maps
Results of visualization with the deconvet. Visualization Segmentation Figure coming from: Layer 3 similar structures consisted of lines Conclusion
8
Applications Visualization of the feature maps
Results of visualization of a fully trained model with the deconvet described earlier. Visualization Layer 4 & 5 more information used for classify Segmentation Figure coming from: Conclusion
9
Applications Visualization of the feature maps
Visualization of the different layers during training Visualization Segmentation Figure coming from: Conclusion
10
Applications Image segmentation (FCN)
Brief introduction of FCN ( FCN modified from AlexNet) Input: Arbitrary size of image Segmentation Converting the fully connected layer in a classifier neural network to a convolutional layer + upsample 8x, 16x, 32x Figure coming from Long, J., Shelhamer, E., & Darrell, T. (2014). Fully Convolutional Networks for Semantic Segmentation. arXiv: [cs]. Retrieved from Conclusion Output: Same size as input, pixel level prediction
11
Applications Image segmentation (FCN)
Conversion from CNN to FCN: Convolutionalizaion (e.g. AlexNet) A fully connected layer can also be viewed as convolutions with kernels that cover its entire input region. Segmentation Figure comes from: Fine-tuning using AlexNet Output 1000 heatmaps. Conclusion Figure: Transforming fully connected layers into convolution layers enables a classification net to output a heatmap.
12
Image segmentation (FCN)
Conversion from CNN to FCN (e.g. AlexNet) Converting all the FC layers to convolution layers. Using transposed convolution layer to recover the activation positions to something meaningful related to the image size. Imagine that we're just scaling up the activation size to the same image size. Segmentation Converting the fully connected layer in a classifier neural network to a convolutional layer + upsample 8x, 16x, 32x Figure coming from Long, J., Shelhamer, E., & Darrell, T. (2014). Fully Convolutional Networks for Semantic Segmentation. arXiv: [cs]. Retrieved from Conclusion
13
Applications Image segmentation (FCN)
Upsampling and the “skip” processor Upsampling with factor f is convolution with a fractional input stride of 1/f. e.g. Deconvolution with an output stride of f. Simply reverses the forward and backward passes of convolution Segmentation Using AlexNet as example. Upsampling is deconcovolution (backwards convolution) which is simply reverses the forward and backward passes of convolution. Higher layer- deep, global, coarse, what Lower layer – shallow, local, fine, where Figures comes from : Conclusion
14
Applications Image segmentation (FCN) Advantages Disadvantages
Arbitrary input size of images High efficiency to train the entire image instead of patches Segmentation Disadvantages Not sensitive to details No spatial regulation No leaning in this paper, does not count as strict transposed convolution Using AlexNet as example. Conclusion
15
Applications Image segmentation (SegNet) Segmentation Figure comes from: Maximums remain the same locations. Rest value is 0s. The decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. Conclusion
16
Conclusions Structures predictions: Semantic segmentation
Visualization of features Training transposed CNN requires too many parameter Large dataset Large GPU memory Having a lot potential and applications Conclusion
17
Acknowledgment Dr. Jundong Liu Zhewei Wang Nidal Abuhajar Tao Sun
Dr. Razvan Bunescu
19
Additional Applications
Object generation (generator in GAN) A. Dosovitskiy, J. T. Springenberg and T. Brox. Learning to generate chairs with convolutional neural networks. CVPR 2015 Generation Deconvolution: 5 * 5 Fixed unplooing: Assuming the maximum values always locates at the left most corner. (e.g. 2 * 2 => 4 * 4) Conclusion
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.