Combining CNN with RNN for scene labeling (segmentation)

Slides:



Advertisements
Similar presentations
Classification spotlights
Advertisements

ImageNet Classification with Deep Convolutional Neural Networks
Deep Learning and Neural Nets Spring 2015
Spatial Pyramid Pooling in Deep Convolutional
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
Detection, Segmentation and Fine-grained Localization
Fully Convolutional Networks for Semantic Segmentation
Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation.
Feedforward semantic segmentation with zoom-out features
Unsupervised Visual Representation Learning by Context Prediction
Convolutional Neural Network
Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,
Philipp Gysel ECE Department University of California, Davis
Gaussian Conditional Random Field Network for Semantic Segmentation
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
A Hierarchical Deep Temporal Model for Group Activity Recognition
Attention Model in NLP Jichuan ZENG.
Recent developments in object detection
SUNY Korea BioData Mining Lab - Journal Review
Faster R-CNN – Concepts
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
Object Detection based on Segment Masks
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Recurrent Neural Networks for Natural Language Processing
CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.
Lecture 24: Convolutional neural networks
References [1] - Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, Gradient-Based Learning Applied to Document Recognition, Proceedings of the IEEE, 86(11): ,
Regularizing Face Verification Nets To Discrete-Valued Pain Regression
CSCI 5922 Neural Networks and Deep Learning: Image Captioning
Dhruv Batra Georgia Tech
Intelligent Information System Lab
Deep Neural Networks based Text- Dependent Speaker Verification
Synthesis of X-ray Projections via Deep Learning
Deep Belief Networks Psychology 209 February 22, 2013.
CS6890 Deep Learning Weizhen Cai
Machine Learning: The Connectionist
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Adri`a Recasens, Aditya Khosla, Carl Vondrick, Antonio Torralba
Object detection.
Master’s Thesis defense Ming Du Advisor: Dr. Yi Shang
Fully Convolutional Networks for Semantic Segmentation
Layer-wise Performance Bottleneck Analysis of Deep Neural Networks
Bird-species Recognition Using Convolutional Neural Network
Computer Vision James Hays
Introduction to Neural Networks
Toward improved document classification and retrieval
A Comparative Study of Convolutional Neural Network Models with Rosenblatt’s Brain Model Abu Kamruzzaman, Atik Khatri , Milind Ikke, Damiano Mastrandrea,
Semantic segmentation
Age and Gender Classification using Convolutional Neural Networks
Lecture: Deep Convolutional Neural Networks
Outline Background Motivation Proposed Model Experimental Results
Lecture 16: Recurrent Neural Networks (RNNs)
Visualizing and Understanding Convolutional Networks
Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions
Analysis of Trained CNN (Receptive Field & Weights of Network)
RCNN, Fast-RCNN, Faster-RCNN
边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University
Heterogeneous convolutional neural networks for visual recognition
CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Attention for translation
Learn to Comment Mentor: Mahdi M. Kalayeh
Department of Computer Science Ben-Gurion University of the Negev
Automatic Handwriting Generation
Deep Object Co-Segmentation
Semantic Segmentation
Learning Deconvolution Network for Semantic Segmentation
Deep learning: Recurrent Neural Networks CV192
Bidirectional LSTM-CRF Models for Sequence Tagging
Presentation transcript:

Combining CNN with RNN for scene labeling (segmentation) Tao Zeng

Scene labeling Problem Image classification Classifying each image into K class Image segmentation: Classifying each pixel in the image Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. Labeling every pixel in the image with object class it belongs to. Chen, Liang-Chieh, et al. "Semantic image segmentation with deep convolutional nets and fully connected crfs." arXiv preprint arXiv:1412.7062 (2014).

Scene labeling Problem Spatiotemporal segmentation: Challenge: Solving segmentation and recognition simultaneously X t Labeling every pixel in the image with object class it belongs to. Seguin, Guillaume, et al. "Instance-level video segmentation from object tracks." (2016).

Patch wise training & patch wise prediction Slow due to redundant computation Li, Hongsheng, Rui Zhao, and Xiaogang Wang. "Highly efficient forward and backward propagation of convolutional neural networks for pixelwise classification." arXiv preprint arXiv:1412.4526 (2014).

Fully convolutional Networks Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.

Motivation Motivation CNN does not have an explicit mechanism to modulate feature with context (CRF) Need to model the relationship between labels

Contextual information is important

Idea of recurrent CNN Providing feedback from the output into the input allows the network to model label dependencies, and correct its own previous predictions Ensuring the object coherence in scene labeling

Recurrent CNN Model 1 Model 2 Liang, Ming, and Xiaolin Hu. "Recurrent convolutional neural network for object recognition." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. Pinheiro, Pedro HO, and Ronan Collobert. "Recurrent Convolutional Neural Networks for Scene Labeling." ICML. 2014.

=? Weights sharing Recurrent RCL Residual Network Weight Layer X +

determining the label of a pixel in an image The model is able to perform local feature extraction and context integration simultaneously in each parameterized layer, therefore particularly fits this application because both local and global information are critical for determining the label of a pixel in an image

Spatiotemporal sequence prediction

Problem Goal: Predicting the future rainfall intensity in a local region over a relatively short period of time M x N 2D space , p measurements Predict the most likely length-K sequence in the future given the previous J observations

1D LSTM to 2D conv LSTM Donahue, Jeffrey, et al. "Long-term recurrent convolutional networks for visual recognition and description." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.

Combining Fully Convolutional and Recurrent Neural networks for 3D Biomedical Image Segmentation

Problem 3D biomedical images are often anisotropic: high solution in x-y axis but low in Z axis Previous Methods: 2D convolutional for each slice and then followed by concatenating them into 3D 3D convolution 2D-3D hybrid Approach Combine FCN (u-Net) and LSTM (BDC-LSTM) to Exploit intra-slice and inter-slice contexts High resolution Low resolution

U-Net: Biomedical image segmentation

kU-Net

CLSTM

Nz slices - -> kU-Net --> 64 x Nx x Ny feature map f2dZ f2dZ --> BDC-LSTM ----> f3dZ -->softmax -->prob Decoupling kU-Net and BDC-LSTM training due to GPU memory and context consideration

Thank you! Questions ?