Learning Hierarchical Features for Scene Labeling Cle’ment Farabet, Camille Couprie, Laurent Najman, and Yann LeCun by Dong Nie.

Slides:

Advertisements

Similar presentations

Semantic Contours from Inverse Detectors Bharath Hariharan et.al. (ICCV-11)

Advertisements

The Layout Consistent Random Field for detecting and segmenting occluded objects CVPR, June 2006 John Winn Jamie Shotton.

3rd Workshop On Semantic Perception, Mapping and Exploration (SPME) Karlsruhe, Germany,2013 Semantic Parsing for Priming Object Detection in RGB-D Scenes.

Random Forest Predrag Radenković 3237/10

Scene Labeling Using Beam Search Under Mutex Constraints ID: O-2B-6 Anirban Roy and Sinisa Todorovic Oregon State University 1.

LARGE-SCALE IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill road building car sky.

INTRODUCTION Heesoo Myeong, Ju Yong Chang, and Kyoung Mu Lee Department of EECS, ASRI, Seoul National University, Seoul, Korea Learning.

1 Building a Dictionary of Image Fragments Zicheng Liao Ali Farhadi Yang Wang Ian Endres David Forsyth Department of Computer Science, University of Illinois.

Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs Roozbeh Mottaghi 1, Sanja Fidler 2, Jian Yao 2, Raquel Urtasun 2, Devi Parikh 3 1 UCLA.

Large-Scale Object Recognition with Weak Supervision

Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA

Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.

Self-Validated Labeling of MRFs for Image Segmentation Wei Feng 1,2, Jiaya Jia 2 and Zhi-Qiang Liu 1 1. School of Creative Media, City University of Hong.

Learning to Detect A Salient Object Reporter: 鄭綱 (3/2)

Background Removal of Multiview Images by Learning Shape Priors Yu-Pao Tsai, Cheng-Hung Ko, Yi-Ping Hung, and Zen-Chung Shih.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson

LARGE-SCALE NONPARAMETRIC IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill CVPR 2011Workshop on Large-Scale.

Abstract We present a model of curvilinear grouping using piecewise linear representations of contours and a conditional random field to capture continuity.

Graph Cut based Inference with Co-occurrence Statistics Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr.

R-CNN By Zhang Liliang.

What, Where & How Many? Combining Object Detectors and CRFs

Introduction of Saliency Map

3D Scene Models Object recognition and scene understanding Krista Ehinger.

Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.

Segmentation and Grouping Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/23/10.

The Three R’s of Vision Jitendra Malik.

Recognition using Regions (Demo) Sudheendra V. Outline Generating multiple segmentations –Normalized cuts [Ren & Malik (2003)] Uniform regions –Watershed.

City University of Hong Kong 18 th Intl. Conf. Pattern Recognition Self-Validated and Spatially Coherent Clustering with NS-MRF and Graph Cuts Wei Feng.

“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)

Why Categorize in Computer Vision ?. Why Use Categories? People love categories!

#MOTION ESTIMATION AND OCCLUSION DETECTION #BLURRED VIDEO WITH LAYERS

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.

INTRODUCTION Heesoo Myeong and Kyoung Mu Lee Department of ECE, ASRI, Seoul National University, Seoul, Korea Tensor-based High-order.

Xu Huaping, Wang Wei, Liu Xianghua Beihang University, China.

The 18th Meeting on Image Recognition and Understanding 2015/7/29 Depth Image Enhancement Using Local Tangent Plane Approximations Kiyoshi MatsuoYoshimitsu.

Face Image-Based Gender Recognition Using Complex-Valued Neural Network Instructor :Dr. Dong-Chul Kim Indrani Gorripati.

Image Segmentation Superpixel methods Speaker: Hsuan-Yi Ko.

A Dynamic Conditional Random Field Model for Object Segmentation in Image Sequences Duke University Machine Learning Group Presented by Qiuhua Liu March.

Feedforward semantic segmentation with zoom-out features

Towards Total Scene Understanding: Classiﬁcation, Annotation and Segmentation in an Automatic Framework N 工科所錢雅馨 2011/01/16 Li-Jia Li, Richard.

Extracting Simple Verb Frames from Images Toward Holistic Scene Understanding Prof. Daphne Koller Research Group Stanford University Geremy Heitz DARPA.

Context Neelima Chavali ECE /21/2013. Roadmap Introduction Paper1 – Motivation – Problem statement – Approach – Experiments & Results Paper 2 Experiments.

Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.

Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov

Learning Hierarchical Features for Scene Labeling

Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,

Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.

Image segmentation.

Gaussian Conditional Random Field Network for Semantic Segmentation

Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.

HFS: Hierarchical Feature Selection for Efficient Image Segmentation

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Announcements Project proposal due tomorrow

Finding Things: Image Parsing with Regions and Per-Exemplar Detectors

References [1] - Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, Gradient-Based Learning Applied to Document Recognition, Proceedings of the IEEE, 86(11): ,

Nonparametric Semantic Segmentation

Saliency detection Donghun Yeo CV Lab..

Adversarially Tuned Scene Generation

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Saliency detection Donghun Yeo CV Lab..

Cascaded Classification Models

Semantic segmentation

Region and Shape Extraction

RCNN, Fast-RCNN, Faster-RCNN

Department of Computer Science Ben-Gurion University of the Negev

Human-object interaction

“Traditional” image segmentation

Semantic Segmentation

Presentation transcript:

Learning Hierarchical Features for Scene Labeling Cle’ment Farabet, Camille Couprie, Laurent Najman, and Yann LeCun by Dong Nie

Outline Background/Motivation Multiscale CNN for feature representation and initial classification Postprocessing: Graph-based classification Majority over super-pixel regions CRF over superpixels Optimal cover of purity tree Experimental Results Discussion

Scene parsing/labeling: definition Scene parsing: labeling each pixel in the image with category of the object to which it belongs Scene parsing is one important step toward image understanding

Questions for scene parsing How to produce a good internal representation of the visual information? How to use contextual information to ensure the self-consistency of the interpretation ? Or end-to-end scene parsing

Scene Parsing: conventional methods Most scene parsing methods based on graph model Presegmentation (superpixels/segment candidates) CRFs/MRFs ensure consistency of labeling tree sky road field car unlabeled building window

Proposed method Scene Parsing Architecture of this system relies on two main components Multiscale deep feature representation Graph model based classification Superpixels CRF over superpixels Multilevel cut with purity tree

Proposed method CRF

Outline Background/Motivation Multiscale CNN for feature representation and initial classification Postprocessing stratigies: Graph-based classification Majority over super-pixel regions CRF over superpixels Optimal cover of purity tree Experimental Results Discussion

Multiscale feature representation for scene parsing Good internel representations are hierarchical CNNs are capable to learn such hierarchies of features Multiscale strategy is adopted to combine short-range and long-range information

Multiscale CNN for scene parsing

Multiscale CNN for feature representation

Outline Background/Motivation Multiscale CNN for feature representation and initial classification Postprocessing: Graph-based classification Majority over super-pixel regions CRF over superpixels Optimal cover of purity tree Experimental Results Discussion

Superpixel methods Superpixel Generation Graph based method Gradient descent based method Graph based by Felzenszwalb et al. Ncut (normalized cut) by Shi et al. Superpixel lattice by Moore et al. Entropy based by Liu et al. Watersheds by Vincent et al. Mean shift by Comaniciu et al. Quick shift by Vedaldi et al. Turbopixels by Levinshtein et al. SLIC by Achanta et al.

Superpixel Pixel-wise prediction may cause noise, we can avoid it by assigning a single label to local regions of same color intensities Felzenszwalb et al, ACM IJCV 2004

Superpixel labeling

Majority over superpixel regions

Outline Background/Motivation Multiscale CNN for feature representation and initial classification Postprocessing: Graph-based classification Majority over super-pixel regions CRF over superpixels Optimal cover of purity tree Experimental Results Discussion

CRF in image labeling Let G = (S, E) be a graph, then (X, L) is said to be a Conditional Random Field (CRF) if, when conditioned on X, the random variables obey the Markov property with respect to the graph: where S-{i} is the set of all sites in the graph except the site i, Ni is the set of neighbors of the site i in G. MRF CRF

CRF over superpixel Superpixl strategy only gives a local assignment, not involve a global understanding of the scene This paper use a CRF to impose consistency and coherency where

CRF over superpixels

Outline Background/Motivation Multiscale CNN for feature representation and initial classification Postprocessing: Graph-based classification Majority over super-pixel regions CRF over superpixels Optimal cover of purity tree Experimental Results Discussion

Why optimal cover of purity tree The observation level problem: An object, or object part, can be easily classified once it is segmented at the right level. The previous two strategies are based on an arbitrary segmentation of the image The proposed optimal cover of purity tree can analyze a family of segmentations and automatically discover the best observation level for each pixel in the image

Hierarchical segmentations Set of components can be very large, this paper adopt hierarchical segmentations to reduce the number of components for a pixel Hierarchical segmentations are generated by method described in [1],[2] Transform the output of any contour detector into a hierarchical region tree. [1]. Contour Detection and Hierarchical Image Segmentation [2]. Geodesic Saliency of Watershed Contours and Hierarchical Segmentation

Hierarchical segmentations

Component cover Represent the component cover with a tree

How to compute purity/Producing confidence cost

Optimal Purity Cover

Optimal cover of purity tree

Proposed method revisit

Outline Background/Motivation Multiscale CNN for feature representation and initial classification Postprocessing: Graph-based classification Majority over super-pixel regions CRF over superpixels Optimal cover of purity tree Experimental Results Discussion

Scene parsing performance Stanford Background Dataset [Gould 1009]: 8 categories

Scene parsing performance SIFT Flow Dataset [Liu 2009]: 33 categories

Scene parsing performance Barcelona dataset [Tighe 2010]: 170 categories

Scene parsing: Stanford dataset

Scene parsing: SIFT flow dataset

Scene parsing: real time From url:

Outline Background/Motivation Multiscale CNN for feature representation and initial classification Postprocessing: Graph-based classification Majority over super-pixel regions CRF over superpixels Optimal cover of purity tree Experimental Results Discussion

Wide contextual window is critical to the quality of scene parsing When a wide context is used, postprocessing is greatly reduced

Discussion Highly complicated postprocessing schemes do not seem to improve the results significantly over simple schemes

Discussion The proposed feed-forward pixel labeling system is dramatically faster

Thank you