Nonparametric Semantic Segmentation

Slides:



Advertisements
Similar presentations
O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
Advertisements

Automatic Photo Pop-up Derek Hoiem Alexei A.Efros Martial Hebert Carnegie Mellon University.
3rd Workshop On Semantic Perception, Mapping and Exploration (SPME) Karlsruhe, Germany,2013 Semantic Parsing for Priming Object Detection in RGB-D Scenes.
Face Alignment by Explicit Shape Regression
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Constrained Approximate Maximum Entropy Learning (CAMEL) Varun Ganapathi, David Vickrey, John Duchi, Daphne Koller Stanford University TexPoint fonts used.
Segmentation from Examples By: A’laa Kryeem Lecturer: Hagit Hel-Or.
Joint Optimisation for Object Class Segmentation and Dense Stereo Reconstruction Ľubor Ladický, Paul Sturgess, Christopher Russell, Sunando Sengupta, Yalin.
Learning to Combine Bottom-Up and Top-Down Segmentation Anat Levin and Yair Weiss School of CS&Eng, The Hebrew University of Jerusalem, Israel.
Shape Sharing for Object Segmentation
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
LARGE-SCALE IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill road building car sky.
INTRODUCTION Heesoo Myeong, Ju Yong Chang, and Kyoung Mu Lee Department of EECS, ASRI, Seoul National University, Seoul, Korea Learning.
Mixture of trees model: Face Detection, Pose Estimation and Landmark Localization Presenter: Zhang Li.
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
A New Block Based Motion Estimation with True Region Motion Field Jozef Huska & Peter Kulla EUROCON 2007 The International Conference on “Computer as a.
Learning to Detect A Salient Object Reporter: 鄭綱 (3/2)
Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.
Robust and large-scale alignment Image from
Event prediction CS 590v. Applications Video search Surveillance – Detecting suspicious activities – Illegally parked cars – Abandoned bags Intelligent.
LARGE-SCALE NONPARAMETRIC IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill CVPR 2011Workshop on Large-Scale.
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
Optical flow and Tracking CISC 649/849 Spring 2009 University of Delaware.
Image Matching via Saliency Region Correspondences Alexander Toshev Jianbo Shi Kostas Daniilidis IEEE Conference on Computer Vision and Pattern Recognition.
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
Accurate, Dense and Robust Multi-View Stereopsis Yasutaka Furukawa and Jean Ponce Presented by Rahul Garg and Ryan Kaminsky.
Information Retrieval in Practice
Tzu ming Su Advisor : S.J.Wang MOTION DETAIL PRESERVING OPTICAL FLOW ESTIMATION 2013/1/28 L. Xu, J. Jia, and Y. Matsushita. Motion detail preserving optical.
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.
Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.
Medical Imaging Dr. Mohammad Dawood Department of Computer Science University of Münster Germany.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
Markov Random Fields Probabilistic Models for Images
INTRODUCTION Heesoo Myeong and Kyoung Mu Lee Department of ECE, ASRI, Seoul National University, Seoul, Korea Tensor-based High-order.
Effective Optical Flow Estimation
CS654: Digital Image Analysis
Scene Completion Using Millions of Photographs James Hays, Alexei A. Efros Carnegie Mellon University ACM SIGGRAPH 2007.
Associative Hierarchical CRFs for Object Class Image Segmentation
O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
Fully Convolutional Networks for Semantic Segmentation
Features, Feature descriptors, Matching Jana Kosecka George Mason University.
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
SUN Database: Large-scale Scene Recognition from Abbey to Zoo Jianxiong Xiao *James Haysy Krista A. Ehinger Aude Oliva Antonio Torralba Massachusetts Institute.
776 Computer Vision Jan-Michael Frahm Spring 2012.
Learning Hierarchical Features for Scene Labeling
1 Review and Summary We have covered a LOT of material, spending more time and more detail on 2D image segmentation and analysis, but hopefully giving.
Learning Hierarchical Features for Scene Labeling Cle’ment Farabet, Camille Couprie, Laurent Najman, and Yann LeCun by Dong Nie.
Image segmentation.
Scene Parsing with Object Instances and Occlusion Ordering JOSEPH TIGHE, MARC NIETHAMMER, SVETLANA LAZEBNIK 2014 IEEE CONFERENCE ON COMPUTER VISION AND.
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
Course : T Computer Vision
Recent developments in object detection
Finding Things: Image Parsing with Regions and Per-Exemplar Detectors
Markov Random Fields with Efficient Approximations
Saliency detection Donghun Yeo CV Lab..
Paper Presentation: Shape and Matching
ICCV Hierarchical Part Matching for Fine-Grained Image Classification
Adversarially Tuned Scene Generation
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Computer Vision James Hays
Learning to Combine Bottom-Up and Top-Down Segmentation
Saliency detection Donghun Yeo CV Lab..
Brief Review of Recognition + Context
KFC: Keypoints, Features and Correspondences
Papers 15/08.
Optical flow Computer Vision Spring 2019, Lecture 21
Filtering An image as a function Digital vs. continuous images
Human-object interaction
“Traditional” image segmentation
Goodfellow: Chapter 14 Autoencoders
Presentation transcript:

Nonparametric Semantic Segmentation Tackgeun You

Motivation Semantic Segmentation Nonparametric Semantic Segmentation? Object detection + Segmentation Nonparametric Semantic Segmentation? Scalable with “the number of object categories” References Conferences (CVPR 2009) Nonparametric scene parsing: Label transfer via dense scene alignment (ECCV 2010) SuperParsing: Scalable Nonparametric Image Parsing with Superpixel Journal (TPAMI 2011) Nonparametric Scene Parsing via Label Transfer (IJCV 2012) SuperParsing: Scalable Nonparametric Image Parsing with Superpixels

0. “Recognition by Matching” Nonparametric Scene Parsing via Label Transfer Database Of Labeled Images Query Query Image Transferring GT labels By Matching SIFT Merging & Shape Constraint Final Labeling Retrieved Set with Labels Scene Retrieval Transferred Labels for Retrieval Set

1. Scene Retrieval Nonparametric Scene Parsing via Label Transfer Database Of Labeled Images Query Query Image Get {k−NN}∩{𝜖−𝑁𝑁} images in Global features Example : K=5,𝜖=1 𝟏+𝝐 ×𝒅 𝒎𝒊𝒏 Retrieved Set with Labels 1. Superpixel Approaches의 경우에는 k-NN으로만 뽑았는데 여기서는 e-NN까지 기준을 두는 이유입니다. 2개의 기준을 사용해서 뽑는게 좋다는 것을 보여주는 이미지인데요. Scene Retrieval

1. Scene Retrieval Nonparametric Scene Parsing via Label Transfer Database Of Labeled Images Query Query Image Get <k,𝜖>- 𝑁𝑁 images in Global features Space Paper: k=150,𝜖=5 Global features ∈ℝ 5160 GIST (960) Spatial pyramid (4200) Get 𝑀≤𝑘 images by achieved energy of SIFT flow Retrieved Set with Labels 1. Superpixel Approaches의 경우에는 k-NN으로만 뽑았는데 여기서는 e-NN까지 기준을 두는 이유입니다. 2개의 기준을 사용해서 뽑는게 좋다는 것을 보여주는 이미지인데요. Scene Retrieval

2-1. Dense Matching Nonparametric Scene Parsing via Label Transfer Dense Correspondence between images

2-2. Optical Flow Nonparametric Scene Parsing via Label Transfer Brightness consistency assumption 𝐼 𝑥+Δ𝑥, 𝑦+Δ𝑦, 𝑡 =𝐼 𝑥,𝑦,𝑡−1 Approximated into linear by Taylor expansion 𝐼 𝑥+Δ𝑥, 𝑦+Δ𝑦, 𝑡 ≈𝐼 𝑥,𝑦,𝑡 + 𝛻 𝑥 𝐼⋅Δ𝑥+ 𝛻 𝑦 𝐼⋅Δ𝑦 Aperture problem → Using multiple points (image patch!) 𝛻 𝑥 𝐼( 𝒑 1 ) 𝛻 𝑦 𝐼( 𝒑 1 ) ⋯ 𝛻 𝑥 𝐼( 𝒑 𝑛 ) 𝛻 𝑦 𝐼( 𝒑 𝑛 ) Δ𝑥 Δ𝑦 ≈ − 𝛻 𝑡 𝐼 𝒑 1 ⋯ − 𝛻 𝑡 𝐼 𝒑 𝑛 Least-square method 𝒘 = arg min 𝒘 𝑨𝒘−𝒃 2 = 𝑨 𝑻 𝑨 −𝟏 𝑨𝒃 Lucas-Kanade Method Δ𝑥 Δ𝑦 = 𝑖 𝛻 𝑥 𝐼 𝑝 𝑖 2 𝑖 𝛻 𝑥 𝐼 𝑝 𝑖 𝛻 𝑦 𝐼 𝑝 𝑖 𝑖 𝛻 𝑥 𝐼 𝑝 𝑖 𝛻 𝑡 𝐼 𝑝 𝑖 𝑖 𝛻 𝑦 𝐼 𝑝 𝑖 2 −1 𝑖 𝛻 𝑥 𝐼( 𝑝 𝑖 ) 𝛻 𝑡 𝐼 𝑝 𝑖 𝑖 𝛻 𝑦 𝐼( 𝑝 𝑖 ) 𝛻 𝑡 𝐼 𝑝 𝑖 𝛻 𝑥 𝐼(𝒑) 𝛻 𝑦 𝐼(𝒑) Δ𝑥 Δ𝑦 ≈− 𝛻 𝑡 𝐼 𝒑

2-2. Optical Flow Nonparametric Scene Parsing via Label Transfer Brightness consistency assumption 𝐼 𝑥+Δ𝑥, 𝑦+Δ𝑦, 𝑡 =𝐼 𝑥,𝑦,𝑡−1 Approximated into linear by Taylor expansion 𝐼 𝑥+Δ𝑥, 𝑦+Δ𝑦, 𝑡 ≈𝐼 𝑥,𝑦,𝑡 + 𝛻 𝑥 𝐼⋅Δ𝑥+ 𝛻 𝑦 𝐼⋅Δ𝑦 Aperture problem → Using multiple points (image patch!) 𝛻 𝑥 𝐼( 𝒑 1 ) 𝛻 𝑦 𝐼( 𝒑 1 ) ⋯ 𝛻 𝑥 𝐼( 𝒑 𝑛 ) 𝛻 𝑦 𝐼( 𝒑 𝑛 ) Δ𝑥 Δ𝑦 ≈ − 𝛻 𝑡 𝐼 𝒑 1 ⋯ − 𝛻 𝑡 𝐼 𝒑 𝑛 Least-square method 𝒘 = arg min 𝒘 𝑨𝒘−𝒃 2 = 𝑨 𝑻 𝑨 −𝟏 𝑨𝒃 Lucas-Kanade Method Δ𝑥 Δ𝑦 = 𝑖 𝛻 𝑥 𝐼 𝑝 𝑖 2 𝑖 𝛻 𝑥 𝐼 𝑝 𝑖 𝛻 𝑦 𝐼 𝑝 𝑖 𝑖 𝛻 𝑥 𝐼 𝑝 𝑖 𝛻 𝑡 𝐼 𝑝 𝑖 𝑖 𝛻 𝑦 𝐼 𝑝 𝑖 2 −1 𝑖 𝛻 𝑥 𝐼( 𝑝 𝑖 ) 𝛻 𝑡 𝐼 𝑝 𝑖 𝑖 𝛻 𝑦 𝐼( 𝑝 𝑖 ) 𝛻 𝑡 𝐼 𝑝 𝑖

2-3. SIFT flow Nonparametric Scene Parsing via Label Transfer 𝒘( 𝒑 𝟏 ) 𝑣( 𝒑 𝟏 ) Matching points by “SIFT” not “patch” Find the 𝒘 which minimize 𝐸 𝒘 𝐸 𝒘 = 𝒑 min⁡( 𝑠 1 𝒑 − 𝑠 2 𝒑+𝒘 𝒑 1 , 𝑡) + 𝒑 𝜂 𝑢 𝒑 + 𝑣 𝒑 + 𝒑,𝒒 ∈𝜖 min 𝜆 𝑢 𝒑 −𝑢 𝒒 , 𝑑 + min 𝜆 𝑣 𝒑 −𝑣 𝒒 , 𝑑 𝒑,𝒒 ∈𝜖 min 𝜆 𝑢 𝒑 −𝑢 𝒒 , 𝑑 + min 𝜆 𝑣 𝒑 −𝑣 𝒒 , 𝑑 + 𝑢( 𝒑 𝟏 ) 𝒑 𝟏 𝑎 𝑛 = 𝑛 𝑘 𝑎 𝑘 𝑛 𝑥−𝑦 1 𝑥−𝑦 2 Data term Match SIFTs cf. optical flow : 𝒘 = arg min 𝒘 𝑨𝒘−𝒃 2 Small displacement term No information then make 𝒘 𝒑 smaller Smoothness term Make adjacent vectors be similar

3. “Dense” Scene Alignment Nonparametric Scene Parsing via Label Transfer (k) Ground truth

4. Label Transfer Nonparametric Scene Parsing via Label Transfer Find 𝐜 which minimize 𝐽 𝒄 by 𝐽 𝒄 = 𝒑 𝐸 𝐿 𝒄 𝒑 ;𝑠, 𝑠 𝑖 ′ 𝑖=1:𝑀 +𝛼 𝒑 𝐸 𝑃 (𝒄 𝒑 ) +𝛽 𝒑,𝒒 ∈𝜖 𝐸 𝑠 𝒄 𝒑 , 𝒄 𝒒 ;𝐼 + log 𝑍 𝐸 𝐿 𝒄 𝒑 =𝑙;𝑠, 𝑠 𝑖 ′ = min 𝑖∈ Ω 𝒑,𝑙 𝑠 𝒑 − 𝑠 𝑖 (𝒑+𝒘 𝒑 ) , Ω 𝒑,𝑙 ≠∅ &𝜏, Ω 𝒑,𝑙 =∅ Ω 𝒑,𝑙 = 𝑖; 𝑐 𝑖 𝑝+𝑤 𝑝 =𝑙 , 𝑙= 1,⋯,𝐿 𝐸 𝑃 𝒄 𝒑 =𝑙 =− log ℎ𝑖𝑠 𝑡 𝑙 (𝒑) 𝐸 𝑠 𝒄 𝒑 , 𝒄 𝒒 ;𝐼 =𝛿[𝒄 𝒑 ≠𝒄 𝒒 ] 𝜉+ 𝑒 𝛾 𝐼 𝒑 −𝐼 𝒒 2 𝜉+1 Likelihood 𝑐 𝑖 ≠ 𝑐 𝑗 𝛿[𝑥] 𝑐 𝑖 = 𝑐 𝑗 Prior Smooth Query Image/SIFT Result/Query Label Retrieval Image/SIFT/Label SIFT flow Transferred Image/SIFT/Label

0. “Recognition by Matching” Nonparametric Scene Parsing via Label Transfer Database Of Labeled Images Query Query Image Transferring GT labels By Matching SIFT Merging & Shape Constraint Final Labeling Retrieved Set with Labels Scene Retrieval Transferred Labels for Retrieval Set

Merging & Shape Constraint Per-class Likelihoods 0. Overview SuperParsing: Scalable Nonparametric Image Parsing with Superpixel Database Of Labeled Images Query Image Query Superpixel-wise Transferring labels By Visual similarity Merging & Shape Constraint Nearest Scene Retrieval Final Labeling Per-class Likelihoods Retrieved Set with Labels

1. Scene Retrieval SuperParsing: Scalable Nonparametric Image Parsing with Superpixel Retrieve Nearest 200 images in Global feature Space ∈ℝ 5952 GIST (960) Spatial Pyramid (4200) Tiny image (768) Color histogram (24) Database Of Labeled Images Query Image Query Nearest Scene Retrieval Retrieved Set

Per-class Likelihoods 2. Local Superpixel likelihood SuperParsing: Scalable Nonparametric Image Parsing with Superpixel The likelihood ratio for class 𝑐 𝐿 𝑠 𝑖 ,𝑐 = 𝑃( 𝑠 𝑖 |𝑐) 𝑃( 𝑠 𝑖 | 𝑐 ) = 𝑘 𝑃( 𝑓 𝑖 𝑘 |𝑐) 𝑃( 𝑓 𝑖 𝑘 | 𝑐 ) Non-parametric density estimation 𝑃( 𝑓 𝑖 𝑘 |𝑐) 𝑃( 𝑓 𝑖 𝑘 | 𝑐 ) = 𝑛(𝑐, 𝒩 𝑖 𝑘 )/𝑛(𝑐,𝒟) 𝑛( 𝑐 , 𝒩 𝑖 𝑘 )/𝑛( 𝑐 ,𝒟) = 𝑛(𝑐, 𝒩 𝑖 𝑘 ) 𝑛( 𝑐 , 𝒩 𝑖 𝑘 ) × 𝑛( 𝑐 ,𝒟) 𝑛(𝑐,𝒟) Superpixel features (1741 Dimension) Shape (67), Location (65), Texture/SIFT (800), Color (105), Appearance (704) Per-class Likelihoods Count the number of samples 𝒞 𝑖 ={ , } 세부적인 Feature들을 명시하고 왜 이렇게 많은 Feature를 사용하는지 생각해볼 것. 어떤 Feature하나가 크게 다르게 되면 Separation plane이 존재하기 쉽기 때문에.Feature가 여러 개면 Linearly separable에 유리할 것 같다. Non-parametric estimation의 의미와 특징에 대해서 생각해보기. 𝒩 𝑖 𝑘 :All of superpixels in the retrieval set whose 𝑘-th feature distance from 𝑓 𝑖 𝑘 , threshold 𝑡 𝑘 𝑛(𝑐,𝒟):All of superpixels with class 𝑐 in set 𝒟 𝒟:All of superpixels in training set

3. Contextual Inference SuperParsing: Scalable Nonparametric Image Parsing with Superpixel Markov Random Field Given an undirected graph 𝐺=(𝑉,𝐸) 𝑣 𝑖 ∈𝑉~ 𝑋 𝑖 Satisfy Local Markov Property Given energy, it maximizes entropy Clique factorization 𝑃 𝑋 1 ,⋯, 𝑋 𝑛 ={ 𝑥 1 ,⋯, 𝑥 𝑛 } = 𝑪∈𝑐𝑙(𝐺) 𝜙 𝑪 ( 𝑥 𝑪 ) Every two vertices are connected by an edge Find 𝒄 which minimize 𝐽 𝒄 𝐽 𝒄 = 𝑠 𝑖 ∈𝑆 𝐸 𝐿 𝑠 𝑖 , 𝑐 𝑖 +𝜆 𝑠 𝑖 , 𝑠 𝑗 ∈𝐴 𝐸 𝑆 ( 𝑐 𝑖 , 𝑐 𝑗 ) 𝐸 𝐿 𝑠 𝑖 , 𝑐 𝑖 =− 𝑤 𝑖 𝑙𝑜𝑔𝐿( 𝑠 𝑖 , 𝑐 𝑖 ) 𝐸 𝑆 𝑐 𝑖 , 𝑐 𝑗 =− 𝑙𝑜𝑔 𝑃( 𝑐 𝑖 , 𝑐 𝑗 ) ×𝛿 𝑐 𝑖 ≠ 𝑐 𝑗 𝒄 a vector of labels for superpixels 𝑆 a set of superpixels 𝐴 a set of adjacent superpixels 𝒞 𝑖 ={ , } 𝒞 𝑗 ={ , , } Likelihood Smoothing (Edge) Many state-of-the-art approaches encode such constraints with the help of CRF models, However, CRFs tend to be very costly both in terms of learning and inference. Markov Random Field로 이 문제를 정의하는 것을 보여줌. “Data Term”에서 Likelihood가 높은 Label을 선택할 것이라는 것을 알려줌. 이 자체는 Maximum Label을 선택하는 것과 동일함. “Smoothing Term”에서 같은 Label이면 Penalty가 없고 다른 Label이면 Penalty가 있음을 알려줌. 𝑐 𝑖 ≠ 𝑐 𝑗 𝛿[𝑥] 𝑐 𝑖 = 𝑐 𝑗

4. Extend to Geometric Classes SuperParsing: Scalable Nonparametric Image Parsing with Superpixel Semantic Geometric class matching 𝑐 𝑠𝑘𝑦 , 𝑐 𝑔𝑟𝑜𝑢𝑛𝑑 , 𝑐 𝑏𝑢𝑖𝑙𝑑𝑖𝑛𝑔 ↔ 𝑔 𝑣𝑒𝑟𝑡𝑖𝑐𝑎𝑙 𝑐 𝑠𝑘𝑦 ↔ 𝑔 ℎ𝑜𝑟𝑖𝑧𝑜𝑛𝑡𝑎𝑙 Extended MRF energy function to Semantic and Geometric classes Find 𝒄, 𝒈 which maximize 𝐻 𝒄, 𝒈 𝐻 𝒄, 𝒈 =𝐽 𝒄 +𝐽 𝒈 +𝜇 𝑠 𝑖 ∈𝑆𝑃 𝜓( 𝑐 𝑖 , 𝑔 𝑖 ) Many state-of-the-art approaches encode such constraints with the help of CRF models, However, CRFs tend to be very costly both in terms of learning and inference. Clique -> pronoun as “click”! Enforcing coherence of ( 𝑐 𝑖 , 𝑔 𝑖 )

4. Extend to Geometric Classes SuperParsing: Scalable Nonparametric Image Parsing with Superpixel Many state-of-the-art approaches encode such constraints with the help of CRF models, However, CRFs tend to be very costly both in terms of learning and inference. Find 𝒄, 𝒈 which maximize 𝐻 𝒄, 𝒈 𝐻 𝒄, 𝒈 =𝐽 𝒄 +𝐽 𝒈 +𝜇 𝑠 𝑖 ∈𝑆𝑃 𝜓( 𝑐 𝑖 , 𝑔 𝑖 ) Enforcing coherence of ( 𝑐 𝑖 , 𝑔 𝑖 )

Comparison of NSS-algorithms SIFT Flow SuperParsing Scene Retrieval <𝑘, 𝜖>-NN GIST + Spatial Pyramid of HoG SIFT flow scoring k-NN 200 images GIST + Spatial Pyramid of SIFT + Tiny images + Color histogram Prior Sum of all labels in dataset Not use Likelihood Transferred labels by SIFT flow Superpixel-wise Nonparametric density estimation with several features Contextual information (Semantic, Geometric) pair MRF Energy function 𝐸 𝒄 = 𝐸 𝐿 (𝒄) + 𝐸 𝑃 (𝒄) + 𝐸 𝑆 (𝒄) 𝐸 𝒄,𝒈 = 𝐸 𝐿 ( 𝑐 𝑖 ) + 𝐸 𝑆 ( 𝑐 𝑖 ) +𝐽 𝒈 +𝜓(𝒄,𝒈) Optimization Belief Propagation Graph-cut

Discussion - Experiments SIFT Flow Barcelona LMO Polo VOC 2011 Dense Scene Alignment by SIFT flow (2009) 74.75 N/A 74.8 89.8 (2011) SuperParsing: Local Labeling (2010) 73.2 62.5 SuperParsing: MRF (2010) 76.3 66.6 SuperParsing: Joint semantic/geometric (2010) 76.9 66.9 87.9 Semantic Segmentation using Regions and Parts (2012) 40.8 Semantic Segmentation with Second-order Pooling (2012) 47.6 Tensor-based High-order Semantic Segmentation Relation Transfer for Semantic Scene Segmentation (2013) 77.1 94.2 Finding Things: Image Parsing with Regions and Per-Exemplar Detectors (2013) 78.6 Learning Hierarchical Features for Scene Labeling (2013) 78.5 67.8 Rich feature hierarchies for accurate object detection and semantic segmentation(2014) 47.9

Discussion - Datasets

Conclusion Non-parametric approaches Dataset - concentration on specific classes How to design Energy function in MRF Shape Smoothing filter Additional information can be added Geometric context Performance enhancement by 1 ~ 2%

Semantic Segmentation using SO-NMF Human Tree 여기에도 Sparse Constraint를 줄 수는 없을까? … N-times

Future Works Future works Other methods Deep learning features 2nd order pooling Hierarchical Inference Detecting objects Deep learning features Reflect Context information Experiments in large datasets

Thank you for listening. 여기에도 Sparse Constraint를 줄 수는 없을까? Thank you for listening.

3-4. Optimization of SIFT flow Nonparametric Scene Parsing via Label Transfer Matching points by “SIFT” not “patch” Find the 𝒘 which minimize 𝐸 𝒘 𝐸 𝒘 = 𝒑 min⁡( 𝑠 1 𝒑 − 𝑠 2 𝒑+𝒘 𝒑 1 , 𝑡) + 𝒑 𝜂 𝑢 𝒑 + 𝑣 𝒑 + 𝒑,𝒒 ∈𝜖 min 𝜆 𝑢 𝒑 −𝑢 𝒒 , 𝑑 + min 𝜆 𝑣 𝒑 −𝑣 𝒒 , 𝑑 𝒑,𝒒 ∈𝜖 min 𝜆 𝑢 𝒑 −𝑢 𝒒 , 𝑑 + min 𝜆 𝑣 𝒑 −𝑣 𝒒 , 𝑑 If image has 𝒉 𝟐 pixels by Dual-layer loopy Belief Propagation 𝑂 ℎ 8 → 𝑂 ℎ 4 by Coarse-to-fine matching scheme 𝑂 ℎ 4 → 𝑶 𝒉 𝟐 𝒍𝒐𝒈 𝒉

Per-class Likelihoods 2-2. Used Features SuperParsing: Scalable Nonparametric Image Parsing with Superpixel Superpixel features (1741 Dimension) Shape (67), Location (65), Texture/SIFT (800), Color (105), Appearance (704) Per-class Likelihoods 세부적인 Feature들을 명시하고 왜 이렇게 많은 Feature를 사용하는지 생각해볼 것. 어떤 Feature하나가 크게 다르게 되면 Separation plane이 존재하기 쉽기 때문에.Feature가 여러 개면 Linearly separable에 유리할 것 같다. Non-parametric estimation의 의미와 특징에 대해서 생각해보기.

Discussion – Scene Retrieval