1 Segmentation with Scene and Sub-Scene Categories Joseph Djugash Input Image Scene/Sub-Scene Classification Segmentation
2 Problem Statement Goal: Accurate segmentation of salient objects/regions in any image. Problems/Issues: What is a salient object(s)? How do we identify the presence of these objects? What metric/feature/cue do we use? Is this consistent over all images?
3 Outline Normalized Cut Learning segmentation Methods The Naïve Approach Affinity Matching Minimizing Segmentation Error Discussion of Results
4 Bottom-Up Segmentation Normalized Cuts, Mean Shift, etc. Drawbacks Parameters Number of Clusters/Segments Cluster Size (Sigma values) Performance drops when dealing with a wide variety of images 1. Jianbo Shi; Malik, J. Normalized cuts and image segmentation. PAMI (2000) 2. D. Comaniciu, P. Meer. Mean Shift: A Robust Approach toward Feature Space Analysis. PAMI (2002)
5 Class Specific Segmentation Image patches from training images are fit to the input image Patch consistency checks correct inaccurate matching and segmentation Only known object classes can be segmented correctly Large database required to encompass all possible objects 3. Eran Borenstein, Shimon Ullman. Class-Specific, Top-Down Segmentation. ECCV (2002)
6 Outline Normalized Cut Learning segmentation Methods The Naïve Approach Affinity Matching Minimizing Segmentation Error Discussion of Results
7 The Naïve Approach Scene Categories Parameter Learning Parameter Look-Up using Bag- Of-Words Scene Categories Training Phase Image Segmentation Testing Phase Learnt Parameters: Cluster Range {min & max # of clusters} Sigma Values {controls size of cluster/segment} The parameters are learnt using a supervised reinforcement learning algorithm (Feedback +/ o /–)
8 Results: The Naïve Approach
9 Outline Normalized Cut Learning segmentation Methods The Naïve Approach Affinity Matching Minimizing Segmentation Error Discussion of Results
10 Affinity Matching Parameter Learning Nearest Neighbor Classification& Parameter Look-Up Training Phase Image Segmentation Testing Phase Parameters Learnt using reinforcement learning Nearest Neighbor Matching performed on a set of stored Affinity matrices Query images are classified and learnt parameters are used to Segments Image
11 Results: Affinity Matching
12 Outline Normalized Cut Learning segmentation Methods The Naïve Approach Affinity Matching Minimizing Segmentation Error Discussion of Results
13 Minimizing Segmentation Error LabelMe Data Gist Features Kmeans Clustering Image Segmentation Training Phase Parameter Learning Minimizing Segmentation Cost to Human Labeled Data Parameter Learning Minimizing Segmentation Cost to Human Labeled Data Segmentation Error J = i ( J N + J S + J O ) Similarity to Non-Objects J N = j W( px In, px Out ) Cluster Cost J S = j e ( γ *|#Seg – #Obj|) i є Images in Cluster j є Labeled Objects Testing Phase Dissimilarity within Objects J N = j e ( ) px In W( px In, px In ) 4. A. Torralba, K. P. Murphy, W. T. Freeman and M. A. Rubin. Context-based vision system for place and object recognition. AIM (2003) Object Pixels
14 Results: Minimizing Segmentation Error
15 Results: Minimizing Segmentation Error
16 Discussion The Naïve Approach Uses intuitive categories that humans can identify with Human feedback can be unreliable and inconsistent Broad category labels lead to segmentation inaccuracy Affinity Matching Affinity matrix provides a good feature space for segmentation Closeness in the affinity matrix might not necessarily imply similarity in segmentation parameters Minimizing Segmentation Error A qualitative measure for evaluating the validity of a segmentation Human segmentation should try to realize image cues NCut segmentation prefers edge contours in the image
17 Questions?
18 Outline – Detailed The Naïve Approach Reinforcement Learning Bag-of-words scene classification Affinity Matching NN classification on Affinity Matrix Learning similar image structure Human Segmentation & Minimizing Cost LabelMe Data Gist Features
19
20 Learning Scene Categories – Bag-of-Words Scene Categories learnt from clustering on image codewords Feature Detection and Representation can be done using various Image Cues Codewords stored in the database need to cover a wide spectrum 3. Fei-Fei and P. Perona. A Bayesian hierarchical model for learning natural scene categories. CVPR (2005)