Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.

Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin

Supervised learning methods yield good recognition performance in practice. But… Supervision is Expensive – collect training examples, perform labeling, segmentation, etc. Supervision has Bias – variability of the target data may not be captured (i.e., not general enough) We propose an Unsupervised Foreground Detection and Category Learning method based on image clustering

Related Work Unsupervised Category Discovery – Topic models: pLSA, LDA - Fergus et al., Sivic et al., Quelhas et al., ICCV 2005, Fei-Fei & Perona, CVPR 2005, Liu & Chen, ICCV 2007 – Image Clustering - Grauman & Darrell, CVPR 2006, Dueck & Frey, ICCV 2007 – Image Clustering with localization - Kim et al., CVPR 2008 Supervised Feature Selection / Part Discovery – Discriminative Feature Selection - Dorko & Schmid, ICCV 2003, Quack et al., ICCV 2007 – Weakly Supervised Learning - Weber et al., ECCV 2000, Fergus et al., CVPR 2003, Chum & Zisserman, CVPR 2007… – Query Expansion - Chum et al., ICCV 2007

Problem Clusters formed from full image matches

Mutual Relationship between Foreground Features and Clusters If we have only foreground features, we can form good clusters… Clusters formed from full image matches Clusters formed from foreground matches

Mutual Relationship between Foreground Features and Clusters If we have good clusters, we can detect the foreground…

Mutual Relationship between Foreground Features and Clusters If we have good clusters, we can detect the foreground… If we have only foreground features, we can form good clusters…

Our Approach Unsupervised task that iteratively seeks the mutual support between discovered objects and their defining features Update cluster based on weighted semi-local feature matches Refine feature weights given current clusters Feature index Feature weights

Sets of local features X = {(f 1 (X),w 1 ),(f 2 (X),w 2 ),…,(f n (X),w n )}Y = {(f 1 (Y),w 1 ),(f 2 (Y),w 2 ),…,(f m (Y),w m )}

Optimal Partial Matching X = {(f 1 (X),w 1 ),(f 2 (X),w 2 ),…,(f n (X),w n )}Y = {(f 1 (Y),w 1 ),(f 2 (Y),w 2 ),…,(f m (Y),w m )} Earth Mover’s Distance [Rubner et al., IJCV 2000] : : scalars giving the amount of weight mapped from : features from sets : distance between the descriptors X and Y,,

Feature Contribution to Match X Y f 1 (X) f 2 (X) f 3 (X) f 1 (Y) f 2 (Y) D(f i (X), f j (Y) )

Feature Contribution to Match Feature index X Y f 1 (X) f 2 (X) f 3 (X) f 1 (Y) f 2 (Y) D(f i (X), f j (Y) ) Contribution to Match Weight computation is influenced by both the flow (amount of mass transferred) and distance between the matching features: Contribution = weight / distance

Feature Contribution to Match Feature index Weight computation is influenced by both the flow (amount of mass transferred) and distance between the matching features: Contribution = weight / distance X Y f 1 (X) f 2 (X) f 3 (X) f 1 (Y) f 2 (Y) Contribution to Match

Feature Contribution to Match Feature index X Y f 1 (X) f 2 (X) f 3 (X) f 1 (Y) f 2 (Y) Contribution to Match Weight computation is influenced by both the flow (amount of mass transferred) and distance between the matching features: Contribution = weight / distance

Feature Contribution to Match Feature index Contribution to Match X Y f 1 (X) f 2 (X) f 3 (X) f 1 (Y) f 2 (Y) Weight computation is influenced by both the flow (amount of mass transferred) and distance between the matching features: Contribution = weight / distance

Computing Feature Weights feature index contribution to match

Computing Feature Weights new feature weights

feature weights Computing Image Similarity : Matching features have high weights and high similarity  High contribution to match score

feature weights Computing Image Similarity : Matching features have low weights and low similarity  low (negligible) contribution to match score

feature weights Computing Image Similarity Matching features have low and high weights and high similarity. The amount of weight that is matched is always the smaller of the two feature weights.  Low contribution to match score :

Forming Clusters

Compute Pair-wise Partial Matching Image Similarities

Forming Clusters Normalized Cuts Clustering

Mutual Relationship between Foreground Features and Clusters If we have good clusters, we can detect the foreground… If we have only foreground features, we can form good clusters… Now we have the pieces to do both…

Cluster and Feature Weight Refinement: Iteration 1 Feature index Feature weights Images as Local Feature Sets Pair-wise Partial Matching Normalized Cuts Clustering Initial Set of Clusters

Cluster and Feature Weight Refinement: Iteration 1 Feature index Feature weights Compute Feature Weights New Feature Weights

Cluster and Feature Weight Refinement: Iteration 2 Feature index Feature weights Images as Local Feature Sets w/ New Weights Pair-wise Partial Matching Noticeable Change in Matching Normalized Cuts Clustering

Cluster and Feature Weight Refinement: Iteration 2 Feature index Feature weights New Set of Clusters Compute Feature Weights New Feature Weights

Cluster and Feature Weight Refinement: Iteration 3 Feature index Feature weights Pair-wise Partial Matching + Normalized Cuts Final Set of Clusters New Feature Weights

Semi-local features: Lazebnik et al., BMVC 2004, Sivic & Zisserman, CVPR 2004, Agarwal & Triggs, ECCV 2006, Pantofaru et al., Beyond Patches Wkshp 2006, Quack et al., ICCV 2007 Local features may not produce good matches… Local features: Our proximity distribution descriptor:

Experiments Goals: 1)Unsupervised Foreground Discovery 2)Unsupervised Category Discovery 3)Comparison with Related Methods Datasets: Caltech-101, Microsoft Research Cambridge, Caltech-4 Semi-local Features: Densely sampled SIFT, DoG SIFT, Hessian-Affine SIFT Number of Clusters: # of Classes

Quality of Foreground Detection Object categories with highest clutter were chosen – 2 supervised classifiers built: 1) trained on all features, 2) trained on foreground features – Ranked categories for which segmentation most helped supervised classification

Quality of Foreground Detection 10-classes subset - highly weighted features

Quality of Clusters Formed Cluster quality for the 4-classes and 10-classes sets of Caltech-101 Quality Measure: F-measure Black dotted lines indicate the best possible quality that could be obtained if the ground truth segmentation were known

Comparison with clustering methods Affinity Propagation: message passing algorithm which identifies good exemplars by propagating non-metric affinities [Dueck & Frey, ICCV 2007] Partial Match Clusters: forms groups with partial-match spectral clustering but does not iteratively improve foreground feature weights and cluster assignments [Grauman & Darrell, CVPR 2006] Caltech-4 dataset (N=3188), 10 runs with 400 randomly selected images Caltech-101 subsets: 7-class (N=441) and 20- class (N=1230)

Comparison with topic models Comparison of accuracy of foreground discovery Positive Class: Caltech motorcycle class (826 images) Negative Class: Caltech background class (900 images) Foreground detection rate: threshold varied among top 20% most confident features [1] correspondence-based pLSA variant - [Liu & Chen, ICCV 2007] [2] pLSA with spatial information - [Liu & Chen, CVPR wkshop, 2006]

Assumptions and Limitations Support of the pattern among multiple examples in the dataset Some support must be detected in the initial iteration Background can be consistently reoccurring: introduce semi-supervision

Contributions Unsupervised foreground feature selection from unlabeled images Automatic object category learning Mutual reinforcement of foreground and category discovery benefits both Novel semi-local descriptor

Future Work Incremental updates to unlabeled dataset Extension to multi-label cluster assignments Automatic Model Selection: k Automatically construct summaries of unstructured image collections

Questions?

Quality of Foreground Detection and Clusters Formed Microsoft Research Cambridge (MSRC)–v1 dataset

Proximity Distribution Descriptor p: base feature Ellipses denote features, their patterns indicate the visual word types, numbers indicate rank order of spatial proximity to the base feature Motivated by Proximity Distribution Kernels [Ling & Soatto, ICCV 2007]

Computing Feature Weights new feature weights

Computing Feature Weights new feature weights Normalization to keep original weight

- highly weighted features [Face, Dalmatian, Hedgehog, Okapi]

[9] Affinity Propagation - [Dueck & Frey, ICCV 2007] FF-Dense: Foreground Focus with semi-local descriptors (dense SIFT base features) FF-Sparse: Foreground Focus with semi-local descriptors (DoG SIFT base features) FF-SIFT: Foreground Focus with DoG SIFT features

Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.

Similar presentations

Presentation on theme: "Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.

Similar presentations

Presentation on theme: "Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin."— Presentation transcript:

Similar presentations

About project

Feedback