Unsupervised Detection of Regions of Interest Using Iterative Link Analysis Gunhee Kim 1 Antonio Torralba 2 1: SCS, CMU 2: CSAIL, MIT Neural Information Processing Systems 2009 November 30, 2009
Unsupervised Detection of ROIs A set of images… Rectangular Regions of Interest
Why Is the ROI Detection Useful ? Scene recognition [Quattoni&Torralba, CVPR09] Training for Recognition [Bosch et al, ICCV07] Flickr Notes
Alternating Optimization One of widely used heuristics for iterative optimization Optimization over two sets of variables is not easy But affordable to optimize one while the other is fixed
Goal: Find correspondences between two sets of point clouds [Besl&McKay,1992] Example – Iterative Closest Point Algorithm Trans- formation Estimate transformation parameters Corres- pondences Associate points by NN criteria
Goal: Clustering Example – K-means Cluster Membership Find nearest cluster center Cluster Centers Take mean Initialization Pictures from Bishop’s book
Goal: Find best ROIs in each image of dataset Unsupervised Detection of ROIs Refine ROIs Detection or Localization Find Examplars Modeling or Ranking examplars Where is butterfly? What are examplars?
Our Approach Inspired by alternating optimization Based on link analysis of hypothesis network. Find Examplars = Central and diverse Hubs Refine ROIs = Highly-ranked Hypotheses in each image wrt examplars Easy, Fast and Dynamic –Simple heuristic for linearity of computation wrt dataset size. –Ex. 4.5 hours / 200k images with naïve matlab implementation.
ROI Candidates and Description For each, define –At least one of would be good Description: Spatial pyramids of visual words and HOG Similarity measure: Cosine similarity An image15 segments43 ROI hypotheses Visual wordsEdge Gradient
Algorithm - Input Image set and its ROI hypothesis set
Algorithm - Initialization Best ROI = Image itself !
Algorithm - Initialization Initialization is essential for the success ! Why is it a feasible idea for Web images ? –Most pictures are taken from a canonical view so that an object of interest is located in a center with significant size. –Given a similarity network of a sufficiently large number of images, democratic voting reveals the most dominant visual information as hubs [Kim et al 08] Examples of top-ranked Images
Algorithm – First Hub Seeking Generate a similarity network and find a hub set
Algorithm – First ROI Refinement Bipartite graph between hub sets and All ROIs of an image
Algorithm – Second Hub Seeking Keep iterating…
Hub Seeking with Centrality & Diversity Mean-shift like hub seeking algorithm Mean Shift [Comaniciu and Meer, PAMI 2002] K-NN similarity matrixPageRank vector G (t) K-NN graph Degree distribution ~ PageRank vector
Hub Seeking with Centrality & Diversity Mean-shift like hub seeking algorithm Max P-value ! Fixed radius window = max. reachable probability d (= 0.1) Mean Shift
ROI Refinement Augmented Bipartite Graph (1-α)W o WoTWoT αW i ROI hypothesisHub setvector ROI hypotheses Hub set PageRank Argmax () i
ROI Refinement What does α do? (1-α)W o WoTWoT αW i α = 0α = 0.1 WoWo WoTWoT
Example - ROI Refinement T=0T=1T=2T=3T=4T=5T=6T=7 T=0 T=1 T=2 T=3 T=4 T=5T=6 T=7
Scalability Setting Bottleneck: Quadratic computation to generate a similarity matrix of selected ROIs If dataset size is too large, –Run the algorithm with N number of images ( N = 10,000) –Re-use x % of previous high-ranked images. Dataset N N N N
Experiments Performance Test –PASCAL VOC 2006 Dataset –Weakly-supervised 1 and Unsupervised 2 Scalability Test –Five objects: {butterfly+insect (69,990), classic+car (265,731), motorcycle+bike (106,590), sunflower (165,235), giraffe+zoo (53,620)} –Weakly-supervised 1 1: Input imageset consists of a single object type (only localization is required) 2: Input imageset consists of multiple object types (localization and clustering are required)
Performance Tests Weakly Supervised Localization (PR-Curves) [Russell et al. CVPR 2006] seg discovery/index.html X-axis: Recall Y-axis: Precision
Performance Tests Unsupervised Classification & Localization X-axis: Recall Y-axis: Precision X-axis: FP rate Y-axis: TP rate ROC Curves PR Curves
Scalability Tests Weakly-supervised Localization X-axis: Recall Y-axis: Precision
Perturbation Tests Robustness of ROI detection of each image against random network formation –100 random sets of size of 200 images Entropy: Dataset An image of interest X-axis: ROI hypotheses Y-axis: Frequencies
Localization Examples
Conclusion Alternating optimization based Unsupervised ROI detection Simple and Fast Competitive performance on PASCAL 06 Scalable Test with more than 200K Flickr images Critic: Analysis for convexity, convergence, sensitivity to initialization, quality of solution
Algorithm