Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008
Image Classification
Support Vector Machines SVM constructs a separating hyperplane in multi-dimensional space, one that maximizes the margin between two data sets, true and false positives
SVM: Localization problem SVM answers ‘Yes’ or ‘No’ to whether the objects belongs to the classifier’s object class as well as returns confidence score It cannot say where the object is located in the image and at what scale
SVM Object Localization Methods Exhaustive Search. For n x n image complexity is O(n 4 ) Sliding Window Approach
Branch–and–Bound Scheme Branching. Dividing a space of candidate rectangles into subspaces Bounding. Pruning subspaces with a highest possible score lower than some guaranteed score in other subspaces
Bounding function To use branch-and-bound for given quality function f, we need to define upper bound function
Example I. Bag of visual words SVM For every image Extract SIFT image descriptors Quantize descriptors using K-entry codebook of descriptors Represent an image by a histogram of codebook entry occurences every image is coded as 1-dimensional vector h of length K where K is the number of codebook ‘words’
Example I. Bounding function SVM Decision function: We can express it as a sum of per-point contributions with weights If we denote by R max the largest rectangle and by R min the smallest rectangle contained in a parameter region R, then
Example I. Bounding function For example, For an image of 5 feature points coded by 5 – word codebook h = [2, 0, 1, 0, 2] Then, = + due to linearity of the scalar product
Example II. Spatial Pyramid Kernel SVM
similarity function: decision function: l = 1.. L: for every pyramid level i = 1.. l, j = 1.. l: for every bin in the pyramid k = 1.. N: for every image in training set
Example II. Bounding function Convert decision function to a sum of per-point contributions The upper bound for f is obtained by summing the bounds for all levels and cells
Possible Applications Retrieve images from databases based on the queries that match only part of an image..
Experiments
Precision – Recall curves