Download presentation
Presentation is loading. Please wait.
Published byEllen Jordan Modified over 9 years ago
1
BEYOND SLIDING WINDOW: Object Localization by Efficient Subwindow Search Christoph H. Lampert, Matthew B. Blaschko, and Thomas Hofmann
2
M OTIVATIONS To localize the object without exhaustive search observation : often, only a small portion of the image contains the object of interest To find a global optimum in a huge search space Object detection and retrieval
3
C ONTRIBUTIONS Efficient (n^2 VS n^4) n^4 rectangles for an image n X n n X n possible centers n possible choices for width & n for height n^4 rectangles Optimal Versatile arbitrary objects VS simple parametric objects in line drawings [4] flexible in the choice of the cost function VS L2 distance [13] Challenge To find optimal and tight bounds
4
B RANCH AND B OUND first proposed by A. H. Land and A. G. Doig in 1960 for linear programming a “divide and conquer” approach to optimize some cost function f(x) recursively branching & bounding split S into subsets Si that min( f(x) ) = min(vi) compute the lower & upper bounds of f(x) within Si pruning
5
M ETHODOLOGY Cost function Parameter space Bounds
6
B EST F IRST
7
B OUNDING I a bag of visual words for non-rigid objects histograms of SIFT prototypes SVM decision function bounds get the maximal amount of + and minimal amount of – integral image makes evaluation O(1),
8
R ESULTS PASCAL VOC 06 5,304 images with 9,507 objects from 10 categories 1000 visual words from 50,000 SURF descriptors claim a match when > 50% overlap between the detected bounding box and the ground truth PASCAL VOC 2007 9,963 images with 24,640 objects
9
R ESULTS
10
E VALUATION
11
S PEED 40ms per image on a 2.4 GHz PC
12
B OUNDING II spatial pyramid for rigid objects histograms with spatial information Extensions with ESS (fine-grained pyramids) SVM decision function
13
R ESULTS UIUC Car database (side-view, one car per image) 1050 training (550 positive images) 277 test (170 single scale + 107 multi scale) 1000 visual words from 50,000 SURF descriptors
15
I MAGE PART RETRIEVAL query-by-example localized similarity measure bounds
16
R ESULTS 10143 keyframes of a movie return 100 most relevant images for a query 2s per returned image
17
C ONCLUSIONS high speed with global optimum can be extended to multi-detections, other shapes, different cost functions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.