Download presentation
Presentation is loading. Please wait.
1
Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun CVPR 2009
2
Outline Introduction Bundled features Image Retrieval using bundled feature Experiments and results Conclusion
3
Outline Introduction Bundled features Image Retrieval using bundled feature Experiments and results Conclusion
4
Target Given a query image, is to locate its near- and partial-duplicate images in a large corpus of web images.
5
Unlike object-based image retrieval
6
State-of-the-art Visual word(quantization) & scalable textual index retrieval schemes Post-processing – Geometric verification Bundled feature – Weak geometric verification Bundled feature = SIFT + SMER
7
Outline Introduction Bundled features Image Retrieval using bundled feature Experiments and results Conclusion
8
MSER Maximally Stable Extremal Region
9
MSER
10
Bundled features
11
Discriminative power Increase discriminative power – Feature region size – Feature dimensionality Drawbacks – Less repeatable – Localization accuracy – Sensitive to occlusion, photometric, geometric
12
Matching bundled features
13
Bundled features
14
Advantage More discriminative Allowed to have large overlap error – Partially match Robust – Occlusion – Geometric changes – …etc
15
Outline Introduction Bundled features Image Retrieval using bundled feature Experiments and results Conclusion
16
Feature quantization Hierarchical k-means – One million visual words from 50K training images
17
Feature quantization K-D tree – pointList = [(2,3), (5,4), (9,6), (4,7), (8,1), (7,2)]
18
Matching bundled features
19
Matching bundled features
20
Inverted-file index Documents – T 0 = "it is what it is" – T 1 = "what is it" – T 2 = "it is a banana" Index – "a": {2} – "banana": {2} – "is": {0, 1, 2} – "it": {0, 1, 2} – "what": {0, 1}
21
Indexing and retrieval Support – 512 bundled features each image – 32 visual word each bundled feature
22
Indexing and retrieval Voting
23
Indexing and retrieval tf – 100 vocabularies in a document, ‘a’ 3 times – 0.03 (3/100) idf – 1,000 documents have ‘a’, total number of documents 10,000,000 – 9.21 ( ln(10,000,000 / 1,000) ) if-idf = 0.28( 0.03 * 9.21)
24
Outline Introduction Bundled features Image Retrieval using bundled feature Experiments and results Conclusion
25
Dataset Basic dataset – One million images most frequently clicked in a popular commercial image-search engine – (50K, 200K, 500K) Ground truth – Manually labeled 780 partial-duplicate web image form 19 groups. – Evaluation dataset = basic dataset + ground truth Query – 150 images from ground truth
26
mAP Mean average precision EX: – two images A&B – A has 4 duplicate images – B has 5 duplicate images – Retrieval rank A: 1, 2, 4, 7 – Retrieval rank B: 1, 3, 5 – Average precision A = (1/1+2/2+3/4+4/7)/4=0.83 – Average precision B = (1/1+2/3+3/5+0+0)/3=0.45 – mAP= (0.83+0.45)/2=0.64
27
Evaluation Baseline – Bag-of-features approach with soft assignment[13] [13] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In CVPR, 2008.
28
Evaluation Compare(HE) – enhance the with hamming embedding [3] by adding a 24-bit hamming code to filter out target features. [3] H. Jegou, M. Douze, and C. Schmid. Hamming embedding and weak geometric consistency for large scale image search. In ECCV, 2008.
29
Evaluation baseline0.35 to Bundled(mem)0.40 a 14% improvement baseline0.35 to Bundled 0.49 a 40% improvement baseline0.35 to Bundled+HE0.52 a 49% improvement
30
Evaluation Compare(Re-ranking) – Full geometric verification, RANSAC for top 300 candidate images
31
Evaluation Baseline+re-rank 0.50 to Bundled+re-rank 0.62 a 24% improvement Baseline 0.35 to Bundled+re-rank 0.62 a 77% improvement
32
Evaluation Trade-off Run time – a single CPU on a 3.0GHz Core Duo desktop with 16G memory
33
Sample results AP from 0.51 to 0.74 a 45% improvement
34
Sample results
36
Outline Introduction Bundled features Image Retrieval using bundled feature Experiments and results Conclusion
37
Bundled features for large scale partial- duplicate web image search. Bundled features property – More discriminative than individual SIFT features. – Simple and robust geometric constraints – Partially match two groups of SIFT features Advantage – Robustness to occlusion, photometric and geometric changes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.