Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun CVPR 2009.

Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun CVPR 2009

Outline Introduction Bundled features Image Retrieval using bundled feature Experiments and results Conclusion

Target Given a query image, is to locate its near- and partial-duplicate images in a large corpus of web images.

Unlike object-based image retrieval

State-of-the-art Visual word(quantization) & scalable textual index retrieval schemes Post-processing – Geometric verification Bundled feature – Weak geometric verification Bundled feature = SIFT + SMER

MSER Maximally Stable Extremal Region

Bundled features

Discriminative power Increase discriminative power – Feature region size – Feature dimensionality Drawbacks – Less repeatable – Localization accuracy – Sensitive to occlusion, photometric, geometric

Matching bundled features

Bundled features

Advantage More discriminative Allowed to have large overlap error – Partially match Robust – Occlusion – Geometric changes – …etc

Feature quantization Hierarchical k-means – One million visual words from 50K training images

Feature quantization K-D tree – pointList = [(2,3), (5,4), (9,6), (4,7), (8,1), (7,2)]

Matching bundled features

Inverted-file index Documents – T 0 = "it is what it is" – T 1 = "what is it" – T 2 = "it is a banana" Index – "a": {2} – "banana": {2} – "is": {0, 1, 2} – "it": {0, 1, 2} – "what": {0, 1}

Indexing and retrieval Support – 512 bundled features each image – 32 visual word each bundled feature

Indexing and retrieval Voting

Indexing and retrieval tf – 100 vocabularies in a document, ‘a’ 3 times – 0.03 (3/100) idf – 1,000 documents have ‘a’, total number of documents 10,000,000 – 9.21 ( ln(10,000,000 / 1,000) ) if-idf = 0.28( 0.03 * 9.21)

Dataset Basic dataset – One million images most frequently clicked in a popular commercial image-search engine – (50K, 200K, 500K) Ground truth – Manually labeled 780 partial-duplicate web image form 19 groups. – Evaluation dataset = basic dataset + ground truth Query – 150 images from ground truth

mAP Mean average precision EX: – two images A&B – A has 4 duplicate images – B has 5 duplicate images – Retrieval rank A: 1, 2, 4, 7 – Retrieval rank B: 1, 3, 5 – Average precision A = (1/1+2/2+3/4+4/7)/4=0.83 – Average precision B = (1/1+2/3+3/5+0+0)/3=0.45 – mAP= (0.83+0.45)/2=0.64

Evaluation Baseline – Bag-of-features approach with soft assignment[13] [13] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In CVPR, 2008.

Evaluation Compare(HE) – enhance the with hamming embedding [3] by adding a 24-bit hamming code to filter out target features. [3] H. Jegou, M. Douze, and C. Schmid. Hamming embedding and weak geometric consistency for large scale image search. In ECCV, 2008.

Evaluation baseline0.35 to Bundled(mem)0.40 a 14% improvement baseline0.35 to Bundled 0.49 a 40% improvement baseline0.35 to Bundled+HE0.52 a 49% improvement

Evaluation Compare(Re-ranking) – Full geometric verification, RANSAC for top 300 candidate images

Evaluation Baseline+re-rank 0.50 to Bundled+re-rank 0.62 a 24% improvement Baseline 0.35 to Bundled+re-rank 0.62 a 77% improvement

Evaluation Trade-off Run time – a single CPU on a 3.0GHz Core Duo desktop with 16G memory

Sample results AP from 0.51 to 0.74 a 45% improvement

Sample results

Bundled features for large scale partial- duplicate web image search. Bundled features property – More discriminative than individual SIFT features. – Simple and robust geometric constraints – Partially match two groups of SIFT features Advantage – Robustness to occlusion, photometric and geometric changes

Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun CVPR 2009.

Similar presentations

Presentation on theme: "Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun CVPR 2009."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun CVPR 2009.

Similar presentations

Presentation on theme: "Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun CVPR 2009."— Presentation transcript:

Similar presentations

About project

Feedback