Download presentation
Published byGraciela Study Modified over 10 years ago
1
Image Retrieval with Geometry-Preserving Visual Phrases
Yimeng Zhang, Zhaoyin Jia and Tsuhan Chen Cornell University Change to phrases Improve equations Add connection for inverted files and min-hash
2
Similar Image Retrieval
… Image Database Ranked relevant images
3
Bag-of-Visual-Word (BoW)
Length: dictionary size … Images are represented as the histogram of words Similarity of two images: cosine similarity of histograms
4
Geometry-preserving Visual Phrases
length-k Phrase:: k words in a certain spatial layout … Bag of Phrases: (length-2 phrases)
5
Phrases vs. Words Relevant Irrelevant Word Word Length-2 Length-2
6
Previous Works
7
Geometry Verification
Only on top ranked images Searching Step with BoW Encode Spatial Info … Post-processing (Geometry Verification)
8
Modeling relationship between words
Co-occurrences in Entire image [L. Torresani, et al, CVPR 2009] No spatial information Phrases in a local neighborhoods [J. Yuan et al, CVPR07][Z. Wu et al., CVPR10] [C.L.Zitnick, Tech.Report 07] No long range interactions, weak geometry Select a subset of phrases [J. Yuan et al, CVPR07] Discard a large portion of phrases … (length-2 Phrase) Previous works: reduce the number of phrases Dimension: exponential to # of words in Phrase Our work: All phrases, Linear computation time
9
Approach
10
Overview BoP Min-hash Inverted Files Min-hash Inverted Files
Similarity Measure BoP BoW [Zhang and Chen, 09] 2. Large Scale Retrieval This Paper Min-hash Inverted Files Min-hash Inverted Files
11
Co-occurring Phrases Only consider the translation difference A B D C
[Zhang and Chen, 09]
12
Co-occurring Phrase Algorithm
B A 3 2 1 -1 -2 -3 -4 D C # of co-occurring length -2 Phrases: F DF F E F EF B A F 1 +1 =5 A C A F A A B C D A F Offset space E F [Zhang and Chen, 09]
13
Relation with the feature vector
… … same as BOW!!! # of co-occurring length-k phrases Inner product of the feature vectors M: # of corresponding pairs, in practice, linear to the number of local features
14
Inverted Index with BoW
Avoid comparing with every image Inverted Index … … … … … Image ID I1 I2 … In Score +1 Score table
15
Inverted Index with Word Location
… … … Assume same word only occurs once in the same image, Same memory usage as BoW … … … …
16
Compute the Offset Space
Score Table Compute # of Co-occurring Phrases: Compute the Offset Space Image ID I1 I2 … In Score BoW … I1 I2 In BoP
17
Inverted Files with Phrases
Inverted Index I1 I10 … … 0,-1 1,-1 -1,-1 -1,0 0,0 1,0 I8 … … 0,1 … wi I5 … Size of blue box … +1 +1 +1 +1 Offset Space … … … …
18
Final Score … I1 I2 In I1 I2 … In Final similarity scores 5 4 2 Offset
Space 8 3 … 2 2 10 2 1 1 Image ID I1 I2 … In Score Final similarity scores
19
Overview BoP Min-hash Inverted Files Min-hash Inverted Files BoW
Min-hash has been proposed to improve the inverted files Min-hash Inverted Files Min-hash Inverted Files Less storage and time complexity
20
Min-hash with BoW Probability of min-hash collision (same word) = Image Similarity I I’
21
Min-hash with Phrases Offset space 3 2 1 -1 -2 -3 -4 I I’ Probability of k min-hash collision with consistent geometry (Details are in the paper)
22
Increase the memory usage
Other Invariances Add dimension to the offset space Increase the memory usage Image I Image I’ [Zhang and Chen, 10]
23
Variant Matching Local histogram matching
24
Evaluation BoW + Inverted Index vs. BoP + inverted Index
BoW + Min-hash vs. BoP + Min-hash Post-processing methods: complimentary to our work
25
Experiments –Inverted Index
5K Oxford dataset (55 queries) 1M flicker distracters Philbin, J. et al. 07
26
Example Precision-recall curve
BoP BoW Recall Precision BoP Recall Precision BoW BoW Change figures to raw figures Consistent for figures Higher precision at lower recall
27
Comparison Mean average precision: mean of the AP on 55 queries
BoP+RANSAC BoP BoW BoW+RANSAC Outperform BoW (similar computation) Outperform BoW+RANSAC (10 times slower on 150 top images) Larger improvement on smaller vocabulary size
28
+Flicker 1M Dataset Computational Complexity Method Memory
Number of image (K) Method Memory Runtime (seconds) Quantization Search BoW 8.1G 0.89s 0.137s BoP 8.5G 0.215s BoW+RANSAC - 4.137s RANSAC: 4s on top 300 images
29
Experiment - min-hash University of Kentucky dataset
Minhash with BoW: [O. Chum et al., BMVC08]
30
Conclusion Encode more spatial information into the BoW
Can be applied to all images in the database at the searching step Same computational complexity as BoW Better Retrieval Precision than BoW+RANSAC
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.