Accounting for the relative importance of objects in image retrieval Sung Ju Hwang and Kristen Grauman University of Texas at Austin
What can the images tags tell us? Images tagged with keywords clearly tell us which objects to search for Dog Black lab Jasper Sofa Self Living room Fedora Explore #24 Nowadays, images with tags are becoming more and more common, as user community photo sites such as
Content-based image retrieval Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.
Retrieving images with similar visual features Visual features do not always correspond to a semantic object, or a concept Proximity in the visual feature space does not directly mean that two images will be semantically similar Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.
Semantic retrieval of the images By mapping each images to the semantic space using labels, we can associate the visual features with the semantics This semantic space does not know which objects are more important than others Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names. Tree, Grass, Cow Tree, Grass, Train
Our Idea: learning the semantic space with objects importance People expect the retrieved images to contain similar semantics as the query image What’s the point in finding images with similar background? Should find images that contain similar main objects Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.
Related work Hardoon~ Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.
Kernel Canonical Correlation Analysis Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.
Performance Evaluation Normalized Discounted Cumulative Gain at top k. - A good matching made in earlier rank will have more effect in the matching score. - A perfect ranking will have the score 1 Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.
Image-to-image retrieval results Retrieval performance is measured by the similarity of the ground-truth bounding box labels Retrieval performance is measured by the similarity of the tag lists Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.
Retrieval results Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.
More retrieval results Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.
Tag-to-image retrieval results Our method achieved 20% better accuracy than the word+visual baseline Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.
Image-to-tag auto annotation results Dataset PASCAL VOC 2007 Method K=1 K=3 K=5 K=10 Visual-only 0.0826 0.1765 0.2022 0.2095 Word+Visual 0.0818 0.1712 0.1992 0.2097 Ours 0.0901 0.1936 0.2230 0.2335 Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.
Conclusion Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.