Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch
Overview CBIR – a problem of image understanding? Approaches to feature weighting The NN k technique for retrieval Getting connected: NN k networks for browsing Future work
Challenges: Semantic gap What is the relationship between low-level features and image meaning?
Challenges: Image polysemy one image - multiple meanings
Feature Weights
Approaches to feature weighting (1) Post retrieval relevance feedback effectiveness relies on a good first retrieval result useful for fine-tuning weights
Approaches to feature weighting (2) SVM Metaclassifier ( Yavlinsky et al., ICASSP 2004 ) Given a set of queries and ground truth, for each query: sample at random m positive and negative images and for each build a score vector consisting of the feature- specific similarities between that image and the query Use an SVM to obtain a hyperplane that separates positive and negative score vectors with least error For an unseen image, compute similarity to the query as the distance of its score vector to the hyperplane.
Approaches to feature weighting (2) The distance of a vector to a hyperplane is a weighted sum of the vector’s components, which is just the aggregation formula shown to you previously. The hyperplane represents a set of feature weights that maximise the expected mean average precision
Approaches to feature weighting (2) ~ 300 score vectors needed to establish near-optimal weights for a subset of the Corel collection on average (6192 images) No query-specific weight optimization
Approaches to feature weighting (3) Query-specific optimization (Aggarwal et al., IEEE Trans. Multimedia 2002) Modify query representation along each feature axis and regenerate modified query, ask user whether new query image is still relevant Interesting idea but limited applicability in practice
NN k retrieval
The idea of NN k – a two-step approach 1.Retrieve with all possible weight sets -> returns a set of images (NN k ) each associated with a particular weight 2.Retrieve with the weights associated with the relevant images the user selects
The first retrieval step: finding the NN k For each feature combination w, determine nearest neighbour of the query Record for each nearest neighbour the proportion of w for which it came top as well as the average of these w NN for nearest neighbour, k for the dimensionality of the weight space (= length of the weight vector w)
F2 F1 F3 The first retrieval step: finding the NN k 1 1 1
With fixed number of grid points, time complexity is exponential in the number of features (k) Useful theorem: if for any two weight sets w 1 and w 2 that differ only in two components the top ranked image is the same, then this image will be top ranked for all linear combinations of w 1 and w 2
The first retrieval step: finding the NN k
Visualization of NN k
Retrieve with each weight set in turn Merge ranked lists The second retrieval step:
Comparison of NN k with two post-retrieval methods for weight-update 1. Our own: minimize 2. Rui’s method (Rui et al., 2002) Performance evaluation
Corel Gallery 380,000 Package Given a subset of images, treat each image as a query in turn and retrieve from the rest For RF: retrieve with equal weight sets, gather relevance data and retrieve with new weight set For NN k : determine NN k, gather relevance data and retrieve with new weight sets Determine MAP after second retrieval
Performance evaluation
NN k networks
Network Construction Vertices represent images An arc is established between two images X and Y, iff there exist at least one instantiation of the weight vector w, for which Y is the nearest neighbour of X Record for each nearest neighbour the proportion of w, for which it came top -> edge weight, measure of similarity Storage: for each image, its nearest neighbours and their frequencies
Rationale exposure of semantic richness user decides which image meaning is the correct one network precomputed -> interactive supports search without query formulation
Graph topology: small world properties small average distance between any two vertices (three nodes for images) high clustering coefficient: an image‘s neighbours are likely to be neighbours themselves CorelSketchesVideo k547 Vertices6, ,318 Edges150,7761,8221,253,076 C(G) C rand (G) Dist Dist rand
Graph topology: scale-freeness Degree distribution follows power-law
Image Browsing Initial display: retrieval result using search-by-example OR cluster display using Markov-Chain Clustering (MCL) technique (van Dongen, 2000) Clicking on an image displays all adjacent vertices in the network Distance inversely proportional to edge weight
Evaluation of NN k networks: TRECVID2003 search collection: keyframes from news videos 24 topics: example images + text Four system variants: Search + Relevance Feedback + Browsing Search + Relevance Feedback Search + Browsing Browsing
Future work
User interaction What can we infer from the history of images in the user‘s search path? The location of the target? Changing information needs?
Network construction and analysis Hierarchical sampling of points in weight space Incremental update of network while preserving small-world properties Optimal network structures
Thanks