Self-Organizing Maps for Content-Based Image Database Retrieval Authors: E. Oja, J. Laaksonen, M. Koskeala, S. Brandt Presented by: Nemanja Petrovic
Motivation Large digital image and video libraries Problem: Retrieving and browsing Similar problem – text document mining: WEBSOM
PicSOM System for content-based image retrieval SOM is used to organize images into map units in a two-dimensional grid Similar images are located near each other
Features Spatial location Color Texture Shape
Color RGB representation 15-dimensional feature vector
Texture Y values of YIQ representation Probability that pixel in 8-neighborhood is brighter than central one 40-dimensional feature vector
Shape Edge extraction (Sobel masks) histograms for 8 directions 40-dimensional feature vector Co-occurrence matrix of edge elements 320-dimensional feature vector Fourier Transformation Orthogonal coordinates Polar coordinates 512-dimensional feature vector each
Tree Structured SOM Layer 0 Layer 1 Layer 2
Training
Tree Structured SOM cluster-specific submaps with the hierarchy representing the hierarchy found in the input signals speedup of the training process efficient searching through layers of trained TS-SOM
PicSOM www.cis.hut.fi/picsom/ Database: 4350 images Three SOM layers: 4 x 4 16 x 16 64 x 64
Interactive Searching For each layer maps are created Images presented to user are marked on maps according to user’s response to them: positively if images are chosen negatively if images are rejected Maps are convolved by a low-passed filter masks Best candidates, that haven’t been presented jet, are those with big value on the map and they are presented to user in next iteration
selected presented images rejected 1 -1
1 0.5 -0.5 -1
Interactive Searching iteration K iteration K+1 iteration K+2
Using multiple TS-SOMs Layer 1 Layer 2 Layer 3
Quantitative results database D – N images, class C – Nc images a priori probability=Nc/N measure distances between image K from class C and all other images from D\{K} and order results: 10.1 8.4 8.1 7.4 5.3 4.2 … ordered sequence label 1 if image is in same class as K, otherwise label 0 (denote it as hI sequence) 1 1 0 0 1 0 … Optimal result: First Nc labels will be 1
Quantitative results observed probability that given image K has an image from the same class at position I weighted observed probability
Quantitative results for the first k retrieved images
Conclusion Preliminary results show potential of proposed method Remaining problem: How to measure performance?
Thank you! Questions?