Content-Based Image Retrieval using the EMD algorithm Igal Ioffe George Leifman Supervisor: Doron Shaked Winter-Spring 2000 Technion - Israel Institute of Technology Department of Electrical Engineering The Vision Research and Image Science Laboratory
Project Goal Similar ImagesSource ImageColor Image DB Estimate similarity between pairs of images Order the images accounting to similarity to the source image by query
System Overview Distance Similar images DB Image features Query process Query Image
Overview:Images & Histograms
Overview: Distance Minkowski-form distance (L 2 ) EMD – Earth Movers Distance
Overview: quantization Summarizing the image content Reducing high computation complexity Original Image (20154 colors) Quantized Image (15 colors) Quantized Image (5 colors)
Research Issues Color Quantization algorithms Quad Tree clustering Different color spaces EMD - Earth Movers Distance algorithm
Median Cut vs. Maximum Diversity Maximum Diversity better than Median Cut for small number of colors (<10) Median Cut Maximum Diversity 3 colors 2 colors
Problems with Histogram
Quad Tree Clustering Recursive cluster definition Dynamic stop constraints
Q.Tree Clustering Examples
Color Spaces RGB color space linear combination of red, green, blue used to represent image pixels CIE LAB color space –closer to human vision system
EMD Bipartite network flow problem Can be formalized as a well known transportation problem from linear programming field Minimize - cost Efficient and fast Simplex based solutions
Principal Block Scheme (a) Color Image Database (b) Preprocess each image (c) Store properties of each image in file (d) Start data base navigation
DB Creator demo
DB Navigator demo
Results
Why Visual C++ ? Graphic user-friendly interface Faster than Matlab C++ Object Oriented Design Patterns Usage of MFC: effective and convenient way to manipulate large database structures, information reordering and querying (files, strings, array, etc)
Code Optimizations Effective Cache Usage Decreasing data dependencies in out-of-order execution Loop Unrolling Using Multi-Threading to achieve performance gain on Multi-Processor systems
Code Optimizations - Examples struct Rec{ Key key; Data data; Rec *next; }; for (i=0; i<N; i++) { acc+=a[i]; } for (i=0; i<N/2; i+=2) { acc1+=a[i]; acc2+=a[i+1]; } acc = acc1 + acc2; struct Rec{ Key key; Rec *next; Data data; };
Conclusions EMD captures well perceptual similarity or dissimilarity of images Using both color histogram and image cluster map improves the results versus histogram alone There is no preferable color space, but their combination leads to better results
Issues For Further Research Including texture properties in image description Testing the application on very large image data bases (> images) Handling various images transformations, e.g. partial image, scaling, rotation More advanced image feature combination, including color,texture and position