1 Faculty of Information Technology Generic Fourier Descriptor for Shape-based Image Retrieval Dengsheng Zhang, Guojun Lu Gippsland School of Comp. & Info Tech Monash University Churchill, VIC 3842 Australia
2 Faculty of Information Technology Outline Motivations Problems Generic Fourier Descriptor (GFD) Experimental Results Conclusions Motivations Problems Generic Fourier Descriptor (GFD) Experimental Results Conclusions
3 Faculty of Information Technology Motivations Content-based Image Retrieval –Image description is important for image searching –Image description constitutes one of the key part of MPEG-7 –Shape is an important image feature along with color and texture Effective and Efficient Shape Descriptor –good retrieval accuracy, compact features, general application, low computation complexity, robust retrieval performance and hierarchical coarse to fine representation Content-based Image Retrieval –Image description is important for image searching –Image description constitutes one of the key part of MPEG-7 –Shape is an important image feature along with color and texture Effective and Efficient Shape Descriptor –good retrieval accuracy, compact features, general application, low computation complexity, robust retrieval performance and hierarchical coarse to fine representation
4 Faculty of Information Technology Fourier Descriptor Obtained by applying Fourier transform on a shape signature, such as the central distance function r(t). No Contour Same Contour and Different content
5 Faculty of Information Technology Zernike Moments Acquired by applying Zernike moment transform on a shape region in polar space. –Complex form –Does not allow finer resolution in radial direction –create a number of repetitions in each order of moment –Shape must be normalized into an unit disk Acquired by applying Zernike moment transform on a shape region in polar space. –Complex form –Does not allow finer resolution in radial direction –create a number of repetitions in each order of moment –Shape must be normalized into an unit disk
6 Faculty of Information Technology Generic Fourier Descriptor Polar Transform –For an input image f(x, y), it is first transformed into polar image f(r, ): –Find R = max{ r( ) } Polar Transform –For an input image f(x, y), it is first transformed into polar image f(r, ): –Find R = max{ r( ) }
7 Faculty of Information Technology Generic Fourier Descriptor-II Polar Raster Sampling Polar Grid Polar image Polar raster sampled image in Cartesian space
8 Faculty of Information Technology Generic Fourier Descriptor-III Binary polar raster sampled shape images Polar raster sampling
9 Faculty of Information Technology Generic Fourier Descriptor-IIV 2-D Fourier transform on polar raster sampled image f(r, ): where 0 r<R and i = i(2 /T) (0 i<T); 0 <R, 0 <T. R and T are the radial frequency resolution and angular frequency resolution respectively. The normalized Fourier coefficients are the GFD. 2-D Fourier transform on polar raster sampled image f(r, ): where 0 r<R and i = i(2 /T) (0 i<T); 0 <R, 0 <T. R and T are the radial frequency resolution and angular frequency resolution respectively. The normalized Fourier coefficients are the GFD.
10 Faculty of Information Technology Generic Fourier Descriptor-V Rotation invariant Fourier Polar raster sampled PF
11 Faculty of Information Technology Generic Fourier Descriptor-VI Translation invariant due to using shape centroid as origin. Scale normalization: Due to f(x, y) is real, only a quarter of the transformed coefficients are distinct. The first 36 coefficients are selected as shape descriptor. The similarity between two shapes are measured by the city block distance between the two set of GFDs. Translation invariant due to using shape centroid as origin. Scale normalization: Due to f(x, y) is real, only a quarter of the transformed coefficients are distinct. The first 36 coefficients are selected as shape descriptor. The similarity between two shapes are measured by the city block distance between the two set of GFDs.
12 Faculty of Information Technology Experiment Datasets –MPEG-7 region shape database (CE-2) has been tested. CE-2 has been organized by MPEG-7 into six datasets to test a shape descriptor’s behaviors under different distortions. –Set A1 is for test of scale invariance. 100 shapes in Set A1 has been classified into 20 groups which are designated as queries. –Set A2 is for test of rotation invariance. 140 shapes in Set A2 has been classified into 20 groups which are designated as queries –Set A3 is for test of rotation/scaling invariance. –Set A4 is for test of robustness to perspective transform. 330 shapes in Set A4 has been classified into 30 groups which are designated as queries. –Set B consists of 2811 shapes from the whole database, it is for subjective test. 682 shapes in Set B have been manually classified into 10 groups by MPEG-7. –For the whole database, 651 shapes have been classified into 31 groups which can be used as queries. Datasets –MPEG-7 region shape database (CE-2) has been tested. CE-2 has been organized by MPEG-7 into six datasets to test a shape descriptor’s behaviors under different distortions. –Set A1 is for test of scale invariance. 100 shapes in Set A1 has been classified into 20 groups which are designated as queries. –Set A2 is for test of rotation invariance. 140 shapes in Set A2 has been classified into 20 groups which are designated as queries –Set A3 is for test of rotation/scaling invariance. –Set A4 is for test of robustness to perspective transform. 330 shapes in Set A4 has been classified into 30 groups which are designated as queries. –Set B consists of 2811 shapes from the whole database, it is for subjective test. 682 shapes in Set B have been manually classified into 10 groups by MPEG-7. –For the whole database, 651 shapes have been classified into 31 groups which can be used as queries.
13 Faculty of Information Technology Performance Measurement Precision-Recall For each query, the precision of the retrieval at each level of the recall is obtained. The result precision of retrieval is the average precision of all the query retrievals. Precision-Recall For each query, the precision of the retrieval at each level of the recall is obtained. The result precision of retrieval is the average precision of all the query retrievals.
14 Faculty of Information Technology Results Average Precision-Recall on Set A1 and A2 Scale Invariance Test Rotation Invariance Test
15 Faculty of Information Technology Results Average Precision-Recall on Set A4 and CE-2 Perspective Invariance Test General Invariance Test
16 Faculty of Information Technology Average Precision-Recall on Set B Class Average No. of shapes GFD (%) ZMD (%) Subjective Test
17 Faculty of Information Technology Results
18 Faculty of Information Technology
19 Faculty of Information Technology
20 Faculty of Information Technology Conclusions A new shape descriptor, generic Fourier descriptor (GFD) has been proposed. It has been tested on MPEG-7 region shape database Comparisons have been made between GFD and MPEG-7 shape descriptor ZMD. Compared with ZMD, GFD has four advantages: –it captures spectral features in both radial and circular directions; –it is simpler to compute; –it is more robust and perceptually meaningful; –the physical meaning of each feature is clearer. The proposed GFD satisfies all the six requirements set by MPEG-7 for shape representation. A new shape descriptor, generic Fourier descriptor (GFD) has been proposed. It has been tested on MPEG-7 region shape database Comparisons have been made between GFD and MPEG-7 shape descriptor ZMD. Compared with ZMD, GFD has four advantages: –it captures spectral features in both radial and circular directions; –it is simpler to compute; –it is more robust and perceptually meaningful; –the physical meaning of each feature is clearer. The proposed GFD satisfies all the six requirements set by MPEG-7 for shape representation.