Download presentation
Presentation is loading. Please wait.
Published bySaija Manninen Modified over 6 years ago
1
Image Retrieval Based on the Wavelet Features of Interest
Te-Wei Chiang, Tienwei Tsai, and Yo-Ping Huang 2006/10/10 Mr. Chairman (Madam Chairman), Ladies and Gentlemen : (Or:Thank you, Mr. Chairman(Madam Chairman). Ladies and Gentlemen ) I'm pleased to have you all here today. My name is XXX. I'm here today to present a paper titled “An Image Retrieval Approach Based on Dominant Wavelet Features”.
2
Outline 1. Introduction 2. Proposed Image Retrieval System
3. Experimental Results 4. Conclusions The outline of this presentation is shown as follows: First, I'll start with introductions of the .. Second, I’ll introduce the Proposed image retrieval system. And each module in the proposed system, including the color space transformation, feature extraction, and the similarity measurement. Next, I’ll show the experimental results. Finally, I’ll summarize our approach and talk about the future works.
3
1. Introduction Two approaches for image retrieval:
query-by-text (QBT): annotation-based image retrieval (ABIR) query-by-example (QBE): content-based image retrieval (CBIR) Standard CBIR techniques can find the images exactly matching the user query only. With the spread of Internet and the growth of multimedia application, the requirement of image retrieval is increased. Basically, image retrieval procedures can be roughly divided into two approaches: query-by-text (QBT) and query-by-example (QBE). In QBT, queries are texts and targets are images; in QBE, queries are images and targets are images. For practicality, images in QBT retrieval are often annotated by words. When images are sought using these annotations, such retrieval is known as annotation-based image retrieval (ABIR). ABIR has the following drawbacks. First, manual image annotation is time-consuming and therefore costly. Second, human annotation is subjective. Furthermore, some images could not be annotated because it is difficult to describe their content with words. On the other hand, annotations are not necessary in a QBE setting, although they can be used. The retrieval is carried out according to the image contents. Such retrieval is known as content-based image retrieval (CBIR) [5]. CBIR becomes popular for the purpose of retrieving the desired images automatically. However, the standard CBIR techniques can find the images exactly matching the user query only. The retrieval result will be unsatisfactory if the query forms can’t easily represent the content. To overcome such problems, we propose an information retrieval method that allows users to conduct a query by transforming human cognition into numerical data based on their impressions or opinions on the target.
4
Transform type feature extraction techniques
In QBE, the retrieval of images basically has been done via the similarity between the query image and all candidates on the image database. Euclidean distance Transform type feature extraction techniques Wavelet, Walsh, Fourier, 2-D moment, DCT, and Karhunen-Loeve. In our approach, the wavelet transform is used to extract low-level texture features. In QBE, the retrieval of images basically has been done via the similarity between the query image and all candidates on the image database. To evaluate the similarity between two images, one of the simplest ways is to calculate the Euclidean distance between the feature vectors representing the two images. To obtain the feature vector of an image, some transform type feature extraction techniques can be applied, such as wavelet, Walsh, Fourier, 2-D moment, DCT, and Karhunen-Loeve. In our approach, the DCT is used to extract low-level texture features. Due to the energy compacting property of DCT, much of the signal energy has a tendency to lie at low frequencies.
5
In this paper, we focus on the QbE approach
In this paper, we focus on the QbE approach. The user gives an example image similar to the one he/she is looking for. Finally, the images in the database with the smallest distance to the query image will be given, ranking according to their similarity. In this paper, we focus on the QbE approach. The user gives an example image similar to the one he/she is looking for. Finally, the images in the database with the smallest distance to the query image will be given, ranking according to their similarity.
6
System Architecture This system consists of two major modules:
the feature extraction module the similarity measuring module. In the image database establishing phase: each image is first transformed from the standard RGB color space to the YUV space; then each component (i.e., Y, U, and V) of the image is further transformed to the wavelet domain. In the image retrieving phase: the similarity measuring module compares the most significant wavelet coefficients of the Y, U, and V components of the query image and those of the images in the database and find out good matches. To benefit from the user-machine interaction, a GUI is developed, allowing users to adjust weights for each feature according to their preferences. This system consists of two major modules: 1) the feature extraction module and 2) the similarity measuring module. To check if one image is similar to the other or not, we have to compare the features of the two images. Therefore, the feature extraction module plays an important rule in both the image database establishing phase and the image retrieving phase. In the image database establishing phase, each image is first transformed from the standard RGB color space to the YUV space; then each component (i.e., Y, U, and V) of the image is further transformed to the wavelet domain. In the image retrieving phase, after the features (or the most significant wavelet coefficients) of the query image have been extracted by the feature extraction module, the similarity measuring module compares the most significant wavelet coefficients of the Y, U, and V components of the query image and those of the images in the database and find out good matches. Besides, to benefit from the user-machine interaction, a GUI is developed, allowing users to adjust weights for each feature according to their preferences.
7
Feature Extraction Features are functions of the measurements performed on a class of objects (or patterns) that enable that class to be distinguished from other classes in the same general category. Color Space Transformation RGB (Red, Green, and Blue) -> YUV (Luminance and Chroma channels) Features are functions of the measurements performed on a class of objects (or patterns) that enable that class to be distinguished from other classes in the same general category. Basically, we have to extract distinguishable and reliable features from the images. Before the feature extraction process, the images have to be converted to the desired color space. Then each color component of the image is further transformed to wavelet domain to constitute the main features of the original images.
8
YUV color space YUV is based on the Y primary and chrominance.
The Y primary was specifically designed to follow the luminous efficiency function of human eyes. Chrominance is the difference between a color and a reference white at the same luminance. The following equations are used to convert from RGB to YUV spaces: Y(x, y) = R(x, y) G(x, y) B(x, y), U(x, y) = (B(x, y) - Y(x, y)), and V(x, y) = (R(x, y) - Y(x, y)). YUV color space is originally used for PAL (European "standard") analog video. The Y primary was specifically designed to follow the luminous efficiency function of human eyes. Chrominance is the difference between a color and a reference white at the same luminance. The following equations are used to convert from RGB to YUV spaces: Y(x, y) = R(x, y) G(x, y) B(x, y), U(x, y) = (B(x, y) - Y(x, y)), and V(x, y) = (R(x, y) - Y(x, y)). 3. Basically, the Y, U, and V components of a image can be regarded as the luminance, the blue chrominance, and the red chrominance, respectively. 4. After converting from RGB to YUV, the features of each image can be extracted by the DWT.
9
This figure gives a sample image to illustrate the RGB and YUV color space.
There are three objects in this figure, namely, the red apple, the green lemon, and the blue hat. It can be found that to discriminate the R, G, B colors in the RGB color space is quite difficult even though the value of brightness of the objects in their dominant color component is larger; for instance, the value of the brightness of apple is larger in the R channel (see Figure 1(b)). On the other hand, the blue and red colors can be identified easily from the U and V channels (see Figure 1(f)-(g)). After converting from RGB to YUV, the features of each image can be extracted by the DWT.
10
Discrete Wavelet Transform
Mallat' s pyramid algorithm An image of size can be decomposed into its wavelet coefficients by using Mallat' s pyramid algorithm. This algorithm is focused on the computation of these equations : …
11
The wavelet decomposition is illustrated in this figure.
LL(0) is the original image. The output of high-pass filters LH(1), HL(1), HH(1) are three subimages with the same size as low-pass subimage LL(1), and presenting different image details in different directions, i.e. horizontal details, vertical details and diagonal details. LL(1) corresponds to the approximation under the first resolution level. LL(2) corresponds to the approximation under the 2nd resolution level, and so on.
12
Figure 3 shows a sample image to illustrate the wavelet decomposition.
From this figure, we can find that the horizontal, vertical, and diagonal details of the image are extracted to the LH, HL, and HH subimages, respectively; moreover, the resolutions of LL(1), LH(1), and HH(1) are finer than those of the LL(2), LH(2), and HH(2) ; in other words, LL(2), LH(2), and HH(2) are the abstractions of LL(1), LH(1), and HH(1). Therefore, by the wavelet decomposition, we can obtain not only the directional features of the images, but also the features at various resolution levels.
13
Similarity Measurement
In our experimental system, we define a measure called the sum of squared differences (SSD) to indicate the degree of distance (or dissimilarity). The distance between Q and Xn under the Y component and LL(k) subband can be defined as A distance (or similarity) measure is a way of ordering the features from a specific query point. The retrieval performance of a feature can be significantly affected by the distance measure used. Ideally we want to use a distance measure and feature combination that gives best retrieval performance for the collection being queried. In our experimental system, we define a measure called the sum of squared differences (SSD) to indicate the degree of distance (or dissimilarity). Since the lowest frequency wavelet subband, LL(k), is the most significant subband for kth-level wavelet decomposition, we can retrieve the desired image(s) by comparing the LL(k) subband of the candidate images with that of the query image. Assume that and represent the wavelet coefficients of the query image Q and image Xn under LL(k), respectively. Then the distance between Q and Xn under the LL(k) subband can be defined as…
14
The distance between Q and Xn under the component Y can be defined as the weighted combination of LL(k) , LH(k) , HL(k) , HH(k) : Based on the observation that subimages LL(1), LL(2), …, LL(k) correspond to different resolution levels of the original image, the retrieving accuracy may be improved to consider these subimages at the same time. Therefore, the distance between Q and Xn can be modified as the weighted combination of these subimages: where wk is the weight of the distance under the kth resoluiton level.
15
Then, the overall distance between Q and Xn can be defined as :
Likewise, the distances between Q and Xn under the component U and V can be defined. Then, the overall distance between Q and Xn can be defined as : Likewise, the distances between Q and Xn under the component U and V, i.e., and , can be defined. Then, the overall distance between Q and Xn can be defined as … In our system, the level wavelet decomposition for each component of YUV image is 3; that is to say, the value of K is 3. Since Y component is the most significant one among the Y, U, and V components, we keep more features for Y component than for U or V component. Thus, the number of features derived from the Y component of an image is 4, i.e., YLL(3), YLH(3), YHL(3), and YHH(3). Corresponding to the four features are four weights. As for U and V components, only the most significant two features, i.e. ULL(3) and VLL(3), are used in our system; the corresponding two weights are wU and wV, which can be assigned by the GUI of our image retrieval system.
16
5. Experimental Results 1000 images downloaded from the WBIIS database are used to demonstrate the effectiveness of our system. The images are mostly photographic and have various contents, such as natural scenes, animals, insects, building, people, and so on. In this preliminary experiment, 1000 images downloaded from the WBIIS database are used to demonstrate the effectiveness of our system. The images are mostly photographic and have various contents, such as natural scenes, animals, insects, building, people, and so on. To verify the effectiveness of our approach, we conducted a series of experiments using different type of feature sets mainly derived from the third wavelet decomposition level.
17
In the first experiment, a butterfly image of size 85128 (as shown in Figure 4(a)) is used as the query image. Figure 4(b) shows the retrieved results using the Y-component approximations at level 3, i.e. YLL(3), as the main feature; in other words, wY and are set to 1, and the other weights are set to 0. The retrieved results are the top ten in similarity, where the items are ranked in the ascending order of the distance to the query image from the left to the right and then from the top to the bottom. Figure 4(c)-(e) shows the retrieved results based on the combination of some features, using the same query image; for example, Figure 4(c) shows the retrieved results based on YLL(3) and ULL(3). It can be found that, as shown in Figure 4(b), using YLL(3) as the main feature is not good enough. Even though most of the retrieved images are similar to the query image from the viewpoint of the shape, some of which are not similar to the query image from the viewpoint of color tone, e.g., the 2nd retrieved image (ID=895), whose color is pink; however, the color of the butterfly in the query image is dark yellow. To avoid this inconsistency of color tone, we can emphasis the weights of red and blue chrominance to manifest the distinguishing ability of color tones. It is observed that the best results are obtained by using YLL(3), ULL(3) and VLL(3) as the main features.
18
In the next experiment, we use a scene of mountains at nightfall as the query image, as shown in Figure 5(a). Figure 5 (b)-(e) show the retrieved results based on the horizontal details, vertical details, and diagonal details at wavelet level 3, respectively. It can be found that, for the horizontal-textured query image, the retrieved results based on vertical details or diagonal details are better than the results based on horizontal details. This result is quite short of our expectations. In fact, it can be explained from the viewpoint of screening. From Figure 5(c), we can find that there are some vertical components in the 4th, 5th, and 7th ranked images, which makes them quite different to the query image; to filter out these undesirable images, we can simply set YHL(3) (vertical details) as the main feature, such that the system will focus on comparing the vertical components between the query and candidate images. The results of using YHL(3) as the main feature are shown in Figure 5(d).
19
In the next experiment, we use the blastoff of a rocket as the query image (see Figure 6(a)).
The retrieved results based on various combinations of features are shown in Figure 6(b)-(h), among which Figure 6(b)-(e) show the retrieved results based on the horizontal details, vertical details, and diagonal details of the Y component at wavelet level 3, respectively. It can be found that, using different wavelet feature as the main feature, the retrieved results are quite different. From another point of view, the wavelet analysis is acting like a filter on the image, with each wavelet level corresponding to a different frequency band in the original image. Therefore, if you focus a particular frequency at a different distance from the place where you are looking at, you can use this as a guideline to determine which wavelet level and which directional details should be used in the query. Among the retrieved results Figure 6(b)-(e), the results in Figure 6(b) are the best, in which a rocket image is retrieved in the first rank. To further improve the retrieved results, we combine ULL(3) and/or VLL(3) features with YLL(3) in order to exclude the candidate images whose color tone are quite different to that of the query image; the results are shown in Figure 6(f)-(h), in which two rocket images are obtained.
20
6. Conclusions In this paper, a content-based image retrieval method that based on DWT is proposed. To achieve QBE, the system compares the most significant wavelet coefficients of the Y, U, and V components of the query image and those of the images in the database and find out good matches by the help of users’ cognition ability. Since there is no feature capable of covering all aspects of an image, the discrimination performance is highly dependent on the selection of features and the images involved. Since several features are used simultaneously, it is necessary to integrate similarity scores resulting from the matching processes. In this paper, a content-based image retrieval method that based on DWT is proposed. To achieve QBE, the system compares the most significant wavelet coefficients of the Y, U, and V components of the query image and those of the images in the database and find out good matches by the help of users’ cognition ability. Since there is no feature capable of covering all aspects of an image, the discrimination performance is highly dependent on the selection of features and the images involved. Since several features are used simultaneously, it is necessary to integrate similarity scores resulting from the matching processes.
21
Future Works For each type of feature we will continue investigating and improving its ability of describing the image and its performance of similarity measuring. Since only preliminary experiment has been made to test our approach, a lot of works should be done to improve this system. For each type of feature we will continue investigating and improving its ability of describing the image and its performance of similarity measuring.
22
Thank You !!! My presentation is over. Thank you very much for your attention.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.