Download presentation
Presentation is loading. Please wait.
1
Presented by Te-Wei Chiang 2006/6
An Efficient Image Retrieval System Based on Designated Wavelet Features Presented by Te-Wei Chiang 2006/6 Mr. Chairman (Madam Chairman), Ladies and Gentlemen : (Or:Thank you, Mr. Chairman(Madam Chairman). Ladies and Gentlemen ) I'm pleased to have you all here today. My name is XXX. I'm here today to present a paper titled “An Image Retrieval Approach Based on Dominant Wavelet Features”.
2
Outline 1. Introduction 2. Feature Extraction
3. Similarity Measurement 4. Screening Scheme The outline of this presentation is shown as follows: First, I'll start with introductions of the .. Second, I’ll describe ... Third, I’ll introduce the Proposed ... Next, I’ll show the experimental results. Finally, I’ll summarize our approach and talk about the future works. 5. Experimental Results 6. Conclusions
3
1. Introduction Two approaches for image retrieval:
query-by-text (QBT): annotation-based image retrieval (ABIR) query-by-example (QBE): content-based image retrieval (CBIR) Standard CBIR techniques can find the images exactly matching the user query only. With the spread of Internet and the growth of multimedia application, the requirement of image retrieval is increased. Basically, image retrieval procedures can be roughly divided into two approaches: query-by-text (QBT) and query-by-example (QBE). In QBT, queries are texts and targets are images; in QBE, queries are images and targets are images. For practicality, images in QBT retrieval are often annotated by words. When images are sought using these annotations, such retrieval is known as annotation-based image retrieval (ABIR). ABIR has the following drawbacks. First, manual image annotation is time-consuming and therefore costly. Second, human annotation is subjective. Furthermore, some images could not be annotated because it is difficult to describe their content with words. On the other hand, annotations are not necessary in a QBE setting, although they can be used. The retrieval is carried out according to the image contents. Such retrieval is known as content-based image retrieval (CBIR) [5]. CBIR becomes popular for the purpose of retrieving the desired images automatically. However, the standard CBIR techniques can find the images exactly matching the user query only. The retrieval result will be unsatisfactory if the query forms can’t easily represent the content. To overcome such problems, we propose an information retrieval method that allows users to conduct a query by transforming human cognition into numerical data based on their impressions or opinions on the target.
4
Transform type feature extraction techniques
In QBE, the retrieval of images basically has been done via the similarity between the query image and all candidates on the image database. Euclidean distance Transform type feature extraction techniques Wavelet, Walsh, Fourier, 2-D moment, DCT, and Karhunen-Loeve. In our approach, the wavelet transform is used to extract low-level texture features. In QBE, the retrieval of images basically has been done via the similarity between the query image and all candidates on the image database. To evaluate the similarity between two images, one of the simplest ways is to calculate the Euclidean distance between the feature vectors representing the two images. To obtain the feature vector of an image, some transform type feature extraction techniques can be applied, such as wavelet, Walsh, Fourier, 2-D moment, DCT, and Karhunen-Loeve. In our approach, the DCT is used to extract low-level texture features. Due to the energy compacting property of DCT, much of the signal energy has a tendency to lie at low frequencies.
5
Related Works Content-based image retrieval is a technology to search for similar images to a query based only on the image pixel representation. However, the query based on pixel information is quite time-consuming Some of the systems employ color histograms. The histogram measures are only dependent on summations of identical pixel values and do not incorporate orientation and position. Content-based image retrieval is a technology to search for similar images to a query based only on the image pixel representation. However, the query based on pixel information is quite time-consuming because it is necessary to devise a means of describing the location of each pixel and its intensity. Therefore, how to choose a suitable color space and reduce the data to be computed is a critical problem in image retrieval. Some of the systems employ color histograms. The histogram measures are only dependent on summations of identical pixel values and do not incorporate orientation and position. In other words, the histogram is only statistical distribution of the colors and loses the local information of the image. Therefore, we propose an image retrieval scheme to retrieve images from their transform domain, which tries to reduce data and still retains their local information.
6
In this paper, we focus on the QbE approach
In this paper, we focus on the QbE approach. The user gives an example image similar to the one he/she is looking for. Finally, the images in the database with the smallest distance to the query image will be given, ranking according to their similarity. In this paper, we focus on the QbE approach. The user gives an example image similar to the one he/she is looking for. Finally, the images in the database with the smallest distance to the query image will be given, ranking according to their similarity.
7
2. Feature Extraction Features are functions of the measurements performed on a class of objects (or patterns) that enable that class to be distinguished from other classes in the same general category. Color Space Transformation RGB (Red, Green, and Blue) -> YUV (Luminance and Chroma channels) Features are functions of the measurements performed on a class of objects (or patterns) that enable that class to be distinguished from other classes in the same general category. Basically, we have to extract distinguishable and reliable features from the images. Before the feature extraction process, the images have to be converted to the desired color space. Then the Y components of the images, which correspond to gray-level images, are further transformed to wavelet domain to constitute the main features of the original images.
8
YUV color space YUV is based on the CIE Y primary, and also chrominance. The Y primary was specifically designed to follow the luminous efficiency function of human eyes. Chrominance is the difference between a color and a reference white at the same luminance. The following equations are used to convert from RGB to YUV spaces: Y(x, y) = R(x, y) G(x, y) B(x, y), U(x, y) = (B(x, y) - Y(x, y)), and V(x, y) = (R(x, y) - Y(x, y)). Originally used for PAL (European "standard") analog video, YUV is based on the CIE Y primary, and also chrominance. The Y primary was specifically designed to follow the luminous efficiency function of human eyes. Chrominance is the difference between a color and a reference white at the same luminance. The following equations are used to convert from RGB to YUV spaces: Y(x, y) = R(x, y) G(x, y) B(x, y), U(x, y) = (B(x, y) - Y(x, y)), and V(x, y) = (R(x, y) - Y(x, y)). 3. Basically, the Y, U, and V components of a image can be regarded as the luminance, the blue chrominance, and the red chrominance, respectively. 4. After converting from RGB to YUV, the features of each image can be extracted by the DWT.
9
Wavelet Transform Mallat' s pyramid algorithm
An image of size can be decomposed into its wavelet coefficients by using Mallat' s pyramid algorithm. This algorithm is focused on the computation of the following equations :
10
Fig. 1 Fig. 2 The wavelet decomposition is illustrated in Figure 2. LL(0) is the original image. The output of high-pass filters LH(1), HL(1), HH(1) are three subimages with the same size as low-pass subimage LL(1), and presenting different image details in different directions. Figure 3 shows a sample image to illustrate the wavelet decomposition. After wavelet decomposition, the object image energy is distributed in different subbands, each of which keeps a specific frequency component. In other words, each subband image contains one feature. Intuitively, the feature at different subbands can be distinguished more easily than that in the original image.
11
Similarity Measurement
In our experimental system, we define a measure called the sum of squared differences (SSD) to indicate the degree of distance (or dissimilarity). The distance between Q and Xn under the LL(k) subband can be defined as A distance (or similarity) measure is a way of ordering the features from a specific query point. The retrieval performance of a feature can be significantly affected by the distance measure used. Ideally we want to use a distance measure and feature combination that gives best retrieval performance for the collection being queried. In our experimental system, we define a measure called the sum of squared differences (SSD) to indicate the degree of distance (or dissimilarity). Since the lowest frequency wavelet subband, LL(k), is the most significant subband for kth-level wavelet decomposition, we can retrieve the desired image(s) by comparing the LL(k) subband of the candidate images with that of the query image. Assume that and represent the wavelet coefficients of the query image Q and image Xn under LL(k), respectively. Then the distance between Q and Xn under the LL(k) subband can be defined as…
12
Therefore, the distance between Q and Xn can be modified as the weighted combination of LL(k) , LH(k) , HL(k) , HH(k) : Based on the observation that subimages LL(1), LL(2), …, LL(k) correspond to different resolution levels of the original image, the retrieving accuracy may be improved to consider these subimages at the same time. Therefore, the distance between Q and Xn can be modified as the weighted combination of these subimages: , (9) where wk is the weight of the distance under the kth resoluiton level.
13
4 Screening Scheme To speed up the image retrieving process, the screening technique can be applied in advance of the retrieving process. Eliminate the obviously unqualified candidates via a certain criterion In the image retrieving phase, when an example image is queried, all the distances between the feature vector of the query image and those of the images in the image database are calculated. Then the image candidates whose distance to the query image is beyond a predefined threshold will be pruned, and the remaining images will be regarded as the qualified candidates In order to speed up the image retrieving process, the screening technique can be applied in advance of the retrieving process. The concept underlying screening is to eliminate the obviously unqualified candidates via a certain criterion. In the database establishing phase, the features of each image are extracted by DWT. In the image retrieving phase, when an example image is queried, all the distances between the feature vector of the query image and those of the images in the image database are calculated. Then the image candidates whose distance to the query image is beyond a predefined threshold will be pruned, and the remaining images will be regarded as the qualified candidates
14
4.1 Screening Based on Overall Distances
The average distance threshold : N : the number of images in the image database; a : a parameter used to adjust the pruning rate. Screening rule R1 : Screening via the average distance threshold qa. In this rule the distance D(Q, Xn) is compared with distance threshold qa. If D(Q, Xn) is larger than qa, then image Xn is excluded from the set of candidate images for Q. To filter out the images which are dissimilar to the query image Q from the aspect of overall distance, we devise the average distance threshold qa used for pruning: where N is the number of images in the image database; a is a parameter used to adjust the pruning rate. Then, we obtain the following screening rule to eliminate the dissimilar candidate images. Screening rule R1 : Screening via the average distance threshold qa. In this rule the distance D(Q, Xn) is compared with distance threshold qa. If D(Q, Xn) is larger than qa, then image Xn is excluded from the set of candidate images for Q.
15
4.2 Screening Based on Subband Distances
The average subband distance threshold : N: is the number of images in the image database; b : a parameter used to adjust the pruning rate. Similarly, we can obtain , , and Screening rule R2: Screening via the average subband distance threshold , , , and In this rule the subband distances are compared with the subband distance thresholds. If one of the subband distance is larger than the corresponding threshold, then image Xn is excluded from the set of candidate images for Q. To filter out the images which are dissimilar to the query image Q from the aspect of subband distance, we devise the average subband distance threshold used for pruning: where N is the number of images in the image database; b is a parameter used to adjust the pruning rate. Similarly, we can obtain, , and . Then, the following screening rule can be used to eliminate the dissimilar candidate images. Screening rule R2: Screening via the average subband distance threshold , , and . In this rule the subband distances are compared with the subband distance thresholds. If one of the subband distance is larger than the corresponding threshold, then image Xn is excluded from the set of candidate images for Q.
16
5. Experimental Results 1000 images downloaded from the WBIIS database are used to demonstrate the effectiveness of our system. The user can query by an external image or an image from the database. In this preliminary experiment, 1000 images downloaded from the WBIIS [10] database are used to demonstrate the effectiveness of our system. The user can query by an external image or an image from the database. The difference between these two options is that when an external image is used its features need to be extracted while if the image is already in the database, its features are already extracted and stored in the database along the image. Therefore, when a query submitted using an image from the database, only the index of the image in the database is transferred to the server. In both cases the features used are imposed by the database selected at the start. In this experiment, we only use the images from the database as query images. To evaluate the retrieval efficiency of the proposed method, we use the performance measure, the precision rate as shown in Equation (10), Precision rate , (10) where Rr is the number of relevant retrieved items, and Tr is the number of all retrieved items.
17
The images are mostly photographic and have various contents, such as natural scenes, animals, insects, building, people, and so on. To verify the effectiveness of our approach, we conducted a series of experiments using the proposed screening scheme and different type of feature sets. In the first experiment, a sky scene (as shown in Figure 3) is used as the query image. The size of the query image is 85128, and the images of the same size in the image database were regarded as the initial candidates. For the query image, 407 of 1000 images are served as the initial candidates. The retrieved results are the top ten in similarity, where the items are ranked in the ascending order of the distance to the query image from the left to the right and then from the top to the bottom. Figure 3 shows the GUI of our system and the retrieved results using the approximations at level 3, i.e. LL(3), as the main feature without screening scheme. The lower right frames in the figure consist of the scrollbars, by which users can adjust the weight of each feature at each wavelet level. The value of weights is between 0 and 1. In this case, the weight of LL(3) feature is set to 1, and the others are set to 0.
18
Figure 4 shows the retrieved results using the same query image with screening rules R1 and R2, based on the approximations at wavelet level 3. With screening rule R1 (a=0.7), the number of candidate images is reduced from 407 to 27; the retrieved results are shown in Figure 4(b), which is the same as that without the screening scheme. After the screening process, the subsequent effort required to sort the candidate images can be reduced and hence improve the efficiency. With screening rules R1 (a=0.7) and R2 (b=1), the number of candidate images is further reduced from 27 to 14; the retrieved results are shown in Figure 4(c). In addition to the reduction of the set of candidate images, it can be found that the retrieved results are better as rule R2 is applied; the top 4 candidates are closely relative to the query image. It is because the images that are dissimilar to the query image from the aspect of the wavelet subbands are filtered out. Therefore, screening using R1 and R2 is not only efficient but also effective.
19
In the next experiment, a sunset scene is used as the query image, as shown in Figure 5(a). The size of the image is 85128, and 407 of 1000 images of the same size in the image database are initial candidates for the query image. Figure 5(b) shows the retrieved results based on the approximations at wavelet level 3 without screening. Figure 5(c) shows the retrieved results based on the same wavelet feature with screening rules R1 and R2. To illustrate the effect of different wavelet features, Figure 5(d)-(f) show the retrieved results based on the horizontal details, vertical details, and diagonal details at wavelet level 3 with screening rule R1 and R2, respectively.
20
It can be found that, using different wavelet feature as the main feature, the retrieved results are different. Figure 5(g) shows the retrieved results using the combination of horizontal and diagonal details as the main features. In fact, the wavelet analysis is acting like a filter on the image, with each wavelet level corresponding to a different frequency band in the original image. Therefore, if you focus on a particular frequency at a different distance from the place where you are observing, you can use this as a guideline to determine which wavelet level and which directional details should be used in the query.
21
6. Conclusions In this paper, we propose a CBIR method based on the DWT. Each image is first transformed from YUV color space; then the Y component of the image is further transformed to extract four types of features at each resolution level: approximations, horizontal details, vertical details and diagonal details. Our CBIR system also provides a GUI such that users can adjust the weight of each wavelet feature according to their expectations. Then, users can retrieve the desired images from the image database via the query image and the system’s interactive GUI. Besides, for the purpose of efficiency, this paper also proposes a screening scheme, which compares the wavelet subbands of the query image with those of the candidate images in the image database and filters out the candidates that are not similar to the query image in a certain subband. We propose a CBIR method benefited from the superiority of DWT in multiresolution analysis and spatial-frequency localization. In our approach, each image is first transformed from YUV color space; then the Y component of the image is further transformed to extract four types of features at each resolution level: approximations, horizontal details, vertical details and diagonal details. Our CBIR system also provides a GUI such that users can adjust the weight of each wavelet feature according to their expectations. Then, users can retrieve the desired images from the image database via the query image and the system’s interactive GUI.
22
Future Works For each type of feature we will continue investigating and improving its ability of describing the image and its performance of similarity measuring. Since several features may be used simultaneously, it is necessary to develop a scheme that can integrate the similarity scores resulting from the matching processes. Since only preliminary experiment has been made to test our approach, a lot of works should be done to improve this system. For each type of feature we will continue investigating and improving its ability of describing the image and its performance of similarity measuring. Since several features are used simultaneously, it is necessary to integrate similarity scores resulting from the matching processes.
23
Thank You !!! My presentation is over. Thank you very much for your attention.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.