Download presentation
Presentation is loading. Please wait.
1
Optimizing Learning with SVM Constraint for Content-based Image Retrieval* Steven C.H. Hoi 1th March, 2004 *Note: The copyright of the presentation material is hold by the authors.
2
Outline Introduction Related Work Optimizing Learning with SVM constraint Experimental Results Discussions and Future Work Conclusions
3
Introduction In CBIR, there exists a gap between the high level semantics and the low level features calculated by the computers. To learn the associations between the human perception and the low level features, relevance feedback was proposed as a natural way to solve this task. In a CBIR system, users will be asked to provide the relevance judgement on the query results. Based on user’s feedbacks, the CBIR system refines the retrieval performance round-by-round. Difficulties in Relevance Feedback: high dimension feature space, small training samples
4
Related Work Major techniques for Relevance Feedback Query-Point Movement: Rochhio’s formula Query-Point Movement: Rochhio’s formula Moving the ideal query point toward the positive examples and away from the negative examplesMoving the ideal query point toward the positive examples and away from the negative examples Re-weighting: MARS Re-weighting: MARS Axis re-weighting: the inverse of the standard deviation of a feature, say the j-th feature, is used as the weight for the corresponding axis.Axis re-weighting: the inverse of the standard deviation of a feature, say the j-th feature, is used as the weight for the corresponding axis.
5
Parameters optimization methods: MindReader, Rui’00 Parameters optimization methods: MindReader, Rui’00 By minimizing the total distance of positive examples to the ideal query pointBy minimizing the total distance of positive examples to the ideal query point Based on Generalized Ellipsoid DistanceBased on Generalized Ellipsoid Distance Kernel-based Classification Techniques: SVMs, Boosting, etc. Kernel-based Classification Techniques: SVMs, Boosting, etc.
6
SVM Advantages Advantages Sound theoretical backgroundSound theoretical background Minimizing Structure Risk rather than Empirical RiskMinimizing Structure Risk rather than Empirical Risk Excellent Classification performanceExcellent Classification performance Basic Theory Basic Theory Learning boundary with SVM
7
Considering the soft-margin, SVM has the form below: The optimization problem can be solved by introducing the Lagrange multipliers. The derived decision function can be described as follows The distance function for SVM to measure the similarity for Image retrieval is typically given as
8
Limitations of SVMs Inaccurate boundary for few samples In the initial round of relevance feedback, the number of training samples is small, SVMs cannot accurately catch the data distributions. In the initial round of relevance feedback, the number of training samples is small, SVMs cannot accurately catch the data distributions. Ranking problem SVM is originally for classification purposes. Simply to take the distance from boundaries of SVM as the distance function may not be effective enough to describe the data. SVM is originally for classification purposes. Simply to take the distance from boundaries of SVM as the distance function may not be effective enough to describe the data.
9
Solutions A heuristic approach Guo in Ref [3] suggested a simple approach to embed the Euclidean Distance in SVM learning. In their scheme, the samples inside the boundary of SVM are measured by Euclidean Distance while the samples outside the boundary are evaluated by the distance-from-boundary of SVMs. Guo in Ref [3] suggested a simple approach to embed the Euclidean Distance in SVM learning. In their scheme, the samples inside the boundary of SVM are measured by Euclidean Distance while the samples outside the boundary are evaluated by the distance-from-boundary of SVMs. However, the heuristic approach based on straight Euclidean Distance is not flexible and powerful enough to describe the data distribution. Moreover, it lacks systematic formulation in mathematics. However, the heuristic approach based on straight Euclidean Distance is not flexible and powerful enough to describe the data distribution. Moreover, it lacks systematic formulation in mathematics. Optimizing Learning approach To best learn the similarity, we suggest a novel systematic scheme: the optimal learning with SVMs constraint. Our approach can optimally learn the similarity for Image retrieval with the Generalized Ellipsoid Distance. To best learn the similarity, we suggest a novel systematic scheme: the optimal learning with SVMs constraint. Our approach can optimally learn the similarity for Image retrieval with the Generalized Ellipsoid Distance.
10
Optimizing Learning with SVM constraint Basic Idea The training samples are first learned by SVM to form a boundary to separate the positive part and the negative ones. The training samples are first learned by SVM to form a boundary to separate the positive part and the negative ones. Then we learn the optimal similarity measure metric by positive samples with Generalized Ellipsoid Distance constrained with the boundary of SVMs. Then we learn the optimal similarity measure metric by positive samples with Generalized Ellipsoid Distance constrained with the boundary of SVMs. Comparison for the proposed method with previous
11
Problem formulation and notations N – denote the number relevant samples N – denote the number relevant samples M – denote the number of features. M – denote the number of features. Given an image x n, we use x ni = [x ni1,…x niLi ] to represent the i-th feature vector, where L i is the length of the i-th feature vector. Given an image x n, we use x ni = [x ni1,…x niLi ] to represent the i-th feature vector, where L i is the length of the i-th feature vector. Let qi = [ qi1,…,qiL1] denote the ideal query vector of the i-th feature vector. Let qi = [ qi1,…,qiL1] denote the ideal query vector of the i-th feature vector. Generalized Ellipsoid Distance is described by a real symmetric full matrix W i. Generalized Ellipsoid Distance is described by a real symmetric full matrix W i.
13
Optimization target where u is used for feature weights and v is provided for the goodness value of positive samples to weight the importance of the samples. where u is used for feature weights and v is provided for the goodness value of positive samples to weight the importance of the samples.
14
To fuse the optimal learning model with SVM learning, we set the goodness value v(x) related to the SVM distance in each iteration. By introducing Lagrange multipliers, we can solve the previous optimization function. We here only present the major conclusions as follows. The optimal solution to the ideal query point can be solved as The optimal solution to the ideal query point can be solved as
15
The optimal solution to the distance matrix can be solved as The optimal solution to the distance matrix can be solved as where is weighting covariance matrix of Xi The optimal solution to the vector u can be solved as The optimal solution to the vector u can be solved as
16
Distance Measure Metrics where MaxDis is given as
17
Experimental Results Datasets Natural images selected from COREL CDs Natural images selected from COREL CDs Two datasets: 20-Category and 50-Category Two datasets: 20-Category and 50-Category Each category contains 100 images. Each category contains 100 images. Feature Representation 9-dimensional Color Moment 9-dimensional Color Moment 18-dimensional Edge Direction Histogram 18-dimensional Edge Direction Histogram 9-dimensional Wavelet-based Texture 9-dimensional Wavelet-based Texture Experimental Settings Radial Basis Function for SVMs Radial Basis Function for SVMs To enable objective evaluation, we fix the parameters in the learning task for all compared algorithms. To enable objective evaluation, we fix the parameters in the learning task for all compared algorithms.
18
Evaluation on 20-Cat dataset: Average Precision on Top-20
19
Evaluation on 50-Cat dataset:
20
Computational Complexity and Empirical Time Cost
21
Discussion and Future Work Feature Set Selection In our SVM learning scheme, we do not consider the feature set selection in the boundary learning. In our SVM learning scheme, we do not consider the feature set selection in the boundary learning. We can further improve the retrieval performance by combining the feature set selection in the SVM learning tasks. We can further improve the retrieval performance by combining the feature set selection in the SVM learning tasks. The efficiency problem may also be considered in the future work.
22
Conclusions In this talk, we present a novel scheme to learn the relevance feedback task for image retrieval. We suggest the approach by optimal learning with SVMs constraint. Our scheme not only can utilize the advantages of SVMs to learn the boundary in the high dimension feature space, but also can exploit the hidden similarity structure within the boundary of SVMs well. Compared with previous methods, our approach with systematic formulation can perform better from the preliminary experimental results.
23
Reference [1] Chu-Hong Hoi and Michael R. Lyu, [1] Chu-Hong Hoi and Michael R. Lyu, Optimizing Learning with SVM Constraint for Content-Based Image Retrieval, Technical Report, Department of Computer Science and Engineering, The Chinese University of Hong Kong, March, 2004 [2] Yoshiharu Ishikawa, Ravishankar Subramanya, and Christos Faloutsos MindReader: Querying databases through multiple examples (1998) Proc. 24th Int. Conf. Very Large Data Bases, VLDB, 1998 [3] Y. Rui and T.S. Huang. Optimizing learning in image retrieval. IEEE Conf. on CVPR, June 2000. [4] G. D. Guo, A. K. Jain, W. Y. Ma, and H. J. Zhang, Learning Similarity Measure for Natural Image Retrieval with Relevance Feedback, IEEE Trans. on Neural Networks, vol. 13, No. 4, 811-820, July, 2002.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.