An Automatic Method for Selecting the Parameter of the RBF Kernel Function to Support Vector Machines Cheng-Hsuan Li 1,2 Chin-Teng.

Slides:



Advertisements
Similar presentations
ECG Signal processing (2)
Advertisements

1 Manifold Alignment for Multitemporal Hyperspectral Image Classification H. Lexie Yang 1, Melba M. Crawford 2 School of Civil Engineering, Purdue University.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Evaluating Color Descriptors for Object and Scene Recognition Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, and Cees G.M. Snoek,
Machine learning continued Image source:
Particle swarm optimization for parameter determination and feature selection of support vector machines Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen,
LPP-HOG: A New Local Image Descriptor for Fast Human Detection Andy Qing Jun Wang and Ru Bo Zhang IEEE International Symposium.
Feature/Model Selection by Linear Programming SVM, Combined with State-of-Art Classifiers: What Can We Learn About the Data Erinija Pranckeviciene, Ray.
Local or Global Minima: Flexible Dual-Front Active Contours Hua Li Anthony Yezzi.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Locally Constraint Support Vector Clustering
The value of kernel function represents the inner product of two training points in feature space Kernel functions merge two steps 1. map input data from.
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Speaker Adaptation for Vowel Classification
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Deep Belief Networks for Spam Filtering
Sample-Separation-Margin Based Minimum Classification Error Training of Pattern Classifiers with Quadratic Discriminant Functions Yongqiang Wang 1,2, Qiang.
Remote Sensing Laboratory Dept. of Information Engineering and Computer Science University of Trento Via Sommarive, 14, I Povo, Trento, Italy Remote.
 C. C. Hung, H. Ijaz, E. Jung, and B.-C. Kuo # School of Computing and Software Engineering Southern Polytechnic State University, Marietta, Georgia USA.
Identifying Computer Graphics Using HSV Model And Statistical Moments Of Characteristic Functions Xiao Cai, Yuewen Wang.
TARGETED LAND-COVER CLASSIFICATION by: Shraddha R. Asati Guided by: Prof. P R.Pardhi.
Efficient Model Selection for Support Vector Machines
A New Subspace Approach for Supervised Hyperspectral Image Classification Jun Li 1,2, José M. Bioucas-Dias 2 and Antonio Plaza 1 1 Hyperspectral Computing.
Ajay Kumar, Member, IEEE, and David Zhang, Senior Member, IEEE.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian & Edward Wild University of Wisconsin Madison Workshop on Optimization-Based Data.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian Edward Wild University of Wisconsin Madison.
Fuzzy Entropy based feature selection for classification of hyperspectral data Mahesh Pal Department of Civil Engineering National Institute of Technology.
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.
Prediction of Malignancy of Ovarian Tumors Using Least Squares Support Vector Machines C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel 1, I.
C OMBINING E NSEMBLE T ECHNIQUE OF S UPPORT V ECTOR M ACHINES WITH THE O PTIMAL K ERNEL M ETHOD FOR H IGH D IMENSIONAL D ATA C LASSIFICATION I-Ling Chen.
ICML2004, Banff, Alberta, Canada Learning Larger Margin Machine Locally and Globally Kaizhu Huang Haiqin Yang, Irwin King, Michael.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
1 Particle Swarm Optimization-based Dimensionality Reduction for Hyperspectral Image Classification He Yang, Jenny Q. Du Department of Electrical and Computer.
Extending the Multi- Instance Problem to Model Instance Collaboration Anjali Koppal Advanced Machine Learning December 11, 2007.
Non-Bayes classifiers. Linear discriminants, neural networks.
IGARSS 2011, Vancouver, Canada HYPERSPECTRAL UNMIXING USING A NOVEL CONVERSION MODEL Fereidoun A. Mianji, Member, IEEE, Shuang Zhou, Member, IEEE, Ye Zhang,
Color Image Segmentation Advisor : 丁建均 Jian-Jiun Ding Presenter : 蔡佳豪 Chia-Hao Tsai Date: Digital Image and Signal Processing Lab Graduate Institute.
IMPROVING ACTIVE LEARNING METHODS USING SPATIAL INFORMATION IGARSS 2011 Edoardo Pasolli Univ. of Trento, Italy Farid Melgani Univ.
Sparse Bayesian Learning for Efficient Visual Tracking O. Williams, A. Blake & R. Cipolloa PAMI, Aug Presented by Yuting Qi Machine Learning Reading.
H. Lexie Yang1, Dr. Melba M. Crawford2
Design of PCA and SVM based face recognition system for intelligent robots Department of Electrical Engineering, Southern Taiwan University, Tainan County,
Dd Generalized Optimal Kernel-based Ensemble Learning for HS Classification Problems Generalized Optimal Kernel-based Ensemble Learning for HS Classification.
earthobs.nr.no Retraining maximum likelihood classifiers using a low-rank model Arnt-Børre Salberg Norwegian Computing Center Oslo, Norway IGARSS.
Jen-Tzung Chien, Meng-Sung Wu Minimum Rank Error Language Modeling.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Some Topics in Remote Sensing Image Classification Yu Lu
Combining Evolutionary Information Extracted From Frequency Profiles With Sequence-based Kernels For Protein Remote Homology Detection Name: ZhuFangzhi.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Ultra-high dimensional feature selection Yun Li
Proximal Plane Classification KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Second Annual Review June 1, 2001 Data Mining Institute.
Iterative K-Means Algorithm Based on Fisher Discriminant UNIVERSITY OF JOENSUU DEPARTMENT OF COMPUTER SCIENCE JOENSUU, FINLAND Mantao Xu to be presented.
Computational Intelligence: Methods and Applications Lecture 15 Model selection and tradeoffs. Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
FUZZ-IEEE Kernel Machines and Additive Fuzzy Systems: Classification and Function Approximation Yixin Chen and James Z. Wang The Pennsylvania State.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Minimal Kernel Classifiers Glenn Fung Olvi Mangasarian Alexander Smola Data Mining Institute University of Wisconsin - Madison Informs 2002 San Jose, California,
A distributed PSO – SVM hybrid system with feature selection and parameter optimization Cheng-Lung Huang & Jian-Fan Dun Soft Computing 2008.
Shadow Detection in Remotely Sensed Images Based on Self-Adaptive Feature Selection Jiahang Liu, Tao Fang, and Deren Li IEEE TRANSACTIONS ON GEOSCIENCE.
Date of download: 6/23/2016 Copyright © 2016 SPIE. All rights reserved. Five sampling types with P=8, R=1: (a) X=0, (b) X=1, (c) X=2, (d) X=3, (e) X=4.
Nonparametric Weighted Feature Extraction (NWFE) and Its Kernel-based Version (KNWFE) Bor-Chen Kuo Graduate School of Educational Measurement and Statistics,
High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.
Bag-of-Visual-Words Based Feature Extraction
Amin Zehtabian, Hassan Ghassemian CONCLUSION & FUTURE WORKS
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
REMOTE SENSING Multispectral Image Classification
Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen, Zne-Jung Lee
Minimal Kernel Classifiers
Presentation transcript:

An Automatic Method for Selecting the Parameter of the RBF Kernel Function to Support Vector Machines Cheng-Hsuan Li 1,2 Chin-Teng Lin 1 Bor-Chen Kuo 2 Hui-Shan Chu 2 1 Institute of Electrical Control Engineering, National Chiao Tung University, Hsinchu, Taiwan, R.O.C. 2 Graduate Institute of Educational Measurement and Statistics, National Taichung University, Taichung, Taiwan, R.O.C.

Motivation There are a lot of papers on TGARS which use the k-fold cross- validation to find the proper parameter of the RBF kernel function. B. Waske, S. van der Linden,J. A. Benediktsson, A. Rabe, and P. Hostert, “Sensitivity of Support Vector machines to random feature selection in classification of hyperspectral data,” IEEE Trans. Geosci. Remote Sens., vol. 48, no. 7, pp , July J. Chen, C. Wang, and R. Wang, “Using stacked generalization to combine SVMs in magnitude and shape feature spaces for classification of hyperspectral data,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 7, pp , July L. Bruzzone, and M. Marconcini, “Toward the automatic updating of land-cover maps by a domain-adaptation SVM classifier and a circular validation strategy,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 4, pp , April T.V. Bandos, L. Bruzzone, and G. Camps-Valls, “Classification of hyperspectral images with regularized linear discriminant analysis,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 3, pp , March G. Camps-Valls, L. Gomez-Chova, J. Munoz-Mari, J.L. Rojo-Alvarez, and M. Martinez-Ramon, “Kernel-based framework for multitemporal and multisource remote sensing data classification and change detection,” IEEE Trans. Geosci. Remote Sens., vol. 46, no. 6, pp , June etc. Nevertheless, it is time consuming.

Kernel Method The samples in the same class should be mapped into the same area. The samples in the different classes should be mapped into the different areas.

The Properties of the RBF Kernel In the feature space determined by the RBF kernel, the norm of every sample is one and positive. Hence, the samples will be mapped onto the surface of a hypersphere.

The Properties of the RBF Kernel Hence, the cosine values, i.e., the values of the RBF kernel function, indicate the similarities between samples.

The Properties of the RBF Kernel If the cosine values are close to 1, then these samples are more similar in the feature space. If the cosine values are close to 0, then these samples are more dissimilar in the feature space. They are more similar and the corresponding cosine value is close to 1. They are more dissimilar and the corresponding cosine value is close to 0.

The Ideal Distribution For example, there are three classes and are the -th samples in class,. We want to find a feature space such that the distribution of samples in this feature space is like the following figures.

The Ideal Distribution The values of the RBF kernel function should be close to 1 if the samples are in the same class. That is The values of the RBF kernel function would be close to 0 if the samples are in the different classes. That is

An Example of Varying the Parameter small large Vectors in the original space Vectors in the corresponding feature space

An Example of Varying the Parameter small large Vectors in the original space Vectors in the corresponding feature space

An Example of Varying the Parameter small large Vectors in the original space Vectors in the corresponding feature space

An Example of Varying the Parameter small large Vectors in the original space Vectors in the corresponding feature space

An Example of Varying the Parameter small large Vectors in the original space Vectors in the corresponding feature space

An Example of Varying the Parameter small large Vectors in the original space Vectors in the corresponding feature space

An Example of Varying the Parameter small large Vectors in the original space Vectors in the corresponding feature space

An Example of Varying the Parameter small large Vectors in the original space Vectors in the corresponding feature space

An Example of Varying the Parameter small large Vectors in the original space Vectors in the corresponding feature space

Proposed Criteria The mean of values applied by the RBF kernel function on the samples in the same class are close to 1, i.e.,

Proposed Criteria The mean of values applied by the RBF kernel function on the samples in different classes are close to 0, i.e.,

Proposed Criteria We want to find a parameter such that This means that The proposed criterion is defined as

VERSUS -RBF KERNEL Obtain from the Indian Pine Site dataset

VERSUS ACCURACIES-RBF KERNEL

Observations has a global minimum on. From the figure “ versus,” one can find that the values near the minimum value occur at in the rage (3500,4000). From the figure “ versus accuracies,” one can find that the values near the highest accuracy or kappa accuracy also occur at in the rage (3500,4500). The obtained by our proposed method is approximate to the which can achieve the highest accuracy. Therefore, the minimizer of is a good estimator of.

How to Find the Best and are differentiable with respect to. is also differentiable with respect to. The gradient descent method is used to find the optimizer.

Experiment Data sets IR Image Image (No. of bands) Indian Pine Site (dims=220) Washington, DC Mall (dims=191) # of class97 Category (No. of labeled data) Soybeans-min, Soybeans-notill Soybeans-clean, Corn-min Grass/Pasture, Grass/Tree Corn-notill, Hay-windrowed Woods Roof Road Path Grass Tree Water Shadow

Experiment Results Table 1. Overall and Kappa Accuracies in Indian Pine Site Dataset method CPU Time (sec) Overall Accuracy Overall Kappa Accuracy 20 CV OP CV OP CV OP : the number of samples in class i OP : our proposed method CV : 5-fold cross-validation on the set

Experiment Results Table 2. Overall and Kappa Accuracies in Washington, DC Mall method CPU Time (sec) Accuracy Kappa Accuracy 20 CV OP CV OP CV OP : the number of samples in class i OP : our proposed method CV : 5-fold cross-validation on the set

Conclusion In this paper, an automatic method for selecting the parameter of the RBF kernel was proposed. We have experimentally compared it to k-fold cross- validation. The experiment results of two hyperspectral images show that the cost of the proposed method is 9 times less. Furthermore, we will try to apply our proposed method to other kernel-based algorithms. In addition, we will develop similar procedures for other kernel functions.

Thanks for your attention Cheng-Hsuan Li Chin-Teng Lin Bor-Chen Kuo Hui-Shan Chu

Proposed Criteria Note that Hence,

Experimental Results Table 3. Overall and Kappa Accuracies in UCI datasets DatasetsMethod CPU Time (sec) Overall Accuracy Overall Kappa Accuracy Ionosphere CV OP Monks CV OP Pima CV OP Liver CV OP WDBC CV OP Iris CV OP

How to Find the Best