GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.

Slides:



Advertisements
Similar presentations
ADBIS 2007 Discretization Numbers for Multiple-Instances Problem in Relational Database Rayner Alfred Dimitar Kazakov Artificial Intelligence Group, Computer.
Advertisements

ECG Signal processing (2)
Relevant characteristics extraction from semantically unstructured data PhD title : Data mining in unstructured data Daniel I. MORARIU, MSc PhD Supervisor:
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
An Introduction of Support Vector Machine
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
CHAPTER 10: Linear Discrimination
An Introduction of Support Vector Machine
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Particle swarm optimization for parameter determination and feature selection of support vector machines Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen,
Groundwater 3D Geological Modeling: Solving as Classification Problem with Support Vector Machine A. Smirnoff, E. Boisvert, S. J.Paradis Earth Sciences.
Heterogeneous Forests of Decision Trees Krzysztof Grąbczewski & Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Torun, Poland.
Robust Multi-Kernel Classification of Uncertain and Imbalanced Data
Classification and Decision Boundaries
Institute of Intelligent Power Electronics – IPE Page1 Introduction to Basics of Genetic Algorithms Docent Xiao-Zhi Gao Department of Electrical Engineering.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
SUPPORT VECTOR MACHINES PRESENTED BY MUTHAPPA. Introduction Support Vector Machines(SVMs) are supervised learning models with associated learning algorithms.
Feature Selection for Regression Problems
Using Analytic QP and Sparseness to Speed Training of Support Vector Machines John C. Platt Presented by: Travis Desell.
Speaker Adaptation for Vowel Classification
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Neural Optimization of Evolutionary Algorithm Strategy Parameters Hiral Patel.
An Introduction to Support Vector Machines Martin Law.
Genetic Algorithms Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs.
A hybrid method for gene selection in microarray datasets Yungho Leu, Chien-Pan Lee and Ai-Chen Chang National Taiwan University of Science and Technology.
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
Classification with Hyperplanes Defines a boundary between various points of data which represent examples plotted in multidimensional space according.
Identifying Computer Graphics Using HSV Model And Statistical Moments Of Characteristic Functions Xiao Cai, Yuewen Wang.
Repository Method to suit different investment strategies Alma Lilia Garcia & Edward Tsang.
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.
Efficient Model Selection for Support Vector Machines
Prediction model building and feature selection with SVM in breast cancer diagnosis Cheng-Lung Huang, Hung-Chang Liao, Mu- Chen Chen Expert Systems with.
Integrating Neural Network and Genetic Algorithm to Solve Function Approximation Combined with Optimization Problem Term presentation for CSC7333 Machine.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian & Edward Wild University of Wisconsin Madison Workshop on Optimization-Based Data.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian Edward Wild University of Wisconsin Madison.
GATree: Genetically Evolved Decision Trees 전자전기컴퓨터공학과 데이터베이스 연구실 G 김태종.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
Biological data mining by Genetic Programming AI Project #2 Biointelligence lab Cho, Dong-Yeon
SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.
A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.
An Introduction to Support Vector Machines (M. Law)
Exploring Alternative Splicing Features using Support Vector Machines Feature for Alternative Splicing Alternative splicing is a mechanism for generating.
Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?
Mining Binary Constraints in Feature Models: A Classification-based Approach Yi Li.
A Simulated-annealing-based Approach for Simultaneous Parameter Optimization and Feature Selection of Back-Propagation Networks (BPN) Shih-Wei Lin, Tsung-Yuan.
컴퓨터 과학부 김명재.  Introduction  Data Preprocessing  Model Selection  Experiments.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
Solving Function Optimization Problems with Genetic Algorithms September 26, 2001 Cho, Dong-Yeon , Tel:
Improving Support Vector Machine through Parameter Optimized Rujiang Bai, Junhua Liao Shandong University of Technology Library Zibo , China { brj,
Saisakul Chernbumroong, Shuang Cang, Anthony Atkins, Hongnian Yu Expert Systems with Applications 40 (2013) 1662–1674 Elderly activities recognition and.
Cheng-Lung Huang Mu-Chen Chen Chieh-Jen Wang
D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.
Combining Evolutionary Information Extracted From Frequency Profiles With Sequence-based Kernels For Protein Remote Homology Detection Name: ZhuFangzhi.
ECE 471/571 – Lecture 22 Support Vector Machine 11/24/15.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
SVMs in a Nutshell.
Extending linear models by transformation (section 3.4 in text) (lectures 3&4 on amlbook.com)
A distributed PSO – SVM hybrid system with feature selection and parameter optimization Cheng-Lung Huang & Jian-Fan Dun Soft Computing 2008.
Genetic Algorithm(GA)
Alan P. Reynolds*, David W. Corne and Michael J. Chantler
Rule Induction for Classification Using
Project 4: Facial Image Analysis with Support Vector Machines
An Introduction to Support Vector Machines
COSC 4335: Other Classification Techniques
Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen, Zne-Jung Lee
FEATURE WEIGHTING THROUGH A GENERALIZED LEAST SQUARES ESTIMATOR
Presentation transcript:

GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume 31, Issue 2, pp , 2006.

Outline INTRODUCTION BACKGROUND KNOWLEDGE GA-BASED FEATURE SELECTION AND PARAMETER OPTIMIZATION NUMERICAL ILLUSTRATIONS CONCLUSION

Introduction Support vector machines (SVM) were first suggested by Vapnik (1995) for classification. SVM classifies data with different class label by determining a set of support vectors that outline a hyperplane in the feature space. The kernel function transforms train data vector to feature space. SVM used in a range of problems including pattern recognition (Pontil and Verri 1998), bioinformatics (Brown et al. 1999), text categorization (Joachims 1998).

Problems While using SVM, we confront two problems: How to set the best parameters for SVM ! How to choose the input attributes for SVM !

Feature Selection Feature selection is used to identify a powerfully predictive subset of fields within the database and to reduce the number of fields presented to the mining process. Affects several aspects of pattern classification: 1.The accuracy of classification algorithm learned 2.The time needed for learning a classification function 3.The number of examples needed for learning 4.The cost associated with feature

SVM Parameters Setting Proper parameters setting can improve the classification accuracy of SVM. The parameters that should be optimized include penalty parameter C and the parameters with different kernel function. Grid Algorithm is an alternative to find the best C and the gamma parameter, however it is time consuming and does not perform well.

Research Purposes This research objective is to optimize the parameters and the feature subset simultaneously, without degrading the classification accuracy of SVM. Genetic Algorithms (GA) have the potential to be used to generate both the feature subset and the SVM parameters at the same time.

An Overview of This Paper

Support Vector Machine (SVM) Support vector machine (SVM) is a new technique for data classification were first suggested by Vapnik in SVM is using Separating Hyperplane to distinguish the data of two or several different Class that deal with the data mining problem of classification.

Separating Hyperplane

Slack Variable

Penalty Parameter Slack variable which accounts for the cost of overlapping error Consequently objective function must be revised by using penalty parameter C, as follows

Non-Linear Classifier

Polynomial: RBF: Sigmoidal: Kernel Function

Genetic Algorithm Genetic algorithms (GA), a general adaptive optimization search methodology based on a direct analogy to Darwinian natural selection.

Wrapper Model of Feature Selection

Chromosomes Design : represents the value of parameter C : represents the value of parameter γ : represents selected features

Genotype to Phenotype The bit strings for parameter C and γ are genotype that should be transformed into phenotype Value: the phenotype value P min : the minimum value of parameter (user define) P max : the maximum value of parameter (user define) D: the decimal value of bit string L: the length of bit string

Fitness Function Design W A : SVM classification accuracy weight SVM_accuracy: SVM classification accuracy W F : weight of the features C i : cost of feature i F i : “1” represents that feature i is selected; “0” represents that feature i is not selected

System Flows for GA-based SVM (1) Data preprocess: scaling (2) Converting genotype to phenotype (3) Feature subset (4) Fitness evaluation (5) Termination criteria (6) Genetic operation

Figure of System Flows

Experimental Dataset No.Names#Classes#Instances Nominal features Numeric features Total features 1 German (credit card) Australian (credit card) Pima-Indian diabetes Heart disease (Statlog Project) Breast cancer(Wisconsin) Contraceptive Method Choice Ionosphere Iris Sonar Statlog project : vehicle Vowel

Experiments Description To guarantee that the present results are valid and can be generalized for making predictions regarding new data Using k-fold-cross-validation This study used k = 10, meaning that all of the data will be divided into ten parts, each of which will take turns at being the testing data set.

Accuracy Calculation Accuracy using the binary target datasets can be demonstrated by the positive hit rate (sensitivity), the negative hit rate (specificity), and the overall hit rate. For the multiple class datasets, the accuracy is demonstrated only by the average hit rate.

Accuracy Calculation Sensitivity is the proportion of cases with positive class that are classified as positive: P(T+|D+) = TP / (TP+FN). Specificity is the proportion of cases with the negative class: P(T-|D-) = TN / (TN + FP). Overall hit rate is the overall accuracy which is calculated by (TP+TN) / (TN+FP+FN+FP). Target (or Disease) +- Predicted (or Test) +True Positive(TP)False Positive(FP) -False Negative(FN)True Negative(TN)

Accuracy Calculation The SVM_accuracy of the fitness in function is measured by Sensitivity*Specificity for the datasets with two classes (positive or negative). Overall hit rate for the datasets with multiple classes.

GA Parameter Setting Chromosome Represented by using Binary Code Population Size 500 Crossover Rate 0.7,One Point Crossover Mutation Rate 0.02 Roulette Wheel Selection Elitism Replacement

W A and W F Weight W A and W F can influence the experiment result according to the fitness function The higher W A is; the higher classification accuracy is. The higher W F is; the smaller the number of features is.

Folder #4 of German Dataset Curve Diagram

Experimental Results for German Dataset

Results summary (GA-based approach vs. Grid search ) GA-based approachGrid algorithm p-value for Wilcoxon Testing NamesNumber of Original features Number of Selected features Average Positive Hit Rate Average Negative Hit Rate Average Overall Hit Rate% Average Positive Hit Rate Average Negative Hit Rate Average Overall Hit Rate% German2413± ± ± * Australian143± ± ± * diabetes83.7± ± ± Heart disease135.4± ± ± * breast cancer101± ± ± Contraceptive95.4±0.53 N/A 71.22±4.15N/A 53.53± * ionosphere346± ± ± * iris41±0 N/A 100±0N/A 97.37± * sonar6015± ± ± * vehicle189.2±1.4 N/A 84.06±3.54N/A 83.33± Vowel137.8±1 N/A 99.3±0.82N/A 95.95± *

ROC curve for fold #4 of German Credit Dataset

Average AUC for Datasets GA-based approachGrid algorithm German Australian diabetes Heart disease breast cancer Contraceptive ionosphere iris sonar vehicle Vowel

Conclusion We proposed a GA-based strategy to select features subset and to set the parameters for SVM classification. We have conducted two experiments to evaluate the classification accuracy of the proposed GA- based approach with RBF kernel and the grid search method on 11 real-world datasets from UCI database. Generally, compared with the grid search approach, the proposed GA-based approach has good accuracy performance with fewer features.

Thank You Q & A