Improving Support Vector Machine through Parameter Optimized Rujiang Bai, Junhua Liao Shandong University of Technology Library Zibo 255049, China { brj,

Slides:



Advertisements
Similar presentations
ADBIS 2007 Aggregating Multiple Instances in Relational Database Using Semi-Supervised Genetic Algorithm-based Clustering Technique Rayner Alfred Dimitar.
Advertisements

(SubLoc) Support vector machine approach for protein subcelluar localization prediction (SubLoc) Kim Hye Jin Intelligent Multimedia Lab
Brainy: Effective Selection of Data Structures. Why data structure selection is important How to choose the best data structure for a specific application.
Sequential Minimal Optimization Advanced Machine Learning Course 2012 Fall Semester Tsinghua University.
Ali Husseinzadeh Kashan Spring 2010
ECG Signal processing (2)
Relevant characteristics extraction from semantically unstructured data PhD title : Data mining in unstructured data Daniel I. MORARIU, MSc PhD Supervisor:
Feature Grouping-Based Fuzzy-Rough Feature Selection Richard Jensen Neil Mac Parthaláin Chris Cornelis.
Particle swarm optimization for parameter determination and feature selection of support vector machines Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen,
Face Recognition & Biometric Systems Support Vector Machines (part 2)
The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,
Feature Selection Presented by: Nafise Hatamikhah
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
A Study on Feature Selection for Toxicity Prediction*
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Data classification based on tolerant rough set reporter: yanan yean.
Optimizing F-Measure with Support Vector Machines David R. Musicant Vipin Kumar Aysel Ozgur FLAIRS 2003 Tuesday, May 13, 2003 Carleton College.
Reduced Support Vector Machine
August 2005RSFDGrC 2005, Regina, Canada 1 Feature Selection Based on Relative Attribute Dependency: An Experimental Study Jianchao Han 1, Ricardo Sanchez.
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Genetic Algorithm What is a genetic algorithm? “Genetic Algorithms are defined as global optimization procedures that use an analogy of genetic evolution.
A hybrid method for gene selection in microarray datasets Yungho Leu, Chien-Pan Lee and Ai-Chen Chang National Taiwan University of Science and Technology.
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.
2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations1 Towards Effective Browsing of Large Scale Social Annotations WWW 2007.
Efficient Model Selection for Support Vector Machines
An Approach of Artificial Intelligence Application for Laboratory Tests Evaluation Ş.l.univ.dr.ing. Corina SĂVULESCU University of Piteşti.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian & Edward Wild University of Wisconsin Madison Workshop on Optimization-Based Data.
Exploring a Hybrid of Support Vector Machines (SVMs) and a Heuristic Based System in Classifying Web Pages Santa Clara, California, USA Ahmad Rahman, Yuliya.
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.
Chapter 8 The k-Means Algorithm and Genetic Algorithm.
Benk Erika Kelemen Zsolt
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Neural and Evolutionary Computing - Lecture 9 1 Evolutionary Neural Networks Design  Motivation  Evolutionary training  Evolutionary design of the architecture.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
A genetic approach to the automatic clustering problem Author : Lin Yu Tseng Shiueng Bien Yang Graduate : Chien-Ming Hsiao.
Spam Detection Ethan Grefe December 13, 2013.
Richard Jensen, Andrew Tuson and Qiang Shen Qiang Shen Aberystwyth University, UK Richard Jensen Aberystwyth University, UK Andrew Tuson City University,
A Face processing system Based on Committee Machine: The Approach and Experimental Results Presented by: Harvest Jang 29 Jan 2003.
Data Classification with the Radial Basis Function Network Based on a Novel Kernel Density Estimation Algorithm Yen-Jen Oyang Department of Computer Science.
1 Support Cluster Machine Paper from ICML2007 Read by Haiqin Yang This paper, Support Cluster Machine, was written by Bin Li, Mingmin Chi, Jianping.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
RSVM: Reduced Support Vector Machines Y.-J. Lee & O. L. Mangasarian First SIAM International Conference on Data Mining Chicago, April 6, 2001 University.
Genetic Algorithms Genetic algorithms provide an approach to learning that is based loosely on simulated evolution. Hypotheses are often described by bit.
CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.
Solving Function Optimization Problems with Genetic Algorithms September 26, 2001 Cho, Dong-Yeon , Tel:
Application of the GA-PSO with the Fuzzy controller to the robot soccer Department of Electrical Engineering, Southern Taiwan University, Tainan, R.O.C.
Agenda  INTRODUCTION  GENETIC ALGORITHMS  GENETIC ALGORITHMS FOR EXPLORING QUERY SPACE  SYSTEM ARCHITECTURE  THE EFFECT OF DIFFERENT MUTATION RATES.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 3 Basic Data Mining Techniques Jason C. H. Chen, Ph.D. Professor of MIS School of Business.
Massive Support Vector Regression (via Row and Column Chunking) David R. Musicant and O.L. Mangasarian NIPS 99 Workshop on Learning With Support Vectors.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
1 Discriminative Frequent Pattern Analysis for Effective Classification Presenter: Han Liang COURSE PRESENTATION:
A Semantic Text Classification Based on DBpedia Rujiang Bai, Junhua Liao Shandong University of Technology Library Zibo , China { brj,
1 Comparative Study of two Genetic Algorithms Based Task Allocation Models in Distributed Computing System Oğuzhan TAŞ 2005.
A distributed PSO – SVM hybrid system with feature selection and parameter optimization Cheng-Lung Huang & Jian-Fan Dun Soft Computing 2008.
Support Vector Machines Reading: Textbook, Chapter 5 Ben-Hur and Weston, A User’s Guide to Support Vector Machines (linked from class web page)
Genetic Algorithm(GA)
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
An Enhanced Support Vector Machine Model for Intrusion Detection
Artificial Intelligence Project 2 Genetic Algorithms
Particle swarm optimization
Discriminative Frequent Pattern Analysis for Effective Classification
The following slides are taken from:
Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen, Zne-Jung Lee
FEATURE WEIGHTING THROUGH A GENERALIZED LEAST SQUARES ESTIMATOR
Presenter: Donovan Orn
Presentation transcript:

Improving Support Vector Machine through Parameter Optimized Rujiang Bai, Junhua Liao Shandong University of Technology Library Zibo , China { brj,

Outline Motivation and Background Motivation and Background RGSC(Rough sets and Genetic algorithms for SVM classifier) RGSC(Rough sets and Genetic algorithms for SVM classifier) Experiment Experiment Conclusion Conclusion

Motivation and Background Three problems Three problems 1 、 the high dimensionality of input feature vectors impacts on the classification speed. 2 、 The kernel parameters setting for SVM in a training process impacts on the classification accuracy. 3 、 Feature selection also impacts classification accuracy.

Motivation and Background Goal Goal The objective of this work is to reduce the dimension of feature vectors, optimizing the parameters to improve the SVM classification accuracy and speed. MethodMethod In order to improve classification speed we spent rough sets theory to reduce the feature vector space. We present a genetic algorithm approach for feature selection and parameters optimization to improve classification accuracy.

(1) Preprocessing: preprocessing includes remove HTML tags, segment word and construct Vector Space Model. (2) Feature reduction by rough sets. Our objective is to find a reduction with minimal number of attributes. Described in Alg. 1. (3) Converting genotype to phenotype. This step will convert each parameter and feature chromosome from its genotype into a phenotype. (4) Feature subset. After the genetic operation and converting each feature subset chromosome from the genotype into the phenotype, a feature subset can be determined. RGSC

(5) Fitness evaluation. For each chromosome representing C, γ and selected features, training dataset is used to train the SVM classifier, while the testing dataset is used to calculate classification accuracy. When the classification accuracy is obtained, each chromosome is evaluated by fitness function— formula (8). RGSC

(6) Termination criteria. When the termination criteria are satisfied, the process ends; otherwise, we proceed with the next generation. (7) Genetic operation. In this step, the system searches for better solutions by genetic operations, including selection, crossover, mutation, and replacement. (8) Input the preprocessed data sets into the obtained optimized SVM classifier. RGSC

Algorithm of RST-based feature reduce RGSC

Algorithm : Rough Sets Attribute Reduction algorithm Input: a decision table,, Output: a reduction of T, denoted as Redu. 1. construct the binary discernibility matrix M of T ; 2. delete the rows in the M which are all 0’s, Redu= /* delete pairs of inconsistent objects*/ 3. while 4. {(1) select an attribute ci in the M with the highest discernibility degree (if there are several (j=1,2,…,m) with the same highest discernibility degree, choose randomly an attribute from them); 5. (2) Redu Redu ; 6. (3) remove the rows which have ‘‘1’’ in the column from M; 7. (4) remove the column from M; }endwhile /* the following steps remove redundant attributes from Redu */ 8. suppose that Redu = contains k attributes which are sorted by the order of entering Redu, is the first attributes chosen into Redu, is the last one chosen into Redu. 9. get the binary discernibility matrix MR of decision table TR= ; 10. delete the rows in the MR which are all 0’s;

Chromosome design To implement our proposed approach, this research used the RBF kernel function for the SVM classifier. The chromosome comprises three parts, C, γ, and the features mask. RGSC

Fitness function Fitness function WA SVM classification accuracy weight SVM_accuracy SVM classification accuracy WF weight for the number of features Ci cost of feature i Fi ‘1’ represents that feature i is selected; ‘0’ represents that feature i is not selected RGSC

Experiments Experiment environment Experiment environment Our implementation was carried out on the YALE(Yet Another Learning Environment) 3.3 development environment(Available at: Feature reduction by Rough Sets Theory carried out on ROSETTA(you can download it from The empirical evaluation was performed on Intel Pentium IV CPU running at 3.0 GHz and 1GB RAM.

Conclusion In this paper, we have proposed a document classification method using an SVM based on Rough Sets Theory and Genetic Algorithms. The feature vectors are reduced by Rough Set Theory. The feature vectors are selected and parameters optimization by Genetic Algorithms. The experimental results show that the RGSC we proposed yields the best result of these three methods.

Thank you!