Hybrid Ant Colony Optimization-Support Vector Machine using Weighted Ranking for Feature Selection and Classification.

Slides:



Advertisements
Similar presentations
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Advertisements

CS6800 Advanced Theory of Computation
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Data Mining Feature Selection. Data reduction: Obtain a reduced representation of the data set that is much smaller in volume but yet produces the same.
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
SVM—Support Vector Machines
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Particle swarm optimization for parameter determination and feature selection of support vector machines Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen,
Machine Learning Week 1, Lecture 2. Recap Supervised Learning Data Set Learning Algorithm Hypothesis h h(x) ≈ f(x) Unknown Target f Hypothesis Set 5 0.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Hybridization of Search Meta-Heuristics Bob Buehler.
Reduced Support Vector Machine
Ensemble Learning: An Introduction
Ant Colony Optimization Optimisation Methods. Overview.
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
Ant Colony Optimization Algorithms for the Traveling Salesman Problem ACO Kristie Simpson EE536: Advanced Artificial Intelligence Montana State.
Ant Colony Optimization: an introduction
Ant Colony Optimization (ACO): Applications to Scheduling
A hybrid method for gene selection in microarray datasets Yungho Leu, Chien-Pan Lee and Ai-Chen Chang National Taiwan University of Science and Technology.
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
Breast Cancer Diagnosis via Linear Hyper-plane Classifier Presented by Joseph Maalouf December 14, 2001 December 14, 2001.
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.
Genetic Algorithms and Ant Colony Optimisation
Efficient Model Selection for Support Vector Machines
Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.
Prediction model building and feature selection with SVM in breast cancer diagnosis Cheng-Lung Huang, Hung-Chang Liao, Mu- Chen Chen Expert Systems with.
Swarm Computing Applications in Software Engineering By Chaitanya.
WELCOME. Malay Mitra Lecturer in Computer Science & Application Jalpaiguri Polytechnic West Bengal.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian & Edward Wild University of Wisconsin Madison Workshop on Optimization-Based Data.
Swarm Intelligence 虞台文.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian Edward Wild University of Wisconsin Madison.
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.
Prediction of Malignancy of Ovarian Tumors Using Least Squares Support Vector Machines C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel 1, I.
Ant Colony Optimization. Summer 2010: Dr. M. Ameer Ali Ant Colony Optimization.
Discrete optimization of trusses using ant colony metaphor Saurabh Samdani, Vinay Belambe, B.Tech Students, Indian Institute Of Technology Guwahati, Guwahati.
Exploring Alternative Splicing Features using Support Vector Machines Feature for Alternative Splicing Alternative splicing is a mechanism for generating.
Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.
Week 1 - An Introduction to Machine Learning & Soft Computing
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
Ant Colony Optimization Quadratic Assignment Problem Hernan AGUIRRE, Adel BEN HAJ YEDDER, Andre DIAS and Pascalis RAPTIS Problem Leader: Marco Dorigo Team.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Intelligent Numerical Computation1 Numerical Analysis MATLAB programming Numerical Methods Applications Contents.
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
COT6930 Course Project. Outline Gene Selection Sequence Alignment.
Ramakrishna Lecture#2 CAD for VLSI Ramakrishna
Biologically Inspired Computation Ant Colony Optimisation.
An Effective Hybridized Classifier for Breast Cancer Diagnosis DISHANT MITTAL, DEV GAURAV & SANJIBAN SEKHAR ROY VIT University, India.
Machine Learning and Data Mining: A Math Programming- Based Approach Glenn Fung CS412 April 10, 2003 Madison, Wisconsin.
A distributed PSO – SVM hybrid system with feature selection and parameter optimization Cheng-Lung Huang & Jian-Fan Dun Soft Computing 2008.
Classification of Breast Cancer Cells Using Artificial Neural Networks and Support Vector Machines Emmanuel Contreras Guzman.
Ant Colony Optimisation. Emergent Problem Solving in Lasius Niger ants, For Lasius Niger ants, [Franks, 89] observed: –regulation of nest temperature.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Genetic Algorithms.
In Search of the Optimal Set of Indicators when Classifying Histopathological Images Catalin Stoean University of Craiova, Romania
Glenn Fung, Murat Dundar, Bharat Rao and Jinbo Bi
Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Ant Colony Optimization Quadratic Assignment Problem
Implementing AdaBoost
Ant Colony Optimization
traveling salesman problem
Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen, Zne-Jung Lee
Somi Jacob and Christian Bach
Feature Selection Methods
FEATURE WEIGHTING THROUGH A GENERALIZED LEAST SQUARES ESTIMATOR
Fewer attributes are better if they are optimal
CS+Social Good.
Adapted by Dr. Sarah from a talk by Dr. Catherine A. Gorini
Support Vector Machines 2
Presenter: Donovan Orn
Presentation transcript:

Hybrid Ant Colony Optimization-Support Vector Machine using Weighted Ranking for Feature Selection and Classification

Dimension of dataset high Decreased Predictive performance More Computation Time F1 F2 F3 F4 F6 F5 F7 DATASET WITH ALL FEATURES DATASET WITH ONLY SELECTED FEATURES NOISY FEATURES Dimension of dataset low Increased Predictive performance Lesser Computation Time

Evaluate features using heuristic information Fast but inaccurate E.g. Statistical correlation Feature Selection Algorithms FiltersWrappers Evaluate features using learning algorithm Slow but accurate E.g. Ant Colony Optimization, Genetic Algorithm

 Uses a hybrid filter-wrapper based approach.  Wrapper method: Ant Colony Optimization  Filter method : Weighted Ranking of features 7/10/20164

Ant Colony Optimization (ACO) Food Source 4 PheromoneOptimal Path  ACO is an iterative process used for solving combinatorial optimization problems  Inspired by the foraging behavior of real ants and their inherent ability to find the shortest path from a food source to their nest.

 Features equivalent to cities in TSP.  Ants conduct a partial tour in contrast to TSP.  ACO depends on: Heuristic Information on a feature Experience gained by ants in the previous iterations.

Filter approach. Each feature was assigned a Weighted Ranking score. Final Rank Score(WR) Information Gain Score (IG) Chi- Square Score (CS) CFS score (CFS) Weight w1 Weight w2 Weight w3 Weighted Rank (WR) of feature f is calculated as: WR f = w1 * IG f + w2 * CS f + w3 * CFS f

Proposed Algorithm 1 Rank the features based on the weighted summation of their IG, CS and CFS scores. 2 Initialize Pheromone values on all the links τo Initialize number of ants Initialize other algorithm parameters 3 An ant selects the first feature depending upon a scoring system. Scoring System employed for 90% of the cases. For the rest 10%, feature selected randomly

Proposed Algorithm 4 To select the next feature choose between exploitation and exploration 5 Conduct the partial tour in the same manner until maximum number of a priori fixed subset size is reached Depute other ants to conduct their own partial tours. 6 After completion of the tours of all ants, evaluate the quality of the tours employing SVM

7 Increase the pheromone concentration of links of the best ant proportional to the quality of the tour (accuracy of the classifier). 8 Decrease the pheromone concentrations on all the links visited by the remaining ants. 9 Repeat steps 3 to 8 for a certain number of iterations. Proposed Algorithm

 A scoring system has been developed for the selection of the first feature, which depends on:  Total no. of links connecting a feature having a pheromone value better than a threshold.  The sum of all the pheromone values on links starting from that feature.  The fitness value of that particular feature.

 The subset is decreased progressively from 90% of the original feature set size to 10%.  The subset size which gives the maximum accuracy is selected.

 In order to test the proposed algorithm, an SVM learning algorithm was used.  After a solution is constructed, each ant passes its feature subset to the classifier and receives its accuracy.  This accuracy is used as a fitness function for selecting the best ant of that iteration. The classifier accuracy was evaluated using 10- fold cross validation.

 Any good classifier can be employed as a wrapper.  Any combination of good filter-methods can be used with a weighted scoring system of features.

 Time Consuming.  Solution: Parallel implementation of the algorithm

ParametersValues Number of ants100 Number of iterations50 Exploitation Probability Factor (q0) 0.7 Pheromone Update Strength ( φ) 0.25 Pheromone Decay Parameter ( ρ ) 0.98 Pheromone Importance Factor ( β ) 1 Information Gain weight0.3 Chi-Square weight0.3 CFS weight0.3

 In order to test the proposed algorithm, six datasets were obtained from the UCI (University of California, Irvine) machine learning repository. Dataset NameNo. of featuresNo. of classesClass label type No. of instances Wisconsin Breast Cancer 102REAL683 Hepatitis192REAL155 Lung Cancer563REAL32 Splice602REAL1000 Bupa Liver Disorder 62REAL345 Statlog Heart132REAL270

 The classifier accuracy was calculated for all the datasets with and without using Feature Selection and the results were compared. Dataset NameNo. of original features No. of Features in the subset % Accuracy without FS % Accuracy with FS Wisconsin Breast Cancer Hepatitis Lung Cancer Splice Bupa Liver Disorder Statlog Heart

 The selected subset of features of all the datasets Dataset NameSelected Features Wisconsin Breast Cancer2,3,4,7,9 Hepatitis12, 13, 14 Lung Cancer9, 14 Splice15, 16, 18, 19, 22, 23, 25, 26, 28, 29, 30, 31, 32, 33, 34, 35 Bupa Liver Disorder1,2 Statlog Heart3,7,9,12,13

 Malignant Tumor that develops from Breast cells  Most Common cause of death among women  To reduce deaths, early diagnosis is important  Early diagnosis is important to distinguish between benign and malignant tumors

 Biopsy- expensive and invasive.  Several machine learning techniques successfully used to predict Breast Cancer.  Objective- To assign patients to benign or malignant class

 Dataset obtained from UCI Machine Learning Repository.  Consists of 10 features, 699 instances.  Features describe characteristics of the cell nuclei present in the breast mass image.  An RBF kernel was found to outperform the polynomial and linear kernels for this task, with the value of C=100.  A feature subset of size 5 features obtained an accuracy of %.

#FEATURESDOMAIN 1.Sample code numberId Number 2.Clump Thickness Uniformity of Cell Size Uniformity of Cell Shape Marginal Adhesion Single Epithelial Cell Size Bare Nuclei Bland Chromatin Normal Nucleoli Mitoses Class2 for benign, 4 for malignant

AlgorithmAccuracy (%) NB+CHI NB+GAIN NB+RelieF NB+CFS RBF+GA KNN+CHIWSS NN+CHIWSS74.23 SVM+CHWSS76.29 ACO-SVM+WR

 Hybrid filter-wrapper technique.  Reduces complexity of Machine learning algorithms.  Easy implementation.  Increased Predictive Performance  Picks up the correlations between features.  Flexible