Download presentation
Presentation is loading. Please wait.
Published byKathlyn Carroll Modified over 9 years ago
1
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering Bilkent University May 16, 2005
2
2 Motivation Neural Networks Feature Subset Selection Genetic Algorithms Methodology Experiments and Results Conclusions and Future Work Outline
3
3 Motivation It is not unusual to find problems involving hundreds of features Beyond a point the inclusion of additional features leads to worse rather than better performance Differentiate between features that contribute new information and not Many of current techniques such as PCA and LDA involve linear transformations to lower dimensions A multi-objective genetic algorithm is needed to: Reduce the cost Increase the accuracy (if applicable)
4
4 Neural Networks An information processing paradigm that is inspired by the way biological nervous systems process information A large number of highly interconnected processing elements (neurones) working in unison to solve specific problems They are configured for a specific application through a learning process Adjustments to synaptic connections that exist between the neurons
5
5 Neural Networks The network may become unbelievably complex if the number of the features used to for classification increases very much If the network becomes too complex, then: Size increases Training time increases Training set size increases Classification time increases Some optimization methods such as node pruning techniques exist for classification using ANNs
6
6 Feature Subset Selection Reduce the number of features used in classification while maintaining acceptable classification accuracy Considerable impact on the effectiveness of the resulting classification Computational complexity is reduced as there is smaller number of inputs Accuracy increases as the removed features hinders the classification process Can be seen as a case of binary feature weighting
7
7 Genetic Algorithms A family of computational models inspired by evolution GAs are parallel iterative optimizers, and have been successfully applied to a broad spectrum of optimization problems Focusing on the application of selection, mutation, and recombination to a population of competing problem solutions A directed search rather than an exhaustive search
8
8 Genetic Algorithms Given enough time and a well bounded problem, the genetic algorithm can find a global optimum Performance of genetic algorithm depends on a number of factors including: The choice of genetic representation and operators, The fitness function, The details of the fitness-dependent selection procedure, Various user-determined parameters such as population size All about representation and fitness…
9
9 Methodology Represent the feature subsets as binary strings where: A value of 1 will represent the inclusion of a particular feature in the training process A value of 1 will represent its absence The genetic algorithm will operate on a pool of binary strings For each binary string we train a new neural network with the selected features as input nodes to evaluate the fitness of the resulting binary set
10
10 Methodology As a result of the training we obtain an error value e(x), where 0 ≤ e(x) ≤ 1 A cost function for the network s(x) is obtained, where again 0 ≤ s(x) ≤ 1 After training the fitness of the feature subset is obtained through
11
11 Experiments and Results We conducted an experiment that shows our results on a handwritten digit recognition problem We implemented our methodology using the Matlab Neural Network and Genetic Algorithm toolboxes The database we used in our experiments was the UCI database for handwritten digits This database includes 200 samples for each digit (totally 2000 samples) The digits are represented as 15 x 16 images each
12
12 Experiments and Results We have randomly chosen 100 digits from each digit for the training set and used the remaining 100 digits for testing our networks to obtain the necessary e(x) and s(x) values We decided to use the pixels as our features and so we have 240 features to evaluate We create a pool of feature subsets represented as 240-bit bit-strings where 1s represent the inclusion of the associated pixel value and 0s represent the absence of it while training the network For each binary string in the pool we create a new Feed-Forward back- propagation ANN with one hidden layer composed of 10 neurons. We used logarithmic sigmoid functions which are gradient descent with momentum and adaptive learning rate back-propagation training functions… (the slowest in Matlab, namely ‘traingdx’)
13
13 Experiments and Results The parameters for our GA are: Population Size: 50 Number of Generations: 100 Probability of Crossover: 0.6 Probability of Mutation: 0.001 Elite Count: 2 Type of Mutation: Uniform Type of Selection: Rank-based Stall Generations Limit: 10 Stall Time Limit: Infinite
14
14 Experiments and Results
15
15 Experiments and Results Accuracy Training Dataset Test Dataset Full Feature Set (240, s(x) = 1.00) 99.7%89.9% Optimal Subset (53, s(x) = 0.221) 99.4%90.4%
16
16 Conclusions Proposed methodology succeeds in reducing the complexity of the feature set used by the ANN-classifier Genetic algorithms offer an attractive approach to solving the features subset selection problem This methodology finds application areas in cost sensitive design of classifiers for tasks such as medical diagnosis and computer vision Other application areas include automated data mining and knowledge discovery from datasets with an abundance of irrelevant or redundant features The GA-based approach to feature subset selection does not rely on monotonicity assumptions that are used in traditional approaches to feature subset selection
17
17 Future Work An analysis is still needed to improve the results obtained using GAs Performance improvement and trials for the other datasets may be included Performance improvements should be done on the Genetic Algorithms themselves Another analysis could be based on the fitness evaluation function where there may be used other fitness functions The approach may be tried in semi-supervised learning case
18
18 Thanks for Listening… Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.