Presented by: Dr Beatriz de la Iglesia Managing Population Diversity Through the Use of Weighted Objectives and Modified Dominance: An Example from Data Mining Alan P. Reynolds and Beatriz de la Iglesia School of Computing Sciences University of East Anglia Norwich, Norfolk, UK Presented by: Dr Beatriz de la Iglesia MCDM 2007
Overview The context: Multi-objective predictive classification The problem: Loss of diversity Managing diversity using three objectives Managing diversity using adaptive objectives Managing diversity by modifying the dominance relation Conclusions and further research Acknowledgement: Work supported by EPSRC Grant GR/T04298 B de la Iglesia MCDM 2007
Multi-Objective (MO) Predictive Classification NSGA II applied to optimize rules in the form of expression trees. Internal nodes are Boolean operators (‘AND’, ‘OR’), leaf nodes contain Attribute Tests (ATs) (e.g. sex=male, color red, age 29). Expression trees used to predict class in binary classification problems. B de la Iglesia MCDM 2007
Objectives for MO approach Minimizing misclassification costs: simple error rate; balanced error rate; any other measure of overall misclassification cost. Minimizing rule complexity: simple count of ATs or other measures of complexity; to encourage the production of rules that are easily understood by the client; to reduce the chance of overfitting the data. B de la Iglesia MCDM 2007
Applying NSGA II Initialization: population initialized with random balanced trees of depth two. Crossover: subtree crossover which selects two nodes at random and swaps associated subtrees. Mutation applied whenever crossover is not applied 50% chance of AT mutated; 25% chance random AT and its parent node removed 25% chance random AT and parent node inserted B de la Iglesia MCDM 2007
Rule evaluation Rule Simplification: effort is made before each rule is evaluated to simplify the rule. Rule Reduction: If, after rule simplification, the rule is still larger than a preset maximum size (20 ATs in the experiments reported here), ATs and their parent nodes are removed at random until the rule meets the size constraint. B de la Iglesia MCDM 2007
Using two objectives Validation performance for UCI ML repository Adult dataset Population of 100 and crossover rate of 30%. Algorithm run 30 times, error bars give Standard Deviation 14.45% error rate on test data (STD 0.12%) 15.98 % error rate on test data (STD 0.70%) B de la Iglesia MCDM 2007
Loss of population diversity Analysis of performance reveals loss of population diversity (genetic material) early in the search. Information entropy used as a measure of population diversity Entropy is 0 if all solutions are the same in population If all solutions are unique B de la Iglesia MCDM 2007
Population entropy B de la Iglesia MCDM 2007
Population Diversity in MO GAs Early in the search few solutions are non-dominated. Algorithms like NSGA II and SPEA 2 become too elitist. May also be a problem, especially early on in the search, whenever objectives are correlated in other MO scenarios. Solution: artificially increase the fraction of non- dominated solutions early in the search to counteract the loss of population diversity. B de la Iglesia MCDM 2007
Using three objectives If the client is unsure how to calculate misclassification costs, it is tempting to optimise rule complexity, number of False Positives (FP) and False Negatives (FN). In practice, the client usually a rough idea – this knowledge can be used to improve the search. Rule complexity FP + (1- ) FN (1- ) FP + FN False positives False negatives Dominated area 3 basic objectives v. 3 objectives with =0.8 at constant rule complexity B de la Iglesia MCDM 2007
Optimizing 3 objectives- 200 generations Results after 200 generations, scaled with respect to =0.5 =0 equivalent to optimizing 3 objectives (Complexity, FP,FN) =0.5 equivalent to optimizing 2 objectives (Complexity, simple error rate) B de la Iglesia MCDM 2007
Optimizing 3 objectives- 2000 generations Results after 2000 generations, scaled with respect to =0.5 including 95% confidence intervals Two objective approach outperforms others, except for =0.4 B de la Iglesia MCDM 2007
Population Diversity Three objectives reduce loss of diversity early in the search Three objectives’ search does not find best large rules under error rate and rule simplicity Solution: Combination approach should be used B de la Iglesia MCDM 2007
Adaptive objectives (2000 gen) Initialise =0. Measuring entropy of population at each iteration, increase by 0.01 up to 0.5, whenever entropy is greater than 5 and decrease by 0.01 up to 0 whenever entropy less than three. Mean error rates (scaled) on training data using 2000 generations B de la Iglesia MCDM 2007
Adaptive objectives (200 gen) 15.98 % error rate on test data (STD 0.70%) 15.00 % error rate on test data (STD 0.25%) B de la Iglesia MCDM 2007
Modifying the dominance relation Breaking one objective into two or more parts not always easy. An alternative is to use linear combinations of objectives as before, but where one of the weights is negative. For example, minimize 1.1 * complexity – 0.1 * cost 1.1 * cost – 0.1 * complexity Problem: All 1 AT rules become non-dominated. B de la Iglesia MCDM 2007
Modifying dominance relation Alternative is to modify dominance relation by allowing a certain amount of leeway; r1 dominates r2 iff Opposite to -dominance which allows a solution to dominate more of the objective space to control spread of Pareto-front or external archive. B de la Iglesia MCDM 2007
Modifying dominance Results are similar to those obtained when using three objectives in terms of Solution quality Effect on population diversity Results also suggest an adaptive scheme whereby early in the search more leeway is given and this is reduced later. Results of adaptive leeway also provide improvements for both 200 and 2000 generations, although less marked than when using three adaptive objectives. B de la Iglesia MCDM 2007
Conclusions Expression trees optimisation – loss of population diversity early in the search process as small rule with low cost dominates population. This can also be expected in MO problems with few positively correlated objectives. Two methods for maintaining population diversity: Split objective into component parts and use linear combinations of new objectives Modify dominance relation to give leeway in one objective Parameters result in feedback control mechanism depending on diversity in the population B de la Iglesia MCDM 2007
Further research Further experimentation required crossover, population size feedback mechanism other measures of population diversity Effects of diversity loss in other problem domains should be studied. B de la Iglesia MCDM 2007