1 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov CS321 HS 2009 Autonomic Computer Systems Evolutionary Computation II November 19, 2009 Lidia Yamamoto University of Basel
2 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Overview Evolutionary Computation, Part II Representations Performance considerations Dynamic environments Summary
3 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Performance Issues: Optimizing the Optimization What makes evolutionary computation difficult? difficult = computationally expensive (in terms of time, memory) from [Weise2009] Premature convergence Ruggedness Lack of causality Deceptiveness Neutrality Epistasis Noise Overfitting Oversimplification Multi-objectivity Dynamic environments No Free Lunch
4 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Metrics Diversity Causality Neutrality Evolvability
5 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Exploitation vs. Exploration Crucial to heuristic optimization: to strike a good balance between exploration and exploitation Exploration: creation of novel solutions able to explore yet unknown regions of the search space Exploitation: make best use of existing good solutions, and build upon them for the construction of new ones Too much exploration: Lose focus, wander randomly through search space: can’t improve Too much exploitation: Stick to small area near current (perhaps poor) solutions, don’t look around: can't improve either
6 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Premature Convergence Convergence: an optimization algorithm converges when it doesn’t produce new solutions anymore, or keeps producing a very reduced subset of solutions Premature convergence: the algorithm converges to a local optimum and can’t improve from there (unable to explore other regions of the search space) Typical in multimodal fitness landscapes multimodal function: has several maxima or minima multimodal fitness landscape: has several (local or global) optima
7 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Premature Convergence Example: z x y z x y optimization run initial populationprematurely converged population
8 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Premature Convergence Typically caused by loss of diversity Diversity: measure of the amount variety, i.e. number of different solutions in the population, and how different they are (distance between alternative solutions) Loss of diversity: after the population converges, it becomes very uniform (all solutions resemble the best one). Causes: too strong selective pressure towards best solution too much exploitation of existing building blocks from current population (e.g. by recombining them, or mutating them only slightly)
9 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Premature Convergence Fighting premature convergence: Restart from scratch (as a last resort) Maintain diversity (but may slow down opt.) – Decrease selection pressure – Random immigrants: insert new random individuals periodically – Penalize similarity [Miller1996]: Crowding: similar individuals are more likely to die to make room for new ones Sharing: similar individuals “share” fitness (fitness gets reduced in proportion to the number of similar individuals)
10 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Ruggedness zz x y multimodal fitness landscaperugged fitness landscape Rugged fitness landscape: multimodal with steep ascends and descends: optimization algorithm has trouble finding reliable gradient information to follow
11 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Ruggedness A typical cause of ruggedness: weak causality Strong causality: small changes in the genotype lead to small changes in fitness (ideal) Weak causality: a small change in the genotype may lead to a large or unpredictable change in fitness a small mutation may convert a very good solution into a very bad one, and vice-versa optimization becomes erratic, may still work but very slowly Mitigating the effects of ruggedness: Large populations, high diversity Change the genotype representation for a smoother genotype-phenotype-fitness map
12 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Deceptiveness The gradient leads the optimizer away from the optimum Consequence: optimizer may perform worse than random walk No effective countermeasures Palliative solutions: large populations, high diversity, increase causality by grouping related genes f(x) x global optimum
13 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Neutrality A neutral change (e.g. neutral mutation) is a transformation in the genotype that produces no change in fitness Degree of neutrality: of a genotype: fraction of neutral results among all possible (1-step) changes that can be applied to it of a region of the search space: average neutrality of the genotypes within this region
14 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Neutrality Example: f(x) x neutral genotypes neutral region neutral changes (e.g. mutation)
15 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Neutrality and Evolvability Evolvability: in biology: ability to generate heritable and selectable phenotypic variation in optimization: ability to produce new, fitter solutions Neutrality has positive and negative influences on evolvability: positive: it may help to avoid “death valleys” of poor solutions: neutral changes accumulate, until enough changes result in a beneficial outcome – punctuated equilibria in biology: long periods of stasis, followed by short periods of rapid phenotypic evolution negative: it may slow down convergence: within the neutral region, the algorithm has no hint about how to make progress
16 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Neutrality Bridges Premature convergenceSmall neutral bridgeWide neutral bridge figure from [Weise2009]
17 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Overfitting Overfitting: emergence of an overly complicated solution that tries to fit as much of the training data as possible Typical cause: noise in the measured data used as training set Example, in symbolic regression: f(x) x x x original functionmeasured data (with noise) overfitted result
18 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Overfitting Consequence: loss of generality: the solution generated is too specific for the set of data (includes the noise as part of the solution) Generality: A solution is general if it is not only valid for the training samples, but also for all different inputs that it should face Countermeasures: to favor simpler solutions larger and randomized training subsets, repeated tested
19 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Oversimplification Opposite of overfitting: too simple solutions are obtained Causes: Incomplete training set, not sufficiently representative of the problem to be solved Premature convergence due to ruggedness, deceptiveness Solution: careful analysis of problem space and design of solution representation original functionmeasured dataoversimplified result f(x) x x x
20 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov No Free Lunch Theorem Wolpert and Macready, 1997: No Free Lunch (NFL) Theorem(s) averaged over all problems, all search algorithms have the same performance. or: if an algorithm performs well for a certain category of problems, it must perform poorly for other problems. Performance improvements often rely on more knowledge about the problem domain (e.g. assume strong causality, or a certain degree of ruggedness)
21 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Other Issues Epistasis and Pleiotropy Epistasis: interaction between different genes Pleiotropy: a single gene influences multiple traits In GP: one gene (e.g. program segment) influences other genes (e.g. code executed afterwards): a mutation may have a cascade effect, leading to weak causality Multi-Objective Optimization multiple, possibly contradictory objectives to be pursued simultaneously must find a balance among them: notion of “better” replaced by a notion of “dominant” solution
22 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Overview Evolutionary Computation, Part II Representations Performance considerations Dynamic environments Summary
23 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Optimization in Dynamic Environments Motivation: dynamic applications: continuously changing environment delivery scheduling, vehicle routing, greenhouse control... autonomic environments: – detect and respond to changes, continuous self-optimization Dynamic Optimization: Algorithm should continuously track the optimum in the presence of dynamic changes and uncertainties keep performance under (small) changes adjust quickly to changes
24 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Optimization in Dynamic Environments Challenges: change and uncertainty noise or errors in fitness function calculation or approximation changes in environmental parameters (e.g. in a wireless net: number of nodes, weather conditions or obstacles that may affect transmissions) change in desired optimum, i.e. change in fitness function Re-optimize (start from scratch) is expensive Crucial to keep diversity: if the optimum changes, the population must be able to re- adapt: this requires diversity in the population
25 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Optimization in Dynamic Environments In a dynamic environment, convergence to a given optimum is a problem: how to readapt to a new optimum? Solutions: Restart from scratch (last resort if changes are too severe) Recreate diversity after change: randomization, e.g. hypermutation (but: may destroy previous info) Maintain diversity: e.g. random immigrants, sentinels – random immigrants: insert new random individuals periodically – sentinels: keep some individuals at fixed locations – but: slows down convergence
26 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Optimization in Dynamic Environments Solutions (cont.): Memory-enhanced algorithms: "remember" previous optima, in case they come back: – implicit memory: redundant genetic representation (e.g. diploid) – explicit memory: explicitly store and retrieve info from mem. when problem changes: retrieve suitable solution from memory more successful overall than implicit memory [Jin2005] – both only useful in combination with diversity keeping if no diversity in memory then memory not so useful
27 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Optimization in Dynamic Environments Solutions (cont.): Multi-population approaches: different subpopulations on different peaks, with memory of local optima – example of memory with diversity combination – approaches self-organizing scouts [Branke2000] multi-national GA [Ursem2000] Anticipation and prediction [Bosman2005] – system tries to predict future consequences of current decisions – estimate expected values given probability distribution
28 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Overview Evolutionary Computation, Part II Representations Performance considerations Dynamic environments Summary
29 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Summary Solving problems with evolutionary computation involves a number of design choices: Genotype representation for candidate solutions: – string, tree, graph, multiset (chemistry),... Phenotype representation: – same as genotype? – or indirect encoding (e.g. grammatical evolution) with genotype- phenotype map? Choice of reproduction, variation, fitness evaluation and selection mechanisms – strike a balance between exploration and exploitation Performance considerations
30 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov Summary Performance considerations: prevent premature convergence keep diversity (especially in multimodal landscapes and dynamic environments) face and exploit neutrality deal with noisy fitness (e.g. in dynamic environments, avoid overfitting) Not covered: co-evolution: different species (tasks) interact, have an impact on each other’s evolution – competitive relation, e.g. host-parasite – cooperative relation, e.g. symbiosis
31 S321 HS 2009: Evolutionary Computation II, L. Yamamoto, 19 Nov References [Weise2009] T. Weise, M. Zapf, R. Chiong, and A. J. Nebro. “Why Is Optimization Difficult?” Nature-Inspired Algorithms for Optimisation, Studies in Computational Intelligence, volume 193, chapter 11, pages 150. Springer, [Miller1996] B. L. Miller, M. J. Shaw, “Genetic algorithms with dynamic niche sharing for multimodal function optimization”, Proc. IEEE International Conference on Evolutionary Computation, agoya, Japan, May [Jin2005] Y. Jin and J. Branke. "Evolutionary Optimization in Uncertain Environments - A Survey". IEEE Transactions on Evolutionary Computation, 9(3):303317, Jun