Experimentation with Evolutionary Computing A.E. Eiben Free University Amsterdam with special thanks to Ben Paechter.

Slides:

Advertisements

Similar presentations

Genetic Algorithms Chapter 3. A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Genetic Algorithms GA Quick Overview Developed: USA in.

Advertisements

1 An Adaptive GA for Multi Objective Flexible Manufacturing Systems A. Younes, H. Ghenniwa, S. Areibi uoguelph.ca.

Parameter control Chapter 8. A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Parameter Control in EAs 2 Motivation 1 An EA has many.

Parameter Control A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Chapter 8.

G. Alonso, D. Kossmann Systems Group

EvoNet Flying Circus Introduction to Evolutionary Computation Brought to you by (insert your name) The EvoNet Training Committee The EvoNet Flying Circus.

Evolutionary Computing A Practical Introduction Presented by Ben Paechter Napier University with thanks to the EvoNet Training Committee and its “Flying.

Online Performance Auditing Using Hot Optimizations Without Getting Burned Jeremy Lau (UCSD, IBM) Matthew Arnold (IBM) Michael Hind (IBM) Brad Calder (UCSD)

Chapter 7 Sampling Distributions

1 Lecture 8: Genetic Algorithms Contents : Miming nature The steps of the algorithm –Coosing parents –Reproduction –Mutation Deeper in GA –Stochastic Universal.

Estimation of Distribution Algorithms Ata Kaban School of Computer Science The University of Birmingham.

Working with Evolutionary Algorithms Chapter Issues considered Experiment design Algorithm design Test problems Measurements and statistics Some.

Evolutionary Computational Intelligence Lecture 10a: Surrogate Assisted Ferrante Neri University of Jyväskylä.

EvoNet Flying Circus Introduction to Evolutionary Computation Brought to you by (insert your name) The EvoNet Training Committee The EvoNet Flying Circus.

Improved Crowd Control Utilizing a Distributed Genetic Algorithm John Chaloupek December 3 rd, 2003.

A new crossover technique in Genetic Programming Janet Clegg Intelligent Systems Group Electronics Department.

Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,

CMP3265 – Professional Issues and Research Methods Research Proposals: n Aims and objectives, Method, Evaluation n Yesterday – aims and objectives: clear,

Evolutionary Computation Application Peter Andras peter.andras/lectures.

How to Build an Evolutionary Algorithm

Experimental Evaluation

Applying Multi-Criteria Optimisation to Develop Cognitive Models Peter Lane University of Hertfordshire Fernand Gobet Brunel University.

IE 594 : Research Methodology – Discrete Event Simulation David S. Kim Spring 2009.

Constraint handling A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Chapter 14.

Genetic Algorithms: A Tutorial

Water Contamination Detection – Methodology and Empirical Results IPN-ISRAEL WATER WEEK (I 2 W 2 ) Eyal Brill Holon institute of Technology, Faculty of.

Prepared by Barış GÖKÇE 1.  Search Methods  Evolutionary Algorithms (EA)  Characteristics of EAs  Genetic Programming (GP)  Evolutionary Programming.

Genetic Algorithm.

Project.  Topic should be: Clear and specific Practical and meaningful, this means the results of your research must have some implications in real life.

1 Paper Review for ENGG6140 Memetic Algorithms By: Jin Zeng Shaun Wang School of Engineering University of Guelph Mar. 18, 2002.

Evaluation of software engineering. Software engineering research : Research in SE aims to achieve two main goals: 1) To increase the knowledge about.

SOFT COMPUTING (Optimization Techniques using GA) Dr. N.Uma Maheswari Professor/CSE PSNA CET.

Introduction: Why statistics? Petter Mostad

Evolution Strategies Evolutionary Programming Genetic Programming Michael J. Watts

Lecture 8: 24/5/1435 Genetic Algorithms Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 10. Hypothesis Testing II: Single-Sample Hypothesis Tests: Establishing the Representativeness.

Genetic Algorithms Michael J. Watts

What is Genetic Programming? Genetic programming is a model of programming which uses the ideas (and some of the terminology) of biological evolution to.

L ECTURE 3 Notes are modified from EvoNet Flying Circus slides. Found at: www2.cs.uh.edu/~ceick/ai/EC1.ppt.

Applying Genetic Algorithm to the Knapsack Problem Qi Su ECE 539 Spring 2001 Course Project.

1 “Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for optimal combinations of things, solutions.

GENETIC ALGORITHM A biologically inspired model of intelligence and the principles of biological evolution are applied to find solutions to difficult problems.

Working with Evolutionary Algorithms Chapter 14. A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Working with EAs 2 Issues considered.

Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.

Statistics : Statistical Inference Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University 1.

AP Statistics Chapter 24 Comparing Means.

Simulation & Confidence Intervals COMP5416 Advanced Network Technologies.

Evolutionary Programming

1 OUTPUT ANALYSIS FOR SIMULATIONS. 2 Introduction Analysis of One System Terminating vs. Steady-State Simulations Analysis of Terminating Simulations.

Presenting and Analysing your Data CSCI 6620 Spring 2014 Thesis Projects: Chapter 10 CSCI 6620 Spring 2014 Thesis Projects: Chapter 10.

Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.

1 Running Experiments for Your Term Projects Dana S. Nau CMSC 722, AI Planning University of Maryland Lecture slides for Automated Planning: Theory and.

Optimization Problems

© Copyright McGraw-Hill 2004

Advanced MMBR Conjoint analysis (1). Advanced Methods and Models in Behavioral Research Conjoint analysis -> Multi-level models You have to understand:

D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.

Selection and Recombination Temi avanzati di Intelligenza Artificiale - Lecture 4 Prof. Vincenzo Cutello Department of Mathematics and Computer Science.

Evolutionary Programming A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Chapter 5.

Working with Evolutionary Algorithms Chapter Issues considered Experiment design Algorithm design Test problems Measurements and statistics Some.

EVOLUTIONARY SYSTEMS AND GENETIC ALGORITHMS NAME: AKSHITKUMAR PATEL STUDENT ID: GRAD POSITION PAPER.

Genetic Algorithms An Evolutionary Approach to Problem Solving.

 Presented By: Abdul Aziz Ghazi  Roll No:  Presented to: Sir Harris.

Daniil Chivilikhin and Vladimir Ulyantsev

Who cares about implementation and precision?

Advanced Artificial Intelligence Evolutionary Search Algorithm

Professor S K Dubey,VSM Amity School of Business

Working with Evolutionary Algorithms

Putting It All Together: Which Method Do I Use?

Genetic Algorithms Chapter 3.

Presentation transcript:

Experimentation with Evolutionary Computing A.E. Eiben Free University Amsterdam with special thanks to Ben Paechter

A.E. Eiben, Experimentation with EC2 EvoNet Summer School 2002 Issues considered Experiment design Algorithm design Test problems Measurements and statistics Some tips and summary

A.E. Eiben, Experimentation with EC3 EvoNet Summer School 2002 Experimentation Has a goal or goals Involves algorithm design and implementation Needs problem(s) to run the algorithm(s) on Amounts to running the algorithm(s) on the problem(s) Delivers measurement data, the results Is concluded with evaluating the results in the light of the given goal(s) Is often documented (see tutorial on paper writing)

A.E. Eiben, Experimentation with EC4 EvoNet Summer School 2002 Goals for experimentation Get a good solution for a given problem Show that EC is applicable in a (new) problem domain Show that my_EA is better than benchmark_EA Show that EAs outperform traditional algorithms (sic!) Find best setup for parameters of a given algorithm Understand algorithm behavior (e.g. pop dynamics) See how an EA scales-up with problem size See how performance is influenced by parameters of the problem of the algorithm …

A.E. Eiben, Experimentation with EC5 EvoNet Summer School 2002 Example: Production Perspective Optimising Internet shopping delivery route Different destinations each day Limited time to run algorithm each day Must always be reasonably good route in limited time

A.E. Eiben, Experimentation with EC6 EvoNet Summer School 2002 Example: Design Perspective Optimising spending on improvements to national road network –Total cost: billions of Euro –Computing costs negligible –Six months to run algorithm on hundreds computers –Many runs possible –Must produce very good result just once

A.E. Eiben, Experimentation with EC7 EvoNet Summer School 2002 Perspectives of goals Design perspective: find a very good solution at least once Production perspective: find a good solution at almost every run also Publication perspective: must meet scientific standards (huh?) Application perspective: good enough is good enough (verification!) These perspectives have very different implications on evaluating the results (yet often left implicit)

A.E. Eiben, Experimentation with EC8 EvoNet Summer School 2002 Algorithm design Design a representation Design a way of mapping a genotype to a phenotype Design a way of evaluating an individual Design suitable mutation operator(s) Design suitable recombination operator(s) Decide how to select individuals to be parents Decide how to select individuals for the next generation (how to manage the population) Decide how to start: initialisation method Decide how to stop: termination criterion

A.E. Eiben, Experimentation with EC9 EvoNet Summer School 2002 Algorithm design (cont’d) For a detailed treatment see Ben Paechter’s lecture from the 2001 Summer School:

A.E. Eiben, Experimentation with EC10 EvoNet Summer School 2002 Test problems 5 DeJong functions 25 “hard” objective functions Frequently encountered or otherwise important variants of given practical problem Selection from recognized benchmark problem repository (“challenging” by being NP--- ?!) Problem instances made by random generator Choice has severe implications on generalizability and scope of the results

A.E. Eiben, Experimentation with EC11 EvoNet Summer School 2002 Bad example I invented “tricky mutation” Showed that it is a good idea by: Running standard (?) GA and tricky GA On 10 objective functions from the literature Finding tricky GA better on 7, equal on 1, worse on 2 cases I wrote it down in a paper And it got published! Q: what did I learned from this experience? Q: is this good work?

A.E. Eiben, Experimentation with EC12 EvoNet Summer School 2002 Bad example (cont’d) What did I (my readers) did not learn: How relevant are these results (test functions)? What is the scope of claims about the superiority of the tricky GA? Is there a property distinguishing the 7 good and the 2 bad functions? Are my results generalizable? (Is the tricky GA applicable for other problems? Which ones?)

A.E. Eiben, Experimentation with EC13 EvoNet Summer School 2002 Getting Problem Instances 1 Testing on real data Advantages: Results could be considered as very relevant viewed from the application domain (data supplier) Disadvantages Can be over-complicated Can be few available sets of real data May be commercial sensitive – difficult to publish and to allow others to compare Results are hard to generalize

A.E. Eiben, Experimentation with EC14 EvoNet Summer School 2002 Getting Problem Instances 2 Standard data sets in problem repositories, e.g.: OR-Library UCI Machine Learning Repository Advantage: Well-chosen problems and instances (hopefully) Much other work on these  results comparable Disadvantage: Not real – might miss crucial aspect Algorithms get tuned for popular test suites

A.E. Eiben, Experimentation with EC15 EvoNet Summer School 2002 Getting Problem Instances 3 Problem instance generators produce simulated data for given parameters, e.g.: GA/EA Repository of Test Problem Generators Advantage: Allow very systematic comparisons for they can produce many instances with the same characteristics enable gradual traversion of a range of characteristics (hardness) Can be shared allowing comparisons with other researchers Disadvantage Not real – might miss crucial aspect Given generator might have hidden bias

A.E. Eiben, Experimentation with EC16 EvoNet Summer School 2002 Basic rules of experimentation EAs are stochastic  never draw any conclusion from a single run perform sufficient number of independent runs use statistical measures (averages, standard deviations) use statistical tests to assess reliability of conclusions EA experimentation is about comparison  always do a fair competition use the same amount of resources for the competitors try different comp. limits (to coop with turtle/hare effect) use the same performance measures

A.E. Eiben, Experimentation with EC17 EvoNet Summer School 2002 Things to Measure Many different ways. Examples: Average result in given time Average time for given result Proportion of runs within % of target Best result over n runs Amount of computing required to reach target in given time with % confidence …

A.E. Eiben, Experimentation with EC18 EvoNet Summer School 2002 What time units do we use? Elapsed time? Depends on computer, network, etc… CPU Time? Depends on skill of programmer, implementation, etc… Generations? Difficult to compare when parameters like population size change Evaluations? Evaluation time could depend on algorithm, e.g. direct vs. indirect representation

A.E. Eiben, Experimentation with EC19 EvoNet Summer School 2002 Measures Performance measures (off-line) Efficiency (alg. speed) CPU time No. of steps, i.e., generated points in the search space Effectivity (alg. quality) Success rate Solution quality at termination “Working” measures (on-line) Population distribution (genotypic) Fitness distribution (phenotypic) Improvements per time unit or per genetic operator …

A.E. Eiben, Experimentation with EC20 EvoNet Summer School 2002 Performance measures No. of generated points in the search space = no. of fitness evaluations (don’t use no. of generations!) AES: average no. of evaluations to solution SR: success rate = % of runs finding a solution (individual with acceptabe quality / fitness) MBF: mean best fitness at termination, i.e., best per run, mean over a set of runs SR  MBF Low SR, high MBF: good approximizer (more time helps?) High SR, low MBF: “Murphy” algorithm

A.E. Eiben, Experimentation with EC21 EvoNet Summer School 2002 Fair experiments Basic rule: use the same computational limit for each competitor Allow each EA the same no. of evaluations, but Beware of hidden labour, e.g. in heuristic mutation operators Beware of possibly fewer evaluations by smart operators EA vs. heuristic: allow the same no. of steps: Defining “step” is crucial, might imply bias! Scale-up comparisons eliminate this bias

A.E. Eiben, Experimentation with EC22 EvoNet Summer School 2002 Example: off-line performance measure evaluation Which algorith is better? Why? When?

A.E. Eiben, Experimentation with EC23 EvoNet Summer School 2002 Example: on-line performance measure evaluation Populations mean (best) fitness Which algorith is better? Why? When? Algorithm B Algorithm A

A.E. Eiben, Experimentation with EC24 EvoNet Summer School 2002 Example: averaging on-line measures time Run 2 Run 1 average Averaging can “choke” interesting onformation

A.E. Eiben, Experimentation with EC25 EvoNet Summer School 2002 Example: overlaying on-line measures time Overlay of curves can lead to very “cloudy” figures

A.E. Eiben, Experimentation with EC26 EvoNet Summer School 2002 Statistical Comparisons and Significance Algorithms are stochastic Results have element of “luck” Sometimes can get away with less rigour – e.g. parameter tuning For scientific papers where a claim is made: “Newbie recombination is better ran uniform crossover”, need to show statistical significance of comparisons

A.E. Eiben, Experimentation with EC27 EvoNet Summer School 2002 Example Is the new method better?

A.E. Eiben, Experimentation with EC28 EvoNet Summer School 2002 Example (cont’d) Standard deviations supply additional info T-test (and alike) indicate the chance that the values came from the same underlying distribution (difference is due to random effetcs) E.g. with 7% chance in this example.

A.E. Eiben, Experimentation with EC29 EvoNet Summer School 2002 Statistical tests T-test assummes: Data taken from continuous interval or close approximation Normal distribution Similar variances for too few data points Similar sized groups of data points Other tests: Wilcoxon – preferred to t-test where numbers are small or distribution is not known. F-test – tests if two samples have different variances.

A.E. Eiben, Experimentation with EC30 EvoNet Summer School 2002 Statistical Resources Microsoft Excel

A.E. Eiben, Experimentation with EC31 EvoNet Summer School 2002 Better example: problem setting I invented myEA for problem X Looked and found 3 other EAs and a traditional benchmark heuristic for problem X in the literature Asked myself when and why is myEA better

A.E. Eiben, Experimentation with EC32 EvoNet Summer School 2002 Better example: experiments Found/made problem instance generator for problem X with 2 parameters: n (problem size) k (some problem specific indicator) Selected 5 values for k and 5 values for n Generated 100 problem instances for all combinations Executed all alg’s on each instance 100 times (benchmark was also stochastic) Recorded AES, SR, MBF values w/ same comp. limit (AES for benchmark?) Put my program code and the instances on the Web

A.E. Eiben, Experimentation with EC33 EvoNet Summer School 2002 Better example: evaluation Arranged results “in 3D” (n,k) + performance (with special attention to the effect of n, as for scale-up) Assessed statistical significance of results Found the niche for my_EA: Weak in … cases, strong in cases, comparable otherwise Thereby I answered the “when question” Analyzed the specific features and the niches of each algorithm thus answering the “why question” Learned a lot about problem X and its solvers Achieved generalizable results, or at least claims with well-identified scope based on solid data Facilitated reproducing my results  further research

A.E. Eiben, Experimentation with EC34 EvoNet Summer School 2002 Some tips Be organized Decide what you want Define appropriate measures Choose test problems carefully Make an experiment plan (estimate time when possible) Perform sufficient number of runs Keep all experimental data (never throw away anything) Use good statistics (“standard” tools from Web, MS) Present results well (figures, graphs, tables, …) Watch the scope of your claims Aim at generalizable results Publish code for reproducibility of results (if applicable)

A.E. Eiben, Experimentation with EC35 EvoNet Summer School 2002 Summary Experimental methodology in EC is weak Lack of strong selection pressure for publications Laziness (seniors), copycat behavior (novices) Not much learning from other fields actively using better methodology, e.g., machine learning (training-test instances) social sciences! (statistics) Not much effort into better methodologies better test suites reproducible results (code standardization) Much room for improvement: do it!