Download presentation
Presentation is loading. Please wait.
1
Evolution of descent directions Alejandro Sierra Escuela Politécnica Superior Universidad Autónoma de Madrid Iván Santibáñez Koref Bionik und Evolutionstechnik Technical University of Berlin
2
Outline Estimation of distribution algorithms (EDA) A naive EDA Beyond the naive EDA: IDEA MBOA CMA-ES Classical optimization algorithms Evolution of descent directions (ED 2)
3
Estimation of distribution algorithms Optimization algorithm
4
Estimation of distribution algorithms Optimization algorithm Samples from a probability density function (pdf)
5
Estimation of distribution algorithms Optimization algorithm Samples from a probability density function (pdf) The pdf is updated in an evolutionary way: Population of samples The best representatives are used to update the parameters of the pdf
6
A naive EDA
8
Initialization of each Normal pdf: Random means. Standard deviations = 1.
9
A naive EDA Initialization of each Normal pdf: Random means. Standard deviations = 1. Repeat until finding a good solution:
10
A naive EDA Initialization of each Normal pdf: Random means. Standard deviations = 1. Repeat until finding a good solution: Take λ samples from the product
11
A naive EDA Initialization of each Normal pdf: Random means. Standard deviations = 1. Repeat until finding a good solution: Take λ samples from the product Recalculate means and deviations from µ (μ<<λ) best samples
12
Beyond the naive EDA Two ways out:
13
Beyond the naive EDA Two ways out: Use a full multidimensional Normal distribution (CMA-ES).
14
Beyond the naive EDA Two ways out: Use a full multidimensional Normal distribution (CMA-ES). Use Bayesian networks to learn more complex joint probability relationships (IDEA, MBOA).
15
Beyond the naive EDA IDEA and MBOA have very heavy learning procedures. We’d like to keep the naive approach without giving up variable dependencies. Classical minimization algorithms as inspiration.
16
Classical optimization algorithms Classical minimization of a function f (x): 1. Generate a random point x 2. Generate a random direction v
17
Classical optimization algorithms f(x) Initial point
18
Classical optimization algorithms Classical minimization of a function f (x): 1. Generate a random point x 2. Generate a random direction v 3. Run a line minimization algorithm to find λ v :
19
Classical optimization algorithms f(x) Point found by line minimization
20
Classical optimization algorithms Classical minimization of a function f (x): 1. Generate a random point x 2. Generate a random direction v 3. Run a line minimization algorithm to find λ v : 4. Update x
21
Classical optimization algorithms f(x)
22
Classical optimization algorithms Classical minimization of a function f (x): 1. Generate a random point x 2. Generate a random direction v 3. Run a line minimization algorithm to find λ v : 4. Update x 5. Update v and come back to point 3.
23
Classical optimization algorithms
25
ED 2 : the naive EDA of descent directions Sampling directions from a factorized probability density function. Each direction is a model of correlation between variables. A product of normal distributions is enough.
26
ED 2 : the algorithm The initial normal distributions are initialized and random directions are sampled.
27
ED 2 : the algorithm The initial normal distributions are randomly generated. The best point is initialized.
28
ED 2 : the algorithm The initial normal distributions are randomly generated. The best point is initialized. The following steps are repeated:
29
ED 2 : the algorithm The initial normal distributions are randomly generated. The best point is initialized. The following steps are repeated: Each direction is used to improve the best point by interpolation.
30
ED 2 : the algorithm The initial normal distributions are randomly generated. The best point is initialized. The following steps are repeated: Each direction is used to improve the best point by interpolation. Fitness = drop in fitness value.
31
ED 2 : the algorithm The initial normal distributions are randomly generated. The best point is initialized. The following steps are repeated: Each direction is used to improve the best point by interpolation. Fitness = drop in fitness value. The pdfs are updated out of the best directions. New directions are sampled from the product of normal distributions.
32
ED2: Results for the cigar function FunctionCMA-ESIDEAMBOAED 2 Cigar1 (3840)4.6122.2 Rotated cigar1 (3840)3821003.7 CMA-ES takes 3840 function evaluations till reaching f(x)=10 -10 IDEA takes 4.6 times more evaluations
33
Conclusions ED 2 : Evolution of descent directions Sampling of directions from a product of normal distributions ED 2 is very fast Future work: More complex line minimization algorithms
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.