Download presentation
1
Engineering Optimization
Concepts and Applications Fred van Keulen Matthijs Langelaar CLA H21.1
2
Summary single variable methods
Bracketing + Dichotomous sectioning Fibonacci sectioning Golden ratio sectioning Quadratic interpolation Cubic interpolation Bisection method Secant method Newton method In practice: additional “tricks” needed to deal with: Multimodality Strong fluctuations Round-off errors Divergence 0th order 1st order 2nd order And many, many more!
3
Unconstrained optimization algorithms
Single-variable methods Multiple variable methods 0th order 1st order 2nd order Direct search methods Descent methods
4
Test functions Comparison of performance of algorithms:
Mathematical convergence proofs Performance on benchmark problems (test functions) Examples of test functions: Rosenbrock’s function (“banana function”) Optimum: (1, 1)
5
Test functions (2) Quadratic function: Many local optima:
Optimum: (1, 3) Many local optima: Optimum: (0, 0)
6
Random methods Random jumping method: (random search)
Generate random points, keep the best Random walk method: Generate random unit direction vectors Walk to new point if better Decrease stepsize after N steps
7
Simulated annealing (Metropolis algorithm)
Random method inspired by natural process: annealing Heating of metal/glass to relieve stresses Controlled cooling to a state of stable equilibrium with minimal internal stresses Probability of internal energy change (Boltzmann’s probability distribution function) Note, some chance on energy increase exists! S.A. based on this probability concept The guy in the picture is Nicholas (nick) Metropolis. T is temperature, k = Boltzmann constant
8
Simulated annealing algorithm
Set a starting “temperature” T, pick a starting design x, and obtain f(x) Randomly generate a new design y close to x Note: Obtain f(y). Accept new design if better. If worse, generate random number r, and accept new design when Stop if design has not changed in several steps. Otherwise, update temperature:
9
Simulated annealing (3)
As temperature reduces, probability of accepting a bad step reduces as well: Negative Reducing Increasingly negative Accepting bad steps (energy increase) likely in initial phase, but less likely at the end Temperature zero: basic random jumping method Variants: several steps before test, cooling schemes, …
10
Random methods properties
Very robust: work also for discontinuous / nondifferentiable functions Can find global minimum Last resort: when all else fails S.A. known to perform well on several hard problems (traveling salesman) Quite inefficient, but can be used in initial stage to determine promising starting point Drawback: results not repeatable The random factor makes that the results cannot be repeated (unless a quasi-random number generator is used)
11
Cyclic coordinate search
Search alternatingly in each coordinate direction Perform single-variable optimization along each direction (line search): Directions fixed: can lead to slow convergence
12
Powell’s Conjugate Directions method
Adjusting search directions improves convergence Idea: replace first direction with combined direction of a cycle: Steps in cycle i Directions for cycle i+1 For more dimensions, the idea is to toss out the first search direction, and replace it (at the end of the direction list) by the new combined direction. Guaranteed to converge in n cycles for quadratic functions! (theoretically)
14
Nelder and Mead Simplex method
Simplex: figure of n + 1 points in Rn Gradually move toward minimum by reflection: f = 10 f = 7 f = 5 For better performance: expansion/contraction and other tricks
15
Biologically inspired methods
Popular: inspiration for algorithms from biological processes: Genetic algorithms / evolutionary optimization Particle swarms / flocks Ant colony methods Ant colony methods are used for discrete systems, AFAIK Typically make use of population (collection of designs) Computationally intensive Stochastic nature, global optimization properties
16
Genetic algorithms Based on evolution theory of Darwin: Survival of the fittest Objective = fitness function Designs are encoded in chromosomal strings, ~ genes: e.g. binary strings: 1 x1 x2
17
GA flowchart Create initial population Evaluate fitness
of all individuals Test termination criteria Create new population Crossover Mutation Reproduction Select individuals for reproduction Quit Termination criteria can be fixed number of generations, a certain required fitness level is reached, no more improvement in X generations
18
GA population operators
Reproduction: Exact copy/copies of individual Crossover: Randomly exchange genes of different parents Many possibilities: how many genes, parents, children … Mutation: Randomly flip some bits of a gene string Used sparingly, but important to explore new designs
19
Population operators Crossover: Mutation: Parent 2 Parent 1 Child 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Child 1 Child 2 Mutation: 1
20
Particle swarms / flocks
No genes and reproduction, but a population that travels through the design space Derived from simulations of flocks/schools in nature Individuals tend to follow the individual with the best fitness value, but also determine their own path Some randomness added to give exploration properties (“craziness parameter”)
21
Random numbers between 0 and 1
PSO algorithm Initialize location and speed of individuals (random) Evaluate fitness Update best scores: individual (y) and overall (Y) Update velocity and position: PSO = particle swarm optimization Random numbers between 0 and 1 Control “social behavior” vs “individual behavior”
22
Summary 0th order methods
Nelder-Mead beats Powell in most cases Robust: most can deal with discontinuity etc. Less attractive for many design variables (>10) Stochastic techniques: Computationally expensive, but Global optimization properties Versatile Population-based algorithms benefit from parallel computing
23
Unconstrained optimization algorithms
Single-variable methods Multiple variable methods 0th order 1st order 2nd order
24
Steepest descent method
Move in direction of largest decrease in f: Taylor: Best direction: x2 x1 f = 1.9 -f Example: -f f = 0.044 f = 7.2 Divergence occurs! Remedy: line search
25
Steepest descent convergence
Zig-zag convergence behavior:
26
Effect of scaling Scaling variables helps a lot!
x2 y1 y2 Ideal scaling hard to determine (requires Hessian information) x1
27
Fletcher-Reeves conjugate gradient method
Based on building set of conjugate directions, combined with line searches Conjugate directions: Conjugate directions: guaranteed convergence in N steps for quadratic problems (recall Powell: N cycles of N line searches)
28
Fletcher-Reeves Conjugate gradient method
Set of N conjugate directions: (Special case: orthogonal directions, eigenvectors) Property: searching along conjugate directions yields optimum of quadratic function in N steps (or less): Blackboard: conjug. property, f, gradient Method actually invented for solving linear systems of equantions, that are derived from quadratic problems. But used (successfully) outside that domain. Optimality:
29
Conjugate directions Find conjugate coordinates bi:
Optimization process with line search along all di: Note, what is found in the line searches are in fact the conjugate coordinates. On this slide, they look suspiciously similar.
30
Conjugate directions (2)
Optimization by line searches along conjugate directions will converge in N steps (or less): ? Blackboard: result for alpha Indeed, by making use of the conjugate property, here we see that the conjugate directions are the line search lengths. (definition)
31
But … how to obtain conjugate directions?
How to generate conjugate directions with only gradient information? Start with steepest descent direction: Line search: f = c f = c+1 Blackboard: step definition, grad*step=0 (6,8) We assume we’ll look for a new direction of the form given. That’s just a choice. It’s a descent step plus something. -f2 d1
32
Conjugate directions (3)
Condition for conjugate direction: Blackboard: right bottom equation, gamma definition Using the definition of conjugacy, we can determine the gamma value. Line search: But, in general, A is unknown! Remedy: Gradients:
33
Eliminating A (cont.) Result: This is all to get rid of that A.
Showing that all conjugate directions generated in this way are indeed conjugate is possible, but the proof is quite lengthy. It’s given in Chong, p. 156.
34
Why that last step? By Fletcher-Reeves: starting from Polak-Rebiere version: But because Now use
35
Generally best in most cases
Three CG variants For general non-quadratic problems, three variants exist that are equivalent in the quadratic case: Hestenes-Stiefel: Polak-Rebiere: Fletcher-Reeves: Generally best in most cases
36
CG practical Start with abritrary x1 Set first search direction:
Line search to find next point: Next search direction: Repeat 3 Restart every (n+1) steps, using step 2
37
CG properties Theoretically converges in N steps or less for quadratic functions In practice: Non-quadratic functions Finite line search accuracy Round-off errors Slower convergence; > N steps In fact, Newton and quasi-Newton methods perform better. After N steps / bad convergence: restart procedure etc.
38
Application to mechanics (FE)
Structural mechanics: Quadratic function! Equilibrium: CG: Line search: Simple operations on element level. Attractive for large N!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.