Optimization methods Aleksey Minin Saint-Petersburg State University Student of ACOPhys master program (10 th semester) 1 Joint Advanced Students School.

Optimization methods Aleksey Minin Saint-Petersburg State University Student of ACOPhys master program (10 th semester) 1 Joint Advanced Students School 03.07.2015 A pplied and CO mputational Phys ics

What is optimization? 03.07.2015Joint Advanced Students School 2

Content: 1.Applications of optimization 2.Global Optimization 3.Local Optimization 4.Discrete optimization 5.Constrained optimization 6.Real application, Bounded Derivative Network. 3 Joint Advanced Students School 03.07.2015

Applications of optimization Advanced engineering designAdvanced engineering design BiotechnologyBiotechnology Data analysisData analysis Environmental managementEnvironmental management Financial planningFinancial planning Process controlProcess control Scientific modeling etcScientific modeling etc 4 Joint Advanced Students School 03.07.2015

Global or Local ? Optimization Global an overviewLocal An overview and implementation 03.07.2015Joint Advanced Students School 5

What is global optimization? The objective of global optimization is to find the globally best solution of (possibly nonlinear) models, in the (possible or known) presence of multiple local optima. 03.07.2015Joint Advanced Students School 6

Global optimization Branch and Bound Evolutionary Algorithms Simulated Annealing Tree annealing Tabu Search 03.07.2015Joint Advanced Students School 7

Branch and bound Root problem Consider the original problem with the complete feasible region The lower-bounding and upper-bounding procedures are applied If the bounds match Solution found procedure terminates Else feasible region is divided into two or more regions, each strict sub regions of the original 03.07.2015Joint Advanced Students School 8

03.07.2015Joint Advanced Students School 9 Branch and bound Scientist are ready to carry out some experiments. But the quality of all of the varies depending on type of experiment according to next table: Type of experiment Scientist number 1234 A0.90.80.90.85 B0.70.60.80.7 C0.850.70.850.8 D0.750.70.750.7

03.07.2015Joint Advanced Students School 10 Branch and bound Root

03.07.2015Joint Advanced Students School 11 Branch and bound Root AAAA 0.55

03.07.2015Joint Advanced Students School 12 Branch and bound Root AAAA 0.55 A ADCC 0.42 B BAAA 0.42 C CAAA 0.52 D DAAA 0.45

03.07.2015Joint Advanced Students School 13 Branch and bound Root AAAA 0.55 A ADCC 0.42 B BAAA 0.42 C CAAA 0.52 C CAAA 0.52 D DAAA 0.45 B CBAA 0.39 B CBAA 0.39 D CDAA 0.45 A CABD 0.38 A CABD 0.38

03.07.2015Joint Advanced Students School 14 Branch and bound Root AAAA 0.55 A ADCC 0.42 B BAAA 0.42 C CAAA 0.52 C CAAA 0.52 D DAAA 0.45 D CDAA 0.45 B CBAA 0.39 A CABD 0.38 A CABD 0.38 A CBAD 0.37 B CDBA 0.40

03.07.2015Joint Advanced Students School 15 Branch and bound Root AAAA 0.55 A ADCC 0.42 B BAAA 0.42 C CAAA 0.52 C CAAA 0.52 D DAAA 0.45 D CDAA 0.45 D CDAA 0.45 B CBAA 0.39 A CABD 0.38 B CDBA 0.40 B CDBA 0.40 A CBAD 0.37

DisadvantagesAdvantages Very general Easy to program Fast for small tasks Bad for big tasks Becomes difficult (dim) 03.07.2015Joint Advanced Students School 16 Branch and bound

Evolutionary algorithms Step 1 Initialize the population Step 2 Evaluate initial population Repeat Perform competitive selection Apply genetic operators to generate new solutions Evaluate solutions in the population Until Some convergence criteria is satisfied 03.07.2015Joint Advanced Students School 17

Evolutionary algorithms 03.07.2015Joint Advanced Students School 18 AdvantagesDisadvantages Starts from population well on "noisy" functions Do not stick Time consuming for big problems Not stable

Simulated annealing 03.07.2015Joint Advanced Students School 19 Start T,E=const Compute dE If dE>0 Accept exp(-dE/T) if dE<0 then accept Decrease T Apply small perturbation Solution found! If T=0 Repeat until good solution not found

03.07.2015Joint Advanced Students School 20 Simulated annealing results

AdvantagesDisadvantages What is T? Heavily Depend on initial point How to define dT? Good physical meaning Easy to program Good for high dim tasks 03.07.2015Joint Advanced Students School 21 Simulated annealing

Tree annealing developed by Bilbro and Snyder [1991] 03.07.2015Joint Advanced Students School 22 1. Randomly choose an initial point x over the search interval S0 2. Randomly travel down the tree to an arbitrary terminal node i, and generate a candidate point y over the subspace defined by Si. 3. If f(y) replace x with y, and go to step 5. 4. Compute P = exp (-(f(y)-f(x))/T). If P>R, where R is a random number uniformly distributed between 0 and 1, replace x with y. 5. If y replace x, decrease T slightly and update the tree until T < Tmin.

AdvantagesDisadvantages No guaranties for convergence Slow convergence Gives just region of solution Good for nonconvex Handle continuous variables 03.07.2015Joint Advanced Students School 23 Tree annealing developed by Bilbro and Snyder [1991]

Swarm intelligence 03.07.2015Joint Advanced Students School 24

Tabu Search Select current point, current node by random current node becomes best node Repeat Select a new node that has a lowest distance in the neighborhood of current node that is not on tabu list new node becomes current node If evalf (current node) <evalf (best node) Current node becomes best node Until some counter reaches limit 03.07.2015Joint Advanced Students School 25

Taboo search implementation 03.07.2015Joint Advanced Students School 26 1 1

Tabu search implementation 03.07.2015Joint Advanced Students School 27 1 1 3 3 4 4 5 5 2 2

Taboo search implementation 03.07.2015Joint Advanced Students School 28 1 1 3 3 4 4 5 5 2 2 1 1

Tabu search implementation 03.07.2015Joint Advanced Students School 29 1 1 3 3 4 4 5 5 2 2 1 1 7 7 6 6 3 3

Tabu search implementation 03.07.2015Joint Advanced Students School 30 1 1 3 3 4 4 5 5 2 2 1 1 7 7 6 6 3 3 9 9 8 8 6 6

Tabu search implementation 03.07.2015Joint Advanced Students School 31 1 1 3 3 4 4 5 5 2 2 1 1 7 7 6 6 3 3 9 9 8 8 6 6 10 11 9 9

AdvantagesDisadvantages Artificial intelligence concept Good convergence Do not stick Too many parameters Loops are possible 03.07.2015Joint Advanced Students School 35 Tabu Search

What is Local Optimization? The term LOCAL refers both to the fact that only information about the function from the neighborhood of the current approximation is used in updating the approximation as well as that we usually expect such methods to converge to whatever local extremum is closest to the starting approximation. Global structure of the objective function is unknown to a local method. 03.07.2015Joint Advanced Students School 36

Unconstrained optimization Gradient descent Conjugated gradients BFGS Gauss Newton Levenberg- Marquardt 03.07.2015Joint Advanced Students School 37 Constrained optimization SimplexSQPInterior point Local optimization

Gradient descent 03.07.2015Joint Advanced Students School 38 Consider F(x). F(x) is defined and F’(x) defined in some neighborhood of point a. F(x) increases fastest if one goes from a in the direction of gradient of F at a. => Then

Gradient descent 03.07.2015Joint Advanced Students School 39 Therefore we obtained: F(x 0 )<F(x 1 )<…<F(x n )

Quasi-Newton Methods 03.07.2015Joint Advanced Students School 40 These methods build up curvature information at each iteration to formulate a quadratic model problem of the form: where the Hessian matrix, H, is a positive definite symmetric matrix, c is a constant vector, and b is a constant. The optimal solution for this problem occurs when the partial derivatives of x go to zero:

Quasi-Newton Methods 03.07.2015Joint Advanced Students School 41

BFGS - algorithm 03.07.2015Joint Advanced Students School 42 Obtain Sk by solving: Perform a line search to find the optimal αk in the direction found in the first step, then updateline search

03.07.2015Joint Advanced Students School 43 BFGS - algorithm

Gauss Newton algorithm 03.07.2015Joint Advanced Students School 44 Given m functions f1 f2 … fm of n parameters p1 p2.. Pn (m>n),and we want to minimize the sum: The matrix inverse is never computed explicitly in practice. Therefore we use: instead of the above formula for p k+1, we use

03.07.2015Joint Advanced Students School 45 Gauss Newton algorithm

Levenberg-Marquardt 03.07.2015Joint Advanced Students School 46 This is an iterative procedure. Initial guess for p T = (1,1,…,1). p is replaced by (p+q) => At the minimum of the sum of squares S we have Differentiating the square of the right hand side (*) The key to LMA is to replace (*) with the ‘damped version’ :

03.07.2015Joint Advanced Students School 47 Levenberg-Marquardt

03.07.2015Joint Advanced Students School 48 SQP – constrained minimization Reformulation

03.07.2015Joint Advanced Students School 49 SQP – constrained minimization The principal idea is the formulation of a QP sub-problem based on a quadratic approximation of the Lagrangian function:

03.07.2015Joint Advanced Students School 50 SQP – constrained minimization Updating the Hessian matrix

03.07.2015Joint Advanced Students School 51 SQP – constrained minimization Updating the Hessian matrix Hessian should be positive definite Then qk Ts >0 at each update Is qk Ts k <0 then qk is modified on an element by element basis The aim is to distort gk which leads to positive definite as little as possible The most negative of qksk is repeatedly halved Repeat until qk Ts k > 10 -5

Neural Net analysis What is Neuron? Typical formal neuron makes the elementary operation – weighs values of the inputs with the locally stored weights and makes above their sum nonlinear transformation: neuron makes nonlinear operation above a linear combination of inputs 03.07.2015Joint Advanced Students School

Neural Net analysis What is training? W – set of synaptic weights E (W) – error function What kind of optimization to choose? Joint Advanced Students School 03.07.2015

Joint Advanced Students School 54 Neural Network – any architecture 1 1 2 2 3 3 4 4 Error back propagation 1 1 2 2 3 3 4 4 5 5 0 0 6 6

03.07.2015Joint Advanced Students School 55 How to optimize? Objective function – is an Empirical error (should decay) Parameters to optimize - are weights Constraints – are equalities (inequalities) for weights if exist Objective function – is an Empirical error (should decay) Parameters to optimize - are weights Constraints – are equalities (inequalities) for weights if exist

03.07.2015Joint Advanced Students School 56 Neural Net analysis and constrained and unconstrained minimization NB! For unconstrained optimization I applied Levenberg- Marquardt method For constrained case I applied SQP method NB! For unconstrained optimization I applied Levenberg- Marquardt method For constrained case I applied SQP method

Thank you for your attention 03.07.2015Joint Advanced Students School 57

Optimization methods Aleksey Minin Saint-Petersburg State University Student of ACOPhys master program (10 th semester) 1 Joint Advanced Students School.

Similar presentations

Presentation on theme: "Optimization methods Aleksey Minin Saint-Petersburg State University Student of ACOPhys master program (10 th semester) 1 Joint Advanced Students School."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Optimization methods Aleksey Minin Saint-Petersburg State University Student of ACOPhys master program (10 th semester) 1 Joint Advanced Students School.

Similar presentations

Presentation on theme: "Optimization methods Aleksey Minin Saint-Petersburg State University Student of ACOPhys master program (10 th semester) 1 Joint Advanced Students School."— Presentation transcript:

Similar presentations

About project

Feedback