Derivative-free Methods using Linesearch Techniques Stefano Lucidi
P. Tseng L. Grippo joint works with (the father linesearch approach) M. Sciandrone G. Liuzzi F. Lampariello V. Piccialli (in order of appearance in this research activity) F. Rinaldi G. Fasano
PROBLEM DEFINITION: are not available
MOTIVATIONS: In many engineering problems the objective and constraint function values are obtained by direct measurements complex simulation programs first order derivatives can be often neither explicitly calculated nor approximated
MOTIVATIONS: In fact the mathematical representations of the objective function and the constraints are not available the source codes of the programs are not available the values of the objective function and the constraints can be affected by the presence of noise the evaluations of the objective function and the constraints can be very expensive
MOTIVATIONS: the mathematical representations of the objective function and the constraints are not available the first order derivatives the objective function and the constraints can not be computed analytically
MOTIVATIONS: the source codes of the programs are not available the automatic differentiation techniques can not be applied
MOTIVATIONS: the evaluations of the objective function and the constraints can be very expensive the finite difference approximations can be too expensive (they need n function evaluations at least)
MOTIVATIONS: finite difference approximations can produce very wrong estimates of the first order derivatives the values of the objective function and the constraints can be affected by the presence of noise
NUMERICAL EXPERIENCE: we considered 41 box constrained standard test problems we perturbed such problems in the following way: where denotes a Gaussian distributed random number with zero mean and variance
NUMERICAL EXPERIENCE: we considered two codes: Number of Failures DF_box = derivative-free method E04UCF = NAG subroutine using finite-differences gradients DF_box E04UCF
GLOBALLY CONVERGENT DF METHODS Direct search methods use only function values - pattern search methods where the function is evaluated on specified geometric patterns - line search methods which use one-dimensional minimization along suitable search directions Modelling methods approximate the functions by suitable models which are progressively built and updated
UNCONSTRAINED MINIMIZATION PROBLEMS is not available is compact
THE ROLE OF THE GRADIENT characterizes accurately the local behaviour of f allows us to determine an "efficient" descent direction to determine a "good" step length along the direction
THE ROLE OF THE GRADIENT is the directional derivatives of along provides the rates of change of along the 2n directions characterizes accurately the local behaviour of
HOW TO OVERCOME THE LACK OF GRADIENT the local behaviour of along should be indicative of the whole local behaviour of a set of directions can be associated at each
ASSUMPTION D Given, the bounded sequences are such that
EXAMPLES OF SETS OF DIRECTIONS are linearly independent and bounded
EXAMPLES OF SETS OF DIRECTIONS (Lewis,Torczon) are bounded
EXAMPLES OF SETS OF DIRECTIONS
UNCONSTRAINED MINIMIZATION PROBLEMS Assumption D ensures that, performing finer and finer sampling of along it is possible: - either to realize that the point is a good approximation of a stationary point of - or to find a point where is decreased
GLOBAL CONVERGENCE By Assumption D we have:
GLOBAL CONVERGENCE By using satisfying Assumption D it is possible: to characterize the global convergence of a sequence of points by means the existence of suitable sequences of failures in decreasing the objective function along the directions
GLOBAL CONVERGENCE By Assumption D we have:
PROPOSITION Let and be such that: - - satisfy Assumption D then - there exist sequences of points and scalars such that
GLOBAL CONVERGENCE the sampling of along all the directions can be distributed along the iterations of the algorithm the Proposition characterizes in “some sense” the requirements on the accettable samplings of along the directions that guarantee the global convergence it is not necessary to perform at each point a sampling of along all the directions
GLOBAL CONVERGENCE The use of directions satisfying Condition D and the result of producing sequences of points satisfying the hypothesis of the Proposition are the common elements of all the globally convergent direct search methods The direct search methods can divided in - pattern search methods - line search methods
PATTERN SEARCH METHODS Cons: all the points produced must lie in a suitable lattice this implies - additional assumptions on the search directions - restrictions on the choiches of the steplenghts Pros: they require that the new point produces a simple decrease of (in the line search methods the new point must guarantees a “sufficient” decrease of ) (in the line search methods no additional requiriments respect to Assumption D and the assumptions of the Proposition)
LINESEARCH TECHNIQUES
ALGORITHM DF STEP 1 Compute satisfying Assumption D Minimization of along STEP 2 STEP 3 Compute and set k=k+1
STEP 2 The aim of this step is: - to detect the “promising” directions, the direction along which the function decreases “sufficiently” - to compute steplenghts along these directions which guarantee both a “sufficiently” decrease of the function and a “sufficient” moving from the previous point
LINESEARCH TECHNIQUE
STEP 2 The value of the initial step along the i-th direction derives from the linesearch performed along the i-th direction at the previuos iteration If the set of search directions does not depend on the iteration namely the scalar should be representative of the behaviour of the objective function along the i-th direction
STEP 3 set k=k+1 and go to Step 1 Find such thatotherwise set At Step 3, every approximation technique can be used to produce a new better point
GLOBAL CONVERGENCE THEOREM Let be the sequence of points produced by DF Algorithm then there exists an accomulation point of and every accumulation points of is a stationary point of the objective function
LINEARLY CONSTRAINED MINIMIZATION PROBLEMS is not available is compact (LCP)
LINEARLY CONSTRAINED MINIMIZATION PROBLEMS Given a feasible point it is possible to define the set of the indeces of the active constraints the set of the feasible directions
LINEARLY CONSTRAINED MINIMIZATION PROBLEMS is a stationary point for Problem (LCP) is a stationary point for Problem (LCP)
LINEARLY CONSTRAINED MINIMIZATION PROBLEMS is a stationary point for Problem (LCP)
LINEARLY CONSTRAINED MINIMIZATION PROBLEMS an estimate of the set of the indeces of the active constraints an estimate of the set of the feasible directions Given and it is possible to define has good properties which allow us to define globally convergent algorithms
ASSUMPTION D2 (an example) Given and the set of directions with satisfies: is uniformly bounded
ALGORITHM DFL STEP 1 Compute satisfying Assumption D2 Minimization of along STEP 2 STEP 3 Compute the new point and set k=k+1
GLOBAL CONVERGENCE THEOREM Let be the sequence of points produced by DFL Algorithm then there exists an accomulation point of and every accumulation points of is a stationary point for Problem (LCP)
BOX CONSTRAINED MINIMIZATION PROBLEMS is not available is compact (BCP) satisfies Assumption D2 the set
NONLINEARLY CONSTRAINED MINIMIZATION PROBLEMS (NCP) is not available are not available
NONLINEARLY CONSTRAINED MINIMIZATION PROBLEMS and given a point We define
NONLINEARLY CONSTRAINED MINIMIZATION PROBLEMS ASSUMPTION A1 The set is compact For every ASSUMPTION A2 there exists a vector such that Assumption A1 boundeness of the iterates Assumption A2 existence and boundeness of the Lagrange multipliers
NONLINEARLY CONSTRAINED MINIMIZATION PROBLEMS We consider the following continuously differentiable penalty function: where (penalty parameter)
NONLINEARLY CONSTRAINED MINIMIZATION PROBLEMS ? ?
ALGORITHM DFN STEP 1 Compute satisfying Assumption D2 Minimization of along STEP 2 STEP 3 Compute the new point and set k=k+1
new STEP 3 ( ) set k=k+1 and go to Step 1 Find such that otherwise set if set otherwise set andthen
new STEP 3 is reduced whenever a better approximation of a stationary point of the penalty function has been obtained can be viewed as stationarity measure
GLOBAL CONVERGENCE THEOREM Let be the sequence of points produced by DFN Algorithm then there exists an accomulation point of which is a stationary point for Problem (LCP)
MIXED NONLINEARLY MINIMIZATION PROBLEMS (MNCP) We define n. discr. var. n. cont. var.
MIXED NONLINEARLY MINIMIZATION PROBLEMS We define is a stationary point of Problem MNLP if there exists such that:
ALGORITHM MDFN STEP 1 Compute Mixed Minimization of along STEP 2 STEP 3 Compute the new point and set k=k+1
ALGORITHM MDFN STEP 1 Compute STEP 2 STEP 3 Compute the new point and set k=k+1 a continuous linesearch along If perform a discrete linesearch along If perform
Continuous linesearch Continuous linesearch of MDFN = linesearch of DFN it produces the point
LINESEARCH TECHNIQUE
MIXED NONLINEARLY MINIMIZATION PROBLEMS or every accumulation point of the sequence produced by the algorithm, satisfies: ASSUMPTION A3 Either the nonlinear constraints functions do not depend on the integer variables are such that
GLOBAL CONVERGENCE THEOREM Let be the sequence of points produced by MDFN Algorithm then there exists an accomulation point of which is a stationary point for Problem (MNCP)
MIXED NONLINEARLY MINIMIZATION PROBLEMS More complex (and expensive) derivative-free algorithms allows us to determine “better” stationary points to tackle “more difficult” mixed nonlinear optimization problems
MIXED NONLINEARLY MINIMIZATION PROBLEMS to determine “better” stationary points for Problem (MNCP) satisfies the KKT conditions w.r.t.
discrete general variables continuous variables discrete dimensional variables Discrete dimensional variables z : Vector of discrete variables which determine the number of continuous and discrete variables Three different sets of variables: to tackle “more difficult” mixed nonlinear optimization problems
HARD MIXED NONLINEARLY MINIMIZATION PROBLEMS The feasible set of y depends on the dimensional variables z The feasible set of x depends on the discrete variables y and on the dimensional variables z (Hard-MNCP)
NONSMOOTH MINIMIZATION PROBLEMS
the cone of descent directions can be made arbitrarily narrow
NONSMOOTH MINIMIZATION PROBLEMS Possible approaches: smoothing techniques “larger” set of search directions
NONSMOOTH MINIMIZATION PROBLEMS smoothing techniques
NONSMOOTH MINIMIZATION PROBLEMS
ALGORITHM DFN STEP 1 Compute satisfying Assumption D2 Minimization of along STEP 2 STEP 3 Compute the new point and set k=k+1
new STEP 3 set k=k+1 and go to Step 1 Find such that otherwise set set
new STEP 3 is reduced whenever a better approximation of a stationary point of the penalty function has been obtained can be viewed as stationarity measure
GLOBAL CONVERGENCE THEOREM Let be the sequence of points produced by the Algorithm then there exists an accomulation point of which is a stationary point for the MinMax Problem
NONSMOOTH MINIMIZATION PROBLEMS “larger” set of search directions (NCP) locally Lipschitz-continuous
We consider the following nonsmooth penalty function: where (penalty parameter) NONSMOOTH MINIMIZATION PROBLEMS
ASSUMPTION A1 The set is compact For every ASSUMPTION A2 there exists a vector such that
NONSMOOTH MINIMIZATION PROBLEMS
set of search directions which are asintotically dense in the unit sphere It is possible to define algorithms globally convergent towards NONSMOOTH MINIMIZATION PROBLEMS stationary points (in the Clarke sense) by assuming that the algorithms use
NONSMOOTH MINIMIZATION PROBLEMS Multiobjective optimization problem (working in progress) locally Lipschitz-continuous
NONSMOOTH MINIMIZATION PROBLEMS Bilevel optimization problem (working in progress)
Our DF-codes are available at:
Thank your for your attention
n. magnets=3 n. rings =6 n.magnets =4 half magnet Optimal Design of Magnetic Resonance apparatus
Design Variables Positions of the rings along the X-axis X x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 Angular positions of each row of small magnets
Design Variables Offsets of the 4 outermost rings w.r.t. the 2 innermost ones X b1b1 b2b2 b3b3 b4b4 Radius of magnets (integer values)
Objective Function The objective function measures the non-uniformity of the magnetic field within a specified target region which is Magnetic field as uniform as possible and directed along the Z axis
nr=5 nm=3 r=22 f=51 ppm Starting point (commercial devices) nr=7 nm=3 r=27 f=18 ppm Final point
Magnetic Resonance Results Behavior of on the ZY plane 51ppm configuration 18ppm configuration