Download presentation
Presentation is loading. Please wait.
Published byBritton Harris Modified over 10 years ago
1
CHAPTER 8 A NNEALING- T YPE A LGORITHMS Organization of chapter in ISSO –Introduction to simulated annealing –Simulated annealing algorithm Basic algorithm with noise-free loss measurements With noisy loss measurements –Numerical examples Traveling salesperson problem Continuous problems with single and multiple minima –Annealing algorithms based on stochastic approximation with injected “noise” –Convergence theory Slides for Introduction to Stochastic Search and Optimization (ISSO) by J. C. Spall
2
8-2 Background on Simulated Annealing Continues in spirit of Chaps. 2, 6, and 7 in working with only loss measurements (no direct gradients) Simulated annealing (SAN) based on analogies to cooling (annealing) of physical substances –Optimal analogous to minimum energy state Primarily designed to be global optimization method Based on probabilistic criterion for accepting increased loss value during search process –Metropolis criterion –Allows for temporary increase in loss as means of reaching global minimum Some convergence theory possible (e.g., Hajek, 1988, for discrete [see p. 213 of ISSO]; Sect. 8.6 in ISSO for continuous )
3
8-3 Metropolis Criterion In iterative process, suppose have current value curr and candidate new value new. Should we accept new if new is worse than curr (i.e., has higher loss value)? Metropolis criterion (from famous 1953 paper of Metropolis et al.) gives probability of accepting new value (c b is constant and T is “temperature”; set c b = 1 without loss of generality) Repeated application of Metropolis criterion (iteration to iteration) provides for convergence of SAN to global minimum –Markov chain theory applies for discrete ; stochastic approximation for continuous
4
8-4 SAN Algorithm with Noise-Free Loss Measurements Step 0 (initialization) Step 0 (initialization) Set initial temperature T and current parameter curr ; determine L( curr ). Step 1 (candidate value) Step 1 (candidate value) Randomly determine new value new and determine L( new ). Step 2 (compare L values) Step 2 (compare L values) If L( new ) < L( curr ), accept new. Alternatively, if L( new ) L( curr ), accept new with probability given by Metropolis criterion (implemented via Monte Carlo sampling scheme); otherwise keep curr. Step 3 (iterate at fixed temperature) Step 3 (iterate at fixed temperature) Repeat steps 1 and 2 until T is changed. Step 4 (decrease temperature) Step 4 (decrease temperature) Lower T according to the annealing schedule and return to Step 1. Continue till effective convergence.
5
8-5 SAN Algorithm with Noisy Loss Measurements As with random search (Chap. 2 of ISSO), standard SAN not designed for noisy measurements y = L + However, SAN sometimes used with noisy measurements Standard approach is to form average of loss measurements at each in search process Alternative is to use threshold idea of Sect. 2.3 of ISSO –Only accept new value if noisy loss value is sufficiently bigger or smaller than current noisy loss Can use one-sided Chebyshev inequality to characterize likelihood of error at each iteration under general noise distribution Very limited convergence theory for SAN with noisy measurements
6
8-6 Traveling Salesperson Problem (TSP) TSP is famous discrete optimization problem Many successful uses of SAN with TSP Basic problem is to find best way for salesperson to hit every city in territory once and only once –Setting arises in many problems of optimization on networks (communications, transportation, etc.) If tour involves n cities, there are (n–1) ! / 2 possible solutions –Extremely rapid growth in solution space as n increases –Problem is “NP hard” inversiontranslationswitchingPerturbations in SAN steps based on three operations on network: inversion, translation, and switching –Depicted below
7
TSP: Standard Search Operations Applied to 8-City Tour Inversion reverses order 2-3-4-5; translation removes section 2-3-4-5 and places it between 6-7; switching interchanges order of 2 and 5. 8-7
8
8-8 TSP (cont’d) Solution to Trivial 4-City Problem where Cost/Link = Distance (Related to Exercise 8.5 in ISSO)
9
8-9 Some Numerical Results for SAN Section 8.3 of ISSO reports on three examples for SAN –Small-scale TSP –Problem with no local minima other than global minimum –Problem with multiple local minima All examples based on stepwise temperature decay in basic SAN steps above and noise-free loss measurements All SAN runs require algorithm tuning to pick: –Initial T –Number of iterations at fixed T –Choice of 0 < < 1, representing amount of reduction in each temperature decay –Method for generating candidate new Brief descriptions follow on slides below….
10
8-10 (Example 8.1 in ISSO) Small-Scale TSP (Example 8.1 in ISSO) 10-city tour (very small by industrial standards) –Know by enumeration that minimum cost of tour = 440 Randomly chose inversion, translation, or switching at each iteration –Tuning required to choose “good” probabilities of selecting these operators 8 of 10 SAN runs find minimum cost tour –Sample mean cost of initial tour is 700; sample mean of final tour is 444 Essential to success is adequate use of inversion operator; 0 of 10 SAN runs find optimal tour if probability of inversion is 0.50 much largerSAN successfully used in much larger TSPs –E.g., seminal 1983 (!) Kirkpatrick et al. paper in Science considers TSP with 400 cities
11
8-11 Comparison of SAN and Two Random Search Algorithms (Example 8.2 in ISSO) Considered very simple p = 2 “quartic loss” seen earlier: Function has single global minimum; no local minima Table below gives sample mean terminal loss value, where initial loss = 4.00; L( ) = 0 SAN performs well, but random search even better in this problem
12
8-12 Evaluation of SAN in Problem with Multiple Local Minima(Example 8.3 in ISSO) Evaluation of SAN in Problem with Multiple Local Minima (Example 8.3 in ISSO) Many numerical studies in literature showing favorable results for SAN Loss function in study of Brooks and Morgan (1995): with = [t 1, t 2 ] T and = [ 1, 1] 2 Function has many local minima with a unique global minimum Study compares quasi-Newton method and SAN –“Apples vs. oranges” (gradient-based vs. non-gradient- based) 20% of quasi-Newton runs and 100% of SAN runs ended near (random initial conditions)
13
8-13 Global Optimization via Annealing of Stochastic Approximation SAN not only way annealing used for global optimization With appropriate annealing, stochastic approximation (SA) can be used in global optimization Standard approach is to inject Gaussian noise to r.h.s. of SA recursion: where G k is direct gradient measurement (Chap. 5) or gradient approximation (FDSA or SPSA), b k 0 (the “annealing”), and w k N(0, I p p ) Injected noise w k generated by Monte Carlo Eqn. (*) has theoretical basis for formal convergence (Sect. 8.4 of ISSO)
14
8-14 Global Optimization via Annealing of Stochastic Approximation (cont’d) Careful selection of a k and b k required to achieve global convergence Stochastic rate of convergence is slow: when a k = a / (k+1) , < 1 when a k = a / (k+1) Above slow rates are price to be paid for global convergence SPSA without injected randomness (i.e., b k = 0) is global optimizer under certain conditions –Much faster convergence rate (0< 2/3)
15
8-15 Ratio of Asymptotic Estimation Errors with and without Injected Randomness (b k > 0 and b k = 0, resp.) 10 3 10 4 10 5 10 6 Iterations, k 10 20 30 40 50 60 70
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.