Local Search and Optimization

Slides:



Advertisements
Similar presentations
Informed search algorithms
Advertisements

Heuristic Search Russell and Norvig: Chapter 4 Slides adapted from:
Heuristics for sliding-tile puzzles
Informed Search CS 171/271 (Chapter 4)
Heuristic Functions By Peter Lane
Informed search algorithms
Review: Search problem formulation
Local Search Algorithms
Informed search algorithms
An Introduction to Artificial Intelligence
Informed search algorithms
Heuristics CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
Solving Problem by Searching
CSC344: AI for Games Lecture 5 Advanced heuristic search Patrick Olivier
Review: Search problem formulation
CS 460 Spring 2011 Lecture 3 Heuristic Search / Local Search.
Trading optimality for speed…
Review Best-first search uses an evaluation function f(n) to select the next node for expansion. Greedy best-first search uses f(n) = h(n). Greedy best.
CSC344: AI for Games Lecture 4: Informed search
PROBLEM SOLVING BY SEARCHING (2)
1 CS 2710, ISSP 2610 R&N Chapter 4.1 Local Search and Optimization.
Informed search algorithms
Local Search and Optimization
Vilalta&Eick: Informed Search Informed Search and Exploration Search Strategies Heuristic Functions Local Search Algorithms Vilalta&Eick: Informed Search.
1 Local search and optimization Local search= use single current state and move to neighboring states. Advantages: –Use very little memory –Find often.
INTRODUÇÃO AOS SISTEMAS INTELIGENTES Prof. Dr. Celso A.A. Kaestner PPGEE-CP / UTFPR Agosto de 2011.
An Introduction to Artificial Life Lecture 4b: Informed Search and Exploration Ramin Halavati In which we see how information.
Informed Search Uninformed searches easy but very inefficient in most cases of huge search tree Informed searches uses problem-specific information to.
Local Search Algorithms This lecture topic Chapter Next lecture topic Chapter 5 (Please read lecture topic material before and after each lecture.
Dr.Abeer Mahmoud ARTIFICIAL INTELLIGENCE (CS 461D) Dr. Abeer Mahmoud Computer science Department Princess Nora University Faculty of Computer & Information.
Informed search algorithms
Informed search algorithms
Informed search algorithms Chapter 4. Outline Best-first search Greedy best-first search A * search Heuristics.
1 Shanghai Jiao Tong University Informed Search and Exploration.
Informed search algorithms Chapter 4. Best-first search Idea: use an evaluation function f(n) for each node –estimate of "desirability"  Expand most.
ISC 4322/6300 – GAM 4322 Artificial Intelligence Lecture 3 Informed Search and Exploration Instructor: Alireza Tavakkoli September 10, 2009 University.
1 CS 2710, ISSP 2610 Chapter 4, Part 2 Heuristic Search.
Iterative Improvement Algorithm 2012/03/20. Outline Local Search Algorithms Hill-Climbing Search Simulated Annealing Search Local Beam Search Genetic.
Review: Tree search Initialize the frontier using the starting state While the frontier is not empty – Choose a frontier node to expand according to search.
Local Search Pat Riddle 2012 Semester 2 Patricia J Riddle Adapted from slides by Stuart Russell,
For Wednesday Read chapter 6, sections 1-3 Homework: –Chapter 4, exercise 1.
For Wednesday Read chapter 5, sections 1-4 Homework: –Chapter 3, exercise 23. Then do the exercise again, but use greedy heuristic search instead of A*
Princess Nora University Artificial Intelligence Chapter (4) Informed search algorithms 1.
Local Search and Optimization Presented by Collin Kanaley.
Informed Search and Heuristics Chapter 3.5~7. Outline Best-first search Greedy best-first search A * search Heuristics.
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 5: Power of Heuristic; non- conventional search.
When A* doesn’t work CIS 391 – Intro to Artificial Intelligence A few slides adapted from CS 471, Fall 2004, UBMC (which were adapted from notes by Charles.
1 CS 2710, ISSP 2160 Chapter 3, Part 2 Heuristic Search.
4/11/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 4, 4/11/2005 University of Washington, Department of Electrical Engineering Spring 2005.
A General Introduction to Artificial Intelligence.
Feng Zhiyong Tianjin University Fall  Best-first search  Greedy best-first search  A * search  Heuristics  Local search algorithms  Hill-climbing.
Announcement "A note taker is being recruited for this class. No extra time outside of class is required. If you take clear, well-organized notes, this.
Informed Search II CIS 391 Fall CIS Intro to AI 2 Outline PART I  Informed = use problem-specific knowledge  Best-first search and its variants.
Local Search. Systematic versus local search u Systematic search  Breadth-first, depth-first, IDDFS, A*, IDA*, etc  Keep one or more paths in memory.
N- Queens Solution with Genetic Algorithm By Mohammad A. Ismael.
Local Search Algorithms and Optimization Problems
Heuristic Functions.
Heuristic Search  Best First Search –A* –IDA* –Beam Search  Generate and Test  Local Searches –Hill Climbing  Simple Hill Climbing  Steepest Ascend.
CPSC 420 – Artificial Intelligence Texas A & M University Lecture 5 Lecturer: Laurie webster II, M.S.S.E., M.S.E.e., M.S.BME, Ph.D., P.E.
Constraints Satisfaction Edmondo Trentin, DIISM. Constraint Satisfaction Problems: Local Search In many optimization problems, the path to the goal is.
Local search algorithms In many optimization problems, the path to the goal is irrelevant; the goal state itself is the solution State space = set of "complete"
Informed Search Uninformed searches Informed searches easy
Review: Tree search Initialize the frontier using the starting state
Local Search Algorithms
Artificial Intelligence (CS 370D)
Heuristics Local Search
Heuristics Local Search
Local Search Algorithms
Local Search Algorithms
Presentation transcript:

Local Search and Optimization CS 2710, ISSP 2610 Chapter 3, Part 3 Heuristic Search Chapter 4 Local Search and Optimization

Beam Search Cheap, unpredictable search For problems with many solutions, it may be worthwhile to discard unpromising paths Greedy best first search that keeps a fixed number of nodes on the fringe

Beam Search def beamSearch(fringe,beamwidth): while len(fringe) > 0: cur = fringe[0] fringe = fringe[1:] if goalp(cur): return cur newnodes = makeNodes(cur, successors(cur)) for s in newnodes: fringe = insertByH(s, fringe) fringe = fringe[:beamwidth] return []

Beam Search Optimal? Complete? Hardly! Space? O(b) (generates the successors) Often useful

Creating Heuristics

Combining Heuristics If you have lots of heuristics and none dominates the others and all are admissible… Use them all! H(n) = max(h1(n), …, hm(n))

Relaxed Heuristic Relaxed problem A problem with fewer restrictions on the actions The cost of an optimal solution to a relaxed problem is an admissible heuristic for the original problem.

Relaxed Problems Exact solutions to different (relaxed) problems H1 (# of misplaced tiles) is perfectly accurate if a tile could move to any square H2 (sum of Manhattan distances) is perfectly accurate if a tile could move 1 square in any direction The cost of an optimal solution to a relaxed problem is an admissible heuristic for the original problem. The optimal solution in the original problem is, by def, also a solution in the relaxed problem and therefore must be at least as expensive as the optimal solution in the relaxed problem

Relaxed Problems If problem is defined formally as a set of constraints, relaxed problems can be generated automatically Absolver (Prieditis, 1993) Discovered a better heuristic for 8 puzzle and the first useful heuristic for Rubik’s cube Generates heuristics using relaxed problem and other techniques. Discovered a better heuristic for 8 puzzle and the first useful heuristic for Rubik’s cube. Note depending on the problem, the heuristic can be derived directly. Or you might have to allow h to run a breadth-first search

Systematic Relaxation Precondition List A conjunction of predicates that must hold true before the action can be applied Add List A list of predicates that are to be added to the description of the world-state as a result of applying the action Delete List A list of predicates that are no longer true once the action is applied and should, therefore, be deleted from the state description Primitive Predicates ON(x, y) : tile x is on cell y CLEAR(y) : cell y is clear of tiles ADJ(y, z) : cell y is adjacent to cell z

Here is the full definition of s move for the n-puzzle Move(x, y, z): precondition list ON(x, y), CLEAR(z), ADJ(y, z) add list ON(x, z), CLEAR(y) delete list ON(x, y), CLEAR(z)

Misplaced distance is 1+1=2 moves (1) Removing CLEAR(z) and ADJ(y, z) gives “# tiles out of place”. Misplaced distance is 1+1=2 moves 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Manhattan distance is 6+3=9 moves (2) Removing CLEAR(z) gives “Manhattan distance”. Manhattan distance is 6+3=9 moves 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Pattern Database Heuristics The idea behind pattern database heuristics is to store exact solution costs for every possible sub-problem instance.

Solve part of the problem, ignoring the other tiles 3 7 11 12 13 14 15 3 7 11 12 13 14 15 3 7 11 12 13 14 15

Pattern Databases optimal solution cost of the subproblem <= optimal solution cost of the full problem. Run exhaustive search to find optimal solutions for every possible configuration of 3, 7, 11, 12, 13, 14, 15, and store the results Do the same for the other tiles (maybe in two 4-tile subsets) Do this once before any problem solving is performed. Expensive, but can be worth it, if the search will be applied to many problem instances (deployed) The cost of an optimal solution to the subproblem in this case is less than or equal to the optimal solution cost of the full problem.

Pattern Databases Recall, in our example, we have three subproblems (subsets of 7, 4, and 4 tiles) State S has specific configurations of those subsets h(s)?

h(s)? Look up the exact costs for s’s configurations of the 7, 4, and 4 tiles in the database Take the max! The max of a set of admissible heuristics is admissible What if it isn’t feasible to have entries for all possibilities? ….

What if it isn’t feasible to have entries for all possibilities? …. Take the max of: The exact costs we do have, and the Manhattan distance for those we don’t have

Sums of admissible heurstics We would like to take the sum rather than the max, since the result is more informed In general, adding two admissible heuristics might not be admissible For example, moves that solve one subproblem might help another subproblem But we can choose patterns that are disjoint, so we can sum them

Disjoint Pattern Database Heuristics Patterns that have no tiles in common. (As in our example) When calculating costs for a pattern, only count moves of the tiles in the pattern Add together the heuristic values for the individual patterns. The sum is admissible and more informed than taking the max

Examples for Disjoint Pattern Database Heuristics 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 20 moves needed to solve red tiles 25 moves needed to solve blue tiles Overall heuristic is sum, or 20+25=45 moves

A trivial example of disjoint pattern database heuristics is Manhattan Distance in the case that we view every tile as a single pattern database 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Overall heuristic is sum of the Manhattan Distance of each tile which is 39 moves.

For your interest http://idm-lab.org/bib/abstracts/Koen07g.html P. Haslum, A. Botea, M. Helmert, B. Bonet and S. Koenig. Domain-Independent Construction of Pattern Database Heuristics for Cost-Optimal Planning. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 1007-1012, 2007.

Linear Conflict Heuristic Function Def. Linear Conflict Heuristic --Two tiles tj and tk are in a linear conflict if tj and tk are the same line, the goal positions of tj and tk are both in that line, tj is to the right of tk, and goal position of tj is to the left of the goal position of tk.

Linear Conflict Example 3 1 1 3 Manhattan distance is 2+2=4 moves

Linear Conflict Example 3 1 1 3 Manhattan distance is 2+2=4 moves

Linear Conflict Example 3 1 3 1 Manhattan distance is 2+2=4 moves

Linear Conflict Example 3 1 3 1 Manhattan distance is 2+2=4 moves

Linear Conflict Example 3 1 3 1 Manhattan distance is 2+2=4 moves

Linear Conflict Example 1 3 1 3 Manhattan distance is 2+2=4 moves

Linear Conflict Example 1 3 1 3 Manhattan distance is 2+2=4 moves Add a penalty for each linear conflict

Other Sources of Heuristics Ad-hoc, informal, rules of thumb (guesswork) Approximate solutions to problems (algorithms course) Learn from experience (solving lots of 8-puzzles). Each optimal solution is a learning example (node,actual cost to goal) Learn heuristic function, E.G. H(n) = c1x1(n) + c2x2(n) x1 = #misplaced tiles; x2 = #adj tiles also adj in the goal state. c1 & c2 learned (best fit to the training data)

Search Types Backtracking state-space search Local Search and Optimization Constraint satisfaction search Adversarial search

Local Search and Optimization Previous searches: keep paths in memory, and remember alternatives so search can backtrack. Solution is a path to a goal. Path may be irrelevant, if the final configuration only is needed (8-queens, IC design, network optimization, …)

Local Search Use a single current state and move only to neighbors. Use little space Can find reasonable solutions in large or infinite (continuous) state spaces for which the other algorithms are not suitable Iterative improvement: start with a complete configuration and make modifications to improve it

Optimization Local search is often suitable for optimization problems. Search for best state by optimizing an objective function.

Visualization States are laid out in a landscape Height corresponds to the objective function value Move around the landscape to find the highest (or lowest) peak Only keep track of the current states and immediate neighbors

Local Search Alogorithms Two strategies for choosing the state to visit next Hill climbing Simulated annealing Then, an extension to multiple current states: Genetic algorithms

Hillclimbing (Greedy Local Search) Generate nearby successor states to the current state Pick the best and replace the current state with that one. Loop

Hill-climbing search problems Local maximum: a peak that is lower than the highest peak, so a bad solution is returned Plateau: the evaluation function is flat, resulting in a random walk Ridges: slopes very gently toward a peak, so the search may oscillate from side to side Ridge: sequence of local maxima. See page 114 – that picture is good Local maximum Plateau Ridge

Random restart hill-climbing Start different hill-climbing searches from random starting positions stopping when a goal is found Save the best result from any search so far If all states have equal probability of being generated, it is complete with probability approaching 1 (a goal state will eventually be generated). Finding an optimal solution becomes the question of sufficient number of restarts Surprisingly effective, if there aren’t too many local maxima or plateaux There are lots of varients of hill climbing.

Simulated Annealing Based on a metallurgical metaphor Start with a temperature set very high and slowly reduce it. Run hillclimbing with the twist that you can occasionally replace the current state with a worse state based on the current temperature and how much worse the new state is. Annealing: process used to temper or harden metals and glass by heating them to a high temperature and then gradually cooling them.

Simulated Annealing Annealing: harden metals and glass by heating them to a high temperature and then gradually cooling them At the start, make lots of moves and then gradually slow down

Simulated Annealing More formally… Generate a random new neighbor from current state. If it’s better take it. If it’s worse then take it with some probability proportional to the temperature and the delta between the new and old states.

Simulated annealing Probability of a move decreases with the amount ΔE by which the evaluation is worsened A second parameter T is also used to determine the probability: high T allows more worse moves, T close to zero results in few or no bad moves Schedule input determines the value of T as a function of the completed cycles

function Simulated-Annealing(problem, schedule) returns a solution state inputs: problem, a problem schedule, a mapping from time to “temperature” current ← Make-Node(Initial-State[problem]) for t ← 1 to ∞ do T ← schedule[t] if T=0 then return current next ← a randomly selected successor of current ΔE ← Value[next] – Value[current] if ΔE > 0 then current ← next else current ← next only with probability eΔE/T

Intuitions Hill-climbing is incomplete Pure random walk, keeping track of the best state found so far, is complete but very inefficient Combine the ideas: add some randomness to hill-climbing to allow the possibility of escape from a local optimum

Intuitions the algorithm wanders around during the early parts of the search, hopefully toward a good general region of the state space Toward the end, the algorithm does a more focused search, making few bad moves

Theoretical Completeness There is a proof that if the schedule lowers T slowly enough, simulated annealing will find a global optimum with probability approaching 1 In practice, that may be way too many iterations In practice, though, SA can be effective at finding good solutions

Local Beam Search Keep track of k states rather than just one, as in hill climbing In comparison to beam search we saw earlier, this algorithm is state-based rather than node-based.

Local Beam Search Begins with k randomly generated states At each step, all successors of all k states are generated If any one is a goal, alg halts Otherwise, selects best k successors from the complete list, and repeats

Local Beam Search Successors can become concentrated in a small part of state space Stochastic beam search: choose k successors, with probability of choosing a given successor increasing with value Like natural selection: successors (offspring) of a state (organism) populate the next generation according to its value (fitness)

Genetic Algorithms Variant of stochastic beam search Combine two parent states to generate successors

function GA (pop, fitness-fn) Repeat new-pop = {} for i from 1 to size(pop): x = rand-sel(pop,fitness-fn) y = rand-sel(pop,fitness-fn) child = reproduce(x,y) if (small rand prob): child mutate(child) add child to new-pop pop = new-pop Until an indiv is fit enough, or out of time Return best indiv in pop, according to fitness-fn

function reproduce(x,y) n = len(x) c = random num from 1 to n return: append(substr(x,1,c),substr(y,c+1,n)

Example: n-queens Put n queens on an n × n board with no two queens on the same row, column, or diagonal

Genetic Algorithms Notes Representation of individuals Classic approach: individual is a string over a finite alphabet with each element in the string called a gene Usually binary instead of AGTC as in real DNA Selection strategy Random Selection probability proportional to fitness Selection is done with replacement to make a very fit individual reproduce several times Reproduction Random pairing of selected individuals Random selection of cross-over points Each gene can be altered by a random mutation

Genetic Algorithms When to use them? Genetic algorithms are easy to apply Results can be good on some problems, but bad on other problems Genetic algorithms are not well understood