Origin of Heuristic Functions

Slides:



Advertisements
Similar presentations
Heuristic Functions By Peter Lane
Advertisements

Informed search algorithms
1 Dual lookups in Pattern Databases Ariel Felner, Ben-Gurion Univ. Israel Uzi Zahavi, Bar-Ilan Univ. Israel Jonathan Schaeffer, Univ. of Alberta, Canada.
Review: Search problem formulation
Problem solving with graph search
Informed search strategies
Informed search algorithms
An Introduction to Artificial Intelligence
A* Search. 2 Tree search algorithms Basic idea: Exploration of state space by generating successors of already-explored states (a.k.a.~expanding states).
Inconsistent Heuristics
Finding Search Heuristics Henry Kautz. if State[node] is not in closed OR g[node] < g[LookUp(State[node],closed)] then A* Graph Search for Any Admissible.
Informed search algorithms
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 4 Jim Martin.
Solving Problem by Searching
Recent Progress in the Design and Analysis of Admissible Heuristic Functions Richard E. Korf Computer Science Department University of California, Los.
Problem Solving Agents A problem solving agent is one which decides what actions and states to consider in completing a goal Examples: Finding the shortest.
Solving Problems by Searching Currently at Chapter 3 in the book Will finish today/Monday, Chapter 4 next.
Search in AI.
Additive pattern database heuristics
Review: Search problem formulation
1 Advances in Pattern Databases Ariel Felner, Ben-Gurion University Israel
MAE 552 – Heuristic Optimization Lecture 27 April 3, 2002
1 PHA*:Performing A* in Unknown Physical Environments Ariel Felner Bar-Ilan University. Joint work with Roni Stern and Sarit Kraus.
Cooperating Intelligent Systems Informed search Chapter 4, AIMA.
2 Heuristic from Relaxed Models  A heuristic function returns the exact cost of reaching a goal in a simplified or relaxed version of the original problem.
Research Related to Real-Time Strategy Games Robert Holte November 8, 2002.
Uninformed Search Reading: Chapter 3 by today, Chapter by Wednesday, 9/12 Homework #2 will be given out on Wednesday DID YOU TURN IN YOUR SURVEY?
Pattern Databases Robert Holte University of Alberta November 6, 2002.
Cooperating Intelligent Systems Informed search Chapter 4, AIMA 2 nd ed Chapter 3, AIMA 3 rd ed.
Solving problems by searching
Heuristics CSE 473 University of Washington. © Daniel S. Weld Topics Agency Problem Spaces SearchKnowledge Representation Planning PerceptionNLPMulti-agentRobotics.
Backtracking.
Informed Search Idea: be smart about what paths to try.
Using Abstraction to Speed Up Search Robert Holte University of Ottawa.
Informed Search Strategies
Informed State Space Search Department of Computer Science & Engineering Indian Institute of Technology Kharagpur.
Informed (Heuristic) Search
Li Wang Haorui Wu University of South Carolina 04/02/2015 A* with Pattern Databases.
Informed search algorithms Chapter 4. Outline Best-first search Greedy best-first search A * search Heuristics.
Criticizing solutions to Relaxed Models Yields Powerful Admissible Heuristics Sean Doherty Mingxiang Zhu.
Informed search strategies Idea: give the algorithm “hints” about the desirability of different states – Use an evaluation function to rank nodes and select.
Informed searching. Informed search Blind search algorithms do not consider any information about the states and the goals Often there is extra knowledge.
Review: Tree search Initialize the frontier using the starting state While the frontier is not empty – Choose a frontier node to expand according to search.
Lecture 3: Uninformed Search
Heuristic Search Andrea Danyluk September 16, 2013.
Advanced Artificial Intelligence Lecture 2: Search.
CSE 573: Artificial Intelligence Autumn2012 Heuristics & Pattern Databases for Search With many slides from Dan Klein, Richard Korf, Stuart Russell, Andrew.
Informed Search Reading: Chapter 4.5 HW #1 out today, due Sept 26th.
Informed Search and Heuristics Chapter 3.5~7. Outline Best-first search Greedy best-first search A * search Heuristics.
Introduction to Artificial Intelligence Class 1 Planning & Search Henry Kautz Winter 2007.
Informed Search II CIS 391 Fall CIS Intro to AI 2 Outline PART I  Informed = use problem-specific knowledge  Best-first search and its variants.
State space representations and search strategies - 2 Spring 2007, Juris Vīksna.
Problem Spaces & Search Dan Weld CSE 573. © Daniel S. Weld 2 Logistics Read rest of R&N chapter 4 Also read chapter 5 PS 1 will arrive by , so be.
Searching for Solutions
A* optimality proof, cycle checking CPSC 322 – Search 5 Textbook § 3.6 and January 21, 2011 Taught by Mike Chiang.
CE 473: Artificial Intelligence Autumn 2011 A* Search Luke Zettlemoyer Based on slides from Dan Klein Multiple slides from Stuart Russell or Andrew Moore.
Chapter 3.5 and 3.6 Heuristic Search Continued. Review:Learning Objectives Heuristic search strategies –Best-first search –A* algorithm Heuristic functions.
Heuristic Functions.
Artificial Intelligence Lecture No. 8 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Review: Tree search Initialize the frontier using the starting state
Last time: Problem-Solving
Heuristic Search A heuristic is a rule for choosing a branch in a state space search that will most likely lead to a problem solution Heuristics are used.
Finding Heuristics Using Abstraction
EA C461 – Artificial Intelligence
Heuristics Local Search
Artificial Intelligence
Heuristics Local Search
Informed Search Idea: be smart about what paths to try.
Reading: Chapter 4.5 HW#2 out today, due Oct 5th
Informed Search Idea: be smart about what paths to try.
Presentation transcript:

Origin of Heuristic Functions Chapter 6 Origin of Heuristic Functions

Heuristic from Relaxed Models A heuristic function returns the exact cost of reaching a goal in a simplified or relaxed version of the original problem. This means that we remove some of the constraints of the problem. Removing constraints = adding edges

Example: Road navigation A good heuristic – the straight line. We remove the constraint that we have to move along the roads We are allowed to move in a straight line between two points. We get a relaxation of the original problem. In fact, we added edges of the complete graph

Example 2: - TSP problem We can describe the problem as a graph with 3 constraints: 1) Our tour covers all the cities. 2) Every node has a degree two an edge entering the node and an edge leaving the node. 3) The graph is connected. If we remove constraint 2 : We get a spanning graph and the optimal solution to this problem is a MST (Minimum Spanning Tree). If we remove constraint 3: Now the graph isn’t connected and the optimal solution to this problem is the solution to the assignment problem.

Example 3- Tile Puzzle problem One of the constraints in this problem is that a tile can only slide into the position occupied by a blank. If we remove this constraint we allow any tile to be moved horizontally or vertically position. This is the Manhattan distance to its goal location.

The STRIPS Problem formulation We would like to derive such heuristics automatically. In order to do that we need a formal description language that is richer than the problem space graph. One such language is called STRIPS. In this language we have predicates and operators. Let’s see a STRIPS representation of the Eight Puzzle Problem

STRIPS - Eight Puzzle Example On(x,y) = tile x is in location y. Clear(z) = location z is clear. Adj(y,z) = location y is adjacent to location z. Move(x,y,z) = move tile x from location y to location z. In the language we have: A precondition list - for example to execute move(x,y,z) we must have: On(x,y) Clear(z) Adj(y,z) An add list - predicates that weren’t true before the operator and now after the operator was executed are true. A delete list - a subset of the preconditions, that now after the operator was executed aren’t true anymore.

STRIPS - Eight Puzzle Example Now in order to construct a simplified or relaxed problem we only have to remove some of the preconditions. For example - by removing Clear(z) we allow tiles to move to adjacent locations. In general, the hard part is to identify which relaxed problems have the property that their exact solution can be efficiently computed.

Admissibility and Consistency The heuristics that are derived by this method are both admissible and consistent. Note : The cost of the simplified graph should be as close as possible to the original graph. Admissibility means that the simplified graph has an equal or lower cost than the lowest - cost path in the original graph. Consistency means that a heuristic h is consistent for every neighbor n’ of n, when h(n) is the actual optimal cost of reaching a goal in the graph of the relaxed problem. h(n)  c(n,n’)+h(n’)

Method 2: pattern data base A different method for abstracting and relaxing to problem to get a simplified problem. Invented in 1996 by Culberson & Schaeffer

optimal path search algorithms For small graphs: provided explicitly, algorithm such as Dijkstra’s shortest path, Bellman-Ford or Floyd-Warshal. Complexity O(n^2). For very large graphs , which are implicitly defined, the A* algorithm which is a best-first search algorithm.

Best-first search schema sorts all generated nodes in an OPEN-LIST and chooses the node with the best cost value for expansion. generate(x): insert x into OPEN_LIST. expand(x): delete x from OPEN_LIST and generate its children. BFS depends on its cost (heuristic) function. Different functions cause BFS to expand different nodes.. 30 25 20 30 35 35 35 40 Open-List

Best-first search: Cost functions g(x): Real distance from the initial state to x h(x): The estimated remained distance from x to the goal state. Examples:Air distance Manhattan Dinstance Different cost combinations of g and h f(x)=level(x) Breadth-First Search. f(x)=g(x) Dijkstra’s algorithms. f(x)=h’(x) Pure Heuristic Search (PHS). f(x)=g(x)+h’(x) The A* algorithm (1968).

A* (and IDA*) A* is a best-first search algorithm that uses f(n)=g(n)+h(n) as its cost function. f(x) in A* is an estimation of the shortest path to the goal via x. A* is admissible, complete and optimally effective. [Pearl 84] Result: any other optimal search algorithm will expand at least all the nodes expanded by A* Breadth First Search A*

Domains 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 15 puzzle 10^13 states First solved by [Korf 85] with IDA* and Manhattan distance Takes 53 seconds 24 puzzle 10^24 states First solved by [Korf 96] Takes two days 20 21 22 23 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Domains Rubik’s cube 10^19 states First solved by [Korf 97] Takes 2 days to solve

(n,k) Pancake Puzzle 1 2 3 4 5 N An array of N tokens (Pancakes) Operators: Any first k consecutive tokens can be reversed. The 17 version has 10^13 states The 20 version has 10^18 states

(n,k) Top Spin Puzzle n tokens arranged in a ring States: any possible permutation of the tokens Operators: Any k consecutive tokens can be reversed The (17,4) version has 10^13 states The (20,4) version has 10^18 states

4-peg Towers of Hanoi (TOH4) Harder than the known 3-peg Towers of Hanoi There is a conjecture about the length of optimal path but it was not proven. Size 4^k

How to improve search? Enhanced algorithms: Perimeter-search [Delinberg and Nilson 95] RBFS [Korf 93] Frontier-search [Korf and Zang 2003] Breadth-first heuristic search [Zhou and Hansen 04]  They all try to better explore the search tree. Better heuristics: more parts of the search tree will be pruned.

Better heuristics In the 3rd Millennium we have very large memories. We can build large tables. For enhanced algorithms: large open-lists or transposition tables. They store nodes explicitly. A more intelligent way is to store general knowledge. We can do this with heuristics

Subproblems-Abstractions Many problems can be abstracted into subproblems that must be also solved. A solution to the subproblem is a lower bound on the entire problem. Example: Rubik’s cube [Korf 97] Problem:  3x3x3 Rubik’s cube Subproblem:  2x2x2 Corner cubies.

Pattern Databases heuristics A pattern database (PDB) is a lookup table that stores solutions to all configurations of the sub-problem (patterns) This PDB is used as a heuristic during the search 88 Million states 10^19 States Search space Mapping/Projection Pattern space

Non-additive pattern databases Fringe pattern database [Culberson & Schaeffer 1996]. Has only 259 Million states. Improvement of a factor of 100 over Manhattan Distance

Example - 15 puzzle 1 1 4 5 10 4 5 8 11 9 12 2 13 15 8 9 10 11 3 6 7 14 12 13 14 15 How many moves do we need to move tiles 2,3,6,7 from locations 8,12,13,14 to their goal locations The solution to this is located in PDB[8][12][13][14]=18

Example - 15 puzzle 2 3 6 7 How many moves do we need to move tiles 2,3,6,7 from locations 8,12,13,14 to their goal locations The solution to this is located in PDB[8][12][13][14]=18

Disjoint Additive PDBs (DADB) If you have many PDBS, take their maximum 7-8 Values of disjoint databases can be added and are still admissible [Korf & Felner: AIJ-02, Felner, Korf & Hanan: JAIR-04] Additivity can be applied if the cost of a subproblem is composed from costs of objects from corresponding pattern only

DADB:Tile puzzles 5-5-5 6-6-3 7-8 6-6-6-6 Memory Time Nodes Value [Korf, AAAI 2005] Memory Time Nodes Value Heuristic Puzzle 3-tera-bytes 28 days 10^13 Breadth-FS 15 53.424 401,189,630 36.942 Manhattan 3,145 0.541 3,090,405 41.562 5-5-5 33,554 0.163 617,555 42.924 6-6-3 576,575 0.034 36,710 45.632 7-8 242,000 2 days 360,892,479,671 6-6-6-6 24

Heuristics for the TOH Infinite peg heuristic (INP): Each disk moves to its own temporary peg. Additive pattern databases [Felner, Korf & Hanan, JAIR-04]

Additive PDBS for TOH4 6 10 Partition the disks into disjoint sets Store the cost of the complete pattern space of each set in a pattern database. Add values from these PDBs for the heuristic value. The n-disk problem contains 4^n states The largest database that we stored was of 14 disks which needed 4^14=256MB. 6 10

Duality  Motivation What is the relation between state S and states S1 and S2? S 3 1 4 2 5 S1 5 2 4 1 3 Geometrical symmetries!! Reversing and/or rotating S2 1 4 2 5 3 And what is the relation between S and Sd?? Sd 2 4 1 3 5

Symmetries in PDBs 7 8 8 7 Symmetric lookups Tile puzzles: reflect the tiles about the main diagonal. Rubik’s cube: rotate the cube We can take the maximum among the different lookups These are all geometrical symmetries We suggest a new type of symmetry!! 7 8 8 7

Duality :definition 1 п п Let S be a state. Let Π be a permutation such that Π(S)=G Define: =Π(G) consequences Π ( )=G The length of the optimal path from S to G and from S to G is identical S S d п -1 G S d п d d S d An admissible heuristic for S is also admissible for S

Regular and dual representation Regular representation of a problem: Variables – objects (tiles, cubies etc,) Values – locations Dual representation: Variables – locations Values – objects

Duality :definition 2 Definition 2: For a state S we flip the roles of variables and objects Assume a vector <3,1,4,2> Regular representation: Dual representation: 3 1 4 2 2 4 1 3

Duality Claim: Definition 1 and definition 2 are equivalent Proof: Assume that in S, object j is in location i and that Π(i)=j. Applying Π for the first time (on S) will move object j to location j. Applying Π for the second time (on G) will move object i to location j

Using duality Dual lookup: We can take the heuristic of the dual state and use it for the regular state. In particular we can perform a PDB lookup for the dual state Dual Search: This is a novel search algorithm which can be constructed for any known search algorithm

Dual Search When the search arrives at a state S, we also look at its dual state S. We might consider to JUMP and continue the search from S towards the goal. This is a novel version of bidirectional search d d

Example п п Traditional Search Bidirectional Search G G п S 1 S 1 d (a) No Jumps (b) One Jump Traditional Search Bidirectional Search Construction of the solution path is possible by applying usual backtracking with some simple modifications.

When to jump At every node, a decision should be made whether to continue the search from S or to jump to S Jumping Policies: JIL: Jump if larger JOR: Jump only at the root J15,J24: Special jumping policies for the 15 and 24 tile puzzles d

Experimental results Rubik’s cube: 7-edges PDB. 1000 problem instances. Heuristic Search Policy Nodes Time r IDA* - 90,930,662 28.18 d 8,315,116 3.24 max(r,d) 2,997,539 1.34 DIDA* JIL 2,697,087 1.16 JOR 2,464,685 1.02

Experimental results 16 Pancake problem 9-tiles PDB. 100 problem instances. Heuristic Search Policy Nodes Time r IDA* - 342,308,368,717 284,054 d 14,387,002,121 12,485 max(r,d) 2,478,269,076 3,086 DIDA* JIL 260,506,693 362

Experimental results 15 puzzle 7-8 tiles PDB. 1000 problem instances from [Korf & Felner 2002] 7-8 Heuristic Search Policy Value Nodes Time r IDA* - 44.75 136,289 0.081 Max(r,r*) 45.63 36,710 0.034 max(r,r*,d,d*) 46.12 18,601 0.022 DIDA* J15 13,687 0.018

Experimental results 24 puzzle 6-6-6-6 tiles PDB. 50 Problem instances from [Korf & Felner 2002] Heuristic Search Policy Nodes max(r,r*) IDA* - 43,454,810,045 max(r,r*,d,d*) 13,549,943,868 max(r,d*) DIDA* J24 8,248,769,713 3,948,614,947

Conclusions Duality in search spaces Two way to use duality: 1) the dual heuristic 2) the dual search Improvement in performance

Inconsistent heuristics Inconsistency sounds negative “It is hard to concoct heuristics that are admissible but are inconsistent” [AI book Russel and Norvig 2005] “Almost all admissible heuristics are consistent” [Korf, AAAI-2000]

A* Best-first search with a cost function of f(n)=g(n)+h(n) g(n): actual distance from the initial state to n h(n): estimated remained distance from n to the goal state. h(n) is admissible if it is always underestimating Examples of h: Air distance, Manhattan Distance

Attributes of A* If h(n) is admissible then A* is guaranteed to return an optimal (shortest) solution A* is admissible, complete and optimally effective. [Pearl & Dechter 84] A* is exponential in memory!!

IDA* IDA* [Korf 1985] is a linear-space version of A*. A series of DFS calls with increasing threshold Since 1985 it is the default algorithm for many problems!! IDA* re-expands duplicate nodes !!!

Consistent heuristics A heuristic is consistent if for every two nodes n and m |h(n)-h(m)| dist(n,m) Intuition  h cannot change by more than the change of g h(n)  c(n,m) + h(m)

inconsistent heuristics A heuristic is inconsistent if for some two nodes n and m |h(n)-h(m)| > dist(n,m) g=5 h=5 f=10 The child is inconsistent with its parent 1 g=6 h=2 f=8

Pathmax The pathmax (PMX) method corrects inconsistent heuristics. [Mero 84,Marteli 77] The child inherits its parent f-value if it is larger g=5 h=5 f=10 1 g=6 h=2 4 f=810

Reopening of nodes with A* c a f=2 f=1 d I 5 G f=8 f=7 f=0 5 f=3 b f=6 f=2 Node d is expanded twice with A*!!

In the context of A* inconsistency was considered a bad attribute

Inconsistency and IDA* Node re-opening is not a problem with IDA*,because each path to a node is examined anyway!! No overhead for inconsistent heuristics

Bidirectional pathmax (BPMX) h-values h-values 2 4 BPMX 5 1 5 3 Bidirectional pathmax: h-values are propagated in both directions decreasing by 1 in each edge. If the IDA* threshold is 2 then with BPMX the right child will not even be generated!!

Achieving inconsistent heuristics Random selection of heuristics Dual evaluations [Zahavi et al 2006] Compressed pattern databases [Felner et al 2004]

More than one heuristic A munber of different PDBs Symmetric lookups Tile puzzles: reflect the tiles about the main diagonal. Rubik’s cube: rotate the cube 7 8 8 7

Using K heuristics Taking the maximum of K heuristics is Admissible Consistent Better than each of them Drawbacks: Overhead of K heuristics lookups diminishing return. Alternatively, we can randomize which heuristic out of K to consult. Inconsistent Benefits: Only one look up. BPMX can be activated.

Duality [Zahavi et al. AAAI-2006] Let S be a state. Let Π be a permutation such that Π(S)=G Define: =Π(G) S п S d Requirements G п Any sequence of operators applicable to S is also applicable to G. S d If the operators are invertible with same cost: An admissible heuristic for S is also admissible for S d

Achieving inconsistent heuristics Dual evaluations are inconsistent [Zahavi et al. AAAI-2006] 3) Compressed pattern databases In general – any partial heuristic is inconsistent.

Inconsistency and duality Regular states Dual states A 1 2 3 4 5 6 7 8 9 A^d=A 1 2 3 4 5 6 7 8 9 B 1 2 3 4 5 9 8 7 6 B^d=B 1 2 3 4 5 9 8 7 6 C^d C 1 2 3 8 9 5 4 7 6 1 2 3 7 6 9 8 4 5 Regular lookup for A, A^d, B, B^d : PDB[1,2,3,4,5]=0 Regular lookup for C : PDB[1,2,3,7,6]=1 Dual lookup for C: PDB[1,2,3,8,9]=2 Regular Dual B C 1 2

Consistent Vs. Inconsistent Assume a threshold of 5 consistent inconsistent

Heuristic value distribution Notice that all these heuristics have the same average value

7-edges PDB over 1000 instances of depth 14 Rubik’s cube results Time Nodes Lookups No 28.18 90,930,662 Regular 1 7.38 19,653,386 Dual 3.24 8,315,116 Dual+BPMX 3.30 9,652,138 Random 1.25 3,828,138 Random+BPMX 7.85 13,380,154 2 11.60 10,574,180 4 7-edges PDB over 1000 instances of depth 14

7-8 additive PDB over 1000 instances Tile puzzle results Time Nodes Lookups No 0.081 136,289 Regular 1 0.139 247,299 Dual+BPMX 0.029 44,829 Random+BPMX 0.034 36,130 Regular+reflected 2 0.025 26,862 2 Random+BPMX 0.026 21,425 3 Random+BPMX 3 0.022 18,601 All 4 4 7-8 additive PDB over 1000 instances