Neural Heuristics For Problem Solving: Using ANNs to Develop Heuristics for the 8-Puzzle by Bambridge E. Peterson
question to be answered paradox to be resolved obstacle to be overcome goal to be achieved crisis to be averted challenge to be met What is a problem? (informal)
What is a problem? (formal) Formulate problem as a graph search 1.Initial state (question), goal state (answer) 2.Actions - allowable actions for a given state 3.Transition function - T(S,A) - given a state S and action A, return the resulting state S’ when A is performed in S 4.Goal test - function to test whether we’ve reached the goal 5.Path-cost function - keeps track of path cost (from Artificial Intelligence: A Modern Approach, 3rd Edition by Russell and Norvig)
S G Idea: Use explored set to keep track of expanded nodes Use frontier to store successor nodes still to be expanded Many search algorithms differ in how to store nodes in the frontier Graph Search
S G Some Examples: Breadth-first search Depth-first search Iterative-deepening Uniform cost Greedy-best first A* Iterative-deepening A* Graph Search
S G A* search order priority queue using cost function f(n) = g(n) + h(n) f(n) is a cost function g(n) : path cost to reach node (n) h(n) is the heuristic function - estimated distance to the goal A* optimal if h(n) is admissible and consistent Graph Search
Heuristics in Graph Search What is a heuristic? o General rule of thumb for solving a problem. Usually developed through experience What is an admissible heuristic? o A heuristic that never overestimates the path-cost to the goal What is a consistent heuristic? o never takes a step back (monotone) Why use heuristics? o Brute force search is slow when state space is large o Reduces number of nodes necessary to explore
N-Puzzle n = i for positive integer i sliding block puzzle, grid n - 1 tiles, 1 ‘blank space’ start in random state can move one tile at a time exchange places with the ‘blank’ space can only move up, down, left, right 8-puzzle example (right) goal state is numbers 1 through n in order, left to right, top to bottom
N-Puzzle Heuristics 8-puzzle: 9!/2 = 181,440 total states 15-puzzle: 16!/2 approximately 1 trillion states 24-puzzle: 25!/2 approximately 7.76 * states o Have fun with brute-force search in this state space Why use heuristics??? N-puzzle is a good example Something more ‘clever’ than brute force approach is needed…. Manhattan Distance - sum total of city block distance of all tiles in their current position from position in goal state Misplaced tiles - total number of tiles not in goal state position
Symbolic vs. Subsymbolic 1.symbols + rules for their arrangement in space and transformation in time (syntax) is a general definition of language 2.Infinite meaningful arrangements can be generated from a finite set of symbols 3.Natural languages 4.Formal languages Manhattan Distance is a symbolic heuristic 1.Connectionist 2.Parallel-distributed process 3.Simultaneous processing among multiple parallel channels Can we use machine learning to develop heuristics? Subsymbolic heuristics aka “Neural Heuristics”... So the goal is to develop a ‘better’ heuristic for the 8-puzzle...
Generating Training Data generated 20,000 solved instances of the 8-puzzle using Python to generate and solve states using the A*star algorithm stored the instances in MongoDB as well as.txt file for processing in Octave Note: A puzzle can be represented internally as a vector (3, 8, 2, 4, 5, 6, 1, 7, 9) - use 9 to represent the blank space. Obviously only certain operations can be performed...
Training Data Fields Example 1.State n 2.# states explored 3.# nodes added to frontier 4.MD heuristic 5.Path-cost 6.Time (on my machine) 1.8, 7, 1, 2, 9, 6, 3, 4, microseconds
General statistics MINMAXMEANSTD Path-cost MD Frontier Explored Time
Neural Heuristics The idea... Train various MLP networks with backpropagation goal is approximation (regression) Train network with different targets - o the optimal solution o the difference between the optimal solution and the manhattan distance of the state o perhaps another...
Neural Network Input 9 element input state S was transformed in a 81 element vector of 1’s and 0’s - the 9 x k + t bit equaled 1 if and only if S[k] = t Example: [2, 1, 3] = [ ] Example: [3, 2, 1] = [ ] Tried this because of the following paper: Likely Admissible and subsymbolic heuristics
Neural Networks (cont.) # hidden layers - 5, 10 and 15 learning rate set at 0.1 momentum 0.8 Number of epochs , 64 samples an epoch used tanh activation function for the hidden layer sigmoid activation function on output
Neural Networks (cont.) 13,000 samples used for training set 2,000 samples for tuning 3,000 for testing the results of the trained MLP 3,000 for ‘official’ testing in Python using A* saved weights in a.txt file tested in Python using Numpy
Preliminary Results A bit disappointing so far... For the 3,000 remaining testing samples, I compared the stats between the manhattan distance heuristic and various neural heuristics developed in training Heuristic MINMAXMEANSTD MD h* h*_md h*_md_avg h* - heuristic developed with optimal path cost as target h*_md - heuristic function developed with optimal path cost minus manhattan distance as target h*_md_avg - mean of the two above heuristics
Preliminary Results A bit disappointing so far... Examples… Using MD heuristic, takes less than 1 second to solve 10 n-puzzle examples. Average explored for these examples is 963, with 1508 nodes added to the frontier For the same puzzles, using h*, it took over 2 minutes to solve the puzzles, with an average of 27,000 nodes explored and added to the frontier Something isn’t right here...
Next Up Double check code for errors Try 9-h-1 topology, using just the state input without transformation into bit vector SVM - give Support Vector Machine a crack at it Discuss with Professor Hu Still a week left!
Questions?