Playing Tic-Tac-Toe with Neural Networks Zachary McNellis CPSC 4820
What is a robot? Sense, think, act
Sense, Think, React Robotics technology consists of mechanisms that can: Sense – Feedback devices (sensors) allow information about the environment to be recorded Think – Information is processed in some way (simple or complex) Act – Most obvious part of a robot. However, it can be anything from outputting a value to making the robot walk Acting is the most obvious part of robotics technology. The electronic signals that were produced as a result of sensing and thinking then control whatever the robot is designed to do, like lift a sick person, make a facial expression, or control the motors that allow it to navigate around an obstacle.
Creating a tic-tac-toe engine Board representation What move to make? Win, Lose, Draw
Board Representation 3 0 2 1 1 1 1 1 1 1 1 1 9 board positions 4 7 2 5 8 3 6 9 3 0 2 1 1 1 1 1 1 1 1 1 9 board positions Player 1 Player 2 Empty 0 2 1 Dimension, player 1, player 2 Positions labeled 1-9
What Move to Make? What does a tic-tac-toe engine do? Input: board state 3 0 2 1 1 1 1 1 1 1 1 1 Ex. “3 0 2 1 1 1 1 1 1 1 1 1 | ./my_engine” Output: next move Avoid collisions Ex. “5”
Win, Lose, or Draw “playtictactoe.py” Output Specify number of games Engine 1 Engine 2 Output Game progression Player 1 win ___ times Draw ___ times Player 2 win ___ times
1. Random Engine Implementation details Results summary
Implementation Details Java Slow execution Internal representation of board state x---ox--o x: player 1 o: player 2 -: empty position 2 dimensional array Polymorphism to easily allow different engine implementations Player player = new RandomPlayer(board, turn);
Results Summary random_engine vs random_engine Player 1 win 49 Draw 13 times Player 2 win 38 About equal number of wins from player 1 as player 2
2. “Smart” Engine Implementation details Case based reasoning Results summary
Implementation Details Java Rules were simple and came from hands-on experience IF able to get 3 in a row, play winning position ELSE IF able to block opponent, play blocking position ELSE IF empty, play edge position ELSE play random position
Case Based Reasoning Use reverse logic to figure out “rules” governing an unknown engine Steps Retrieve Reuse Revise Retain
Results Summary random_engine vs smart_engine Player 1 win 8 Draw 29 times Player 2 win 63 smart_engine vs smart_engine Player 1 win 59 Draw 10 times Player 2 win 31
3. Neural Network Engine Neural network overview Implementation details Results summary
Neural Network Overview Provides ability to “learn” how to do tasks based on training data Requires linear and nonlinear step to produce a set of weights Weights map training input to training output Learning rate used to discover a set of weights that result in an error of 0, in which all inputs are precisely mapped to all outputs
Implementation Details Goal: train neural network on data produced by previous “smart engine” Input: state of the board Output: next move Neural network trainer Python Allows user to pass in parameters such as learning rate, bias, input, output, and weight files 15 pairs of inputs and outputs used Difficulty of convergence Neural network engine Use set of weights used by trainer to generate “next move”
Results Summary neural_engine vs smart_engine Player 1 win 38 Draw 11 times Player 2 win 51 neural_engine vs random_engine Player 1 win 56 Draw 12 times Player 2 win 32
4. PyBrain Engine PyBrain neural network library Implementation details Results summary
Implementation Details Goal: Implement same neural network engine using training weights produced from an external library PyBrain Python-Based Reinforcement Learning, Artificial Intelligence and Neural Network Library http://pybrain.org/ Used backpropogation method of training values Optimization of errors, minimizing loss function Allows higher chance of convergence for larger data sets 25 pairs of input/output compared to 15
Results Summary smart_engine vs neurallib_engine Player 1 win 51 Draw 6 times Player 2 win 43 random_engine vs neurallib_engine Player 1 win 30 Draw times 16 Player 2 win 54
(5?) Self Organizing Maps Another type of neural network Using weights in different ways Weights are now nodes instead of connections Useful for identifying what the inputs should be Weights are updated based on geography Useful for pattern completion Could be used in tic-tac-toe engine to determine whether a given board state is valid or not You may already be aware of supervised training techniques such as backpropagation where the training data consists of vector pairs - an input vector and a target vector. With this approach an input vector is presented to the network (typically a multilayer feedforward network) and the output is compared with the target vector. If they differ, the weights of the network are altered slightly to reduce the error in the output. This is repeated many times and with many sets of vector pairs until the network gives the desired output. Training a SOM however, requires no target vector. A SOM learns to classify the training data without any external supervision whatsoever.
Now I’ll show a demonstration of running the programs I’ve been discussing