2002/11/15Game Programming Workshop1 A Neural Network for Evaluating King Danger in Shogi Reijer Grimbergen Department of Information Science Saga University
2002/11/15Game Programming Workshop2 Outline Neural networks in games Designing a neural network for king danger King danger features A perceptron for king danger Results Learning behavior Self-play experiment Conclusions and future work
2002/11/15Game Programming Workshop3 Neural Networks in Games Success in backgammon Neurogammon TD-gammon Not so successful in chess, Go and shogi Neural network issues Black box: how to locate errors? How to decide which features to update? In games, TD-learning is used more often
2002/11/15Game Programming Workshop4 Neural Networks in Games Neural networks can be useful if 1.The set of features is fixed 2.There is a well-defined set of examples Difficult for a general evaluation function Possible for certain evaluation function components King danger
2002/11/15Game Programming Workshop5 Design of the Neural Network Why king danger? King danger is a major component of the evaluation function in shogi Most king danger features are well understood Attacked squares near the king Discovered attacks Pins A well-defined set of examples can be constructed Positive examples: Tsume shogi Negative examples: Middle game positions
2002/11/15Game Programming Workshop6 Design of the Neural Network King danger features 161 king danger features Features for the eight squares adjacent to the king (64) Board edge, occupied or not, square control Pieces in hand (45) Which pieces and number of pieces in hand Open lines to the king (32) Open rank, file or diagonal of different lengths Discovered attack (8) Discovered attack on the diagonals, rank or file Potential knight attack (4) Pinned pieces (8) Pinned piece on the diagonals, rank or file
2002/11/15Game Programming Workshop7 Design of the Neural Network A perceptron for king danger The value of the output unit is the sum of the weights of the active input units King in danger: value above a certain threshold Wrong evaluation Positive example but value below threshold Negative example but value above threshold Update weights of the active features Split examples in a training set and a test set Epochs: number of update cycles of the examples in the training set … I1I1 I2I2 I 161 W1W1 W2W2 W 161 O Input units WeightsOutput unit
2002/11/15Game Programming Workshop8 Design of the Neural Network Adding constraints General updating can make all features weights positive or negative Some features that can never have negative weights can become negative and vice versa Add constraints to avoid this problem Always negative weight Defender controls adjacent square Attacker has no piece of each piece type in hand Always positive weight Attacker has at least one of each of the piece types in hand Discovered attack Potential knight attack Pinned piece
2002/11/15Game Programming Workshop9 Results Learning behavior Example set 500 positive examples and 500 negative examples Initial weights assigned randomly King danger threshold Output value is higher than 0.8 Examples divided into a training set and test set Training set: increased from 1 to 999 examples Epochs: 5000, 10000, 15000, 20000, 25000, Learning rate: 0.01 and 0.001
2002/11/15Game Programming Workshop10 Learning behavior Ideal learning curve
2002/11/15Game Programming Workshop11 Learning behavior Typical king danger learning curve
2002/11/15Game Programming Workshop12 Learning behavior Example set problem
2002/11/15Game Programming Workshop13 Results Self-play experiment P-SPEAR Use a trained perceptron for king danger evaluation Used the perceptron trained with epochs and learning rate A number of matches of 20 games was played between SPEAR and P-SPEAR Tuning was difficult Best result: P-SPEAR loses 12-8
2002/11/15Game Programming Workshop14 Conclusions and Future Work Conclusions A perceptron can produce a high percentage of correct predictions of king danger Using a perceptron for king danger can lead to performance comparable to a hand-tuned evaluation function Future work Further analysis of king danger features Adding more constraints to feature weights Analyze the example set Tuning the output value with the other components of the evaluation function Investigate neural networks with hidden layers