Download presentation
Presentation is loading. Please wait.
1
Honte, a Go-Playing Program Using Neural Nets Frederik Dahl
2
Combined approach Supervised learning Shape evaluation Reinforcement learning Group safety Territory Heuristic evaluation Influence Search Capture Connectivity Life and death
3
Architecture
4
Shape evaluation: Multilayer perceptron 190 inputs Receptive field of radius 3 Distance to edge Liberties Captured stones 50 hidden nodes Single output Will an expert play here?
5
Shape evaluation: Training and performance Trained on 400 expert games Expert move used as positive example (+1) Random legal move as negative example (0) Error backpropagation error = target - eval Performance measured by treating prediction as evaluation function What percentage of legal moves are ranked below the expert move?
6
Shape evaluation: Results
7
Local search Selective search for local goals Capture Connectivity Life and death Only considers moves suggested by shape evaluating network Deep and narrow search Captures common-sense knowledge
8
Group safety evaluation: Multilayer perceptron Groups defined by connectable blocks 13 inputs Number of stones in group Number of liberties in group Number of proven eyes Average opponent influence over liberties 20 hidden nodes 1 output Probability of group survival
9
Group safety evaluation: Temporal difference learning Trained by self-play Reward signal for the group is the average final safety of stones 0 = captured 1 = survived TD(0) is used, replaying games backwards Very simple idea: error = eval(next) - eval(now)
10
Influence evaluation Consider random walks from an intersection How likely to end up at a black or white stone? Can also take account of group safety estimates
11
Territory evaluation Another multilayer perceptron 4 Inputs Revised influence (for both sides) Distance from edge 10 hidden nodes 1 output Predicted territory value Trained by TD(0) using eventual territory value as reward signal
12
Playing strength Playing 19x19 Go Approximately even against Handtalk 97-06e Wins more than 50% against Ego 1.0 Weaknesses Confuses group safety with group strength Has no concept of the aji of a group
13
Recent work New version of WinHonte 1.03 Neural net to evaluate sente/gote Trial version available online!
14
Conclusions Go knowledge can be learned Combining different forms of knowledge can be a good idea Multilayer perceptrons provide a flexible representation Local search can be used effectively as input features for learning
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.