Machine Learning for Go

Machine Learning for Go
Jung-Yun Lo Dept. of computer science and information engineering National Dong Hwa University

Outline A survey of the application of machine learning to the game of Go A learning architecture for the game of Go

A survey Some possible directions of research
Global approaches Learning in search Learning in the endgame Learning in the opening The representation language

Global approaches Learning a function from a board position and a move to a reward On large boards, probably a more specific method should be used for different subproblems of the game

Learning in search temperature (high → deeper search)
candidate move ordering temperature (low → stop search) Leaf node static evaluation

Learning in the endgame
Each local endgame positions is evaluated, then the whole game is considered as a sum of games Decomposition search

Learning in the opening
Hard to quantify Using joseki Depend on the surrounding situation Learning a global rules for opening moves

The representation language
Difficult to express more high-level concepts such as liberty, atari, ladder and eye Making the representation language more expressive

block( BlockID, Color, Size, LibertyCount) board( X, Y, GroupID) adjacent( BlockID1, BlockID2)

The Common Fate Graph (Enzenberger, 1996)

projection projection

Loss of information in the CFG

Conclusions Learning result are promising, but the whole field is nearly unexplored and much opportunities to do research

A learning architecture for the game of Go
Combinatorial Game Theory The HUGO Architecture Three Components of HUGO Choice of Subgames Initiative Engine Computing Game Value

Combinatorial game theory
G = {F|O} F : the set of options that player Friend can reach with one legal move O : for player Opponent F can be 2 possible value W : win for Friend L : loss for Friend

Combinatorial game theory
4 possible outcomes for a combinatorial game : WW, WL, LL, and LW WW : won by Friend, irrespective of who moves first WL : an unsettled game, won by the player who moves first LL : lost for Friend even if Friend moves first LW : the player who moves first will loss the game

The HUGO architecture Can be applied to any 2-player, deterministic, full information, partizan, combination game

3 components of HUGO Choice of subgames Initiative engine
Select a collection of well-defined subgames Ensure a high discriminative abilities Initiative engine Find the move that yields the most points Prefer holding initiative Computing game values Compute the game-theoretic value of a particular game

Future work Study more reference about machine learning

Reference A Survey of The Application of Machine Learning to The Game of Go Jan Ramon, Hendrik Blockeel / Katholieke Univ. Leuven A Learning Architecture for The Game of Go A.B. Meijer, H. Koppelaar / Delft Univ. of tech. Computer Go and Machine Learning Thore Graepal

Machine Learning for Go

Similar presentations

Presentation on theme: "Machine Learning for Go"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Machine Learning for Go

Similar presentations

Presentation on theme: "Machine Learning for Go"— Presentation transcript:

Similar presentations

About project

Feedback