Download presentation
Presentation is loading. Please wait.
1
Monte Carlo Go Has a Way to Go Haruhiro Yoshimoto (*1) Kazuki Yoshizoe (*1) Tomoyuki Kaneko (*1) Akihiro Kishimoto (*2) Kenjiro Taura (*1) (*1)University of Tokyo (*2) Future University Hakodate Adapted from the slides presented at AAAI 2006
2
Games in AI Ideal test bed for AI research Ideal test bed for AI research –Clear results –Clear motivation –Good challenge Success in search-based approach Success in search-based approach –chess (1997, Deep Blue) –and others Not successful in the game of Go Not successful in the game of Go –Go is to Chess as Poetry is to Double-entry accounting –It goes to the core of artificial intelligence, which involves the study of learning and decision-making, strategic thinking, knowledge representation, pattern recognition and, perhaps most intriguingly, intuition
3
The game of Go An 4,000 years old board game from China An 4,000 years old board game from China Standard size 19×19 Standard size 19×19 Two players, Black and White, place the stones in turns Two players, Black and White, place the stones in turns Stones can not be moved, but can be captured and taken off Stones can not be moved, but can be captured and taken off Larger territory wins Larger territory wins
4
Terminology of Go Block - connected stones of the same color Liberty - adjacent empty intersection Captured - when no liberty available Eye - surrounded region providing one or more safe liberties
5
Playing Strength $1.2M was set for beating a professional with no handicap (expired!!!) Handtalk in 1997 claimed $7,700 for winning an 11-stone handicap match against a 8-9 years old master
6
Difficulties in Computer Go Large search space Large search space –the game becomes progressively more complex, at least for the first 100 ply ChessGo Board size 8×8 19×19 Depth~80~300 Branching factor 35235 Search space 10 40 10 170
7
Difficulties in Computer Go Lack of good evaluation function Lack of good evaluation function –a material advantage does not mean a simple way to victory, and may just mean that short- term gain has been given priority –legal moves around 150 – 250, usually <50 acceptable (even <10), but computers have a hard time distinguishing them. Very high degree of pattern recognition involved in human capacity to play well. Very high degree of pattern recognition involved in human capacity to play well.
8
Why Monte Carlo Go? Success in other domains Success in other domains Bridge [Ginsberg:1999], Poker [Billings et al.:2002] Reasonable position evaluation based on sampling Reasonable position evaluation based on sampling search space from O(b d ) to O(Nbd) Easy to parallelize Easy to parallelize Can win against search-based approach Can win against search-based approach –Crazy Stone won the 11th Computer Olympiad in 9x9 Go –MoGo 19 th, 20 th KGS 9x9 winner, rated highest on CGOS Replace evaluation function by random sampling Brugmann:1993, Bouzy:2003
9
Basic idea of Monte Carlo Go Generate next moves by 1-ply search Generate next moves by 1-ply search Play a number of random games and compute the expected score Play a number of random games and compute the expected score Choose the move with the maximal score Choose the move with the maximal score The only domain-dependent information is eye. The only domain-dependent information is eye.
10
Terminal Position of Go Larger territory wins Territory = surrounded area + stones ▲ Black’s territory is 36 points × White’s territory is 45 points White wins by 9 points
11
Example Play many sample games –Each player plays randomly Compute average points for each move Select the move that has the highest average 9 points win for black 5 points win for black move A: (5 + 9) / 2 = 7 points Play rest of the game randomly
12
Monte Carlo Go and Sample Size Can reduce statistical errors with additional samples Can reduce statistical errors with additional samples Relationships between sample size and strength are not yet investigated Relationships between sample size and strength are not yet investigated – –Sampling error ~ –N: # of random games Diminishing returns must appear Diminishing returns must appear Monte Carlo with 1000 sample games Monte Carlo with 100 sample games Stronger than
13
Our Monte Carlo Go Implementation basic Monte Carlo Go basic Monte Carlo Go atari-50 enhancement: atari-50 enhancement: Utilization of simple go knowledge in move selection progressive pruning [Bouzy 2003]: progressive pruning [Bouzy 2003]: statistical move pruning in simulations
14
Atari-50 Enhancement Basic Monte Carlo: assign uniform probability for each move in sample game (no eye filling) Basic Monte Carlo: assign uniform probability for each move in sample game (no eye filling) Atari-50: higher probability for capture moves Atari-50: higher probability for capture moves –Capture is “ mostly ” a good move –50% Move A captures black stones
15
Progressive Pruning [Bouzy2003] Try sampling with smaller sample size Try sampling with smaller sample size Prune statistically inferior moves Prune statistically inferior moves score move Can assign more sample games to promising moves
16
Experimental Design Machine Machine –Intel Xeon Dual CPU at 2.40 GHz with 2 GB memory –Use 64 PCs (128 processors) connected by 1GB/s network Three versions of programs Three versions of programs –BASIC: Basic Monte Carlo Go –ATARI: BASIC + Atari-50 enhancement –ATARIPP: ATARI + Progressive Pruning Experiments Experiments –200 self-play games –Analysis of decision quality from 58 professional games
17
Diminishing Returns 4*N samples vs N samples for each move
18
Additional enhancements and Winning Percentage
19
Decision Quality of Each Move 15 3025 201710 72112 Evaluation score of “Oracle” (64 million sample games) Selected move for 100 sample game Monte Carlo Go Average error of one move is ((30 – 30) * 9 + (30 - 15 ) * 1) / 10 = 1.5 points abc 1 2 3 2b -> 9 times 2c -> 1 times
20
Decision Quality of Each Move (Basic)
21
Decision Quality of Each Move (with Atari50 Enhancement)
22
Summary of Experimental Results Additional enhancements improve strength of Monte Carlo Go Additional enhancements improve strength of Monte Carlo Go Diminish returns eventually Diminish returns eventually Additional enhancements get quicker diminishing returns Additional enhancements get quicker diminishing returns Need to collect more samples in the early stage game of 9x9 Go Need to collect more samples in the early stage game of 9x9 Go
23
Conclusions and Future Work Conclusions Conclusions –Additional samples achieve only small improvements Not like search algorithm, e.g. chess Not like search algorithm, e.g. chess –Good at strategy, not tactics blunder due to lack of domain knowledge blunder due to lack of domain knowledge –Easy to evaluate –Easy to parallelize –The way for Monte Carlo Go to go Small sample games with many enhancements will be promising Future Work Future Work –Adjust probability with pattern matching –Learning –Search + Monte Carlo Go MoGo (exploration-exploitation in the search tree using UCT) MoGo (exploration-exploitation in the search tree using UCT) –Scale to 19×19
24
Reference: Go wiki http://en.wikipedia.org/wiki/Go_(board_game) Go wiki http://en.wikipedia.org/wiki/Go_(board_game)http://en.wikipedia.org/wiki/Go_(board_game) Gnu Go http://www.gnu.org/software/gnugo/ Gnu Go http://www.gnu.org/software/gnugo/http://www.gnu.org/software/gnugo/ KGS Go Server http://www.gokgs.com KGS Go Server http://www.gokgs.comhttp://www.gokgs.com CGOS 9x9 Computer Go Server http://cgos.boardspace.net CGOS 9x9 Computer Go Server http://cgos.boardspace.nethttp://cgos.boardspace.net Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.