Combining Tactical Search and Monte-Carlo in the Game of Go Presenter: Ling Zhao University of Alberta November 1, 2005 by Tristan Cazenave & Bernard Helmstetter.

Slides:



Advertisements
Similar presentations
Learning in Computer Go David Silver. The Problem Large state space  Approximately states  Game tree of about nodes  Branching factor.
Advertisements

Development of the Best Tsume-Go Solver
Selective Search in Games of Different Complexity Maarten Schadd.
Abstract Proof Search Studied by Tristan Cazenave Surveyed by Akihiro Kishimoto.
Life in the Game of Go David B. Benson Surveyed by Akihiro Kishimoto.
Search for Transitive Connections Ling Zhao University of Alberta October 27, 2003 Author: T. Cazenave and B. Helmstetter published in JCIS'03.
This time: Outline Game playing The minimax algorithm
Mathematical Morphology Applied to Computer Go Author: Bruno Bouzy Presenter: Ling Zhao June 30, 2004.
The Move Decision Strategy of Indigo Author: Bruno Bouzy Presented by: Ling Zhao University of Alberta March 7, 2007.
Progressive Strategies For Monte-Carlo Tree Search Presenter: Ling Zhao University of Alberta November 5, 2007 Authors: G.M.J.B. Chaslot, M.H.M. Winands,
Computer Vision Group University of California Berkeley 1 Learning Scale-Invariant Contour Completion Xiaofeng Ren, Charless Fowlkes and Jitendra Malik.
Honte, a Go-Playing Program Using Neural Nets Frederik Dahl.
Maximizing the Chance of Winning in Searching Go Game Trees Presenter: Ling Zhao March 16, 2005 Author: Keh-Hsun Chen Accepted by Information Sciences.
Localized Techniques for Power Minimization and Information Gathering in Sensor Networks EE249 Final Presentation David Tong Nguyen Abhijit Davare Mentor:
Planning in Go Ling Zhao University of Alberta September 15, 2003.
Solving Probabilistic Combinatorial Games Ling Zhao & Martin Mueller University of Alberta September 7, 2005 Paper link:
Learning Shape in Computer Go David Silver. A brief introduction to Go Black and white take turns to place down stones Once played, a stone cannot move.
Strategies Based On Threats Ling Zhao University of Alberta March 10, 2003 Comparative evaluation of strategies based on the values of direct threats by.
Generalized Threats Search Paper Review Paper Author: T. Cazenave Review by: A. Botea.
Reinforcement Learning of Local Shape in the Game of Atari-Go David Silver.
Metarules To Improve Tactical Go Knowledge By Tristan Cazenave Presented by Leaf Wednesday, April 28 th, 2004.
1 An Improved Safety Solver for Computer Go Presented by: Xiaozhen Niu Date: 2004/02/24.
How computers play games with you CS161, Spring ‘03 Nathan Sturtevant.
Inside HARUKA Written by Ryuichi Kawa Surveyed by Akihiro Kishimto.
Go Meeting Talk1 Generation of Patterns with External Conditions for the Game of Go Paper presentation
1 Game Playing Chapter 6 (supplement) Various deterministic board games Additional references for the slides: Luger’s AI book (2005). Robert Wilensky’s.
1 Solving Ponnuki-Go on Small Board Paper: Solving Ponnuki-Go on small board Authors: Erik van der Werf, Jos Uiterwijk, Jaap van den Herik Presented by:
A Heuristic Search Algorithm for Capturing Problems in Go Authors: Keh-Hsun Chen and Peigang Zhang Presenter: Ling Zhao August 8, 2006.
Monte Carlo Go Has a Way to Go Haruhiro Yoshimoto (*1) Kazuki Yoshizoe (*1) Tomoyuki Kaneko (*1) Akihiro Kishimoto (*2) Kenjiro Taura (*1) (*1)University.
Recent Progress on the One-eye Solver Akihiro Kishimoto
1 Recognizing safe territories Presented by: Xiaozhen Niu Date: 2003/09/22.
Multipurpose Adversary Planning in the Game of Go Ph.D thesis by Shui Hu Presenter: Ling Zhao Date: November 18, 2002.
1 An Open Boundary Safety-of- Territory Solver for the Game of Go Author: Xiaozhen Niu, Martin Mueller Dept of Computing Science University of Alberta.
Optimizing a Chess Heuristic Using Evolutionary Algorithms Benjamin Rhew
Reinforcement Learning of Local Shape in the Game of Atari-Go David Silver.
1 An Efficient Algorithm for Eyespace Classification in Go Author: Peter Drake, Niku Schreiner Brett Tomlin, Loring Veenstra Presented by: Xiaozhen Niu.
SlugGo: A Computer Baduk Program Presenter: Ling Zhao April 4, 2006 by David G Doshay, Charlie McDowell.
Constraint Satisfaction Problems
Game Trees: MiniMax strategy, Tree Evaluation, Pruning, Utility evaluation Adapted from slides of Yoonsuck Choe.
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 2 Adapted from slides of Yoonsuck.
Parallel Monte-Carlo Tree Search with Simulation Servers H IDEKI K ATO †‡ and I KUO T AKEUCHI † † The University of Tokyo ‡ Fixstars Corporation November.
Upper Confidence Trees for Game AI Chahine Koleejan.
Othello Artificial Intelligence With Machine Learning
Computer Go : A Go player Rohit Gurjar CS365 Project Proposal, IIT Kanpur Guided By – Prof. Amitabha Mukerjee.
 Summary  How to Play Go  Project Details  Demo  Results  Conclusions.
Curriculum development Importance of development - Creating a set of rules - Planning ahead of time - Implementation of the Stage 6 syllabus - The end.
GOMOKU ALGORITHM STUDY MIN-MAX AND MONTE CARLO APPROACHING
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
Evaluation-Function Based Monte-Carlo LOA Mark H.M. Winands and Yngvi Björnsson.
Playing GWAP with strategies - using ESP as an example Wen-Yuan Zhu CSIE, NTNU.
Chapter 3 Teaching with Instructional Software Indiana Wesleyan University Former Student (used by permission)
PYIWIT'021 Threat Analysis to Reduce the Effects of the Horizon Problem in Shogi Reijer Grimbergen Department of Information Science Saga University.
CSCI 4310 Lecture 6: Adversarial Tree Search. Book Winston Chapter 6.
Adversarial Search Chapter Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time limits.
Learning to Play the Game of GO Lei Li Computer Science Department May 3, 2007.
Pedagogical Possibilities for the 2048 Puzzle Game Todd W. Neller.
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
A Key Management Scheme for Wireless Sensor Networks Using Deployment Knowledge Wenliang Du et al.
Othello Artificial Intelligence With Machine Learning Computer Systems TJHSST Nick Sidawy.
GOMOKU ALGORITHM STUDY MIN-MAX AND MONTE CARLO APPROACHING
A Scalable Machine Learning Approach to Go Pierre Baldi and Lin Wu UC Irvine.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)
1 Evaluation Function for Computer Go. 2 Game Objective Surrounding most area on the boardSurrounding most area on the board.
CMPT 463. What will be covered A* search Local search Game tree Constraint satisfaction problems (CSP)
An Evolutionary Algorithm for Neural Network Learning using Direct Encoding Paul Batchis Department of Computer Science Rutgers University.
The Game of Hex A Domain Specific Search Technique For A Beautiful Game Stefan Kiefer.
Stochastic tree search and stochastic games
Goal-driven Mechanism in Interim.2 Go Program
Search.
Search.
Presentation transcript:

Combining Tactical Search and Monte-Carlo in the Game of Go Presenter: Ling Zhao University of Alberta November 1, 2005 by Tristan Cazenave & Bernard Helmstetter

Outline Monte-Carlo Go Motivations Tactical search Gather statistics Combining search and Monte-Carlo Experimental results

Monte-Carlo Go Invented in 1993 by Bruegmann using simulated annealing. Based on Abramson’s expected- outcome model (1990). Achieved a moderate success in 9x9.

Basic idea Play a number of random games Choose a move by 1-ply search, maximizing expected score The only domain-dependent information is eye.

Weakness Scalability due to large computation Blunder due to lack of knowledge

Framework Tactical search: capture, connection, eye, life and death. Play random games and gather statistics for goals. Goal evaluation: mean score of the game when the goal is achieved minus that of the game when the goal fails. Pick the move associated with the best goal.

Tactical search Capture search: for any string, find if it can be captured or saved. Connection search: for any two strings, find if they can be connected. Empty connection search: find if a string can be connected to an empty point. Eye search: find if an eye can be made on an empty point or its neighbors. Life and death search: use generalized widening for groups of strings.

Statistics on random games Compute the mean for the random games where a goal is achieved and the mean for those where a goal has failed. Two new goals for intersections: 1. The goal of playing first on an intersection. 2. The goal of owning an intersection at the end of a game.

Selecting problems Strings cannot be disconnected will form groups. Select the simplest problem for a goal: Avoid over-estimating goals.

Gather statistics Play random games For each selected goal, find the mean of the game when it succeeds and the mean when it fails. Score of a life problem of a string: mean of the game when an intersection of the string keeps its color.

Choose a move Find the goal with the maximum difference of two mean scores. Choose the move associated with the goal.

Why is it useful? High level classifications of points on the board. Successful incorporation of tactical search.

Positive and negative goals Positive goals: confidence with search results. Negative goals: less confidence. Example: save a string (a string is consider safe when it has more than 4 liberties). Fix over-estimation.

Experimental results New enhancement vs. standard MC Each plays 10,000 random games to choose a move on 20 9x9 games result: 52.1 (+-34.2). First one plays 1,000 games, and the second one plays 10,000 games. result: 24.6 (+-40).

Experimental results (cont’d) New enhancement vs. Golois Both use the same tactical search. The second one uses global search and hand tuned heuristics. 40 games were played. result: 26 points.

Conclusions A creative idea to incorporate tactical search and Monte-Carlo. Nice extension to the authors’ previous work. The experimental results are good. The program should be tested against the strongest program.