AlphaGO from Google DeepMind in 2016, beat human grandmasters

Slides:



Advertisements
Similar presentations
Fuzzy Reasoning in Computer Go Opening Stage Strategy P.Lekhavat and C.J.Hinde.
Advertisements

Learning in Computer Go David Silver. The Problem Large state space  Approximately states  Game tree of about nodes  Branching factor.
Artificial Intelligence in Game Design Heuristics and Other Ideas in Board Games.
Minimax and Alpha-Beta Reduction Borrows from Spring 2006 CS 440 Lecture Slides.
Honte, a Go-Playing Program Using Neural Nets Frederik Dahl.
Learning Shape in Computer Go David Silver. A brief introduction to Go Black and white take turns to place down stones Once played, a stone cannot move.
Reinforcement Learning of Local Shape in the Game of Atari-Go David Silver.
Monte Carlo Go Has a Way to Go Haruhiro Yoshimoto (*1) Kazuki Yoshizoe (*1) Tomoyuki Kaneko (*1) Akihiro Kishimoto (*2) Kenjiro Taura (*1) (*1)University.
Reinforcement Learning of Local Shape in the Game of Atari-Go David Silver.
Artificial Intelligence in Game Design Lecture 22: Heuristics and Other Ideas in Board Games.
Introduction Many decision making problems in real life
Artificial Intelligence in Games CA107 Topics in Computing Dr. David Sinclair School of Computer Applications
GOMOKU ALGORITHM STUDY MIN-MAX AND MONTE CARLO APPROACHING
Connect Four AI Robert Burns and Brett Crawford. Connect Four  A board with at least six rows and seven columns  Two players: one with red discs and.
Computer Go : A Go player Rohit Gurjar CS365 Project Presentation, IIT Kanpur Guided By – Prof. Amitabha Mukerjee.
CHECKERS: TD(Λ) LEARNING APPLIED FOR DETERMINISTIC GAME Presented By: Presented To: Amna Khan Mis Saleha Raza.
Every chess master was once a beginner. Irving Chernev
Learning to Play the Game of GO Lei Li Computer Science Department May 3, 2007.
GOMOKU ALGORITHM STUDY MIN-MAX AND MONTE CARLO APPROACHING
1 Evaluation Function for Computer Go. 2 Game Objective Surrounding most area on the boardSurrounding most area on the board.
Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.
AI: AlphaGo European champion : Fan Hui A feat previously thought to be at least a decade away!!!
ConvNets for Image Classification
Understanding AlphaGo. Go Overview Originated in ancient China 2,500 years ago Two players game Goal - surround more territory than the opponent 19X19.
Some counting questions with the Rules of Chess. Chessboard and Chess pieces The game of chess is played on an 8-by-8 grid of alternately colored squares.
Game Playing Why do AI researchers study game playing?
Reinforcement Learning
Adversarial Environments/ Game Playing
Instructor: Vincent Conitzer
Stochastic tree search and stochastic games
Machine Learning for Big Data
Status Report on Machine Learning
Ab & Team Presents CHESS “It’s not just a game”.
I Know it’s an Idiot But it’s MY artificial idiot!
Mastering the game of Go with deep neural network and tree search
AlphaGo with Deep RL Alpha GO.
Status Report on Machine Learning
CHESS.
Videos NYT Video: DeepMind's alphaGo: Match 4 Summary: see 11 min.
Alpha Go …and Higher Ed Reuben Ternes Oakland University
AlphaGo and learning methods
Deep reinforcement learning
AI Strategies for Probabilistic Turn Based Games
CS 4700: Foundations of Artificial Intelligence
Adversarial Search Chapter 5.
AlphaGo and learning methods
A note given in BCC class on March 15, 2016
Dakota Ewigman Jacob Zimmermann
Machine Learning: The Connectionist
Games & Adversarial Search
Adversarial Search.
Deep learning Concept Objectives Accomplishments and Impact
Kevin Mason Michael Suggs
Reinforcement Learning
Instructor: Vincent Conitzer
Connect 4 Michael Yura.
Try starting with fewer objects to investigate.
Reinforcement learning
Unit 10 Review Around the World.
FOUR PLAYER CHESS.
Adversarial Search, Game Playing
These neural networks take a description of the Go board as an input and process it through 12 different network layers containing millions of neuron-like.
Deep Reinforcement Learning: Learning how to act using a deep neural network Psych 209, Winter 2019 February 12, 2019.
Introduction to Artificial Intelligence
Mastering Open-face Chinese Poker by Self-play Reinforcement Learning
CS51A David Kauchak Spring 2019
Connect 4 michael yura.
Function approximation
Connect 4 Michael Yura.
Prabhas Chongstitvatana Chulalongkorn University
Presentation transcript:

AlphaGO from Google DeepMind in 2016, beat human grandmasters uses Monte Carlo game tree search Science Breakthrough of the year runner-up Silver et al (2016). Nature, 529:484–489.

Go - the game 19x19, black vs white stones images from https://en.wikipedia.org/wiki/Go_(game) Go - the game 19x19, black vs white stones chess: O(bd) ~ 3580, go: O(bd) ~ 250150 Rule 1: Every stone remaining on the board must have at least one open "point" ("liberty") directly adjacent (up, down, left, or right), or must be part of a connected group that has at least one such open point next to it. Stones or groups of stones which lose their last liberty are removed from the board. Strategies: Connection: Keeping one's own stones connected means that fewer groups need to make living shape, and one has fewer groups to defend. Stay alive: The simplest way to stay alive is to establish a foothold in the corner or along one of the sides. At a minimum, a group must have two eyes (separate open points) to be "alive" liberties capture

AlphaGo implementation uses deep networks (13 layers) to represent "value function" and "policy function" these are for Reinforcement Learning learn the value of good moves or positions by whether they lead to wins (discounted rewards) performs Monte Carlo game search explore state space like minimax random "rollouts" simulate probable plays by opponent according to policy function

AlphaGo implementation hardware: 1920 CPUs, 28O GPUs training of networks: phase 1: supervised learning from database of 30 million human moves phase 2: play against self using reinforcement learning

match against Tang Weixing (2016)