Marco Adelfio CMSC 828N – Spring 2009 General Game Playing (GGP)

Slides:



Advertisements
Similar presentations
Adversarial Search Chapter 6 Section 1 – 4. Types of Games.
Advertisements

Markov Decision Process
This lecture topic: Game-Playing & Adversarial Search
Study Group Randomized Algorithms 21 st June 03. Topics Covered Game Tree Evaluation –its expected run time is better than the worst- case complexity.
Techniques for Dealing with Hard Problems Backtrack: –Systematically enumerates all potential solutions by continually trying to extend a partial solution.
February 7, 2006AI: Chapter 6: Adversarial Search1 Artificial Intelligence Chapter 6: Adversarial Search Michael Scherger Department of Computer Science.
Monte Carlo Tree Search: Insights and Applications BCS Real AI Event Simon Lucas Game Intelligence Group University of Essex.
Adversarial Search Chapter 6 Section 1 – 4.
Satisfaction Equilibrium Stéphane Ross. Canadian AI / 21 Problem In real life multiagent systems :  Agents generally do not know the preferences.
An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.
Game Intelligence: The Future Simon M. Lucas Game Intelligence Group School of CS & EE University of Essex.
Game Playing CSC361 AI CSC361: Game Playing.
Trading optimality for speed…
How computers play games with you CS161, Spring ‘03 Nathan Sturtevant.
Adversarial Search: Game Playing Reading: Chess paper.
Monte Carlo Go Has a Way to Go Haruhiro Yoshimoto (*1) Kazuki Yoshizoe (*1) Tomoyuki Kaneko (*1) Akihiro Kishimoto (*2) Kenjiro Taura (*1) (*1)University.
Othello Sean Farrell June 29, Othello Two-player game played on 8x8 board All pieces have one white side and one black side Initial board setup.
Games & Adversarial Search Chapter 6 Section 1 – 4.
Game Playing: Adversarial Search Chapter 6. Why study games Fun Clear criteria for success Interesting, hard problems which require minimal “initial structure”
CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.
Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.
1 Adversary Search Ref: Chapter 5. 2 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans.
Game Trees: MiniMax strategy, Tree Evaluation, Pruning, Utility evaluation Adapted from slides of Yoonsuck Choe.
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 2 Adapted from slides of Yoonsuck.
9/14/20151 Game Theory and Game Balance CIS 487/587 Bruce R. Maxim UM-Dearborn.
Game Playing.
Upper Confidence Trees for Game AI Chahine Koleejan.
1 Near-Optimal Play in a Social Learning Game Ryan Carr, Eric Raboin, Austin Parker, and Dana Nau Department of Computer Science, University of Maryland.
Peter van Emde Boas: Games and Computer Science 1999 GAMES AND COMPUTER SCIENCE Theoretical Models 1999 Peter van Emde Boas References available at:
Connect Four AI Robert Burns and Brett Crawford. Connect Four  A board with at least six rows and seven columns  Two players: one with red discs and.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
Evaluation-Function Based Monte-Carlo LOA Mark H.M. Winands and Yngvi Björnsson.
CHECKERS: TD(Λ) LEARNING APPLIED FOR DETERMINISTIC GAME Presented By: Presented To: Amna Khan Mis Saleha Raza.
Monte-Carlo methods for Computation and Optimization Spring 2015 Based on “N-Grams and the Last-Good-Reply Policy Applied in General Game Playing” (Mandy.
For Friday Finish chapter 6 Program 1, Milestone 1 due.
Adversarial Search Chapter Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time limits.
Tetris Agent Optimization Using Harmony Search Algorithm
RADHA-KRISHNA BALLA 19 FEBRUARY, 2009 UCT for Tactical Assault Battles in Real-Time Strategy Games.
Course Overview  What is AI?  What are the Major Challenges?  What are the Main Techniques?  Where are we failing, and why?  Step back and look at.
Reinforcement Learning AI – Week 22 Sub-symbolic AI Two: An Introduction to Reinforcement Learning Lee McCluskey, room 3/10
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
Strategic Reasoning with Game Description Language Ji Ruan, Prof. W. van der Hoek, Prof. M. Wooldridge Department of Computer Science From Specific Game.
RADHA-KRISHNA BALLA 19 FEBRUARY, 2009 UCT for Tactical Assault Battles in Real-Time Strategy Games.
An AI Game Project. Background Fivel is a unique hybrid of a NxM game and a sliding puzzle. The goals in making this project were: Create an original.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)
Adversarial Reasoning: Sampling-Based Search with the UCT algorithm Joint work with Raghuram Ramanujan and Ashish Sabharwal.
Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.
AI: AlphaGo European champion : Fan Hui A feat previously thought to be at least a decade away!!!
ConvNets for Image Classification
Chapter 5 Adversarial Search. 5.1 Games Why Study Game Playing? Games allow us to experiment with easier versions of real-world situations Hostile agents.
1 Semester © Michael Thielscher, Michael Genesereth 2011 General Game Playing Intelligent Agents for eMarkets Prof Michael Thielscher Adjunct.
Understanding AlphaGo. Go Overview Originated in ancient China 2,500 years ago Two players game Goal - surround more territory than the opponent 19X19.
Transfer Learning and Intelligence: an Argument and Approach Matthew E. Taylor Joint work with: Gregory Kuhlmann and Peter Stone Learning Agents Research.
Artificial Intelligence AIMA §5: Adversarial Search
Adversarial Search and Game-Playing
Announcements Homework 1 Full assignment posted..
Stochastic tree search and stochastic games
More on Games Chapter 6 Some material adopted from notes by Charles R. Dyer, University of Wisconsin-Madison.
PENGANTAR INTELIJENSIA BUATAN (64A614)
Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.
Constraint Satisfaction Problems vs. Finite State Problems
Game Playing in AI by: Gaurav Phapale 05 IT 6010
Announcements Homework 3 due today (grace period through Friday)
CSE 4705 Artificial Intelligence
Kevin Mason Michael Suggs
Minimax strategies, alpha beta pruning
Game Playing Fifth Lecture 2019/4/11.
Minimax strategies, alpha beta pruning
Adversarial Search Game Theory.
CS51A David Kauchak Spring 2019
Presentation transcript:

Marco Adelfio CMSC 828N – Spring 2009 General Game Playing (GGP)

Classic Game Playing AI Deep Blue TD-Gammon Poki

General Game Playing AI GGP Agent

General Game Playing GGP Goals: Create systems to play arbitrary games (given formal game definitions) Eliminate game-specific strategies Emphasize generic strategy formulation Competition created by Stanford Logic Group Hosted during AAAI conference since 2005 $10,000 Grand Prize

General Game Playing Questions: What additional challenges arise for GGP agents? How should a GGP agent evaluate game states? Can a GGP agent transfer knowledge between games?

General Game Playing Finitely many players, states Game play controlled by Game Manager over network Players act synchronously (noops allowed) Time limits enforced Basic agent must: Understand rule specification Respond to game states with legal actions Recognize a terminal state and its payoffs

Game Definition Language A game definition must logically define: Set of states in the game Legal actions for each player from a given game state Transition function Initial state Terminal states and their payoffs

Game Definition Language - Example (role p1) (role p2) (init (cell 1 1 b)) (init (cell 1 2 b)) … (init (control p1) … (<= (legal ?w (mark ?x ?y)) (true (cell ?x ?y b)) (true (control ?w))) … (<= (next (cell ?m ?n x)) (does p1 (mark ?m ?n)) (true (cell ?m ?n b))) … (<= (row ?m ?x) (true (cell ?m 1 ?x)) (true (cell ?m 2 ?x)) (true (cell ?m 3 ?x))) … (<= (line ?x) (row ?m ?x)) (<= (line ?x) (column ?m ?x)) (<= (line ?x) (diagonal ?x)) … (<= (goal p1 100) (line x)) (<= (goal p1 0) (line o) … (<= terminal (line x))

Game Communication Game Manager MessageGame Player Response (START MATCH.435 WHITE description 90 30)READY (PLAY MATCH.435 (NIL NIL))(MARK 2 2) (PLAY MATCH.435 ((MARK 2 2) NOOP)))NOOP (PLAY MATCH.435 (NOOP (MARK 1 3))(MARK 1 2) (PLAY MATCH.435 ((MARK 1 2) NOOP))NOOP... (STOP MATCH.435 ((MARK 3 3) NOOP)DONE

General Game Playing Design Challenges: Indeterminacy Size Multi-game Commonalities Opponent Recognition

AAAI Competition – Past Winners ClunePlayer (UCLA) FluxPlayer (Technical University of Dresden) CADIA (Reykjavik University) CADIA (Reykjavik University)

Agent 1: ClunePlayer Approach: Minimax Problem: Needs to assign values to intermediate game states in arbitrary games. Solution: 1. Calculate a vector of generic features at each node 2. Simulate games to determine which features are “stable” and correlated with either payoff or control 3. When running minimax, use a combination of those scores as the evaluation heuristic

Agent 2: CADIA-Player Approach: UCT (Variant of Monte Carlo simulation) Monte Carlo: Pick random actions for each player to descend the tree After reaching a terminal state, update expected payoff Q(s,a) for each visited state s and action a Introduces explore/exploit tradeoff

Agent 2: CADIA-Player UCT (Upper Confidence bound for Trees) Balance exploration and exploitation Give “bonus” to less travelled paths

Agent 3: UTexas LARG Approach: Knowledge Transfer Uses lessons from past games to improve play in new games War Games! Determines whether a new game is isomorphic or similar to a previous game. If so, transfer estimated rewards

Summary General Game Playing introduces a different set of challenges than designing game-specific AI Biggest challenge is evaluating states in a novel game Better understanding of general strategy formation has many applications

References GGP Website: Hilmar Finnson. CADIA-Player: A General Game Playing Agent. MSc Thesis, School of Computer Science, Reykjavik University CADIA-Player: A General Game Playing Agent Kuhlmann, Gregory and Peter Stone. Graph-Based Domain Mapping for Transfer Learning in General Games. Lecture Notes in Computer Science, Volume 4701/2007.Graph-Based Domain Mapping for Transfer Learning in General Games