The Implementation of Machine Learning in the Game of Checkers Billy Melicher Computer Systems lab 08-09.

Slides:



Advertisements
Similar presentations
Chapter 6, Sec Adversarial Search.
Advertisements

Announcements 1.Reading and Homework for this week 2.Late policy on homework: 10% per day, unless you ask for extension beforehand 3.Presentation schedule.
Alpha-Beta Search. 2 Two-player games The object of a search is to find a path from the starting position to a goal position In a puzzle-type problem,
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
AI for Connect-4 (or other 2-player games) Minds and Machines.
Games & Adversarial Search
Artificial Intelligence Adversarial search Fall 2008 professor: Luigi Ceccaroni.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2008.
Tic Tac Toe Architecture CSE 5290 – Artificial Intelligence 06/13/2011 Christopher Hepler.
Adversarial Search CSE 473 University of Washington.
Adversarial Search Chapter 6.
The Implementation of Artificial Intelligence and Temporal Difference Learning Algorithms in a Computerized Chess Programme By James Mannion Computer Systems.
10/19/2004TCSS435A Isabelle Bichindaritz1 Game and Tree Searching.
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
This time: Outline Game playing The minimax algorithm
Game Playing CSC361 AI CSC361: Game Playing.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2006.
Game Tree Search based on Russ Greiner and Jean-Claude Latombe’s notes.
Adversarial Search: Game Playing Reading: Chess paper.
Games & Adversarial Search Chapter 6 Section 1 – 4.
Alpha-Beta Search. 2 Two-player games The object of a search is to find a path from the starting position to a goal position In a puzzle-type problem,
Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.
1 Adversary Search Ref: Chapter 5. 2 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans.
Game Trees: MiniMax strategy, Tree Evaluation, Pruning, Utility evaluation Adapted from slides of Yoonsuck Choe.
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 2 Adapted from slides of Yoonsuck.
Minimax.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Lecture 6: Game Playing Heshaam Faili University of Tehran Two-player games Minmax search algorithm Alpha-Beta pruning Games with chance.
Introduction Many decision making problems in real life
Othello Artificial Intelligence With Machine Learning
Computer Go : A Go player Rohit Gurjar CS365 Project Proposal, IIT Kanpur Guided By – Prof. Amitabha Mukerjee.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Chapter 6 Adversarial Search. Adversarial Search Problem Initial State Initial State Successor Function Successor Function Terminal Test Terminal Test.
Connect Four AI Robert Burns and Brett Crawford. Connect Four  A board with at least six rows and seven columns  Two players: one with red discs and.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
Machine Learning for an Artificial Intelligence Playing Tic-Tac-Toe Computer Systems Lab 2005 By Rachel Miller.
Game Playing. Introduction One of the earliest areas in artificial intelligence is game playing. Two-person zero-sum game. Games for which the state space.
GAME PLAYING 1. There were two reasons that games appeared to be a good domain in which to explore machine intelligence: 1.They provide a structured task.
Blondie24 Presented by Adam Duffy and Josh Hill. Overview Introduction to new concepts Design of Blondie24 Testing and results Other approaches to checkers.
Adversarial Search Chapter Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time limits.
Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games.
Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Game tree search Chapter 6 (6.1 to 6.3 and 6.6) cover games. 6.6 covers state of the art game players in particular. 6.5 covers games that involve uncertainty.
The Implementation of Artificial Intelligence and Temporal Difference Learning Algorithms in a Computerized Chess Program By James Mannion Computer Systems.
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
Graph Search II GAM 376 Robin Burke. Outline Homework #3 Graph search review DFS, BFS A* search Iterative beam search IA* search Search in turn-based.
Othello Artificial Intelligence With Machine Learning Computer Systems TJHSST Nick Sidawy.
Adversarial Search 2 (Game Playing)
Adversarial Search and Game Playing Russell and Norvig: Chapter 6 Slides adapted from: robotics.stanford.edu/~latombe/cs121/2004/home.htm Prof: Dekang.
February 25, 2016Introduction to Artificial Intelligence Lecture 10: Two-Player Games II 1 The Alpha-Beta Procedure Can we estimate the efficiency benefit.
Luca Weibel Honors Track: Competitive Programming & Problem Solving Partisan game theory.
Understanding AI of 2 Player Games. Motivation Not much experience in AI (first AI project) and no specific interests/passion that I wanted to explore.
Search: Games & Adversarial Search Artificial Intelligence CMSC January 28, 2003.
Artificial Intelligence AIMA §5: Adversarial Search
Adversarial Search and Game-Playing
By: Casey Savage, Hayley Stueber, and James Olson
CONTENTS 1. Introduction 2. The Basic Checker-playing Program
By James Mannion Computer Systems Lab Period 3
Optimizing Minmax Alpha-Beta Pruning Real Time Decisions
The Implementation of Machine Learning in the Game of Checkers
Othello Artificial Intelligence With Machine Learning
The Implementation of Machine Learning in the Game of Checkers
Alpha-Beta Search.
Alpha-Beta Search.
The Alpha-Beta Procedure
Search and Game Playing
Alpha-Beta Search.
Alpha-Beta Search.
Alpha-Beta Search.
Unit II Game Playing.
Presentation transcript:

The Implementation of Machine Learning in the Game of Checkers Billy Melicher Computer Systems lab 08-09

Abstract Machine learning uses past information to predict future states Can be used in any situation where the past will predict the future Will adapt to situations

Introduction Checkers is used to explore machine learning Checkers has many tactical aspects that make it good for studying

Background Minimax Heuristics Learning

Minimax Method of adversarial search Every pattern(board) can be given a fitness value(heuristic) Each player chooses the outcome that is best for them from the choices they have

Minimax Chart from wikipedia

Minimax Has exponential growth rate Can only evaluate a certain number of actions into the future – ply

Heuristic Heuristics predict out come of a board Fitness value of board, higher value, better outcome Not perfect Requires expertise in the situation to create

Heuristics H(s) = c0F0(s) + c1F1(s) + … + cnFn(s) H(s) = heuristic Has many different terms In checkers terms could be: Number of checkers Number of kings Number of checkers on an edge How far checkers are on board

Learning by Rote Stores every game played Connects the moves made for each board Relates the moves made from a particular board to the outcome of the board More likely to make moves that result in a win, less likely to make moves resulting in a loss Good in end game, not as good in mid game

How I store data I convert each checker board into a 32 digit base 5 number where each digit corresponds to a playable square and each number corresponds to what occupies that square.

Learning by Generalization Uses a heuristic function to guide moves Changes the heuristic function after games based on the outcome Good in mid game but not as good in early and end games Requires identifying the features that affect game

Development Use of minimax algorithm with alpha beta pruning Use of both learning by Rote and Generalization Temporal difference learning

Temporal Difference Learning In temporal difference learning, you adjust the heuristic based on the difference between the heuristic at one time and at another Equilibrium moves toward ideal function U(s) <-- U(s) + α( R(s) + γU(s') - U(s))

Temporal Difference Learning No proof that prediction closer to the end of the game will be better but common sense says it is Changes heuristic so that it better predicts the value of all boards Adjusts the weights of the heuristic

Alpha Value The alpha value decreases the change of the heuristic based on how much data you have Decreasing returns Necessary for ensuring rare occurrences do not change heuristic too much

Development Equation for learning applied to each weight: w=(previous-current)(previous+current/2) Equation for alpha value: a=50/(49+n)

Results Value of weight reaches equilibrium Changes to reflect the learning of the program Occasionally requires programmer intervention when it reaches a false equilibrium

Results

Learning by rote requires a large data set Requires large amounts of memory Necessary for determining alpha value in temporal difference learning

Conclusions Good way to find equilibrium weight Sometimes requires intervention Doesn't require much memory Substantial learning could be achieved with relativelly few runs Learning did not require the program to know strategies but does require it to play towards a win