Applied machine learning in game theory Dmitrijs Rutko Faculty of Computing University of Latvia Joint Estonian-Latvian Theory Days at Rakari, 2010.

Applied machine learning in game theory Dmitrijs Rutko Faculty of Computing University of Latvia Joint Estonian-Latvian Theory Days at Rakari, 2010

Topic outline Game theory Game Tree Search Fuzzy approach Machine learning Heuristics Neural networks Adaptive / Reinforcement learning Card games

Research overview Deterministic / stochastic games Perfect / imperfect information games

Finite zero-sum games deterministicchance perfect informationchess, checkers, go, othello backgammon, monopoly, roulette imperfect information battleship, kriegspiel, rock- paper-scissors bridge, poker, scrabble

Game trees

Classical algorithms MiniMax O(w d ) Alpha-Beta O(w d/2 ) 1274368954 2 789 28 8 √√√ΧΧ√√√ΧΧ max min max

Advanced search techniques Transposition tables Time efficiency / high cost of space PVS Negascout NegaC* SSS* / DUAL* MTD(f)

Fuzzy approach O(w d/2 ) More cut-offs 1274368954 <5<5 ?≥5 <5<5 √√ΧΧΧ√Χ√ΧΧ max min max

Geometric interpretation 1) X 2 - successful separation 2) X 1 or X 3 - reduced search window αβ 28 X2X2 X1X1 X3X3 α = X 1 β = X 3

BNS enhancement through self- training Traditional statistical approach Minimax value Tree count 251 265 2711 2838 29124 30206 31252 32189 33111 3442 3514 367 1000

Two dimensional game sub-tree distribution 2324252627282930313233343536 Tree count 2300 24000 250101 2600235 270053311 2801012 1338 2900210354334124 30126926587133206 31006102741785733252 3201313173032413814189 330012812262821112111 34000135138622242 35000002432300014 36000000122110007

Statistical sub-tree separation Separation value Tree count 230 241 256 2630 2788 28208 29374 30509 31475 32325 33167 3461 3521 367 2272

Experimental results. 2-width trees

Experimental results. 3-width trees

Future research directions in game tree search Multi-dimensional self-training Wider trees Real domain games

Games with element of chance

Expectiminimax algorithm Expectiminimax(n) = Utility(n) If n is a terminal state Max s  Successors(n) Expectiminimax(s) if n is a max node Min s  Successors(n) Expectiminimax(s) if n is a min node  s  Successors(n) P(s) * Expectiminimax(s) if n is a chance node O(w d c d )

Perfomance in Backgammon *-Minimax Performance in Backgammon, Thomas Hauk, Michael Buro, and Jonathan Schaeer

Backgammon Evaluation methods Static – pip count Heuristic – key points Neural Networks

Temporal difference (TD) learning Reinforcement learning Prediction method

Experimental setup Multi-layer perceptron Representation encoding Raw data (27 inputs) Unary (157 inputs) Extended unary (201 inputs) Binary (201 input) Training game series – 400 000 games

Learning results

Program “DM Backgammon”

Artificial Intelligence and Poker* * Joint work with Annija Rupeneite AI ProblemsPoker problems Imperfect informationHidden cards Multiple agentsMultiple human players Risk managementBet strategy and outcome Agent modelingOpponent(s) modeling Misleading informationBluffing Unreliable informationTaking bluffing into account

Questions ? dim_rut@inbox.lv

Applied machine learning in game theory Dmitrijs Rutko Faculty of Computing University of Latvia Joint Estonian-Latvian Theory Days at Rakari, 2010.

Similar presentations

Presentation on theme: "Applied machine learning in game theory Dmitrijs Rutko Faculty of Computing University of Latvia Joint Estonian-Latvian Theory Days at Rakari, 2010."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Applied machine learning in game theory Dmitrijs Rutko Faculty of Computing University of Latvia Joint Estonian-Latvian Theory Days at Rakari, 2010.

Similar presentations

Presentation on theme: "Applied machine learning in game theory Dmitrijs Rutko Faculty of Computing University of Latvia Joint Estonian-Latvian Theory Days at Rakari, 2010."— Presentation transcript:

Similar presentations

About project

Feedback