Download presentation
Presentation is loading. Please wait.
Published byPiers Manning Modified over 9 years ago
1
Applied machine learning in game theory Dmitrijs Rutko Faculty of Computing University of Latvia Joint Estonian-Latvian Theory Days at Rakari, 2010
2
Topic outline Game theory Game Tree Search Fuzzy approach Machine learning Heuristics Neural networks Adaptive / Reinforcement learning Card games
3
Research overview Deterministic / stochastic games Perfect / imperfect information games
4
Finite zero-sum games deterministicchance perfect informationchess, checkers, go, othello backgammon, monopoly, roulette imperfect information battleship, kriegspiel, rock- paper-scissors bridge, poker, scrabble
5
Topic outline Game theory Game Tree Search Fuzzy approach Machine learning Heuristics Neural networks Adaptive / Reinforcement learning Card games
6
Game trees
7
Classical algorithms MiniMax O(w d ) Alpha-Beta O(w d/2 ) 1274368954 2 789 28 8 √√√ΧΧ√√√ΧΧ max min max
8
Advanced search techniques Transposition tables Time efficiency / high cost of space PVS Negascout NegaC* SSS* / DUAL* MTD(f)
9
Fuzzy approach O(w d/2 ) More cut-offs 1274368954 <5<5 ?≥5 <5<5 √√ΧΧΧ√Χ√ΧΧ max min max
10
Geometric interpretation 1) X 2 - successful separation 2) X 1 or X 3 - reduced search window αβ 28 X2X2 X1X1 X3X3 α = X 1 β = X 3
11
BNS enhancement through self- training Traditional statistical approach Minimax value Tree count 251 265 2711 2838 29124 30206 31252 32189 33111 3442 3514 367 1000
12
Two dimensional game sub-tree distribution 2324252627282930313233343536 Tree count 2300 24000 250101 2600235 270053311 2801012 1338 2900210354334124 30126926587133206 31006102741785733252 3201313173032413814189 330012812262821112111 34000135138622242 35000002432300014 36000000122110007
13
Statistical sub-tree separation Separation value Tree count 230 241 256 2630 2788 28208 29374 30509 31475 32325 33167 3461 3521 367 2272
14
Experimental results. 2-width trees
15
Experimental results. 3-width trees
16
Future research directions in game tree search Multi-dimensional self-training Wider trees Real domain games
17
Topic outline Game theory Game Tree Search Fuzzy approach Machine learning Heuristics Neural networks Adaptive / Reinforcement learning Card games
18
Games with element of chance
19
Expectiminimax algorithm Expectiminimax(n) = Utility(n) If n is a terminal state Max s Successors(n) Expectiminimax(s) if n is a max node Min s Successors(n) Expectiminimax(s) if n is a min node s Successors(n) P(s) * Expectiminimax(s) if n is a chance node O(w d c d )
20
Perfomance in Backgammon *-Minimax Performance in Backgammon, Thomas Hauk, Michael Buro, and Jonathan Schaeer
21
Backgammon Evaluation methods Static – pip count Heuristic – key points Neural Networks
22
Temporal difference (TD) learning Reinforcement learning Prediction method
23
Experimental setup Multi-layer perceptron Representation encoding Raw data (27 inputs) Unary (157 inputs) Extended unary (201 inputs) Binary (201 input) Training game series – 400 000 games
24
Learning results
25
Program “DM Backgammon”
26
Topic outline Game theory Game Tree Search Fuzzy approach Machine learning Heuristics Neural networks Adaptive / Reinforcement learning Card games
27
Artificial Intelligence and Poker* * Joint work with Annija Rupeneite AI ProblemsPoker problems Imperfect informationHidden cards Multiple agentsMultiple human players Risk managementBet strategy and outcome Agent modelingOpponent(s) modeling Misleading informationBluffing Unreliable informationTaking bluffing into account
28
Questions ? dim_rut@inbox.lv
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.