Download presentation
Presentation is loading. Please wait.
1
Presenter: Robert Holte
2
2 Helping the world understand … and make informed decisions. * * Potential beneficiaries: commercial games companies, and their customers. games and the people who play them *
3
3 Multi-billion dollar industry, with considerable Canadian activity U. of A. has one of the best AI & Games research groups in the world Games are good testbeds for A.I. research Machine learning has a key role to play: Opponent/user modelling Massive datasets (e.g. play logs) Challenging problems for machine learning Opponent modelling: very short time frame, weak data Massive datasets: large number of low-level features Active learning opportunities Human element in the overall system
4
4 1. Gameplay Analysis (ongoing) 2. Poker (ongoing, poster) 3. Counter-strike Log Analysis (new, poster) 4. Go (ongoing, poster) 5. General Game Playing (new) 6. Threat Modelling (complete, poster)
5
5 AICML PI ’ s: M. Bowling, R. Holte, J. Schaeffer 8 Software developers 3 Postdoctoral Fellows 14 Grad students
6
6 Electronic Arts BioWare BioTools 3 UofA CS profs
7
7 Grants $490K over 3 years, NSERC strategic grant $10k/year BioWare gift Portion of Jonathan Schaeffer ’ s iCORE chair In-kind Neverwinter Nights source code (BioWare) FIFA ’ 2004 source code (EA) with our gameplay analysis hooks installed at their expense BioTools support of competitions we organize
8
8 IJCAI ’ 03 best paper award Winner of AAAI ’ 06 poker-bot competitions, competitive with top human players World ’ s first man-versus-machine poker match Currently world ’ s best 9x9 Go program, competitive with very good humans (Scientific American article) Electronic Arts interest in gameplay analysis GDC paper HQP to EA, BioWare, BioTools, Invidi, Google, Yahoo!
9
Technical Details
10
10 Large game tree (10 18 ) Stochastic element Variable number of players (2 – 10) Imperfect information (during play, and after) Aim is to maximize winnings not just win The last two make it essential to discover and exploit the opponent ’ s weaknesses
11
11 Rule-based ( “ expert system ” ) – Loki Search-based – Poki Game-theoretic – PsOpti and others Opponent modelling Vexbot PDF cutting Parameter Estimation (Bayesian) Strategy Value estimation ( “ experts ” )
12
12 Nash Equilibrium of an abstract poker game Bluffing, slow play, etc. fall out from the mathematics. Best paper award at IJCAI ’ 03 Won the AAAI ’ 06 poker-bot competitions Has held its own against 2 world-class humans
13
13 DIVAT: an unbiased, low variance estimator of winnings
14
14 The equilibrium strategy for the highly abstract game is far from perfect. No opponent modelling. Nash equilibrium not the best strategy: Non-adaptive Defensive Even the best humans have weaknesses that should be exploited
15
15 Short time to learn and exploit model (< 200 hands). Want to simultaneously: Collect information about the opponent Use the information to get higher payoff Not “ pay ” too much for the information Not be exploitable ourselves Imperfect information, even after hand finishes High variance chance in the game (the shuffled deck) stochastic opponent strategies Properties of the opponent … (next slide)
16
16 We assume a “ smart ” opponent – it has exploitable weaknesses but does not make outright errors plays a non-equilibrium strategy does not play a dominated strategy Opponent ’ s strategy is non-stationary changes during the game may be modelling me to exploit my weaknesses
17
17 In Kuhn poker against exploitable, stationary opponents … Convergence to best-response is slow. Opponent modelling is superior to a static Nash equilibrium strategy. often produces positive expected value robust to game length (50-400) and opponent type Bad initial estimates of P2 ’ s parameters overcome in 25-50 hands. “ Aggressive ” exploration strategies slightly superior to “ safe ” exploration strategies.
18
18 Improved Algorithms for Information- Gathering and Modelling Scaling up Non-stationary Opponents Other poker variants: no-limit, multi- player
19
Introduction
20
20 How to test if game software behaves as intended by the designer ?
21
21
22
22 Corner kicks to the coloured areas score. This was discovered by our SAGA-ML system.
23
23 Machine Learning rules behaviour control Sampling
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.