Download presentation
Presentation is loading. Please wait.
Published byLucas Hood Modified over 9 years ago
1
COMP-4640: Intelligent & Interactive Systems Game Playing A game can be formally defined as a search problem with: -An initial state -a set of operators (actions or moves) -a terminal test -a utility (payoff) function
2
1.Multi-agent environment –Multi-player games involve planning and acting in environments populated by other active agents –Agents use sense/plan/act architecture that does not plan too far into the unpredictable future –But with proper information agent can construct plan that consider the effects of the actions of other agents – In AI we will consider the special case of a games, deterministic turn taking two-player zero sum games of perfect-information 2.Zero Sum Games –either one of them wins (and the other loses), or a draw results –+1 win -1 loss 0 draw 3.Agents utility functions make the games adversarial COMP-4640: Intelligent & Interactive Systems Game Playing
3
Multi-agent environment Robot Soccer COMP-4640: Intelligent & Interactive Systems Game Playing
5
Game tree (2-player, deterministic, turns)
6
COMP-4640: Intelligent & Interactive Systems Game Playing The Minimax Algorithm
7
COMP-4640: Intelligent & Interactive Systems Game Playing The Minimax Algorithm
8
COMP-4640: Intelligent & Interactive Systems Game Playing The evaluation function: Must have the same terminal states (goal states) as the utility function Must be of reasonable complexity so that it can be computed quickly (this is a trade-off between Accuracy and Time) Should be accurate The performance of the game playing system depends on the accuracy “goodness” of the evaluation function
9
COMP-4640: Intelligent & Interactive Systems Game Playing One problem with using minimax is that it may not be feasible to search the whole game tree for a minimax decision (move or action) Using depth-limited search may speed thing up the minimax decision process but instead of using the utility function one would need to construct an evaluation fuction. This evaluation function would provide an estimate of the expected utility of a game position
10
COMP-4640: Intelligent & Interactive Systems Game Playing Properties of minimax Complete? Yes (if tree is finite) Optimal? Yes (against an optimal opponent) Time complexity? O(b m ) Space complexity? O(bm) (depth-first exploration) For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible
11
Once we have developed a good evaluation function, we must also consider: The depth-limit The Horizon Problem –Difficult to eliminate –When a program is facing a move by the opponent that causes serious damage and is ultimately unavoidable –Stalling pushes the move over the horizon to a place where it can’t be detected COMP-4640: Intelligent & Interactive Systems Game Playing
12
Once we have an evaluation function and a depth-limit we can then re-apply minimax search. However, for depth-limited search minimax may still be inefficient. Minimax will expand nodes that need not be searched. By making our search method more efficient, we will be able to search at deeper levels of our game tree. COMP-4640: Intelligent & Interactive Systems Game Playing
13
COMP-4640: Intelligent & Interactive Systems Game Playing: Alpha-Beta Pruning 1.Search below a MIN node may be alpha-pruned if the beta value is < to the alpha value of some MAX ancestor. 2. Search below a MAX node may be beta-pruned if the alpha value is > to the beta value of some MIN ancestor. 2 7
14
Alpha-Beta Pruning (αβ prune) Rules of Thumb –α is the highest max found so far –β is the lowest min value found so far –If Min is on top Alpha prune –If Max is on top Beta prune –You will only have alpha prune’s at Min level –You will only have beta prunes at Max level –See detailed algorithm p167
15
COMP-4640: Intelligent & Interactive Systems Game Playing: Alpha-Beta Pruning 1.Search below a MIN node may be alpha-pruned if the beta value is < to the alpha value of some MAX ancestor. 2. Search below a MAX node may be beta-pruned if the alpha value is > to the beta value of some MIN ancestor. 2 7
16
COMP-4640: Intelligent & Interactive Systems Game Playing: Alpha-Beta Pruning 1.Search below a MIN node may be alpha-pruned if the beta value is < to the alpha value of some MAX ancestor. 2. Search below a MAX node may be beta-pruned if the alpha value is > to the beta value of some MIN ancestor. 3 23 3 5 93 3 β 5
17
COMP-4640: Intelligent & Interactive Systems Game Playing: Alpha-Beta Pruning 1.Search below a MIN node may be alpha-pruned if the beta value is < to the alpha value of some MAX ancestor. 2. Search below a MAX node may be beta-pruned if the alpha value is > to the beta value of some MIN ancestor. 3 23 3 5 0 9 93 3 9 0 0 7 4 7 07 α β
18
COMP-4640: Intelligent & Interactive Systems Game Playing: Alpha-Beta Pruning 1.Search below a MIN node may be alpha-pruned if the beta value is < to the alpha value of some MAX ancestor. 2. Search below a MAX node may be beta-pruned if the alpha value is > to the beta value of some MIN ancestor. 3 23 3 5 0 9 93 3 3 9 0 0 0 2 2 26 2 215 6 7 4 7 6 07 αα β
19
COMP-4640: Intelligent & Interactive Systems Game Playing
20
3 3 5 06 1 653 4 7 73 5 5 6 5 5 3 2
21
α β 3 3 5 06 1 653 4 7 73 5 5 6 5 5 3 2
22
COMP-4640: Intelligent & Interactive Systems Game of Chance: Expecti-minimax Initial value of leaves indicate board state Use percentage chance based upon roll for first calculated value Min eval f(n) selects Max value The second roll uses different assigned percentage chance Max eval f(n) selects Max value
23
COMP-4640: Intelligent & Interactive Systems Game of Chance: Expecti-minimax 3 0 0 3 0 (3*1.0) Initial value of leaves indicate board state Use percentage chance based upon roll for first calculated value Min eval f(n) selects Max value The second roll uses different assigned percentage chance Max eval f(n) selects Max value
24
COMP-4640: Intelligent & Interactive Systems Game of Chance: Expecti-minimax 3 0 0 6 60 12 9 36 3 0 6 9 (3*1.0) Initial value of leaves indicate board state Use percentage chance based upon roll for first calculated value Min eval f(n) selects Max value The second roll uses different assigned percentage chance Max eval f(n) selects Max value
25
COMP-4640: Intelligent & Interactive Systems Game of Chance: Expecti-minimax 3 0 0 6 60 12 9 36 3 0 6 9 (0*0.67 + 6*0.33) 2 2 (3*1.0) Initial value of leaves indicate board state Use percentage chance based upon roll for first calculated value Min eval f(n) selects Max value The second roll uses different assigned percentage chance Max eval f(n) selects Max value
26
COMP-4640: Intelligent & Interactive Systems Game of Chance: Expecti-minimax 3 0 0 6 60 12 9 36 06 3 0 6 9 3 0 6 (0*0.67 + 6*0.33) 2 2 (3*1.0) Initial value of leaves indicate board state Use percentage chance based upon roll for first calculated value Min eval f(n) selects Max value The second roll uses different assigned percentage chance Max eval f(n) selects Max value
27
COMP-4640: Intelligent & Interactive Systems Game of Chance: Expecti-minimax 3 0 0 6 60 12 9 36 06 3 0 6 9 3 0 6 (0*0.67 + 6*0.33) 22 2 22 (3*1.0) Initial value of leaves indicate board state Use percentage chance based upon roll for first calculated value Min eval f(n) selects Max value The second roll uses different assigned percentage chance Max eval f(n) selects Max value
28
Cutting off search MinimaxCutoff is identical to MinimaxValue except 1.Terminal? is replaced by Cutoff? 2.Utility is replaced by Eval Does it work in practice? b m = 10 6, b=35 m=4 4-ply lookahead is a hopeless chess player! –4-ply ≈ human novice –8-ply ≈ typical PC, human master –12-ply ≈ Deep Blue, Kasparov
29
COMP-4640: Deterministic games in practice Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions. Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.Deep Blue Othello: human champions refuse to compete against computers, who are too good.Othello Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.Go
30
http://www.research.ibm.com/deepblue/
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.