Download presentation
Presentation is loading. Please wait.
1
Chapter 6 : Game Search 게임 탐색 (Adversarial Search)
2
original board state new board state new board state 게임 탐색의 특성 - Exhaustive search is almost impossible. ==> mostly too many branching factor and too many depths. (e.g., 바둑 : (18 * 18)! ), 체스 :DeepBlue ? - 정적평가점수 (Static evaluation score) ==> board quality - maximizing player ==> hoping to win (me) minimizing player ==> hoping to lose (enemy) - Game tree ==> is a semantic tree with node (board configuration) and branch (moves).
3
Minimax Game Search Idea: take maximum score at maximizing level (my turn). take minimum score at minimizing level (your turn). 2718 maximizing level minimizing level maximizing level 나는 ? 상대는 ? 27182718 2 1 2 “this move gurantees best”
4
4 최소최대 탐색 예 평가함수 값을 최대화 시키려는 최대화자 A 의 탐색 최소화자 B 의 의도를 고려한 A 의 선택 A [c 1 ] f=0.8 [c 2 ] f=0.3 [c 3 ] f=-0.2 A [c 1 ] f=0.8 [c 2 ] f=0.3 [c 3 ] f=-0.2 [c 11 ] f=0.9 [c 12 ] f=0.1 [c 13 ] f=-0.6 [c 21 ] f=0.1 [c 22 ] f=-0.7 [c 31 ] f=-0.1 [c 32 ] f=-0.3 최소화자 (B) 단계
5
5 Minimax Algorithm Function MINIMAX( state ) returns an action inputs: state, current state in game v = MAX-VALUE (state) return the action corresponding with value v Function MAX-VALUE( state ) returns a utility value if TERMINAL-TEST( state ) then return UTILITY( state ) v = - for a, s in SUCCESSORS( state ) do v = MAX( v, MIN-VALUE( s ) ) return v Function MIN-VALUE( state ) returns a utility value if TERMINAL-TEST( state ) then return UTILITY( state ) v = for a, s in SUCCESSORS( state ) do v = MIN( v, MAX-VALUE( s ) ) return v
6
6 Minimax Example 3 1282461452 The nodes are “MAX nodes” The nodes are “MIN nodes” MIN node MAX node
7
7 Minimax Example 3 1282461452 The nodes are “MAX nodes” The nodes are “MIN nodes” MIN node MAX node
8
8 Minimax Example 3 1282461452 322 The nodes are “MAX nodes” The nodes are “MIN nodes” MIN node MAX node
9
9 Minimax Example 3 1282461452 322 3 The nodes are “MAX nodes” The nodes are “MIN nodes” MIN node MAX node
10
Tic-Tac-Toe Tic-tac-toe, also called noughts and crosses (in the British Commonwealth countries) and X's and O's in the Republic of Ireland, is a pencil-and-paper game for two players, X and O, who take turns marking the spaces in a 3×3 grid. The X player usually goes first. The player who succeeds in placing three respective marks in a horizontal, vertical, or diagonal row wins the game.British Commonwealthpencil-and-paper game The following example game is won by the first player, X:
13
Save time
14
Game tree (2-player) How do we search this tree to find the optimal move?
15
Applying MiniMax to tic-tac-toe The static heuristic evaluation function
17
- Pruning Idea: 탐색 공간을 줄인다 ! (mini-max 지수적으로 폭발 ) - principle: “if you have an idea that is surely bad, do not take time to see how truly bad it is.” 27 >=2 =2 27 >=2 =2 271 <=1 -cut Max Min
18
알파베타 가지치기 – 최대화 노드에서 가능한 최소의 값 ( 알파 ) 과 최소화의 노드에서 가능한 최대의 값 ( 베타 ) 를 사용한 게임 탐색법 – 기본적으로 DFS 로 탐색 진행 [c 0 ] =0.2 [c 2 ] f= -0.1 [c 1 ] f=0.2 [c 21 ] f= -0.1 [c 22 ][c 23 ] [c 11 ] f=0.2 [c 12 ] f=0.7 C 21 의 평가값 -0.1 이 C 2 에 올려지면 나머지 노드들 (C 22, C 23 ) 을 더 이상 탐색할 필요가 없음 -cut
19
Tic-Tac-Toe Example with Alpha-Beta Pruning Backup Values
20
- Procedure never decrease (initially - infinite) -∞ never increase (initially infinite) +∞ - Search rule: 1. -cutoff ==> cut when below any minimizing node that have a <= (ancestor). 2, -cutoff ==> cut when below any maximizing node that have a >= (ancestor).
21
Example 90891009960597574 -cut -cut max min max min
22
22 Alpha-Beta Pruning Algorithm Function ALPHA-BETA( state ) returns an action inputs: state, current state in game v = MAX-VALUE (state, - , + ) return the action corresponding with value v Function MAX-VALUE( state, , ) returns a utility value inputs: state, current state in game , the value of the best alternative for MAX along the path to state , the value of the best alternative for MIN along the path to state if TERMINAL-TEST( state ) then return UTILITY( state ) v = - for a, s in SUCCESSORS( state ) do v = MAX( v, MIN-VALUE( s, , ) ) if v >= then return v = MAX( , v ) return v
23
23 Alpha-Beta Pruning Algorithm Function MIN-VALUE( state, , ) returns a utility value inputs: state, current state in game , the value of the best alternative for MAX along the path to state , the value of the best alternative for MIN along the path to state if TERMINAL-TEST( state ) then return UTILITY( state ) v = + for a, s in SUCCESSORS( state ) do v = MIN( v, MAX-VALUE( s, , ) ) if v <= then return v = MIN( , v ) return v
24
24 Alpha-Beta Pruning Example The nodes are “MAX nodes” The nodes are “MIN nodes” =− =+ , , initial values , , passed to kids =− =+
25
25 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” =− =+ =− =3 MIN updates , based on kids
26
26 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [- , + ]
27
27 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [- , + ] 12 =− =3 MIN updates , based on kids. No change.
28
28 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, + ] 128 3 3 is returned as node value. MAX updates , based on kids.
29
29 Alpha-Beta Pruning Example 3 [- , 3] [3, + ] 128 The nodes are “MAX nodes” The nodes are “MIN nodes”
30
30 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, + ] 128 [3,+ ] , , passed to kids =3 =+
31
31 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, + ] 1282 [3, 2] MIN updates , based on kids. =3 =2 ≥ , so prune. XX
32
32 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, + ] 1282 [3, 2] XX 2 is returned as node value. MAX updates , based on kids. No change.
33
33 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3,+ ] 1282 [3, 2] , , passed to kids =3 =+ XX
34
34 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, + ] 1282 [3, 2] 14 [3, 14] MIN updates , based on kids. =3 =14 X X
35
35 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, + ] 1282 [3, 2] 14 [3, 5] 5 MIN updates , based on kids. =3 =5 X X
36
36 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, + ] 1282 [3, 2] 14 [3, 2] 5 2 MIN updates , based on kids. =3 =2 XX
37
37 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, + ] 1282 [3, 2] 14 [3, 2] 5 2 2 is returned as node value. X X
38
38 Alpha-Beta Pruning Example 3 The nodes are “MAX nodes” The nodes are “MIN nodes” [- , 3] [3, + ] 1282 [3, 2] 14 [3, 2] 5 2 MAX updates , based on kids. No change. X X
40
Example -which nodes can be pruned? 341278 5 6 Max Min Max
41
Answer to Example -which nodes can be pruned? Answer: NONE! Because the most favorable nodes for both are explored last (i.e., in the diagram, are on the right-hand side). 341278 5 6 Max Min Max
42
Second Example (the exact mirror image of the first example) 658721 3 4 -which nodes can be pruned?
43
Answer to Second Example (the exact mirror image of the first example) -which nodes can be pruned? 65872 1 3 4 Min Max Answer: LOTS! Because the most favorable nodes for both are explored first (i.e., in the diagram, are on the left-hand side).
46
점진적 심화방법 Time limits unlikely to find goal, must approximate
47
Iterative (Progressive) Deepening : 점진적 심화방법 In real games, there is usually a time limit T on making a move How do we take this into account? using alpha-beta we cannot use “partial” results with any confidence unless the full breadth of the tree has been searched – So, we could be conservative and set a conservative depth- limit which guarantees that we will find a move in time < T disadvantage is that we may finish early, could do more search In practice, iterative deepening search (IDS) is used –IDS runs depth-first search with an increasing depth-limit –when the clock runs out we use the solution found at the previous depth limit
48
점진적 심화방법
49
49 Iterative deepening search l =0
50
50 Iterative deepening search l =1
51
51 Iterative deepening search l =2
52
52 Iterative deepening search l =3
54
Heuristic Continuation: fight horizon effect
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.