Chapter 6 Instructor : Miss Mahreen Nasir Butt. Outline Games Optimal decisions Minimax algorithm α-β pruning Imperfect, real-time decisions 2.

Chapter 6 Instructor : Miss Mahreen Nasir Butt

Outline Games Optimal decisions Minimax algorithm α-β pruning Imperfect, real-time decisions 2

Games Multi agent environments : any given agent will need to consider the actions of other agents and how they affect its own welfare The unpredictability of these other agents can introduce many possible contingencies There could be competitive or cooperative environments Competitive environments, in which the agent’s goals are in conflict require adversarial search – these problems are called as games 3

Games 4 In game theory (economics), any multiagent environment (either cooperative or competitive) is a game provided that the impact of each agent on the other is significant AI games are a specialized kind - deterministic, turn taking, two-player, zero sum games of perfect information In our terminology – deterministic, fully observable environments with two agents whose actions alternate and the utility values at the end of the game are always equal and opposite (+1 and –1)

Optimal Decisions in Games 5 Consider games with two players (MAX, MIN) Initial State Board position and identifies the player to move Successor Function Returns a list of (move, state) pairs; each a legal move and resulting stateo; Terminal Test Determines if the game is over (at terminal states) Utility Function Objective function, payoff function, a numeric value for the terminal states (+1, -1) or (+192, -192) We are not looking for a path, only the next move to make(that hopefully leads to a wining state) Our best move depends on what the other player does

Game Tree 6 Root node represents the configuration of the board at which a decision must be made Root is labeled a "MAX" node indicating it is my turn; otherwise it is labeled a "MIN" (your turn) Each level of the tree has nodes that are all MAX or all MIN

7 The root of the tree is the initial state Next level is all of MAX’s moves Next level is all of MIN’s moves … Example: Tic-Tac-Toe Root has 9 blank squares (MAX) Level 1 has 8 blank squares (MIN) Level 2 has 7 blank squares (MAX) … Utility function: win for X is +1 win for O is -1 Game Trees

Tic-tac-toe: Game tree (2-player, deterministic, turns) 8

Optimal Strategies 9 In a normal search problem, the optimal solution would be a sequence of moves leading to a goal state - a terminal state that is a win In a game, MIN has something to say about it and therefore MAX must find a contingent strategy, which specifies : MAX’s move in the initial state, then MAX’s moves in the states resulting from every possible response by MIN

Optimal strategies 10 Then MAX’s moves in the states resulting from every possible response by MIN to those moves An optimal strategy leads to outcomes at least as good as any other Strategy when one is playing an infallible opponent

11 Basic Idea: Choose the move with the highest minimax value best achievable payoff against best play Choose moves that will lead to a win, even though min is trying to block Max’s goal: get to 1 Min’s goal: get to -1 Minimax value of a node (backed up value): If N is terminal, use the utility value If N is a Max move, take max of successors If N is a Min move, take min of successors Minimax Strategy

Minimax 12 Perfect play for deterministic games Idea: choose move to position with highest minimax value = best achievable payoff against best play E.g., 2-ply game: A BCD

Minimax value 13 Given a game tree, the optimal strategy can be determined by examining the minimax value of each node (MINIMAX-VALUE(n)) The minimax value of a node is the utility of being in the corresponding state, assuming that both players play optimally from there to the end of the game Given a choice, MAX prefer to move to a state of maximum value, whereas MIN prefers a state of minimum value

Minimax algorithm 14

Minimax 15 MINIMAX-VALUE(root) = max(min(3,12,8), min(2,4,6), min(14,5,2)) = max(3,2,2) = 3 The algorithm first recurses down to the tree bottom-left nodes and uses the Utility function on them to discover that their values are 3, 12 and 8. A BCD

Minimax 16 Then it takes the minimum of these values, 3, and returns it as the backed-up value of node B. Similar process for the other nodes. A C B D

Properties of minimax 17 Complete? Yes (if tree is finite) Optimal? Yes (against an optimal opponent) Time complexity? O(bm) Space complexity? O(bm) (depth-first exploration) For chess, b ≈ 35, m ≈100 for "reasonable" games Exact solution completely infeasible استكمال؟ نعم ( إذا الشجرة هو محدود ) الأمثل؟ نعم ( ضد الخصم الأمثل ) تعقيد الوقت؟ O (BM) الفضاء التعقيد؟ O (BM) ( عمق الاستكشاف والعشرين ) لعبة الشطرنج، ب ≈ 35 ، م 100 ل ≈ " معقولة " ألعاب بالضبط تماما الحل غير قابل للتطبيق

The minimax algorithm: problems 18 Problem with minimax search: The number of game states it has to examine is exponential in the number of moves. Unfortunately, the exponent can’t be eliminated, but it can be cut in half. مشكلة مع مينيماكس البحث : عدد الدول اللعبة لديها لدراسة هو الأسي في عدد من التحركات. للأسف، لا يمكن أن يتم القضاء على الأس، ولكن يمكن قطع عليه في النصف.

α-β pruning 19 It is possible to compute the correct minimax decision without looking at every node in the game tree. Alpha-beta pruning allows to eliminate large parts of the tree from consideration, without influencing the final decision. فمن الممكن لحساب القرار الصحيح مينيماكس دون النظر إلى كل عقدة في شجرة لعبة. ألفا بيتا تقليم يسمح للقضاء على أجزاء كبيرة من شجرة من النظر، دون التأثير على القرار النهائي.

α-β pruning 20 MINIMAX-VALUE(root) = max(min(3,12,8), min(2,x,y),min(14,5,2)) = max(3,min(2,x,y),2) = max(3,z,2) where z <=2 = 3

α-β pruning example 21 It can be inferred that the value at the root is at least 3, because MAX has a choice worth 3. ويمكن استنتاج أن القيمة في جذور ما لا يقل عن 3، لأن لديه خيار MAX بقيمة 3.

α-β pruning example 22 Therefore, there is no point in looking at the other successors of C. لذلك، لا يوجد أي نقطة في النظر إلى غيرها من خلفاء C.

α-β pruning example 23 This is still higher than MAX’s best alternative (i.e., 3), so D’s other successors are explored. هذا لا يزال أعلى من بديل أفضل MAX (أي، 3)، لذلك يتم استكشاف خلفاء D'سلع أخرى.

α-β pruning example 24 The second successor of D is worth 5, so the exploration continues. خليفة الثاني من الجدير D 5، بحيث يواصل الاستكشاف.

α-β pruning example 25 MAX’s decision at the root is to move to B, giving a value of 3

Why is it called α-β? 26 α = the value of the best (i.e., highest-value) choice found so far at any choice point along the path for max β = the value of the best (i.e., lowest value) choice found so far along the path for MIN If v is worse than α, max will avoid it Prune that branch

α = العثور على قيمة أفضل خيار ( أي أعلى قيمة ) حتى الآن في أي لحظة الاختيار على طول مسار ماكس β = العثور على قيمة أفضل ( أي أقل قيمة ) الاختيار حتى الآن على طول الطريق ل MIN إذا V هو أسوأ من α ، والحد الأقصى تجنب ذلك تقليم ذلك الفرع 27

Properties of α-β 28 Pruning does not affect final result Good move ordering improves effectiveness of pruning With "perfect ordering," time complexity =O(bm/2) Doubles depth of search A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)

التقليم لا يؤثر النتيجة النهائية خطوة جيدة يأمر يحسن فعالية التقليم مع " الكمال طلب، " التعقيد الوقت = O (BM / 2) يضاعف عمق البحث وهناك مثال بسيط من قيمة المنطق الحسابية حول أي وثيقة الصلة ( شكل من أشكال metareasoning) 29

MinMax – AlphaBeta Pruning 30 MinMax searches entire tree, even if in some cases the rest can be ignored In general, stop evaluating move when find worse than previously examined move  Does not benefit the player to play that move, it need not be evaluated any further.  Save processing time without affecting final result MinMax يبحث شجرة بأكملها، حتى لو في بعض الحالات يمكن تجاهل بقية بشكل عام، عندما وقف تقييم الخطوة تجد أسوأ من الخطوة سبق النظر فيها  لا يستفيد اللاعب للعب هذا التحرك، ليس من الضروري أن يتم تقييم أكثر من ذلك.  توفيرا للوقت دون التأثير على معالجة النتيجة النهائية

Chapter 6 Instructor : Miss Mahreen Nasir Butt. Outline Games Optimal decisions Minimax algorithm α-β pruning Imperfect, real-time decisions 2.

Similar presentations

Presentation on theme: "Chapter 6 Instructor : Miss Mahreen Nasir Butt. Outline Games Optimal decisions Minimax algorithm α-β pruning Imperfect, real-time decisions 2."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 6 Instructor : Miss Mahreen Nasir Butt. Outline Games Optimal decisions Minimax algorithm α-β pruning Imperfect, real-time decisions 2.

Similar presentations

Presentation on theme: "Chapter 6 Instructor : Miss Mahreen Nasir Butt. Outline Games Optimal decisions Minimax algorithm α-β pruning Imperfect, real-time decisions 2."— Presentation transcript:

Similar presentations

About project

Feedback