Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14 September 2005 William H. Hsu Department of Computing and Information Sciences, KSU http://www.kddresearch.org http://www.cis.ksu.edu/~bhsu Reading: Chapter 6, Russell and Norvig 2e Game Tree Search II

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture Outline Today’s Reading –Sections 6.1-6.4, Russell and Norvig 2e –Recommended references: Rich and Knight, Winston Reading for Next Class: Sections 6.5-6.8, Russell and Norvig Games as Search Problems –Frameworks: two-player, multi-player; zero-sum; perfect information –Minimax algorithm Perfect decisions Imperfect decisions (based upon static evaluation function) –Issues Quiescence Horizon effect –Need for pruning Next Lecture: Alpha-Beta Pruning, Expectiminimax, Current “Hot” Problems Next Week: Knowledge Representation – Logics and Production Systems

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Overview Perfect Play –General framework(s) –What could agent do with perfect info? Resource Limits –Search ply –Static evaluation: from heuristic search to heuristic game tree search –Examples Tic-tac-toe, connect four, checkers, connect-five / Go-Moku / wu 3 zi 3 qi 2 Chess, go Games with Uncertainty –Explicit: games of chance (e.g., backgammon, Monopoly) –Implicit: see project suggestions! Adapted from slides by S. Russell, UC Berkeley

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Minimax Algorithm: Decision and Evaluation Adapted from slides by S. Russell, UC Berkeley  what’s this? Figure 5.3 p. 126 R&N

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Properties of Minimax Adapted from slides by S. Russell, UC Berkeley Complete? –… yes, provided following are finite: Number of possible legal moves (generative breadth of tree) “Length of game” (depth of tree) – more specifically? –Perfect vs. imperfect information? Q: What search is perfect minimax analogous to? A: Bottom-up breadth-first Optimal? –… yes, provided perfect info (evaluation function) and opponent is optimal! –… otherwise, guaranteed if evaluation function is correct Time Complexity? –Depth of tree: m –Legal moves at each point: b –O(b m ) – NB, m  100, b  35 for chess! Space Complexity? O(bm) – why?

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Review: Alpha-Beta (  -  ) Pruning Example Adapted from slides by S. Russell, UC Berkeley Figure 5.6 p. 131 R&N ≥ 3 3 3 MAX MIN MAX 128 ≤ 2 2 145 ≤ 14 2 ≤52 ,  here? What are ,  values here?

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Alpha-Beta (  -  ) Pruning: Modified Minimax Algorithm Adapted from slides by S. Russell, UC Berkeley

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Digression: Learning Evaluation Functions Learning = Improving with Experience at Some Task –Improve over task T, –with respect to performance measure P, –based on experience E. Example: Learning to Play Checkers –T: play games of checkers –P: percent of games won in world tournament –E: opportunity to play against self Refining the Problem Specification: Issues –What experience? –What exactly should be learned? –How shall it be represented? –What specific algorithm to learn it? Defining the Problem Milieu –Performance element: How shall the results of learning be applied? –How shall the performance element be evaluated? The learning system?

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Example: Learning to Play Checkers Type of Training Experience –Direct or indirect? –Teacher or not? –Knowledge about the game (e.g., openings/endgames)? Problem: Is Training Experience Representative (of Performance Goal)? Software Design –Assumptions of the learning system: legal move generator exists –Software requirements: generator, evaluator(s), parametric target function Choosing a Target Function –ChooseMove: Board  Move (action selection function, or policy) –V: Board  R (board evaluation function) –Ideal target V; approximated target –Goal of learning process: operational description (approximation) of V

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence A Target Function for Learning to Play Checkers Possible Definition –If b is a final board state that is won, then V(b) = 100 –If b is a final board state that is lost, then V(b) = -100 –If b is a final board state that is drawn, then V(b) = 0 –If b is not a final board state in the game, then V(b) = V(b’) where b’ is the best final board state that can be achieved starting from b and playing optimally until the end of the game –Correct values, but not operational Choosing a Representation for the Target Function –Collection of rules? –Neural network? –Polynomial function (e.g., linear, quadratic combination) of board features? –Other? A Representation for Learned Function – –bp/rp = number of black/red pieces; bk/rk = number of black/red kings; bt/rt = number of black/red pieces threatened (can be taken on next turn)

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence A Training Procedure for Learning to Play Checkers Obtaining Training Examples –the target function – the learned function – the training value One Rule For Estimating Training Values: – Choose Weight Tuning Rule –Least Mean Square (LMS) weight update rule: REPEAT Select a training example b at random Compute the error(b) for this training example For each board feature f i, update weight w i as follows: where c is a small, constant factor to adjust the learning rate

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Design Choices for Learning to Play Checkers Completed Design Determine Type of Training Experience Games against experts Games against self Table of correct moves Determine Target Function Board  valueBoard  move Determine Representation of Learned Function Polynomial Linear function of six features Artificial neural network Determine Learning Algorithm Gradient descent Linear programming

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Knowledge Bases Adapted from slides by S. Russell, UC Berkeley

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Simple Knowledge-Based Agent Adapted from slides by S. Russell, UC Berkeley Figure 6.1 p. 152 R&N

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Summary Points Introduction to Games as Search Problems –Frameworks Two-player versus multi-player Zero-sum versus cooperative Perfect information versus partially-observable (hidden state) –Concepts Utility and representations (e.g., static evaluation function) Reinforcements: possible role for machine learning Game tree Family of Algorithms for Game Trees: Minimax –Propagation of credit –Imperfect decisions –Issues Quiescence Horizon effect –Need for pruning

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Terminology Game Graph Search –Frameworks Two-player versus multi-player Zero-sum versus cooperative Perfect information versus partially-observable (hidden state) –Concepts Utility and representations (e.g., static evaluation function) Reinforcements: possible role for machine learning Game tree: node/move correspondence, search ply Family of Algorithms for Game Trees: Minimax –Propagation of credit –Imperfect decisions –Issues Quiescence Horizon effect –Need for (alpha-beta) pruning

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.

Similar presentations

Presentation on theme: "Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.

Similar presentations

Presentation on theme: "Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14."— Presentation transcript:

Similar presentations

About project

Feedback