Todd W. Neller Gettysburg College

Slides:



Advertisements
Similar presentations
Heuristic Search techniques
Advertisements

Reinforcement Learning
Lecture 18: Temporal-Difference Learning
Dialogue Policy Optimisation
Artificial Intelligence Presentation
Lirong Xia Reinforcement Learning (2) Tue, March 21, 2014.
Programming exercises: Angel – lms.wsu.edu – Submit via zip or tar – Write-up, Results, Code Doodle: class presentations Student Responses First visit.
1 Overview of Simulation When do we prefer to develop simulation model over an analytic model? When not all the underlying assumptions set for analytic.
Adversarial Search Reference: “Artificial Intelligence: A Modern Approach, 3 rd ed” (Russell and Norvig)
Questions?. Setting a reward function, with and without subgoals Difference between agent and environment AI for games, Roomba Markov Property – Broken.
CS 4700: Foundations of Artificial Intelligence Bart Selman Reinforcement Learning R&N – Chapter 21 Note: in the next two parts of RL, some of the figure/section.
Class Project Due at end of finals week Essentially anything you want, so long as it’s AI related and I approve Any programming language you want In pairs.
CSE 5522: Survey of Artificial Intelligence II: Advanced Techniques Instructor: Alan Ritter TA: Fan Yang.
VK Dice By: Kenny Gutierrez, Vyvy Pham Mentors: Sarah Eichhorn, Robert Campbell.
Monopoly Game Example Mutually Exclusive.
Pondering Probabilistic Play Policies for Pig Todd W. Neller Gettysburg College.
Simulated Annealing Student (PhD): Umut R. ERTÜRK Lecturer : Nazlı İkizler Cinbiş
Decision Analysis Your Logo Here Jane Hagstrom University of Illinois Jane Hagstrom University of Illinois.
Introduction to Artificial Intelligence for Bradley University – CS 521 Anthony (Tony) J. Grichnik Visiting Scientist to Bradley University Caterpillar.
Games with Chance Other Search Algorithms CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 3 Adapted from slides of Yoonsuck Choe.
Bayesian Reinforcement Learning with Gaussian Processes Huanren Zhang Electrical and Computer Engineering Purdue University.
Pedagogical Possibilities for the N-Puzzle Problem Zdravko Markov Central Connecticut State University, Ingrid Russell University of Hartford,
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
Games of Chance Introduction to Artificial Intelligence COS302 Michael L. Littman Fall 2001.
The Parameterized Poker Squares EAAI NSG Challenge
1 9/8/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
Lecture Discrete Probability. 5.1 Probabilities Important in study of complexity of algorithms. Modeling the uncertain world: information, data.
Stochastic Algorithms Some of the fastest known algorithms for certain tasks rely on chance Stochastic/Randomized Algorithms Two common variations – Monte.
Project MLExAI Machine Learning Experiences in AI Ingrid Russell, University.
10/3/2015 ARTIFICIAL INTELLIGENCE Russell and Norvig ARTIFICIAL INTELLIGENCE: A Modern Approach.
Simulation Prepared by Amani Salah AL-Saigaly Supervised by Dr. Sana’a Wafa Al-Sayegh University of Palestine.
Othello Artificial Intelligence With Machine Learning
Artificial Intelligence And Machine learning. Drag picture to placeholder or click icon to add What is AI?
1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 11 Understanding Randomness.
Monte Carlo Methods Versatile methods for analyzing the behavior of some activity, plan or process that involves uncertainty.
14.3 Simulation Techniques and the Monte Carlo Method simulation technique A simulation technique uses a probability experiment to mimic a real-life situation.
CHAPTER 4 PROBABILITY THEORY SEARCH FOR GAMES. Representing Knowledge.
CHECKERS: TD(Λ) LEARNING APPLIED FOR DETERMINISTIC GAME Presented By: Presented To: Amna Khan Mis Saleha Raza.
2005MEE Software Engineering Lecture 11 – Optimisation Techniques.
Monte-Carlo methods for Computation and Optimization Spring 2015 Based on “N-Grams and the Last-Good-Reply Policy Applied in General Game Playing” (Mandy.
Game Playing. Introduction One of the earliest areas in artificial intelligence is game playing. Two-person zero-sum game. Games for which the state space.
Slide Understanding Randomness.  What is it about chance outcomes being random that makes random selection seem fair? Two things:  Nobody can.
Monte Carlo Process Risk Analysis for Water Resources Planning and Management Institute for Water Resources 2008.
Course Overview  What is AI?  What are the Major Challenges?  What are the Main Techniques?  Where are we failing, and why?  Step back and look at.
Expectation-Maximization (EM) Algorithm & Monte Carlo Sampling for Inference and Approximation.
CHAPTER 5 Simulation Modeling. Introduction In many situations a modeler is unable to construct an analytic (symbolic) model adequately explaining the.
Simulation in Healthcare Ozcan: Chapter 15 ISE 491 Fall 2009 Dr. Burtner.
Pedagogical Possibilities for the 2048 Puzzle Game Todd W. Neller.
COMP 2208 Dr. Long Tran-Thanh University of Southampton Reinforcement Learning.
Othello Artificial Intelligence With Machine Learning Computer Systems TJHSST Nick Sidawy.
Artificial Intelligence in Game Design Lecture 20: Hill Climbing and N-Grams.
Aim: What is the importance of probability?. What is the language of Probability? “Random” is a description of a kind of order that emerges in the long.
Strategies for playing the dice game ‘Toss Up’ Roger Johnson South Dakota School of Mines & Technology April 2012.
Understanding AI of 2 Player Games. Motivation Not much experience in AI (first AI project) and no specific interests/passion that I wanted to explore.
Understanding AlphaGo. Go Overview Originated in ancient China 2,500 years ago Two players game Goal - surround more territory than the opponent 19X19.
Optimal Play of the Farkle Dice Game
Stochastic tree search and stochastic games
Status Report on Machine Learning
Done Done Course Overview What is AI? What are the Major Challenges?
Chapter 6: Temporal Difference Learning
Chapter 5: Monte Carlo Methods
AlphaGo with Deep RL Alpha GO.
Games with Chance Other Search Algorithms
Bayesian Network Reasoning with Gibbs Sampling
October 6, 2011 Dr. Itamar Arel College of Engineering
Chapter 6: Temporal Difference Learning
CS 188: Artificial Intelligence Fall 2008
Presentation transcript:

Todd W. Neller Gettysburg College An Introduction to Monte Carlo Techniques in Artificial Intelligence - Part I Todd W. Neller Gettysburg College

Monte Carlo (MC) Techniques in AI General: Monte Carlo simulation for probabilistic estimation Machine Learning: Monte Carlo reinforcement learning Uncertain Reasoning: Bayesian network reasoning with the Markov Chain Monte Carlo method Robotics: Monte Carlo localization Search: Monte Carlo tree search Game Theory: Monte Carlo regret-based techniques

Monte Carlo Simulation Repeated sampling of stochastic simulations to estimate system properties Recommended Readings: Wikipedia article on Monte Carlo Methods [Paul J. Nahin’s Digital Dice: Computational Solutions to Practical Probability Problems is a great source of MC simulation exercises.]

Why MC Simulation? Nihin’s motivational philosophical theme: No matter how smart you are, there will always be probabilistic problems that are too hard for you to solve analytically. Despite (1), if you know a good scientific programming language that incorporates a random number generator (and if it is good it will), you may still be able to get numerical answers to those "too hard" problems.

Problem Solving Approach Program a single simulation with enough printed output to convince you of the correctness of your model. Add your statistical measure of interest and test its correctness as well. Remove printing from the code. Wrap the code in a loop of many iterations. Add printing to summarize the analysis of the collected statistical data.

Game AI Exercises Yahtzee Pig Risk Limitations of MC Simulation probability of getting a Yahtzee (5 of a kind) in 3 rolls of 5 dice Pig probability of turn outcomes of “hold at 20” policy expected number of turns in solitaire play first player advantage assuming “hold at 20” policy Risk attack rollouts with varying attackers, defenders Limitations of MC Simulation probability of rolling all 1s for n dice

MC Reinforcement Learning Learn essential Reinforcement Learning (RL) terminology from a variety of sources: Sutton, R.S. and Barto, A.G. Reinforcement Learning: an introduction, Chapter 3 Kaelbling, L.P., Littman, M.L., and Moore, A.W. Reinforcement learning: a survey, sections 1 and 3.1 Russell, S. and Norvig, P. Artificial Intelligence: a modern approach, 3rd ed., section 17.1 Read specifically about MC RL: Sutton, R.S. and Barto, A.G. Reinforcement Learning: an introduction, Chapter 5

Approach N Since learning is best through experience, we suggest implementing Sutton and Barto’s MC RL algorithms with a single running problem. Approach N Original design as simplest “Jeopardy approach game” [Neller & Presser 2005] prototype 2 players and a single standard 6-sided die (d6). Goal: approach a total of n without exceeding it.  1st player rolls a die repeatedly until they either (1) "hold" with a roll sum <= n, or (2) exceed n and lose.  1st player holds at exactly n  immediate win Otherwise 2nd player rolls to exceed the first player total without exceeding n, winning or losing accordingly. Only 1st player has a choice of play policy.  For n >= 10, the game is nearly fair. Sample solution output given for n = 10, but students may be assigned different n.

MC RL Approach N Exercises Comparative MC Simulation Simulate games with 1st player holding sum s for s in [n – 5, n]. Which s optimizes 1st player wins? First-visit MC method for policy evaluation MC control with exploring starts (MCES) Epsilon-soft on-policy MC control Off-policy MC control

Further MC RL Game AI Exercises Hog Solitaire Each turn, roll some chosen number of dice. Score only rolls with no 1s. How many dice should be rolled so as to minimize the expected number of turns to reach a goal score? Pig Solitaire As above, but with individual die rolls and option to hold and score at any time. Yahtzee or Chance Assuming an option to score a Yahtzee (5-of-a-kind, 50 pts.) or Chance (sum of dice) in 3 rolls, which dice should be rerolled in any given situation?

Conclusion Deep knowledge comes best from playful experience. “One must learn by doing the thing; for though you think you know it, you have no certainty, until you try.” – Sophocles “Play is our brain’s favorite way of learning.” – Diane Ackerman We have provided novel, fun Game AI exercises that cover essentials in MC Simulation and MC RL range from CS1-level to advanced AI exercises have Java solutions available to instructors suggest many starting points for undergraduate research projects