1 Project Ideas. 2 Algorithmic Evaluations/Comparisons  Compare variants of (nested) policy rollout using different bandit algorithms  Compare some.

Slides:

Advertisements

Similar presentations

A Survey of Real-Time Strategy Game AI Research and Competition in StarCraft Santiago Ontanon, Gabriel Synnaeve, Alberto Uriarte, Florian Richoux, David.

Advertisements

Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.

Extraction and Transfer of Knowledge in Reinforcement Learning A.LAZARIC Inria “30 minutes de Science” Seminars SequeL Inria Lille – Nord Europe December.

Games & Adversarial Search

Adversarial Search 對抗搜尋. Outline  Optimal decisions  α-β pruning  Imperfect, real-time decisions.

Selective Search in Games of Different Complexity Maarten Schadd.

1 Reinforcement Learning Introduction & Passive Learning Alan Fern * Based in part on slides by Daniel Weld.

Probability CSE 473 – Autumn 2003 Henry Kautz. ExpectiMax.

Reinforcement Learning & Apprenticeship Learning Chenyi Chen.

Feature Selection Presented by: Nafise Hatamikhah

Game Intelligence: The Future Simon M. Lucas Game Intelligence Group School of CS & EE University of Essex.

Shallow Blue Project 2 Due date: April 5 th. Introduction Second in series of three projects This project focuses on getting AI opponent Subsequent project.

Reinforcement Learning Mitchell, Ch. 13 (see also Barto & Sutton book on-line)

לביצוע מיידי ! להתחלק לקבוצות –2 או 3 בקבוצה להעביר את הקבוצות – היום בסוף השיעור ! ספר Reinforcement Learning – הספר קיים online ( גישה מהאתר של הסדנה.

Reporter : Mac Date : Multi-Start Method Rafael Marti.

1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.

Games & Adversarial Search Chapter 6 Section 1 – 4.

1 Algorithm Design Techniques Greedy algorithms Divide and conquer Dynamic programming Randomized algorithms Backtracking.

1 Monte-Carlo Planning: Policy Improvement Alan Fern.

1 Reinforcement Learning: Learning algorithms Function Approximation Yishay Mansour Tel-Aviv University.

1 Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern.

Upper Confidence Trees for Game AI Chahine Koleejan.

Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia.

MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.

Simulation is the process of studying the behavior of a real system by using a model that replicates the behavior of the system under different scenarios.

Coevolution Chapter 6, Essentials of Metaheuristics, 2013 Spring, 2014 Metaheuristics Byung-Hyun Ha R2R3.

INTELLIGENT SYSTEM FOR PLAYING TAROK

FORS 8450 Advanced Forest Planning Lecture 5 Relatively Straightforward Stochastic Approach.

1/27 High-level Representations for Game-Tree Search in RTS Games Alberto Uriarte and Santiago Ontañón Drexel University Philadelphia October 3, 2014.

For Friday Finish chapter 6 Program 1, Milestone 1 due.

What is randomization and how does it solve the causality problem? 2.3.

Carla P. Gomes CS4700 CS 4701: Practicum in Artificial Intelligence Carla P. Gomes

Artificial Intelligence and Searching CPSC 315 – Programming Studio Spring 2013 Project 2, Lecture 1 Adapted from slides of Yoonsuck Choe.

MDPs (cont) & Reinforcement Learning

RADHA-KRISHNA BALLA 19 FEBRUARY, 2009 UCT for Tactical Assault Battles in Real-Time Strategy Games.

1 Monte-Carlo Planning: Policy Improvement Alan Fern.

Reinforcement Learning Based on slides by Avi Pfeffer and David Parkes.

RADHA-KRISHNA BALLA 19 FEBRUARY, 2009 UCT for Tactical Assault Battles in Real-Time Strategy Games.

Chapter 15: Co-Evolutionary Systems

Adaptive Reinforcement Learning Agents in RTS Games Eric Kok.

Christoph F. Eick: Thoughts on the Rook Project Challenges of Playing Bridge Well 

Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.

AI: AlphaGo European champion : Fan Hui A feat previously thought to be at least a decade away!!!

CE810 / IGGI Game Design II PTSP and Game AI Agents Diego Perez.

Reinforcement Learning for 3 vs. 2 Keepaway P. Stone, R. S. Sutton, and S. Singh Presented by Brian Light.

Understanding AlphaGo. Go Overview Originated in ancient China 2,500 years ago Two players game Goal - surround more territory than the opponent 19X19.

Artificial Intelligence AIMA §5: Adversarial Search

Problem Representation and Problem-solving Strategies.

Stochastic tree search and stochastic games

AlphaGo with Deep RL Alpha GO.

Reinforcement Learning

Markov Decision Processes

AlphaGO from Google DeepMind in 2016, beat human grandmasters

Two-player Games (2) ZUI 2013/2014

Location Prediction and Spatial Data Mining (S. Shekhar)

Reinforcement Learning

Games & Adversarial Search

Games & Adversarial Search

Reinforcement learning

CAP 5636 – Advanced Artificial Intelligence

Games & Adversarial Search

Games & Adversarial Search

Russell and Norvig: Chapter 3, Sections 3.1 – 3.3

POWER CHALLENGES Several Ways To Solve 7 CHALLENGES.

Games & Adversarial Search

Artificial Intelligence and Searching

Rohan Yadav and Charles Yuan (rohany) (chenhuiy)

Games & Adversarial Search

Project Ideas Apply and/or extend ideas from class to a non-trivial application/problem. Sometimes this may involve doing a project that focuses on evaluation.

Presentation transcript:

1 Project Ideas

2 Algorithmic Evaluations/Comparisons  Compare variants of (nested) policy rollout using different bandit algorithms  Compare some variants of Monte-Carlo tree search  Implement an algorithm from the literature and attempt to replicate results, e.g.  Forward Search Sparse Sampling (a type of Monte- Carlo tree search algorithm)  Anytime AO*  Least-Squares Policy Iteration  I could give other pointers depending on interests

3 Algorithmic Comparisons  Compare some reinforcement learning algorithms across some interesting problems  E.g. compare TD-based vs. Policy Gradient based  You could use the domains I have in the Java framework for evaluation

4 Solve a Particular Problem  Pick a challenging sequential decision making problem  Apply one or more of our planning/learning approaches to it and evaluate  Problems from past projects:  Games  Tetris  Pokemon  Blockus  Chess  Backgammon  Othello  Clue  Space Wars (Galcon Fusion)  Starcraft  Pac Man

5 Solve a Particular Problem  Problems from past projects:  Compiler scheduling  Adaptive Java program optimization  Forest Fire Management  Crop Management  Optimizing Policies for Network Protocols  Controllers for Real-Time Strategy Games  Subproblems of the game  Optimizing file sharing policies  Reinforcement learning and Monte-Carlo were the most commonly applied solution approaches