1 Project Ideas. 2 Algorithmic Evaluations/Comparisons  Compare variants of (nested) policy rollout using different bandit algorithms  Compare some.

Slides:



Advertisements
Similar presentations
A Survey of Real-Time Strategy Game AI Research and Competition in StarCraft Santiago Ontanon, Gabriel Synnaeve, Alberto Uriarte, Florian Richoux, David.
Advertisements

Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
Extraction and Transfer of Knowledge in Reinforcement Learning A.LAZARIC Inria “30 minutes de Science” Seminars SequeL Inria Lille – Nord Europe December.
Games & Adversarial Search
Adversarial Search 對抗搜尋. Outline  Optimal decisions  α-β pruning  Imperfect, real-time decisions.
Selective Search in Games of Different Complexity Maarten Schadd.
1 Reinforcement Learning Introduction & Passive Learning Alan Fern * Based in part on slides by Daniel Weld.
Probability CSE 473 – Autumn 2003 Henry Kautz. ExpectiMax.
Reinforcement Learning & Apprenticeship Learning Chenyi Chen.
Feature Selection Presented by: Nafise Hatamikhah
Game Intelligence: The Future Simon M. Lucas Game Intelligence Group School of CS & EE University of Essex.
Shallow Blue Project 2 Due date: April 5 th. Introduction Second in series of three projects This project focuses on getting AI opponent Subsequent project.
Reinforcement Learning Mitchell, Ch. 13 (see also Barto & Sutton book on-line)
לביצוע מיידי ! להתחלק לקבוצות –2 או 3 בקבוצה להעביר את הקבוצות – היום בסוף השיעור ! ספר Reinforcement Learning – הספר קיים online ( גישה מהאתר של הסדנה.
Reporter : Mac Date : Multi-Start Method Rafael Marti.
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
Games & Adversarial Search Chapter 6 Section 1 – 4.
1 Algorithm Design Techniques Greedy algorithms Divide and conquer Dynamic programming Randomized algorithms Backtracking.
1 Monte-Carlo Planning: Policy Improvement Alan Fern.
1 Reinforcement Learning: Learning algorithms Function Approximation Yishay Mansour Tel-Aviv University.
1 Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern.
Upper Confidence Trees for Game AI Chahine Koleejan.
Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
Simulation is the process of studying the behavior of a real system by using a model that replicates the behavior of the system under different scenarios.
Coevolution Chapter 6, Essentials of Metaheuristics, 2013 Spring, 2014 Metaheuristics Byung-Hyun Ha R2R3.
INTELLIGENT SYSTEM FOR PLAYING TAROK
FORS 8450 Advanced Forest Planning Lecture 5 Relatively Straightforward Stochastic Approach.
1/27 High-level Representations for Game-Tree Search in RTS Games Alberto Uriarte and Santiago Ontañón Drexel University Philadelphia October 3, 2014.
For Friday Finish chapter 6 Program 1, Milestone 1 due.
What is randomization and how does it solve the causality problem? 2.3.
Carla P. Gomes CS4700 CS 4701: Practicum in Artificial Intelligence Carla P. Gomes
Artificial Intelligence and Searching CPSC 315 – Programming Studio Spring 2013 Project 2, Lecture 1 Adapted from slides of Yoonsuck Choe.
MDPs (cont) & Reinforcement Learning
RADHA-KRISHNA BALLA 19 FEBRUARY, 2009 UCT for Tactical Assault Battles in Real-Time Strategy Games.
1 Monte-Carlo Planning: Policy Improvement Alan Fern.
Reinforcement Learning Based on slides by Avi Pfeffer and David Parkes.
RADHA-KRISHNA BALLA 19 FEBRUARY, 2009 UCT for Tactical Assault Battles in Real-Time Strategy Games.
Chapter 15: Co-Evolutionary Systems
Adaptive Reinforcement Learning Agents in RTS Games Eric Kok.
Christoph F. Eick: Thoughts on the Rook Project Challenges of Playing Bridge Well 
Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.
AI: AlphaGo European champion : Fan Hui A feat previously thought to be at least a decade away!!!
CE810 / IGGI Game Design II PTSP and Game AI Agents Diego Perez.
Reinforcement Learning for 3 vs. 2 Keepaway P. Stone, R. S. Sutton, and S. Singh Presented by Brian Light.
Understanding AlphaGo. Go Overview Originated in ancient China 2,500 years ago Two players game Goal - surround more territory than the opponent 19X19.
Artificial Intelligence AIMA §5: Adversarial Search
Problem Representation and Problem-solving Strategies.
Stochastic tree search and stochastic games
AlphaGo with Deep RL Alpha GO.
Reinforcement Learning
Markov Decision Processes
AlphaGO from Google DeepMind in 2016, beat human grandmasters
Two-player Games (2) ZUI 2013/2014
Location Prediction and Spatial Data Mining (S. Shekhar)
Reinforcement Learning
Games & Adversarial Search
Games & Adversarial Search
Reinforcement learning
CAP 5636 – Advanced Artificial Intelligence
Games & Adversarial Search
Games & Adversarial Search
Russell and Norvig: Chapter 3, Sections 3.1 – 3.3
POWER CHALLENGES Several Ways To Solve 7 CHALLENGES.
Games & Adversarial Search
Artificial Intelligence and Searching
Rohan Yadav and Charles Yuan (rohany) (chenhuiy)
Games & Adversarial Search
Project Ideas Apply and/or extend ideas from class to a non-trivial application/problem. Sometimes this may involve doing a project that focuses on evaluation.
Presentation transcript:

1 Project Ideas

2 Algorithmic Evaluations/Comparisons  Compare variants of (nested) policy rollout using different bandit algorithms  Compare some variants of Monte-Carlo tree search  Implement an algorithm from the literature and attempt to replicate results, e.g.  Forward Search Sparse Sampling (a type of Monte- Carlo tree search algorithm)  Anytime AO*  Least-Squares Policy Iteration  I could give other pointers depending on interests

3 Algorithmic Comparisons  Compare some reinforcement learning algorithms across some interesting problems  E.g. compare TD-based vs. Policy Gradient based  You could use the domains I have in the Java framework for evaluation

4 Solve a Particular Problem  Pick a challenging sequential decision making problem  Apply one or more of our planning/learning approaches to it and evaluate  Problems from past projects:  Games  Tetris  Pokemon  Blockus  Chess  Backgammon  Othello  Clue  Space Wars (Galcon Fusion)  Starcraft  Pac Man

5 Solve a Particular Problem  Problems from past projects:  Compiler scheduling  Adaptive Java program optimization  Forest Fire Management  Crop Management  Optimizing Policies for Network Protocols  Controllers for Real-Time Strategy Games  Subproblems of the game  Optimizing file sharing policies  Reinforcement learning and Monte-Carlo were the most commonly applied solution approaches