Reinforcement Learning

Slides:



Advertisements
Similar presentations
Ai in game programming it university of copenhagen Reinforcement Learning [Outro] Marco Loog.
Advertisements

ECE457 Applied Artificial Intelligence R. Khoury (2007)Page 1 Please pick up a copy of the course syllabus from the front desk.
1. Algorithms for Inverse Reinforcement Learning 2
Partially Observable Markov Decision Process By Nezih Ergin Özkucur.
Reinforcement Learning & Apprenticeship Learning Chenyi Chen.
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
Random Administrivia In CMC 306 on Monday for LISP lab.
Reinforcement Learning (1)
Exploration in Reinforcement Learning Jeremy Wyatt Intelligent Robotics Lab School of Computer Science University of Birmingham, UK
SE320: Introduction to Computer Games Week 8: Game Programming Gazihan Alankus.
CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.
CS426 Game Programming II Dan Fleck. Why games?  While the ideas in this course are demonstrated programming games, they are useful in all parts of computer.
Reinforcement Learning
Reinforcement Learning 主講人:虞台文 Content Introduction Main Elements Markov Decision Process (MDP) Value Functions.
Bayesian Reinforcement Learning Machine Learning RCC 16 th June 2011.
Curiosity-Driven Exploration with Planning Trajectories Tyler Streeter PhD Student, Human Computer Interaction Iowa State University
Reinforcement Learning 主講人:虞台文 大同大學資工所 智慧型多媒體研究室.
Computational theory techniques in interactive video games.
1 Introduction to Reinforcement Learning Freek Stulp.
Definitions of AI There are as many definitions as there are practitioners. How would you define it? What is important for a system to be intelligent?
Course Overview  What is AI?  What are the Major Challenges?  What are the Main Techniques?  Where are we failing, and why?  Step back and look at.
Reinforcement Learning Based on slides by Avi Pfeffer and David Parkes.
Reinforcement Learning AI – Week 22 Sub-symbolic AI Two: An Introduction to Reinforcement Learning Lee McCluskey, room 3/10
Deep Learning and Deep Reinforcement Learning. Topics 1.Deep learning with convolutional neural networks 2.Learning to play Atari video games with Deep.
CS 5751 Machine Learning Chapter 13 Reinforcement Learning1 Reinforcement Learning Control learning Control polices that choose optimal actions Q learning.
CS 134 Design Documents.
Brief Intro to Machine Learning CS539
Reinforcement Learning
Figure 5: Change in Blackjack Posterior Distributions over Time.
Pixels, Colors and Shapes
Building Imitation and Self-Evolving AI in Python
Reinforcement Learning
Introduction of Reinforcement Learning
Deep Reinforcement Learning
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 10
Done Done Course Overview What is AI? What are the Major Challenges?
A Crash Course in Reinforcement Learning
Reinforcement learning (Chapter 21)
Reinforcement Learning (1)
Reinforcement Learning in POMDPs Without Resets
AlphaGo with Deep RL Alpha GO.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3
Reinforcement learning (Chapter 21)
Reinforcement Learning
Deep reinforcement learning
Timothy Boger and Mike Korostelev
Objective % Explain concepts used to create websites.
"Playing Atari with deep reinforcement learning."
CSCI 5582 Artificial Intelligence
Course Logistics CS533: Intelligent Agents and Decision Making
Reinforcement learning
Instructors: Fei Fang (This Lecture) and Dave Touretzky
Dr. Unnikrishnan P.C. Professor, EEE
Reinforcement Learning
Lecture 7: Introduction to Processing
Reinforcement Learning
Chapter 3: The Reinforcement Learning Problem
October 6, 2011 Dr. Itamar Arel College of Engineering
CS 188: Artificial Intelligence Spring 2006
Introduction to Reinforcement Learning and Q-Learning
Future of Artificial Intelligence
CS 188: Artificial Intelligence Fall 2008
Designing Neural Network Architectures Using Reinforcement Learning
CS 416 Artificial Intelligence
Artificial Intelligence and Future of Education
Reinforcement Learning (2)
Artificial Intelligence Machine Learning
Reinforcement Learning (2)
A Deep Reinforcement Learning Approach to Traffic Management
Presentation transcript:

Reinforcement Learning Developing a self-learning snake game using Reinforcement Learning and pygame.

About me Student, Pursuing my Bachelor’s in Software Engineering Freelance Software Developer A FOSS enthusiast, currently contributing to coala Pythonista, loves to develop automation projects, Machine Learning projects and occasionally write blogs regarding python. Github: https://github.com/satwikkansal Linkedin: https://linkedin.com/in/satwikkansal Website: http://www.satwikkansal.xyz Blog: https://satwikkansal.wordpress.com

Do you remember these?

Contents Quick Intro to Game Development : Common concepts Designing the gameplay Events and control, Implementing game logic Some RL concepts: Agent, State, Reward, Policy, MDP and few more. Q-Learning to the Rescue Other Reinforcement Learning Techniques Self-Driving Car in action Current applications and Future Scopes in RL Available open source framework and libraries The code for the workshop is available at https://github.com/satwikkansal/snakepy

Some Game Development concepts Coordinates : The screen is a 2D grid plane with (0,0) in the top left Colors: RGB and alpha values Drawing: Plotting pixels, Surface Object, blitting Rendering: Animation, Frame/Refresh rate The game loop:

Designing the Gameplay Objects : A snake, Apples, Walls Snake eats the apples, grows 1 unit longer. Snake dies when it hits the wall or runs over itself. Objective: Eat as many apples as possible without dying. What happens when the snake gets killed? How to start the game?

Code Implementation: Drawing, Displaying and Moving the game objects.

User Interaction & Game Logic Arrow keys to move the head. Do we want our snake to keep moving. Detecting overlaps and collisions of snake head with other objects : boundaries, apples and its body. Scoring

Code Implementation: Adding the controls and the score to make a fully functional snake game.

Okay, let’s make our dumb computer control the snake.

Code Implementation: Wait, let’s add some intelligence to our agent. (Provide vision to the CPU i.e. game rules) Next Section: Or better, let’s make the CPU discover knowledge. (Make our snake learn from experiences)

Time to introduce Reinforcement Learning!

A few things to know State, History and Episode Action Reward Policy, value function, and model Environment Agent Markov states and MDP Long story short : Everything that surrounds the agent in environment. A state represents the situation of the agent at a particular time in the environment. The agent performs an action to transition from one state to another and may receive a reward in return. The policy is the strategy of choosing an action given a state and the agent tries to chose a policy that optimizes the expected cumulative reward.

Implementation: Refactoring the game’s code

Q-learning to the rescue! Popular, Simple, Model free RL technique (Environment’s model is not required) Can find optimal action-selection policy for any finite MDP. Learns the action-value function

Code Implementation: Using Q-learning to choose actions for the agent.

Our agent in action Note: Currently our rules don’t penalize snake for running over itself.

Possible Improvements to our agent Optimizing the state space Adding time-based rewards Minimizing the exploration v/s exploitation tradeoff Optimizing the hyperparameters using techniques like Grid Search, Genetic Algorithms. Using state of the art RL techniques.

Other interesting techniques SARSA: Uses Q-Learning as a part of policy iteration mechanism, next action is chosen randomly with predefined probability, faster than Q-learning when no. of actions are high. Deep Q-Networks: Combines usage of RL and Deep Neural Networks like CNN. Learns the non-linear value-action function through experience replay.

The self-driving car simulation design State: Car on left, right, ahead? Traffic light green or red? Next waypoint (from GPS) Actions: Steer Left, Steer Right Accelerate, brake Rewards: Violating the traffic laws Hitting the obstacles Reaching the destination Time taken to reach destination (any thoughts on this?) Code Sample available at: https://github.com/satwikkansal/smartcab

Applications of Reinforcement Learning Playing games like chess (reward is not instantaneous, delayed feedback) Managing portfolio and finances (reward here is the money) Robotics (humanoid robots) Manufacturing and inventory management. General AI agents: Agents that can perform multiple things with single algorithm. Example, an agent playing all the Atari games.

Open source frameworks and libraries for RL Open AI gym - A toolkit for developing and comparing reinforcement learning algorithms. Open AI universe - A software platform for measuring and training an AI's general intelligence across the world's supply of games, websites and other applications Deepmind Lab - A customisable 3D platform for agent-based AI research

Some nice links Youtube lectures and tutorials: UCL course on RL by D.Silver - http://bit.ly/RL-UCL Sentdex pygame tutorial - http://bit.ly/sentdex-pygame Python Code Samples: Reinforcement Learning, an introduction - http://bit.ly/RL-intro-Python Online Demo: ConvNetJS - http://bit.ly/convnetjs