By: Stephen Robertson Supervisor: Phil Sterne Using Hierarchical Reinforcement Learning to Solve a Problem with Multiple Conflicting Sub-problems By: Stephen Robertson Supervisor: Phil Sterne
Presentation Outline Project Motivation Project Aim Progress so far The Gridworld Problem Flat Reinforcement Learning Implementation Results Still to do
Project Motivation Reinforcement Learning is an attractive form of machine learning, but because of the curse of dimensionality, with complex problems it becomes inefficient Hierarchical Reinforcement Learning is a method for dealing with this curse of dimensionality
Project Aim Implementing various algorithms of Hierarchical Reinforcement Learning to a complex gridworld problem Comparing the various algorithms to each other and to flat Reinforcement Learning
Progress Gridworld Implemented in Java Flat Reinforcement Learning Implemented on a 6x6 gridworld in Java Feudal Reinforcement Learning in the process of being implemented
Rules of the gridworld Possible Actions: Left, Right, Up, Down and Rest Collecting food and drink increases nourishment and hydration Landing on the tree, the explorer is now carrying wood with which it can repair its shelter
Rules of the gridworld Resting in a repaired shelter increases health Landing on the lion decreases health With time, nourishment, hydration, health and shelter condition all gradually decrease
Flat Reinforcement learning SARSA with eligibility traces was used To get Flat Reinforcement Learning working at all I needed to simplify the task a bit 6x6 gridworld Nourishment, Hydration, Health and Shelter Condition minimised to 4 discrete levels each Total states: 6 x 6 x 4 x 4 x 4 x 4 x 2 = 18432 Managable
Results
Still to do Finish implementing Feudal Reinforcement Learning Implement Phil’s interpretation of Feudal Reinforcement Learning Implement MaxQ hierarchical reinforcement learning And perhaps others… Compare them
Questions ?