IJCNN, International Joint Conference on Neural Networks, San Jose 2011 Pawel Raif Silesian University of Technology, Poland, Janusz A. Starzyk Ohio University,

Slides:



Advertisements
Similar presentations
A Mechanism for Learning, Attention Switching, and Cognition School of Electrical Engineering and Computer Science, Ohio University, USA
Advertisements

Cognitive Systems, ICANN panel, Q1 What is machine intelligence, as beyond pattern matching, classification and prediction. What is machine intelligence,
Artificial Curiosity Tyler Streeter
A g e n t T e c h n o l o g y G r o u p Competitive Contract Net Protocol Jiří Vokřínek, Jiří Bíba, Jiří Hodík, Jaromír.
Sparse Coding in Sparse Winner networks Janusz A. Starzyk 1, Yinyin Liu 1, David Vogel 2 1 School of Electrical Engineering & Computer Science Ohio University,
Ai in game programming it university of copenhagen Reinforcement Learning [Outro] Marco Loog.
Template design only ©copyright 2008 Ohio UniversityMedia Production Spring Quarter  A hierarchical neural network structure for text learning.
Joost Broekens, Doug DeGroot, LIACS, University of Leiden, The Netherlands Emergent Representations and Reasoning in Adaptive Agents Joost Broekens, Doug.
ROBOT BEHAVIOUR CONTROL SUCCESSFUL TRIAL OF MARKERLESS MOTION CAPTURE TECHNOLOGY Student E.E. Shelomentsev Group 8Е00 Scientific supervisor Т.V. Alexandrova.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
1 Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered knowledge Brief introduction to lectures.
Evolving Neural Network Agents in the NERO Video Game Author : Kenneth O. Stanley, Bobby D. Bryant, and Risto Miikkulainen Presented by Yi Cheng Lin.
Mental Development and Representation Building through Motivated Learning Janusz A. Starzyk.
EE141 1 Broca’s area Pars opercularis Motor cortexSomatosensory cortex Sensory associative cortex Primary Auditory cortex Wernicke’s area Visual associative.
Ensemble Learning: An Introduction
Cooperative Q-Learning Lars Blackmore and Steve Block Expertness Based Cooperative Q-learning Ahmadabadi, M.N.; Asadpour, M IEEE Transactions on Systems,
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
Mental Development and Representation Building through Motivated Learning Janusz A. Starzyk, Ohio University, USA, Pawel Raif, Silesian University of Technology,
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
10th Kovacs Colloquium UNESCO Water Resource Planning and Management using Motivated Machine Learning Janusz Starzyk School of Electrical Engineering and.
Robotics for Intelligent Environments
Curious Reconfigurable Robots Kathryn Merrick ITEE Seminar – Friday 29 th August Collaborators: Dayne Schmidt, Elanor Huntington.
Department of Computer Science Undergraduate Events More
Fourth International Symposium on Neural Networks (ISNN) June 3-7, 2007, Nanjing, China Online Dynamic Value System for Machine Learning Haibo He, Stevens.
Modelling Motivation for Experience-Based Attention Focus in Reinforcement Learning Candidate Kathryn Merrick School of Information Technologies University.
A Hybrid Self-Organizing Neural Gas Network James Graham and Janusz Starzyk School of EECS, Ohio University Stocker Center, Athens, OH USA IEEE World.
EE141 How to Motivate Machines to Learn and Help Humans in Making Water Decisions? Janusz Starzyk School of Electrical Engineering and Computer Science,
Leroy Garcia 1.  Artificial Intelligence is the branch of computer science that is concerned with the automation of intelligent behavior (Luger, 2008).
Wilma Bainbridge Tencia Lee Kendra Leigh
COGNITIVE RADIO FOR NEXT-GENERATION WIRELESS NETWORKS: AN APPROACH TO OPPORTUNISTIC CHANNEL SELECTION IN IEEE BASED WIRELESS MESH Dusit Niyato,
JingTao Yao Growing Hierarchical Self-Organizing Maps for Web Mining Joseph P. Herbert and JingTao Yao Department of Computer Science, University or Regina.
Artificial Intelligence Dr. Paul Wagner Department of Computer Science University of Wisconsin – Eau Claire.
Janusz Starzyk School of Electrical Engineering and Computer Science, Ohio University, USA Photo:
1 Consciousness and Cognition Janusz A. Starzyk Cognitive Architectures.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Models and Algorithms for Complex Networks Power laws and generative processes.
Towards Cognitive Robotics Biointelligence Laboratory School of Computer Science and Engineering Seoul National University Christian.
Natural Actor-Critic Authors: Jan Peters and Stefan Schaal Neurocomputing, 2008 Cognitive robotics 2008/2009 Wouter Klijn.
Session 2a, 10th June 2008 ICT-MobileSummit 2008 Copyright E3 project, BUPT Autonomic Joint Session Admission Control using Reinforcement Learning.
EE141 Motivated Learning based on Goal Creation Janusz Starzyk School of Electrical Engineering and Computer Science, Ohio University, USA
Using Hierarchical Reinforcement Learning to Balance Conflicting Sub- problems By: Stephen Robertson Supervisor: Phil Sterne.
OOAD Unit – I OBJECT-ORIENTED ANALYSIS AND DESIGN With applications
Curiosity-Driven Exploration with Planning Trajectories Tyler Streeter PhD Student, Human Computer Interaction Iowa State University
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Cooperative Q-Learning Lars Blackmore and Steve Block Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents Tan, M Proceedings of the.
Transfer in Variable - Reward Hierarchical Reinforcement Learning Hui Li March 31, 2006.
Learning Agents MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way.
1 Introduction to Reinforcement Learning Freek Stulp.
Algorithmic, Game-theoretic and Logical Foundations
Distributed Models for Decision Support Jose Cuena & Sascha Ossowski Pesented by: Gal Moshitch & Rica Gonen.
Design and Implementation of General Purpose Reinforcement Learning Agents Tyler Streeter November 17, 2005.
Bayesian networks and their application in circuit reliability estimation Erin Taylor.
Dynamic Programming. A Simple Example Capital as a State Variable.
Model-based learning: Theory and an application to sequence learning P.O. Box 49, 1525, Budapest, Hungary Zoltán Somogyvári.
Classification Ensemble Methods 1
Reinforcement Learning AI – Week 22 Sub-symbolic AI Two: An Introduction to Reinforcement Learning Lee McCluskey, room 3/10
Chapter 15. Cognitive Adequacy in Brain- Like Intelligence in Brain-Like Intelligence, Sendhoff et al. Course: Robots Learning from Humans Cinarel, Ceyda.
Reinforcement Learning for Mapping Instructions to Actions S.R.K. Branavan, Harr Chen, Luke S. Zettlemoyer, Regina Barzilay Computer Science and Artificial.
1 Consciousness and Cognition Janusz A. Starzyk Cognitive Architectures.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
1 Passive Reinforcement Learning Ruti Glick Bar-Ilan university.
Introduction to Machine Learning, its potential usage in network area,
An Efficient Bit Vector Approach to Semantics-based
Consciousness and Cognition
Bioagents and Biorobots David Kadleček, Michal Petrus, Pavel Nahodil
The story of distributed constraint optimization in LA: Relaxed
Professor Arne Thesen, University of Wisconsin-Madison
Advancing Motivated Learning with Goal Creation
CASE − Cognitive Agents for Social Environments
Dr. Unnikrishnan P.C. Professor, EEE
Presentation transcript:

IJCNN, International Joint Conference on Neural Networks, San Jose 2011 Pawel Raif Silesian University of Technology, Poland, Janusz A. Starzyk Ohio University, USA, Motivated Learning in Autonomous Systems

Outline Reinforcement Learning (RL) Goal Creation System (GCS) yields self-organizing pain based network Motivated Learning (ML) as a combination of RL + GCS Simulations Results Possible Applications of ML

hierarchical RL Machine Learning Methods intrinsic motivation PROBLEMS IN „REAL WORLD” APPLICATIONS like in AUTONOMOUS SYSTEMS machine learning supervised learning unsupervised learning corrective learning reinforcement learning „curse of dimensionality” lack of motivation for development „top-down approach” „bottom-up approach”

Reinforcement Learning learning through interaction with the environment RL as r ENVIRONMENT

Motivated Learning ML can combine internal goal creation system (GCS) and reinforcement learning (RL).  Motivated learning (ML) is need based motivation, goal creation and learning in an embodied agent.  An agent creates hierarchy of goals based on the primitive need signals.  It receives internal rewards for satisfying its goals (both primitive and abstract).  ML applies to EI working in a hostile environment.

action state GC reward GOALS (motivations) RL ML Motivated Learning – the main IDEA… intrinsic motivations created by learning machines.

An intelligent agent learns how to survive in a hostile environment. How to motivate a machine? We suggest that the hostility of the environment, is the most effective motivational factor.

Assumptions 1. ML agent is independent: it can act autonomously in its environment and is able to choose its own way of development. 2. ML agent’s interface to the environment is the same as RL agent’s. 3. Environment is hostile to the agent. 4. Hostility may be active or passive (depleted resources). 5. Environment is fully observable.

Goal Creation System Neural self-organizing pain-based structures Goal creation scheme a primitive pain is directly sensed an abstract pain is introduced by solving a lower level pain thresholded curiosity based pain Motivations and selection of a goal Motivations are as desires in BDI agent WTA competition selects motivation another WTA selects goals P2P2 G w PpG w BP1 1 PpPp G M2M2 w P1G w BP2 1 P1P1 S1S1 S2S2 B1B1 B2B2 M1M1. SkSk P2P2 G M w PG w BP2 B2B2 B1B1 w BP1 1 P1P1 1 UA -10 WTA

The least abstract The most abstract Office Bank Grocery Food SENSORMOTORINCREASEDECREASE FoodEatSugar levelFood supplies GroceryBuyFood suppliesMoney amount BankWithdrawMoney amountBank account OfficeWorkBank accountWorking possibilities Internal goals simple linear hierarchy between different goals Hierarchy of resources (and possible agent’s goals): Resources are distributed all over the „grid world”

Modified „grid world” This environment is: Complex, Dynamically changing, Fully observable. Agent must localize resources and learn how to utilize them

Environment Internal need signals Perception of resources Resources present in the environment can be used to satisfy the agent’s needs Subjective sense of „lack of resources ” By discovering useful resources and their dependencies, learned hierarchy of internal goals expresses the environment complexity. Resources are distributed all over the „grid world”

Relationships between internal goals doesn’t have to be a linear hierarchy. They may constitute a tree structure or a complex network of resource dependencies. Relationships between internal goals By discovering subsequent resources and their dependencies, the complexity of internal goal network grows. BUT each system may have unique experiences (reflecting personal history of development) Designer’s specified needs Top level resources need1need2 need3

Experiment that combines ML & RL Every resource discovered by the agent becomes a potential goal and is assigned a value function „level”. Goal Creation System establishes new goals and switches agent’s activity between them. RL algorithm learns value functions on different levels.

Experiment Results switching between goals at the beginning … … and at the end. Initially the agent uses many iterations to reach a goal (red dots). Sometimes it abandons the goal when another pain dominates. Final runs are shorter and more successful.

Comparing Primitive Pain Levels of RL & ML Experiment Results Moving average of the primitive pain signal. Initially RL agent learns better. Its performance deteriorates as the resources are depleted

Experiment Results Effectiveness in terms of cumulative reward : Reward determined by the designer of the experiment. Cumulative reward

Reinforcement LearningMotivated Learning Reinforcement Learning Motivated Learning Single value function – Various objectives Measurable rewards Predictable Objectives set by designer Maximizes the reward – Potentially unstable Learning effort increases with complexity Always active Multiple value functions – One for each goal Internal rewards Unpredictable Sets its own objectives Solves minimax problem – Always stable Learns better in complex environment than RL Acts when needed

Conclusions Motivated learning method, based on goal creation system, can improve learning of autonomus agents in special class of problems. ML is especially useful in complex, dynamic environments where it works according to learned hierarchy of goals. Individual goals use well known reinforcement learning algorithms to learn their corresponding value functions. ML concerns building internal representations of useful environment percepts, through interaction with the environment. ML switches machine’s attention and sets intended goals becoming an important mechanism for a cognitive system.

„The real danger is not that computers will begin to think like man, but that man will begin to think like computers.” Sydney J. Harris

References: J.A. Starzyk, J.T. Graham, P. Raif, and A-H.Tan, Motivated Learning for the Development of Autonomous Systems, Cognitive Systems Research, Special issue on Computational Modeling and Application of Cognitive Systems, 12 January Starzyk J.A., Raif P., Ah-Hwee Tan, Motivated Learning as an Extension of Reinforcement Learning, Fourth International Conference on Cognitive Systems, CogSys 2010, ETH Zurich, January Starzyk J.A., Raif P., Motivated Learning Based on Goal Creation in Cognitive Systems, Thirteenth International Conference on Cognitive and Neural Systems, Boston University, May J. A. Starzyk, Motivation in Embodied Intelligence, Frontiers in Robotics, Automation and Control, I-Tech Education and Publishing, Oct. 2008, pp