Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience.

Slides:



Advertisements
Similar presentations
L3S Research Center University of Hanover Germany
Advertisements

A Computational Model of Emotional Influences on Visual Working Memory Related Neural Activity Nienke Korsten Nikos Fragopanagos John Taylor OFC DLPFC.
Dougal Sutherland, 9/25/13.
Threads, SMP, and Microkernels
Computer Organization and Architecture
Introduction to Neural Networks Computing
Neural Network Models in Vision Peter Andras
Tamper Evident Microprocessors Adam Waksman Simha Sethumadhavan Computer Architecture & Security Technologies Lab (CASTL) Department of Computer Science.
Lectures 14: Instrumental Conditioning (Basic Issues) Learning, Psychology 5310 Spring, 2015 Professor Delamater.
Social Learning / Imitation
Template design only ©copyright 2008 Ohio UniversityMedia Production Spring Quarter  A hierarchical neural network structure for text learning.
Soft computing Lecture 6 Introduction to neural networks.
User and Task Analysis Howell Istance Department of Computer Science De Montfort University.
Gray et al., The Soft Constraints Hypothesis: A Rational Analysis Approach to Resource Allocation for Interactive Behavior.
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
Robotics for Intelligent Environments
PSY 402 Theories of Learning Chapter 9 – Motivation.
Biologically Inspired Robotics Group,EPFL Associative memory using coupled non-linear oscillators Semester project Final Presentation Vlad TRIFA.
1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.
Architectural Design.
Shaping.
Abstraction IS 101Y/CMSC 101 Computational Thinking and Design Tuesday, September 17, 2013 Carolyn Seaman University of Maryland, Baltimore County.
Curriculum Learning Yoshua Bengio, U. Montreal Jérôme Louradour, A2iA
 The most intelligent device - “Human Brain”.  The machine that revolutionized the whole world – “computer”.  Inefficiencies of the computer has lead.
Natural Actor-Critic Authors: Jan Peters and Stefan Schaal Neurocomputing, 2008 Cognitive robotics 2008/2009 Wouter Klijn.
Abstraction IS 101Y/CMSC 101 Computational Thinking and Design Tuesday, September 17, 2013 Marie desJardins University of Maryland, Baltimore County.
Soft Computing Lecture 14 Clustering and model ART.
Emergence of Semantic Knowledge from Experience Jay McClelland Stanford University.
Methodology of Simulations n CS/PY 399 Lecture Presentation # 19 n February 21, 2001 n Mount Union College.
For ABA Importance of Individual Subjects Enables applied behavior analysts to discover and refine effective interventions for socially significant behaviors.
University of Windsor School of Computer Science Topics in Artificial Intelligence Fall 2008 Sept 11, 2008.
Executive Function Computational Cognitive Neuroscience Randall O’Reilly.
Curiosity-Driven Exploration with Planning Trajectories Tyler Streeter PhD Student, Human Computer Interaction Iowa State University
CSCI-100 Introduction to Computing
Neural Networks Chapter 7
Learning Agents MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way.
Artificial Intelligence, Expert Systems, and Neural Networks Group 10 Cameron Kinard Leaundre Zeno Heath Carley Megan Wiedmaier.
Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.
Course Overview  What is AI?  What are the Major Challenges?  What are the Main Techniques?  Where are we failing, and why?  Step back and look at.
A Roadmap towards Machine Intelligence
Lecture 5 Neural Control
Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.
Decision Making and Branching (cont.)
A review of select models by Dehaene & Changeux and the implications for future work By Robert Schuler June 5, /11/2007.
Chapter 15. Cognitive Adequacy in Brain- Like Intelligence in Brain-Like Intelligence, Sendhoff et al. Course: Robots Learning from Humans Cinarel, Ceyda.
From NARS to a Thinking Machine Pei Wang Temple University.
Machine Learning Artificial Neural Networks MPλ ∀ Stergiou Theodoros 1.
Introduction to Machine Learning, its potential usage in network area,
Machine Learning Supervised Learning Classification and Regression
Unsupervised Learning of Video Representations using LSTMs
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
SeaRNN: training RNNs with global-local losses
Reinforcement learning (Chapter 21)
Hybrid computing using a neural network with dynamic external memory
Master’s Thesis defense Ming Du Advisor: Dr. Yi Shang
Experimental Design Proposal
Complex Experimental Designs
Hyperparameters, bias-variance tradeoff, validation
A First Look at Music Composition using LSTM Recurrent Neural Networks
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 8, 2018.
Understanding LSTM Networks
Respondent Conditioning
Other Classification Models: Recurrent Neural Network (RNN)
CHAPTER I. of EVOLUTIONARY ROBOTICS Stefano Nolfi and Dario Floreano
Learning linguistic structure with simple recurrent neural networks
Evolutionary Ensembles with Negative Correlation Learning
Introduction to Neural Networks
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

Flexible Shaping: How learning in small steps helps Hierarchical Organization of Behavior, NIPS 2007 Kai Krueger and Peter Dayan Gatsby Computational Neuroscience Unit

Outline Introduction learning may require external guidance shaping as a concept: Set-Up 12-AX task, LSTM network, shaping procedure Results simple shaping when does shaping help most? flexibility to adapt to variations Conclusions, issues and future work rules and habits

Introduction Learning essential for flexibility trial and error external guidance: one shot teaching by verbal explanation of abstract rules imitation shaping Guidance critical for complex behavior branching, working memory, rapid changes

Shaping a method of successive approximations (Skinner 1938) Key features : external alteration of reward contingencies withdrawal of intermittent rewards Creates behavioral units e.g. lever pressing of a rat Separate time scales / branching points by providing separate stages in shaping Ubiquitously (and implicitly) in animal experiments

12-AX task Demo

LSTM network Long Short-Term memory (Hochreiter and Schmidthuber 1997) 3-layer recurrent neural network Provides built-in mechanisms for: working memory gating (input, output and forget) Abstract over-simplified model of PFC basis to motivate PBWM (OReilly et al.)

Shaping procedure Teach 12-AX as successive approximations Separate WM timescales: long: (1 / 2 ) short: (AX/BY) Learning in 7 stages last stage: full 12-AX Resource allocation currently done by hand each stage learned into a new block all other memory blocks disabled provides separation / No interference

Simple shaping Improvement in learning times: 8 fold decrease (only final stage) significantly better (including complete training) median: 13 epochs, min: 8 epochs Need the 4 stages of shaping 1 and 2 High variance in shaping times

What makes shaping work Robustness to additional structure: irrelevant experience related and unrelated tasks / inputs Resource allocation: interference between tasks => no benefits

Shaping: when is it useful? Can shaping prevent scaling of learning time with task complexity? One aspect of complexity: Temporal credit assignment increase the outer loop length => higher temporal complexity Results: training time still increases, but scales much slower. increasing complexity => shaping more important

Rule abstraction Rule abstraction: flexibility to cope with change in statistics Train on the base 12-AX task (loop length 4) Test with variations loop lengths 4, 5, 12, 40 disable learning Should perform perfectly abstract rules have not changed

Generalisation Generalisation: cope with yet unseen data (inner loop combinations) Train 12-AX task without AZ and CX Test performance on full task Only 7 combinations one valid generalisation only? Mixed results: differences in emphasis (1- back / 0-back) overall shaping still better

Reversal learning Reverse stimulus – rule association shape all components needed Repeatedly reverse (after 500 epochs) learning of reversals. Identify flexibility to perform reversals unshaped: mostly fails shaped: succeeds more often

Conclusions Shaping works Reduces learning times Helps learning long time delays separating time scales of actions recombine behavioral units into sequences Improves abstraction and separation Increases flexibility to reversals Take home message: need to take sequential and transfer learning more into account when looking at learning architectures. Still issues to solve though

Limitations Resource allocation prime computational issue done by hand (Homunculus) ideas to automate: compute responsibilities Mosaic Experimental data no published data on learning 12-AX interesting manipulations: loop length, target frequency, … natural grouping of alphabet

Future Work Still based on habitual learning => no instant reprogramming Need additional mechanisms: more explicit rules variable substitution Bilinear rules framework: (Dayan 2007) recall match execute Close interaction between habitual and rule based learning rules supervise habit learning habits form basis of rule execution Results in a task grammar?

Questions? Thank you

Shaping in computational work Surprisingly few implementations: instead: elaborate learning framework to learn in one go Learning grammars: simple examples first (Elman 1993) self simplification by network limitations Robotics learning in a grid world (Singh1992) (Dorigo and Colombetti 1998) robot vs animal shaping (Savage 1998)

Shaping procedure Remembering 1 and 2 anything following a 1 => L / 2 => R progressively increase loop length 1-back memory for A / B unconditional CPT-AX 2 Separate stages for A and B

LSTM network

Examples of learning in LSTM

Shaping at work

Rule abstraction

Reversal learning continued

Alternative shaping paths Not all shaping paths have equal properties Trivial shaping path: each stage full 12-AX rules increase loop length from 1 through 4 Similar training time, but bad generalisation

Rules individual rules: recall match execute