Intrinsically Motivated Collective Motion

Slides:



Advertisements
Similar presentations
1 Photoelectricity Classically, light is treated as EM wave according to Maxwell equation However, in a few types of experiments, light behave in ways.
Advertisements

Modelling - Module 1 Lecture 1 Modelling - Module 1 Lecture 1 David Godfrey.
Discussion of Aguiar Amador, Farhi and Gopinath Coordination and Crisis in Monetary Unions.
COORDINATION and NETWORKING of GROUPS OF MOBILE AUTONOMOUS AGENTS.
CMPUT 466/551 Principal Source: CMU
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Certainty Equivalent and Stochastic Preferences June 2006 FUR 2006, Rome Pavlo Blavatskyy Wolfgang Köhler IEW, University of Zürich.
1 Reactive Pedestrian Path Following from Examples Ronald A. Metoyer Jessica K. Hodgins Presented by Stephen Allen.
Active Learning with Support Vector Machines
Algorithms For Inverse Reinforcement Learning Presented by Alp Sardağ.
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 1 Creating Data Representations On the other hand, sets of orthogonal vectors.
Radial Basis Function Networks
Chapter 1: Introduction to Statistics
Approximation Algorithms Pages ADVANCED TOPICS IN COMPLEXITY THEORY.
LEARNING DECISION TREES Yılmaz KILIÇASLAN. Definition - I Decision tree induction is one of the simplest, and yet most successful forms of learning algorithm.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Dynamical Systems Model of the Simple Genetic Algorithm Introduction to Michael Vose’s Theory Rafal Kicinger Summer Lecture Series 2002.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Conceptual Modelling and Hypothesis Formation Research Methods CPE 401 / 6002 / 6003 Professor Will Zimmerman.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
University of Windsor School of Computer Science Topics in Artificial Intelligence Fall 2008 Sept 11, 2008.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Learning to Navigate Through Crowded Environments Peter Henry 1, Christian Vollmer 2, Brian Ferris 1, Dieter Fox 1 Tuesday, May 4, University of.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
A Roadmap towards Machine Intelligence
Dr.Abeer Mahmoud ARTIFICIAL INTELLIGENCE (CS 461D) Dr. Abeer Mahmoud Computer science Department Princess Nora University Faculty of Computer & Information.
By: Aaron Dyreson Supervising Professor: Dr. Ioannis Schizas
Reinforcement Learning AI – Week 22 Sub-symbolic AI Two: An Introduction to Reinforcement Learning Lee McCluskey, room 3/10
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Modular 1. Introduction of the Course Structure and MyLabsPlus.
Students: Yossi Turgeman Avi Deri Self-Stabilizing and Efficient Robust Uncertainty Management Instructor: Prof Michel Segal.
Fundamentals of machine learning 1 Types of machine learning In-sample and out-of-sample errors Version space VC dimension.
The Acceptance Problem for TMs
Learning from Observations
Learning from Observations
CSC2535: Computation in Neural Networks Lecture 11 Extracting coherent properties by maximizing mutual information across space or time Geoffrey Hinton.
Deep Feedforward Networks
Introduce to machine learning
an introduction to: Deep Learning
Classification of Research
CS b659: Intelligent Robotics
Artificial Intelligence (CS 370D)
Mathematical Modelling of Pedestrian Route Choices in Urban Areas Using Revealed Preference GPS Data Eka Hintaran ATKINS European Transport Conference.
Introduction to Simulation Modelling
Chapter 2 Straight Line Motion
Chapter 2 Motion Along a Straight Line
APPROACHES TO QUANTITATIVE DATA ANALYSIS
A theory on autonomous driving algorithms
Basic machine learning background with Python scikit-learn
Machine Learning Basics
Simultaneous Localization and Mapping
Software Reliability Models.
UAV Route Planning in Delay Tolerant Networks
Navigation In Dynamic Environment
Mixture Density Networks
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Perceptron as one Type of Linear Discriminants
Dr. Unnikrishnan P.C. Professor, EEE
COMP60621 Fundamentals of Parallel and Distributed Systems
Artificial Intelligence Lecture No. 28
ML – Lecture 3B Deep NN.
دانشگاه صنعتي اميركبير
Learning from Observations
Programming with data Lecture 3
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Agent-Based Models Hiroki Sayama
Learning from Observations
Physics-guided machine learning for milling stability:
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
COMP60611 Fundamentals of Parallel and Distributed Systems
Continuous Curriculum Learning for RL
Presentation transcript:

Intrinsically Motivated Collective Motion Henry Charlesworth (H.Charlesworth@warwick.ac.uk) Supervisor: Professor Matthew Turner Introduction Model It has been suggested that in many situations that a sensible principle to follow could be to make decisions so as to maximize the number of choices that are available to you in the future, i.e. to keep your options open as much as possible. This is an example of an intrinsic motivation for behaviour. It offers an incentive to act even without a specific task to complete or an immediate external reward to gain. The idea here is that this kind of behaviour does not help in the solution of any one particular problem but can be beneficial for a wide range of possible scenarios. One attempt to formalize this idea is the “empowerment” framework. Defined in the language of information theory this is essentially a measure of how much information an agent could potentially inject into its environment and then itself detect with its own sensors at a later time. This provides a way of quantifying how much influence or control an agent has over its future states. For discrete sets of actions/sensor states and a deterministic environment (where each action leads to a perfectly predictable outcome) it simply reduces to the logarithm of the number of unique sensor states which can be accessed at some fixed time into the future. Our idea was to apply this to a group of agents equipped with simple visual sensors and study the resultant motion of the group. Consider a group of agents of finite size. Each has a number of visual sensors that detect the angular projection of the other agents in the “flock” and register a 1 if they are more than half full or 0 otherwise. i.e. a visual state is taken to be a vector of 0s and 1s. Choose the currently available action that leads to the largest number of unique visual states in the future. Future branches where collisions occur do not contribute to this count. At each time step the agents can choose from one of five possible actions, including reorientations and speed changes. Essentially agents are moving so as to maximize the control they have over the potential visual states that are accessible at some number of time steps τ in the future. When modelling future trajectories, each agent assumes that the others will continue to move in a straight line at v0. Obviously this is not true in general but just the simplest assumption that can be made. It turns out that the resultant flock is highly ordered and travels with 𝑣 ≈ 𝑣 0 so there is some level of self consistency here anyway. Results Definition of the order parameter φ: φ = 1 𝑁 𝑖 𝒗 𝒊 (Measurement of how aligned the flock is.) Average Opacity is the average fraction of an agent’s visual field which is filled. 𝑢 𝑖 = 𝑣 𝑖 − 𝑣 𝐶 𝑟 = 𝑢 𝑖 (0). 𝑢 𝑗 ( 𝑟 ) Primary result is that this algorithm leads to a robust, highly ordered flock with sensibly regulated density and marginal opacity, i.e. the average opacity of the individuals is close to ½. All of these are features associated with real flocks of starlings. This behaviour is remarkably robust over variations in the model parameters. Looking at the correlations in the fluctuations of the velocity around the flock mean we find that the “correlation length” (i.e. the distance at which the correlation function decays to zero) scales linearly with the size of the flock. This scale free behaviour is also something which has been observed in real flocks of starlings. Conclusions Learning a Heuristic Although this algorithm produces interesting collective motion with lots of features associated with real flocks of birds it is quite complicated. Each decision requires modelling a large number of future trajectories and so is certainly not similar to any kind of calculations that a real organism would be making. We wanted to see whether it would be possible to use this model to train a heuristic that can mimic this behaviour but only using the information which is currently available to the agent. We did this training a neural network to try and learn to classify the “correct” moves that are made in a particular situation by the full empowerment maximizing algorithm. That is, we provide as an input to the network the current (and previous) visual state vectors and then provide an output as an integer between 1 and 5 representing which move was made by the full algorithm. Doing this we are able to train a neural network that produces behaviour which is qualitatively and quantitatively very similar. Model based on agent’s moving so as to maximize the control they have over their environment produces robust and highly ordered collective motion. Many of the features that fall out are associated with real flocks of starlings, including marginal opacity and scale free correlations. Have been able to train a neural network that can nicely reproduce similar behaviour but without having to carry out complicated calculations and model many possible future trajectories, demonstrating that learning heuristics which mimic this “empowerment maximizing” behaviour can be possible. As such perhaps this could be a useful principle for understanding a range of real animal behaviours. References “Empowerment: A universal agent-centric measure of control”, Klyubin A, Polani D, Nehaniv C (2005) “Guided Self-Organization: Inception”, Propenko M (2014) “Scale-free Correlations in starling flocks”, Cavagna et al. (2010)