Interaction settings for measuring (social) intelligence in multi-agent systems Javier Insa-Cabrera, José Hernández-Orallo Dep. de Sistemes Informàtics.

1 Interaction settings for measuring (social) intelligence in multi-agent systems Javier Insa-Cabrera, José Hernández-Orallo Dep. de Sistemes Informàtics i Computació, Universitat Politècnica de València II Workshop ReteCog INTERACTION, Zaragoza, 17-18 January, 2013

2 OUTLINE 1.Introduction 2.Interactive general tests 3.Some findings and caveats 4.Configurations 5.Difficulty estimation 6.Conclusions INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 2

3 1. INTRODUCTION INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 3  Why are tests (for machines) good methodologically?  An intelligence test can be seen as a definition of intelligence.  Note that a definition of intelligence does not ensure an intelligence test.  Cognitive tests can be refuted by experimentation.  Especially those that are universal, since they must put very different kinds of subjects on the same scale.  Cognitive tests can be used to evaluate systems and assess the progress of a discipline.  They will become more and more necessary in the future.  They are useful to make us formulate new questions and address new challenges.

4 1. INTRODUCTION  Can we construct ‘universal’ intelligence tests? INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 4  Project anYnt (Anytime Universal Intelligence)   Any kind of system (biological, non-biological, human).  Any system now or in the future.  Any moment in its development (child, adult).  Any degree of intelligence.  Any speed.  Evaluation can be stopped at any time.

5 1. INTRODUCTION INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 5  Intelligence as a cognitive ability:  General Intelligence: Capacity to perform well in any kind of environment.  Social Intelligence: Ability to perform well in an environment interacting with other agents.  Related but different from collective intelligence or emotional intelligence.  Why social intelligence is so important?  It is shown to be one of the distinctive traits in human intelligence and other animals.  Hermann, Call, Hernández-Lloreda, Hare, Tomasello “Humans have evolved specialized skills of social cognition. The cultural intelligence hypothesis”, Science, 2007.  Shows the ability to create “mind models”.

6 1. INTRODUCTION INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 6  Approach:  Tests must be universal.  Tests must have a formal background of what we are measuring.  Following the “tradition” of tests based on compression, Kolmogorov complexity and related ideas:  Turing Test enhanced with compression (Dowe and Hajek “A non- behavioural, computational extension to the Turing Test, ICCIMA, 1998)  C-tests: Intelligence tests based on Kolmogorov Complexity (Hernandez-Orallo “Beyond the Turing Test”, J. Logic, Language & Information, 2000)

7 2. INTERACTIVE GENERAL TESTS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 7  Universal Intelligence ( Legg and Hutter “Universal intelligence: A definition of machine intelligence, 2007 ).  An interactive extension of C-tests.  Agents are evaluated in a classical reinforcement learning setting.  Choice of environments is done and results averaged using a universal distribution.  This leads to the following definition: = performance over a universal distribution of environments. π μ riri oioi aiai

8 2. INTERACTIVE GENERAL TESTS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 8  Anytime Intelligence Test ( Hernandez-Orallo and Dowe “Measuring universal intelligence: Towards and anytime intelligence test”, Artificial Intelligence, 2010 ).  An interactive setting following (Legg and Hutter 2007) which addresses:  Issues about the difficulty of environments.  The definition of discriminative environments.  Finite samples and (practical) finite interactions.  Time (speed) of agents and environments.  Reward aggregation, convergence issues.  Anytime and adaptive application.

9 2. INTERACTIVE GENERAL TESTS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 9  An environment class  ( Hernandez-Orallo “A (hopefully) Unbiased Universal Environment Class for Measuring Intelligence of Biological and Artificial Systems”, Artificial General Intelligence, 2010 ).  Spaces are defined as fully connected graphs.  Actions are the arrows in the graphs.  Observations are the ‘contents’ of each vertex/cell in the graph.  Example:  Agents can perform actions inside the space.  Rewards: Two special agents, Good ( ⊕ ) and Evil ( ⊖ ), which are responsible for the rewards: leave a trail.  With regular graphs the space resembles a cellular automaton (and other computational models).

10 2. INTERACTIVE GENERAL TESTS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 10  With the test definitions and this environment class, we have been evaluating ‘general intelligence’ of different systems.  Experiments concluded that the test prototype is not universal (Insa- Cabrera et al. “Comparing Humans and AI agents”, Artificial General Intelligence, 2011).  Environments rarely contain social behaviour.  Environment distributions should be reconsidered:  Darwin-Wallace distribution (Hernandez-Orallo et al “On more realistic environment distributions for defining, evaluating and developing intelligence”, Artificial General Intelligence, 2011).

11 2. INTERACTIVE GENERAL TESTS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 11  Towards social tests:  Goal: modify the setting to include some social behaviour.  See whether social behaviour better discriminates between humans and machines.  How:  Introduce simple agents in the environments.  Convert environment into a truly Multi-Agent System (MAS).  Examine the impact of other agents over agent performance using competitive and cooperative scenarios.

12 3. SOME FINDINGS AND CAVEATS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 12  Agents compared:  Reinforcement Learning algorithms:  Q-learning  SARSA  QV-learning  Simple algorithm  Random Results when alone in the environment (only with ⊕ and ⊖ )

13 3. SOME FINDINGS AND CAVEATS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 13  Competition:  All the agents compete for rewards. Competition results: four agents, including the random agent Competition results: three agents, without the random agent

14 3. SOME FINDINGS AND CAVEATS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 14  Cooperation:  The agents receive the average of obtained rewards. Cooperation results: four agents, including the random agent Cooperation results: three agents, without the random agent

15 3. SOME FINDINGS AND CAVEATS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 15  Teams:  Two teams (2 Qlearning vs 2 SARSA) compete for rewards.  Competition and cooperation.

16 3. SOME FINDINGS AND CAVEATS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 16  Environment Complexity  As usual, we use an approximation to K (e.g., Lempel-Ziv approximation):  Unlike previous experiments without other agents, the complexity of the other agents correlates but the trends are much weaker. Competitive scenario Cooperative scenario

17 3. SOME FINDINGS AND CAVEATS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 17  Findings:  The inclusion of other agents (even random) make other agents perform worse.  RL algorithms increase their cost matrix.  Algorithms should learn to deal with ‘noise’.  Complexity increases with the inclusion of social behaviour.  The complexity of the environment is more related to the complexity of the other agents.  We need to calculate first the complexity (or intelligence) of the other agents included in the environment.  The overall complexity gets too large (which also means that its approximation is much more difficult).

18 3. SOME FINDINGS AND CAVEATS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 18  We need a more minimalistic setting  The use of complex agents such as Q-learning (hundreds of lines of code) or a random agent makes the connection between difficulty and environment complexity (including the agents) much more intricate.  We need to simplify the setting and consider simple agents:  We analyse several configurations next.  We need to derive and analyse the difficulty in a different way.  We analyse several distributional approaches.

19 4. CONFIGURATIONS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 19  Multi-agent environment configurations:  We look for configurations which are minimalist.  The behaviour is given mostly by the other agents, not by the environment.  Agents can have simple action, perception and reward schemas.  Simple agents may be easy to define: colliders, evaders, random, etc.

20 4. CONFIGURATIONS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 20  Space and actions:  We keep the previous configuration: a graph where the edges are the actions and the vertices are the cells.  Not necessarily regular (as before).  Observations:  Agents see some of the cells (e.g., adjacent cells and their content).  Agents see who is in each cell (and not only how many).  This is important for social intelligence, since agents need to identify different mind models.

21 4. CONFIGURATIONS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 21  Rewards:  We consider all agents equal. There are no special agents ⊕ and ⊖ generating rewards.  No heavens, no hells:  The number of rewards shared by / included in the system must be finite and remain constant.  There must always be a way to prevent one agent from getting all the rewards.  We relax the ‘balancedness’ property (random agents score 0). It is difficult to ensure in general.  Now rewards are always positive.

22 4. CONFIGURATIONS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 22  Rewards:  These can be seen:  As a function of the observations (adjacent cells).  Leads to trivial equilibria.  For instance, proportional to the number of agents around.  Complexity would depend on how the rewards propagate.  Proportion of rewards difficult to control.  As objects (units) in cells that can be eaten and later disposed.  With a fixed number of rewards on the space, we always have the same “total energy”. The agents compete for this total amount.  The theoretical maximum and minimum are clear.  Rewards should be linked to the behaviour of the other agents, to make agent influence in rewards (goals) direct, so ensuring that behaviour is social.

23 4. CONFIGURATIONS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 23  Current configuration. Definition:  Agents can be arranged into teams of at least one agent. The team of agent a is denoted by T(a).  There is a fixed number of indivisible (energy) units.  Start:  The number of units each agent a stores is denoted by U(a). Agents are originally empty: U(a) = 0.  The units are originally spread at random on the space cells. The number of units in c is denoted by U(c). The number of agents in c is denoted by A(c),  Reward rule:  If A(c) = n and U(c) = m, for each agent a we have U(a)  U(a) + 1, provided m ≥ n. If m < n then for each agent a we have U(a)  U(a)   U(c) / A(c) .  For each step, each agent’s rewards are the sum of units that all its team’s members carry divided by the number of agents in the team.

24 4. CONFIGURATIONS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 24  Current configuration. Properties:  If agents are optimal, some equilibria appear.  For suboptimal and diverse populations of agents, interesting strategies emerge.  These strategies depend mostly on how the other agents behave.  Capturing the behaviour of the other agents is crucial for succeeding in this game.  Co-operation can take place, especially when using teams.

25 4. CONFIGURATIONS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 25  Defining and playing with simple agents:  Examples:  Random agents.  Colliders: go to the nearby cell with most agents.  Evaders: go to the nearby cell with least agents.  Gluttons: go to the nearby cell with most energy.  Regular: do regular patterns.  …  A very simple agent description language is been designed to describe most of them.

26 5. DIFFICULTY ESTIMATION INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 26  Difficulty is not complexity.  An environment full of very complex agents can still be very benevolent and easy.  The other agents may not compete for the rewards.  There may be shortcuts leading to very simple (and possibly non- social) good policies in very complex and chaotic situations.  Furthermore, using the complexity of the environment (and everything it contains), as used for non-social environments, leads to:  Where again this relation is only unidirectional (a difficult environment must be complex): D is high implies K is high.  But with other agents, this is a very loose upper bound and is not useful as a definition or approximation of D.

27 5. DIFFICULTY ESTIMATION INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 27  A solution-centred view of difficulty:  When do we say that a (social) environment is easy?  What are good results?  If there are many agents (policies) leading to good R i then we say that the environment is easy. An environment (or a task) is said to be easy when simple policies get good results.

28 5. DIFFICULTY ESTIMATION INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 28  Complexity on the agent’s side:  Now we need to calculate the complexity of the agents instead: K(π)  We can parametrise a class of agents depending on their complexity.  From here, we can calculate the distribution (and the maximum) of expected aggregated rewards for each complexity k:  We can plot these functions of k.


30 5. DIFFICULTY ESTIMATION INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 30  Can we derive some numerical indicators?  We may also derive some other statistical indicators for discrimination (sparseness).  We do not want environments which are easy or difficult independently of what policy we use (all the agents score similarly).

31 5. DIFFICULTY ESTIMATION INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 31  Can these plots and indicators be estimated?  Instead of a difficult upper bound which requires calculating the complexity of the environment and its components  We need to calculate the complexity of an agent in a sample: K(π 1 ), K(π 2 ), …, K(π m )  Where m is usually large (much larger than n).  And let them interact, always using the same role i (1 ≤ i ≤ n).  All this is consistent with (and gives further justification to) our previous search for minimalistic environments and agents.

32 6. CONCLUSIONS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 32  The inclusion of many agents in an environment makes environments more unpredictable (as expected).  Also much more difficult to analyse in terms of difficulty and discriminative power.  Calculating the complexity of the environment is no longer a good approach for estimating difficulty, especially because the value becomes very large when other agents abound (a very loose upper bound)  Instead, we evaluate the environments as how a distribution of policies/agents work on them. For the approximation of environment difficulty we need:  Minimalistic agent and environment descriptions.  Graphical and statistical summarisation of agent behaviour.

33 6. CONCLUSIONS INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 33  Social behaviour (even a primitive one) is not just the inclusion of other agents.  These agents must play a role.  With this approach, we do not completely discard that other optimal, but non-social, solutions may exist for some multi- agent environments, but we can have more control.  Experimentation on the current configuration will surely detect flaws and will trigger improvements.  There are dozens of similar settings in multi-agent systems, artificial life, cognitive models, etc.  A complete knowledge and analysis of all this is impossible  We are open to suggestions about how ideas from those areas can be useful here (spaces, reward generation, agent description language, …).

34 THANK YOU! INTERACTION SETTINGS FOR MEASURING (SOCIAL) INTELLIGENCE IN MULTI-AGENT SYSTEMS 34  Most especially to the other members of the anYnt project:  for their joint work, ideas, material, software, experiments, patience and support:  David L. Dowe, Monash, Computer Science and Software Engineering Dept, Monash, Australia  M.Victoria Hernandez-Lloreda, Dpto. de Metodología de las Ciencias del Comportamiento, UAM, Spain  Sergio España. DSIC, UPV, Spain.

