1 Ann Nowé Nature inspired agents to handle interaction in IT systems Ann Nowé Computational modeling Lab Vrije Universiteit Brussel.

Slides:



Advertisements
Similar presentations
Price Of Anarchy: Routing
Advertisements

Game Theory Assignment For all of these games, P1 chooses between the columns, and P2 chooses between the rows.
Continuation Methods for Structured Games Ben Blum Christian Shelton Daphne Koller Stanford University.
RCQ-ACS: RDF Chain Query Optimization Using an Ant Colony System WI 2012 Alexander Hogenboom Erasmus University Rotterdam Ewout Niewenhuijse.
(CS/SS 241) Introduction to SISL: Topics in Algorithmic game theory Adam Wierman – 258 Jorgensen John Ledyard – 102 Baxter Jason R. Marden – 335 Moore.
Fast Convergence of Selfish Re-Routing Eyal Even-Dar, Tel-Aviv University Yishay Mansour, Tel-Aviv University.
Some questions o What are the appropriate control philosophies for Complex Manufacturing systems? Why????Holonic Manufacturing system o Is Object -Oriented.
Learning in Multi-agent System
Computational Modeling Lab Wednesday 18 June 2003 Reinforcement Learning an introduction part 3 Ann Nowé By Sutton.
Game-theoretic analysis tools Necessary for building nonmanipulable automated negotiation systems.
1 Graphical Models for Online Solutions to Interactive POMDPs Prashant Doshi Yifeng Zeng Qiongyu Chen University of Georgia Aalborg University National.
Sogang University ICC Lab Using Game Theory to Analyze Wireless Ad Hoc networks.
Mechanism Design without Money Lecture 4 1. Price of Anarchy simplest example G is given Route 1 unit from A to B, through AB,AXB OPT– route ½ on AB and.
Satisfaction Equilibrium Stéphane Ross. Canadian AI / 21 Problem In real life multiagent systems :  Agents generally do not know the preferences.
IN SEARCH OF VALUE EQUILIBRIA By Christopher Kleven & Dustin Richwine xkcd.com.
Planning under Uncertainty
Game-Theoretic Approaches to Multi-Agent Systems Bernhard Nebel.
Basics on Game Theory For Industrial Economics (According to Shy’s Plan)
1 Sensor Networks and Networked Societies of Artifacts Jose Rolim University of Geneva.
Gabriel Tsang Supervisor: Jian Yang.  Initial Problem  Related Work  Approach  Outcome  Conclusion  Future Work 2.
INSTITUTO DE SISTEMAS E ROBÓTICA Minimax Value Iteration Applied to Robotic Soccer Gonçalo Neto Institute for Systems and Robotics Instituto Superior Técnico.
Autonomous Target Assignment: A Game Theoretical Formulation Gurdal Arslan & Jeff Shamma Mechanical and Aerospace Engineering UCLA AFOSR / MURI.
Outline MDP (brief) –Background –Learning MDP Q learning Game theory (brief) –Background Markov games (2-player) –Background –Learning Markov games Littman’s.
Correlated-Q Learning and Cyclic Equilibria in Markov games Haoqi Zhang.
Nash Q-Learning for General-Sum Stochastic Games Hu & Wellman March 6 th, 2006 CS286r Presented by Ilan Lobel.
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
Distributed Rational Decision Making Sections By Tibor Moldovan.
Distributed Computing with Adaptive Heuristics Michael Schapira Princeton Innovations in Computer Science 09 January 2011 Partially supported by NSF Aaron.
Ant Colony Optimization Optimisation Methods. Overview.
Network Routing Problem r Input: m network topology, link metrics, and traffic matrix r Output: m set of routes to carry traffic A B C D E S1S1 R1R1 R3R3.
Exploration in Reinforcement Learning Jeremy Wyatt Intelligent Robotics Lab School of Computer Science University of Birmingham, UK
L/O/G/O Ant Colony Optimization M1 : Cecile Chu.
01/16/2002 Reliable Query Reporting Project Participants: Rajgopal Kannan S. S. Iyengar Sudipta Sarangi Y. Rachakonda (Graduate Student) Sensor Networking.
MAKING COMPLEX DEClSlONS
Chapter 14: Artificial Intelligence Invitation to Computer Science, C++ Version, Third Edition.
Reinforcement Learning on Markov Games Nilanjan Dasgupta Department of Electrical and Computer Engineering Duke University Durham, NC Machine Learning.
A Framework for Distributed Model Predictive Control
General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning Duke University Machine Learning Group Discussion Leader: Kai Ni June 17, 2005.
Network Economics: two examples Marc Lelarge (INRIA-ENS) SIGMETRICS, London June 14, 2012.
Swarm Computing Applications in Software Engineering By Chaitanya.
Swarm Intelligence 虞台文.
SWARM INTELLIGENCE Sumesh Kannan Roll No 18. Introduction  Swarm intelligence (SI) is an artificial intelligence technique based around the study of.
Game-theoretic analysis tools Tuomas Sandholm Professor Computer Science Department Carnegie Mellon University.
Dynamic Programming for Partially Observable Stochastic Games Daniel S. Bernstein University of Massachusetts Amherst in collaboration with Christopher.
Ant Colony Optimization Theresa Meggie Barker von Haartman IE 516 Spring 2005.
Learning Automata based Approach to Model Dialogue Strategy in Spoken Dialogue System: A Performance Evaluation G.Kumaravelan Pondicherry University, Karaikal.
Modeling and Simulation. Warm-up Activity (1 of 3) You will be given a set of nine pennies. Let’s assume that one of the pennies is a counterfeit that.
A Study of Central Auction Based Wholesale Electricity Markets S. Ceppi and N. Gatti.
Neural Networks and Machine Learning Applications CSC 563 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi.
1 Intrinsic Robustness of the Price of Anarchy Tim Roughgarden Stanford University.
ECO290E: Game Theory Lecture 13 Dynamic Games of Incomplete Information.
Optimizing Pheromone Modification for Dynamic Ant Algorithms Ryan Ward TJHSST Computer Systems Lab 2006/2007 Testing To test the relative effectiveness.
Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Competitive Scheduling in Wireless Networks with Correlated Channel State Ozan.
Algorithmic, Game-theoretic and Logical Foundations
1 What is Game Theory About? r Analysis of situations where conflict of interests is present r Goal is to prescribe how conflicts can be resolved 2 2 r.
Designing Games for Distributed Optimization Na Li and Jason R. Marden IEEE Journal of Selected Topics in Signal Processing, Vol. 7, No. 2, pp ,
AntNet: A nature inspired routing algorithm
GridNets 2006 – October 1 st Grid Resource Management by means of Ant Colony Optimization Gustavo Sousa Pavani and Helio Waldman Optical Networking Laboratory.
MAIN RESULT: We assume utility exhibits strategic complementarities. We show: Membership in larger k-core implies higher actions in equilibrium Higher.
Path Planning Based on Ant Colony Algorithm and Distributed Local Navigation for Multi-Robot Systems International Conference on Mechatronics and Automation.
What is Ant Colony Optimization?
An Architecture-Centric Approach for Software Engineering with Situated Multiagent Systems PhD Defense Danny Weyns Katholieke Universiteit Leuven October.
On the Difficulty of Achieving Equilibrium in Interactive POMDPs Prashant Doshi Dept. of Computer Science University of Georgia Athens, GA Twenty.
R. Brafman and M. Tennenholtz Presented by Daniel Rasmussen.
Day 9 GAME THEORY. 3 Solution Methods for Non-Zero Sum Games Dominant Strategy Iterated Dominant Strategy Nash Equilibrium NON- ZERO SUM GAMES HOW TO.
7th Int'l Workshop on Rare Event Simulation, Sept , Rennes, France Ant Colony Optimized Importance Sampling: Principles, Applications and Challenges.
Instructor: Ruta Mehta TA: TBA
   Storage Space Allocation at Marine Container Terminals Using Ant-based Control by Omor Sharif and Nathan Huynh Session 677: Innovations in intermodal.
Overview of SWARM INTELLIGENCE and ANT COLONY OPTIMIZATION
Normal Form (Matrix) Games
Presentation transcript:

1 Ann Nowé Nature inspired agents to handle interaction in IT systems Ann Nowé Computational modeling Lab Vrije Universiteit Brussel

2 Ann Nowé A challenging multi-agent system Routing in telecom networks Non-stationary state dependent Highly distributed Communication cost Competition - Collaboration

3 Ann Nowé Road map Agent Technology: ‘Agent based computing is probably the most important new paradigm for software development since object orientated programming.’ Agent technology three perspectives. Agents as Design Metaphor. Agents as a Simulation. Agents as a source of Technologies. Computing as Interaction, [M Luck, et.al.]

4 Ann Nowé Multi-agent Learning   I R S A S A

5 Ann Nowé Single agent RL in Markovian environment : convergence to optimal policy is guaranteed MAS RL : convergence to Pareto Optimal Nash Equilibrium not guaranteed But (Narendra and Wheeler, 1989) Players in an n-person non-zero sum game who use independently a reward-inaction update scheme with an arbitrarily small step size will always converge to one of the equilibrium points. Which equilibrium point is reached depends on the initial conditions. Background theory

6 Ann Nowé The Homo Egualis society => Coordinated exploration Payoff i is doing better than j i is doing worse than j

7 Ann Nowé Conflicting Interest games: periodical policies

8 Ann Nowé Non-communication period : agents are selfish RL. Exploration: agents can exclude actions from their private action space  other Nash equilibria can be exploited. Coordinated Exploration: 2 phases Selfish RL agents, with social rules

9 Ann Nowé ● Master-slave software ● Coarse granular hardware ● Heterogeneous nodes ● Improve parallel efficiency using LA ● Communication bottleneck ● Every computing node has LA ● IR = 1/ blocking time ● LA learns the amount of work to request ● Results: ● Computing time -39% ● Blocking time -62% MAS & parallel computing

10 Ann Nowé More on LA LA can solve MDP’s 1 LA per state Information is shared between states in an ant-like way LA games (tree structured Markov games) n LA per state, with n number of agents Monte-Carlo updates LA for general Markov Games is still a research topic.

11 Ann Nowé Multi-type ACO  Multiple ant colonies are used.  Each colony has their own type of pheromone  Ants are attracted by their own type of pheromone but repelled by other types.  Each colony converges to their own path, disjoint with paths of other colonies.  Multiple ant colonies are used.  Each colony has their own type of pheromone  Ants are attracted by their own type of pheromone but repelled by other types.  Each colony converges to their own path, disjoint with paths of other colonies.

12 Ann Nowé Conclusion We focus on simple agents with simple learning rules. Nature inspired : social system & ant’s viewpoint A general framework to study the dynamics of these new RL techniques is needed in order to justify their use and show their robustness. Only then they will find their way to real-world future IT applications.