Introduction to Collectives

Slides:

Advertisements

Similar presentations

Reinforcement Learning

Advertisements

Price Of Anarchy: Routing

Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.

3. Basic Topics in Game Theory. Strategic Behavior in Business and Econ Outline 3.1 What is a Game ? The elements of a Game The Rules of the.

1 Dynamic Programming Week #4. 2 Introduction Dynamic Programming (DP) –refers to a collection of algorithms –has a high computational complexity –assumes.

Negotiating a stable distribution of the payoff among agents may prove challenging. The issue of coalition formation has been investigated extensively,

Evolutionary Game Algorithm for continuous parameter optimization Alireza Mirian.

A system Performance Model Instructor: Dr. Yanqing Zhang Presented by: Rajapaksage Jayampthi S.

Sogang University ICC Lab Using Game Theory to Analyze Wireless Ad Hoc networks.

1 Reinforcement Learning Introduction & Passive Learning Alan Fern * Based in part on slides by Daniel Weld.

Algoritmi per Sistemi Distribuiti Strategici

Game-Theoretic Approaches to Multi-Agent Systems Bernhard Nebel.

An Introduction to Game Theory Part II: Mixed and Correlated Strategies Bernhard Nebel.

Lecture 1 - Introduction 1.  Introduction to Game Theory  Basic Game Theory Examples  Strategic Games  More Game Theory Examples  Equilibrium  Mixed.

GridFlow: Workflow Management for Grid Computing Kavita Shinde.

The Communication Complexity of Coalition Formation Among Autonomous Agents A. D. Procaccia & J. S. Rosenschein.

1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel.

QR 38, 2/13/07 Rationality and Expected Utility I. Rationality II. Expected utility III. Sets and probabilities.

A Heuristic Bidding Strategy for Multiple Heterogeneous Auctions Patricia Anthony & Nicholas R. Jennings Dept. of Electronics and Computer Science University.

A Principled Information Valuation for Communications During Multi-Agent Coordination Simon A. Williamson, Enrico H. Gerding, Nicholas R. Jennings School.

Competition between adaptive agents: learning and collective efficiency Damien Challet Oxford University Matteo Marsili ICTP-Trieste (Italy)

Autonomous Target Assignment: A Game Theoretical Formulation Gurdal Arslan & Jeff Shamma Mechanical and Aerospace Engineering UCLA AFOSR / MURI.

Why How We Learn Matters Russell Golman Scott E Page.

Planning in MDPs S&B: Sec 3.6; Ch. 4. Administrivia Reminder: Final project proposal due this Friday If you haven’t talked to me yet, you still have the.

1 A Game Theoretic Formulation of the Dynamic Sensor Coverage Problem Jason Marden ( UCLA ) Gürdal Arslan ( University of Hawaii ) Jeff Shamma ( UCLA )

On Multi-Path Routing Aditya Akella 03/25/02. What is Multi-Path Routing?  Dynamically route traffic Multiple paths to a destination Path taken dependant.

1 Optimizing Utility in Cloud Computing through Autonomic Workload Execution Reporter : Lin Kelly Date : 2010/11/24.

Advanced Topics in Optimization

On Cost-benefit Evaluation Methods of Government- invested IT Projects CNAO's Wuhan Resident Office Haiyan zhang.

Statistical Multiplexer of VBR video streams By Ofer Hadar Statistical Multiplexer of VBR video streams By Ofer Hadar.

Algorithms for Self-Organization and Adaptive Service Placement in Dynamic Distributed Systems Artur Andrzejak, Sven Graupner,Vadim Kotov, Holger Trinks.

Bottom-Up Coordination in the El Farol Game: an agent-based model Shu-Heng Chen, Umberto Gostoli.

01/16/2002 Reliable Query Reporting Project Participants: Rajgopal Kannan S. S. Iyengar Sudipta Sarangi Y. Rachakonda (Graduate Student) Sensor Networking.

A User Experience-based Cloud Service Redeployment Mechanism KANG Yu.

By: Gang Zhou Computer Science Department University of Virginia 1 A Game-Theoretic Framework for Congestion Control in General Topology Networks SYS793.

MAKING COMPLEX DEClSlONS

Introduction to Adaptive Digital Filters Algorithms

The Multiplicative Weights Update Method Based on Arora, Hazan & Kale (2005) Mashor Housh Oded Cats Advanced simulation methods Prof. Rubinstein.

Topology aggregation and Multi-constraint QoS routing Presented by Almas Ansari.

ANTs PI Meeting, Nov. 29, 2000W. Zhang, Washington University1 Flexible Methods for Multi-agent distributed resource Allocation by Exploiting Phase Transitions.

1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.

CS584 - Software Multiagent Systems Lecture 12 Distributed constraint optimization II: Incomplete algorithms and recent theoretical results.

CHAPTER 3 NATIONAL INCOME: WHERE IT COMES FROM AND WHERE IT GOES ECN 2003 MACROECONOMICS 1 Assoc. Prof. Yeşim Kuştepeli.

What is a Sensor Web ? Abhinav Roongta Wireless Information Networking Group University of Florida March 3, 2004.

1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.

Chapter 7 Sampling Distributions Statistics for Business (Env) 1.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor ： Dr. Hsu Presenter ： Keng-Wei Chang Author: Yehuda.

1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.

Kanpur Genetic Algorithms Laboratory IIT Kanpur 25, July 2006 (11:00 AM) Multi-Objective Dynamic Optimization using Evolutionary Algorithms by Udaya Bhaskara.

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 9 of 42 Wednesday, 14.

A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA Introduction.

Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Competitive Scheduling in Wireless Networks with Correlated Channel State Ozan.

Complexity in the Economy and Business IBM Almaden Institute April 12, 2007 W. Brian Arthur External Professor, Santa Fe Institute.

1 Multiagent Teamwork: Analyzing the Optimality and Complexity of Key Theories and Models David V. Pynadath and Milind Tambe Information Sciences Institute.

1 ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 21: Dynamic Multi-Criteria RL problems Dr. Itamar Arel College of Engineering Department.

Algorithms For Solving History Sensitive Cascade in Diffusion Networks Research Proposal Georgi Smilyanov, Maksim Tsikhanovich Advisor Dr Yu Zhang Trinity.

Warsaw Summer School 2015, OSU Study Abroad Program Normal Distribution.

The El Farol Bar Problem on Complex Networks Maziar Nekovee BT Research Mathematics of Networks, Oxford, 7/4/2006.

SM Sec.1 Dated 13/11/10 STRATEGY & STRUCTURE Group 3.

The Price of Routing Unsplittable Flow Yossi Azar Joint work with B. Awerbuch and A. Epstein.

Computing Shapley values, manipulating value division schemes, and checking core membership in multi-issue domains Vincent Conitzer, Tuomas Sandholm Computer.

Yue Zhang, Nathan Vance, and Dong Wang

Artificial Intelligence

Introduction Artificial Intelligent.

Data and Computer Communications

Artificial Intelligence

EASTERN MEDITERRANEAN UNIVERSITY DEPARTMENT OF INDUSTRIAL ENGINEERING IENG314 OPERATIONS RESEARCH II SAMIR SAMEER ABUYOUSSEF

The Price of Routing Unsplittable Flow

Normal Form (Matrix) Games

Presentation transcript:

Introduction to Collectives Kagan Tumer NASA Ames Research Center kagan@ptolemy.arc.nasa.gov http://ic.arc.nasa.gov/~kagan http://ic.arc.nasa.gov/projects/COIN/index.html (Joint work with David Wolpert)

Outline Introduction to collectives Definition / Motivation A naturally occurring example Illustration of theory of collectives I Central equation of collectives Interlude 1: Autonomous defects problem (Johnson and Challet) Illustration of theory of collectives II Aristocrat utility Wonderful life utility Interlude 2: El Farol bar problem: System equilibria and global optima Collective of rovers: Scientific return maximization Final thoughts CDCS 2002 K. Tumer

Motivation Most complex systems, not only can be, but need to be viewed as collectives. Examples include: Control of a constellation of communication satellites Routing data/vehicles over a communication network/highway Dynamic data migration over large distributed databases Dynamic job scheduling across a (very) large computer grid Coordination of rovers/submersibles on Mars/Europa Control of the elements of an amorphous computer/telescope Construction of parallel algorithms for optimization problems Autonomous defects Problem CDCS 2002 K. Tumer

Collectives A Collective is A (perhaps massive) set of agents; All of which have “personal” utilities they are trying to achieve; Together with a world utility function measuring the full system’s performance. Given that the agents are good at optimizing their personal utilities, the crucial problem is an inverse problem: How should one set (and potentially update) the personal utility functions of the agents so that they “cooperate unintentionally” and optimize the world utility? CDCS 2002 K. Tumer

Natural Example: Human Economy World utility is GDP Agents are the individual humans Agents try to maximize their own “personal” utilities Design problem is: How to modify personal utilities of the agents through incentives or regulations (e.g., tax breaks, SEC regulations against insider trading, antitrust laws) to achieve high GDP? Note: A. Greenspan does not tell each individual what to do. Economics hamstrung by “pre-set agents” No such restrictions for an artificial collective CDCS 2002 K. Tumer

Outline Illustration of Theory of Collectives I Introduction to Collectives Definition / Motivation A naturally occurring example Illustration of Theory of Collectives I Central Equation of Collectives Interlude 1: Autonomous defects problem (Johnson and Challet) Illustration of theory of collectives II Aristocrat utility Wonderful life utility Interlude 2: El Farol bar problem: System equilibria and global optima Collective of rovers: Scientific return maximization Final thoughts CDCS 2002 K. Tumer

Nomenclature h : an agent z : state of all agents across all time z h,t : state of agent h at time t z ^h,t : state of all agents other than h at time t z tn z h1,t0 z ^h4,t0 z h4 CDCS 2002 K. Tumer

Key Concepts for Collectives Factoredness: Degree to which an agent’s personal utility is aligned with the world utility (e.g., quantifies “if you get rich, world benefits” concept). Learnability: Signal-to-noise measure. Quantifies how sensitive an agent’s personal utility function is to a change in its state. Intelligence: Percentage of states that would have resulted in agent h having a worse utility (e.g., SAT-like percentile concept). CDCS 2002 K. Tumer

Central Equation of Collectives Our ability to control system consists of setting some parameters s (e.g, agents' goals): Explore vs. Exploit Factoredness Learnability Operations Research Economics Machine Learning eG and eg are intelligences for the agents w.r.t the world utility (G) and their personal utilities (g) , respectively CDCS 2002 K. Tumer

Outline Interlude 1: Autonomous defects problem (Johnson and Challet) Introduction to Collectives Definition / Motivation A naturally occurring example Illustration of Theory of Collectives I Central Equation of Collectives Interlude 1: Autonomous defects problem (Johnson and Challet) Illustration of Theory of Collectives II Aristocrat utility Wonderful life utility Interlude 2: El Farol bar problem: System equilibria and global optima Collective of rovers: Scientific return maximization Final thoughts CDCS 2002 K. Tumer

Autonomous Defects Problem Given a collection of faulty devices, how to choose the subset of those devices that, when combined with each other, gives optimal performance (Johnson & Challet). aj : distortion of component j nk: action of agent k (nk = 0 ; 1) Collective approach: Identify each agent with a component. Question: what utility should each agent try to maximize? CDCS 2002 K. Tumer

Autonomous Defects Problem (N=100) CDCS 2002 K. Tumer

Autonomous Defects Problem (N=1000) CDCS 2002 K. Tumer

Autonomous Defects Problem: Scaling CDCS 2002 K. Tumer

Outline Illustration of Theory of Collectives II Aristocrat utility Introduction to Collectives Definition / Motivation A naturally occurring example Illustration of Theory of Collectives I Central Equation of Collectives Interlude 1: Autonomous defects problem (Johnson and Challet) Illustration of Theory of Collectives II Aristocrat utility Wonderful life utility Interlude 2: El Farol bar problem: System equilibria and global optima Collective of rovers: Scientific return maximization Final thoughts CDCS 2002 K. Tumer

Personal Utility Recall central equation: Factoredness Learnability Solve for personal utility g that maximizes learnability, while constrained to the set of factored utilities CDCS 2002 K. Tumer

Aristocrat Utility pi(zh) = One can solve for factored U with maximal learnability, i.e., a U with good term 2 and 3 in central equation: Intuitively, AU reflects the difference between the actual G and the average G (averaged over all actions you could take). For simplicity, when evaluating AU here, we make the following approximation: 1 Number of possible actions for h pi(zh) = CDCS 2002 K. Tumer

Clamping Clamping parameter CLhv: replace h’s state (taken to be unary vector) with constant vector v Clamping creates a new “virtual” worldline In general v need not be a “legal” state for h Example: four agents, three actions. Agent h2 clamps to “average action” vector a = (.33 .33 .33): 3 0 9 0 0 0 1 1 1 0 0 0 CDCS 2002 K. Tumer

Wonderful Life Utility The Wonderful Life Utility (WLU) for h is given by: Clamping to “null” action (v = 0) removes player from system (hence the name). Clamping to “average” action disturbs overall system minimally (can be viewed as approximation to AU). Theorem: WLU is factored regardless of v Intuitively, WLU measures the impact of agent h on the world Difference between world as it is, and world without h Difference between world as it is, and world where h takes average action WLU is “virtual” operation. System is not re-evolved. CDCS 2002 K. Tumer

Outline Introduction to Collectives Definition / Motivation A naturally occurring example Illustration of Theory of Collectives I Central Equation of Collectives Interlude 1: Autonomous defects problem (Johnson and Challet) Illustration of Theory of Collectives II Aristocrat utility Wonderful life utility Interlude 2: El Farol bar problem: System equilibria and global optima Collective of rovers: Scientific return maximization Final thoughts CDCS 2002 K. Tumer

El Farol Bar Problem Congestion game: A game where agents share the same action space, and world utility is a function purely of how many agents take each action. Illustrative Example: Arthur’s El Farol bar problem: At each time step, each agent decides whether to attend a bar: If agent attends and bar is below capacity, agent gets reward If agent stays home and bar is above capacity, agent gets reward Problem is particularly interesting because rational agents cannot all correctly predict attendance: If most agents predict attendance will be low and therefore attend, attendance will be high If most agents predict high attendance and therefore do not attend … CDCS 2002 K. Tumer

Modified El Farol Bar Problem Each week agents select one of seven nights to attend a bar Attendance for night k at week t Capacity of bar Reward for night k at week t Rt : Reward for week t Further modifications: Each week each agent selects two nights to attend bar. ... Each week each agent selects six nights to attend bar. CDCS 2002 K. Tumer

Personal Utility Functions Two conventional utilities: Uniform Division (UD): Divide each night’s total reward among all agents that attended that night (the “natural” reward) Team Game (TG): Total world reward at time t (Rt) Three collective-based utilities: WL 0 : WL utility with clamping parameter set to vector of 0s (world utility minus “world utility without me”) WL 1 : WL utility with clamping parameter set to vector of 1s (world utility minus “world utility where I attend every night”) WL a : WL utility with clamping parameter set to vector of average action (world utility minus “world utility where I do what is “expected of me”) CDCS 2002 K. Tumer

Bar Problem: Utility Comparison (Attend one night, 60 agents, c=3) CDCS 2002 K. Tumer

Typical Daily Bar Attendance (c=6; t=1000 s ; Number of agents = 168) CDCS 2002 K. Tumer

Scaling Properties (attend one night) c=2,3,4,6,8,10,15, respectively CDCS 2002 K. Tumer

Performance vs. # of Nights to Attend 60 agents; c= 3,6,8,10,10,12,15 respectively CDCS 2002 K. Tumer

Collectives of Rovers Design a collective of autonomous agents to gather scientific information (e.g., rovers on Mars, submersibles under Europa) Some areas have more valuable information than others World Utility: Total importance weighted information collected Both the individual rovers and the collective need to be flexible so they can adapt to new circumstances Collective-based payoff utilities result in better performance than more “natural” approaches CDCS 2002 K. Tumer

World Utility Token value function: World Utility : L : Location Matrix for all agents Lh : Location Matrix agent h Lh,ta: Location Matrix of agent h at time t, had it taken action a at t-1 Q: Initial token configuration World Utility : Note: Agents’ payoff utilities reduce to figuring out what “L” to use. CDCS 2002 K. Tumer

Payoff Utilities Selfish Utility : Team Game Utility : Collectives-Based Utility (theoretical): Collectives-Based Utility (practical): CDCS 2002 K. Tumer

Utility Comparison in Rover Domain 100 rovers on a 32x32 grid CDCS 2002 K. Tumer

Scaling Properties in Rover Domain CDCS 2002 K. Tumer

Summary Given a world utility, deploying RL algorithms provides a solution to the distributed design problem. But what utilities does one use? Theory of collectives shows how to configure and/or update the personal utilities of the agents so that they “unintentionally cooperate” to optimize the world utility Personal utilities based on collectives successfully applied to many domains (e.g., autonomous rovers, constellations of communication satellites, data routing, autonomous defects) Performance gains due to using collectives-based utilities increase with size of problem A fully fleshed science of collectives would benefit from and have applications to many other sciences CDCS 2002 K. Tumer