A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork Thomas R. Ioerger, Yu Zhang, Richard Volz, John Yen (PSU-IST)

Slides:

Advertisements

Similar presentations

FIPA Interaction Protocol. Request Interaction Protocol Summary –Request Interaction Protocol allows one agent to request another to perform some action.

Advertisements

Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.

1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.

SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.

Meta-Level Control in Multi-Agent Systems Anita Raja and Victor Lesser Department of Computer Science University of Massachusetts Amherst, MA

1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.

Mehdi Amirijoo1 Dynamic power management n Introduction n Implementation, levels of operation n Modeling n Power and performance issues regarding.

Artificial Intelligence Knowledge-based Agents Russell and Norvig, Ch. 6, 7.

Specifying Agent Interaction Protocols with AUML and OCL COSC 6341 Project Presentation Alexei Lapouchnian November 29, 2000.

Modeling Command and Control in Multi-Agent Systems* Thomas R. Ioerger Department of Computer Science Texas A&M University *funding provided by a MURI.

Decentralised Coordination of Mobile Sensors using the Max-Sum Algorithm Ruben Stranders, Alex Rogers, Nick Jennings School of Electronics and Computer.

TEXAS A&M UNIVERSITY AND THE UNIVERSITY OF TEXAS AT AUSTIN Army Digitization Research Initiative Dr. Richard A. Volz (Computer Science) Dr. Tom Ioerger.

Scheduling with Uncertain Resources Reflective Agent with Distributed Adaptive Reasoning RADAR.

Introduction and Overview “the grid” – a proposed distributed computing infrastructure for advanced science and engineering. Purpose: grid concept is motivated.

Modeling Command and Control in Multi-Agent Systems* Thomas R. Ioerger Department of Computer Science Texas A&M University *funding provided by a MURI.

Reasoning About Beliefs, Observability, and Information Exchange in Teamwork Thomas R. Ioerger Department of Computer Science Texas A&M University.

Modeling Belief Reasoning in Multi-Agent Systems* Thomas R. Ioerger Department of Computer Science Texas A&M University *funding provided by a MURI grant.

Ant Colonies As Logistic Processes Optimizers

Modeling Teamwork in the CAST Multi-Agent System Thomas R. Ioerger Department of Computer Science Texas A&M University.

Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 7: Expert Systems and Artificial Intelligence Decision Support.

Luís Moniz Pereira Centro de Inteligência Artificial - CENTRIA Universidade Nova de Lisboa, Portugal Pierangelo Dell’Acqua Dept. of Science and Technology.

Modeling Capabilities and Workload in Intelligent Agents for Simulating Teamwork Thomas R. Ioerger, Linli He, Deborah Lord Dept. of Computer Science, Texas.

A Multi-Agent Learning Approach to Online Distributed Resource Allocation Chongjie Zhang Victor Lesser Prashant Shenoy Computer Science Department University.

Modeling Teamwork in Multi-Agent Systems: The CAST Architecture Dr. Thomas Ioerger, Jianwen Yin, and Michael Miller Computer Science, Texas A&M University.

Texas A&M University CAST: Collaborative Agents for Simulating Teamwork John Yen, Jianwen Yin, Thomas R. Ioerger, Michael Miller, Dianxiang Xu, Richard.

1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.

Distributed Quality-of-Service Routing of Best Constrained Shortest Paths. Abdelhamid MELLOUK, Said HOCEINI, Farid BAGUENINE, Mustapha CHEURFA Computers.

Distributed Real-Time systems 1 By: Mahdi Sadeghizadeh Website: Sadeghizadeh.ir Advanced Computer Networks.

COGNITIVE RADIO FOR NEXT-GENERATION WIRELESS NETWORKS: AN APPROACH TO OPPORTUNISTIC CHANNEL SELECTION IN IEEE BASED WIRELESS MESH Dusit Niyato,

1 1 Slide © 2005 Thomson/South-Western Chapter 9, Part B Hypothesis Tests Population Proportion Population Proportion Hypothesis Testing and Decision Making.

Managing Social Influences through Argumentation-Based Negotiation Present by Yi Luo.

Private Whys? An Integrated Discovery Unit. Private Whys? Cast of Characters  Writers: –Deanna Blackmon, retired teacher, writing specialist –Sandy Hughes,

Collectively Cognitive Agents in Cooperative Teams Jacek Brzeziński, Piotr Dunin-Kęplicz Institute of Computer Science, Polish Academy of Sciences Barbara.

Learning HMM-based cognitive load models for supporting human-agent teamwork Xiaocong Fan, Po-Chun Chen, John Yen 소프트컴퓨팅연구실황주원.

Techniques for Analysis and Calibration of Multi- Agent Simulations Manuel Fehler Franziska Klügl Frank Puppe Universität Würzburg Lehrstuhl für Künstliche.

1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.

Challenges in Urban Meteorology: A Forum for Users and Providers OFCM Workshop Summaries Lt Col Rob Rizza Assistant Federal Coordinator for USAF/USA Affairs.

Stochastic Routing Routing Area Meeting IETF 82 (Taipei) Nov.15, 2011.

Linking multi-agent simulation to experiments in economy Re-implementing John Duffy’s model of speculative learning agents.

Feb 24, 2003 Agent-based Proactive Teamwork John Yen University Professor of IST School of Information Sciences and Technology The Pennsylvania State University.

Toward Optimal and Efficient Adaptation in Web Processes Prashant Doshi LSDIS Lab., Dept. of Computer Science, University of Georgia Joint work with: Kunal.

MOBILE AGENTS What is a software agent ? Definition of an Agent (End-User point of view): An agent is a program that assists people and acts on their behalf.

Inferential Statistics A Closer Look. Analyze Phase2 Nature of Inference in·fer·ence (n.) “The act or process of deriving logical conclusions from premises.

Artificial intelligence methods in the CO 2 permission market simulation Jarosław Stańczak *, Piotr Pałka **, Zbigniew Nahorski * * Systems Research Institute,

1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.

Intelligent Agents: Technology and Applications Agent Teamwork IST 597B Spring 2003 John Yen.

Department of Computer Science Aruna Balasubramanian, Brian Neil Levine, Arun Venkataramani DTN Routing as a Resource Allocation Problem.

Chapter 6 DECISION MAKING: THE ESSENCE OF THE MANAGER’S JOB 6.1 © 2003 Pearson Education Canada Inc.

Tetris Agent Optimization Using Harmony Search Algorithm

Multiagent System Katia P. Sycara 일반대학원 GE 랩 성연식.

Learning in the Large Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning in the Large MIT CSAIL PIs:

Franciszek Seredynski, Damian Kurdej Polish Academy of Sciences and Polish-Japanese Institute of Information Technology APPLYING LEARNING CLASSIFIER SYSTEMS.

1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.

Wumpus World 1 Wumpus and 1 pile of gold in a pit-filled cave Starts in [1,1] facing right - Random cave Percepts: [smell, breeze, gold, bump, scream]

Cognitive Architectures For Physical Agents Sara Bolduc Smith College CSC 290.

4 th International Conference on Service Oriented Computing Adaptive Web Processes Using Value of Changed Information John Harney, Prashant Doshi LSDIS.

RADAR February 15, RADAR /Space-Time Learning.

Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia

Is Ignoring Public Information Best Policy? Reinforcement Learning In Information Cascade Toshiji Kawagoe Future University - Hakodate and Shinichi Sasaki.

Wagner Associates NCSD-ADS-DOC ARO Workshop on Cyber Situation Awareness RPD-inspired Hypothesis Reasoning for Cyber Situation Awareness.

Distributed cooperation and coordination using the Max-Sum algorithm

Formal Complexity Analysis of RoboFlag Drill & Communication and Computation in Distributed Negotiation Algorithms in Distributed Negotiation Algorithms.

Intelligent Agents: Technology and Applications Unit Five: Collaboration and Task Allocation IST 597B Spring 2003 John Yen.

Keep the Adversary Guessing: Agent Security by Policy Randomization

The Agility Imperative: Agile C2 for Complex Endeavors

Statistical Process Control

Discrete Event Simulation - 4

St. Edward’s University

CASE − Cognitive Agents for Social Environments

Slides by JOHN LOUCKS St. Edward’s University.

Presentation transcript:

A Decision-Theoretic Approach to Designing Proactive Communication in Multi-Agent Teamwork Thomas R. Ioerger, Yu Zhang, Richard Volz, John Yen (PSU-IST) Dept. of Computer Science Texas A&M University

2 Motivation Agent Multi-Agent Team  Agents share a large amount of knowledge about the teamwork.  Hard coded Interactions among participants.  High-frequency message exchange.  Communication risk.

3 Challenging Issues in Designing Communication Protocols  Each agent has incomplete information from which uncertainties arise.  Each agent has different problem solving capabilities.  Data are decentralized and lack systems’ global control.  Excessive/unrestricted communication leads to lack of scalability

4 Our Approach and Its Contributions Proactive Communication  OBPC: Reduction of communication load through OBservations.  DIP: Dynamic estimation of the probability distribution of Information Production and need.  DTPC: Decision-Theoretic determination of communication strategies.

5 Background  CAST (Collab. Agents for Simulating Teamwork)  MALLET (Multi-Agent Logic-based Language for Encoding Teamwork) (team-plan killwumpus(?w) (process (seq (agent-bind ?ca (constraint (play-role ?ca scout))) (DO ?ca (findwumpus ?w))) (agent-bind ?fi (constraint ((play-role ?fi fighter) (closest-to-wumpus ?fi ?w)))) (DO ?fi (movetowumpus ?w)) (DO ?fi (shootwumpus ?w)))))) (ioper shootwumpus (?w) (pre-cond (wumpus ?w) (location ?w ?x ?y) (dead ?w false)) (effect (dead ?w true)))

6 Overview CAST KB Proactive Communication OBPC DIP DTPC Optimal Communication Strategy Team Structure & Teamwork Procedure

7 Agent Execution Cycle Observe Sense Predict Info. need and production Decide Strategy Communicate Information Act Effect Execution Cycle

8 Syntax of Observability ::= (CanSee )* (BelieveCanSee )* ::= ::= | ::= ( )* ::= ( ) ::= (DO ( )) ::= | ::=

9 Example Observability Rules (CanSee ca (location ?o ?x ?y) (location ca ?xc ?yc) (location ?o ?x ?y) (inradius ?x ?y ?xc ?yc rca) ) //The carrier can see the location property of an object. (CanSee ca (DO ?fi (shootwumpus ?w)) (play-role fighter ?fi) (location ca ?xc ?yc) (location ?fi ?x ?y) (adjacent ?xc ?yc ?x ?y) ) //The carrier can see the shootwumpus action of a fighter. (BelieveCanSee ca fi (location ?o ?x ?y) (location fi ?xi ?yi) (location ?o ?x ?y) (inradius ?x ?y ?xi ?yi rfi) ) //The carrier believes the fighter is able to see the location property of an object. (BelieveCanSee ca fi (DO ?f (shootwumpus ?w)) (play-role fighter ?f) (  ?f fi) (location ca ?xc ?yc) (location fi ?xi ?yi) (location ?f ?x ?y) (inradius ?xi ?yi ?xc ?yc rca) (inradius ?x ?y ?xc ?yc rca) (adjacent ?x ?y ?xi ?yi) ) //The carrier believes the fighter is able to see the shootwumpus action of another fighter.

10 Proactive Communication Based on Observation  ProactiveTell –A provider reasons about what information it will have. –A provider reasons about whether to deliver a piece of information when having the information.  ActiveAsk –A needer reasons about what information it will need. –A needer reasons about whether to ask for a piece of information when needing the information.

11 Evaluation  20 wumpuses, 8 pits, and 20 piles of gold per world.  1 carrier and 3 fighters compose a team.  The team goal is to kill wumpuses and get the gold without being killed.  5 randomly generated worlds with 20×20 cells. Multi-Agent Wumpus World

12 Decision-Theoretic Proactive Communication  Strategies  Utility Function  Cost Function  Value Function  Decision-Making

13 Decision-Making on Situation PA e e a-b: ProactiveTell a-b: Silence b-a: Accept b-a: Wait b-a: Silence e e b-a: ActiveAsk Situation PA: Provider produces a new piece of information a: provider b: needer e: end

14 DM on Situation PB 0 a-b: Reply e a-b: WaitUntilNext Situation PB: Provider receives a request for a piece of information e

15 DM on Situation NA b-a: ActiveAsk b-a: Silence b-a: Wait a-b: Reply a-b: WaitUntilNext a-b: Silence a-b: ProactiveTell Situation NA: Needer needs a piece of information t t e e e t: transfer

16 DM on Situation NB Situation NB: Needer receives a piece of information t 0 e b-a: Accept

17 Utility Function  Parameters in utility function: –I: information about which communication occurs –t: time of decision-making –t 1 : time at which I is needed –t 2 : time at which the value for I used is produced –SU: situation at t –S: strategy available at SU –M: a set of messages involving in obtaining I –E: environment state at t U(I, t, t 1, t 2, SU, S, M, E) =V(I, t, t 1, t 2, SU, S)–C(M)

18 Value Function V(I, t, t 1, t 2, SU, S) =T(I, t, t 1, t 2, SU, S)//Timeliness +R(I, t, t 1, t 2, SU, S)//Relevance

19  Timeliness –Whether agents use a value that can be produced in time when they need I. d(I, t, t 1, t 2, SU, S) = max(0, t 2 –t 1 ) ft(d(I, t, t 1, t 2, SU, S)) s.t. ft(x) < ft(y) if y < x T(I, t, t 1, t 2, SU, S) = ft(d(I, t, t 1, t 2, SU, S)) Timeliness Function

20 Relevance Function  Relevance –Unprocessed, Most recent, Important P(I, t, t 1, t 2, SU, S) = P r (I  t  t 1  t 2  no other value for I was produced between Int[t 1,t 2 ] | S  SU) fr I (P(I, t, t 1, t 2, SU, S)) s.t. fr I (x) < fr I (y) if x < y R(I, t, t 1, t 2, SU, S) = fr I (P(I, t, t 1, t 2, SU, S))

21 Cost Function 0 if M i =  C(M i ) = k 1 + k 2 × len(M i ) otherwise

22 Expected Utility E(U) = Time Strategy t1t1 t2t2 P.ProactiveTell P.Silence +T P.Reply P.WaitUntilNext N.ActiveAsk if a Reply if a WaitUnitlNext N.Silence N.Wait if a ProactiveTell +T if a Silence N.Accept

23 Strategies t Current time Unknown Known Next production Last sent Last not sent Last need aware of Unfulfilled need Situation PA: Situation PA: provider produces I ProactiveTell? Silence?

24 Strategies t Current time Unknown Known Next production Last production Situation PB: Situation PB: provider receives a request for I Reply? WaitUntilNext?

25 Strategies t Current time Unknown Known Next production Last I received Most recent production Situation NA: Situation NA: needer needs I ActiveAsk? Wait? Silence?

26 Strategies Situation NB: Situation NB: needer receives I Accept

27 Summary Advantages of Approach: allows agents to make intelligent choices of communication policy based on: –frequencies: of needs, of sensing, of info. change –costs: of messages, plus penalities for delays in action, or acting with incorrect information

28 Criteria for Applicable Domains  There are information needs among the team.  Agents can communicate.  There is uncertainty in the environment. –Stochastic properties of teamwork process. –Agents have incomplete/disjoint knowledge about the world.  The team acts under critical time constraints, so proactive assistance becomes important.