Information Sharing in Large Heterogeneous Teams Prasanna Velagapudi Robotics Institute Carnegie Mellon University FRC Seminar - August 13, 2009.

Slides:

Advertisements

Similar presentations

Heuristic Search techniques

Advertisements

Impact of Interference on Multi-hop Wireless Network Performance Kamal Jain, Jitu Padhye, Venkat Padmanabhan and Lili Qiu Microsoft Research Redmond.

Dialogue Policy Optimisation

Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.

Towards a Theoretic Understanding of DCEE Scott Alfeld, Matthew E

1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.

Problem Solving by Searching Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 3 Spring 2007.

Partially Observable Markov Decision Process (POMDP)

Modeling Maze Navigation Consider the case of a stationary robot and a mobile robot moving towards a goal in a maze. We can model the utility of sharing.

1 Greedy Forwarding in Dynamic Scale-Free Networks Embedded in Hyperbolic Metric Spaces Dmitri Krioukov CAIDA/UCSD Joint work with F. Papadopoulos, M.

Randomized Sensing in Adversarial Environments Andreas Krause Joint work with Daniel Golovin and Alex Roper International Joint Conference on Artificial.

GRASP University of Pennsylvania NRL logo? Autonomous Network of Aerial and Ground Vehicles Vijay Kumar GRASP Laboratory University of Pennsylvania Ron.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Solving POMDPs Using Quadratically Constrained Linear Programs Christopher Amato.

Playback delay in p2p streaming systems with random packet forwarding Viktoria Fodor and Ilias Chatzidrossos Laboratory for Communication Networks School.

On Large-Scale Peer-to-Peer Streaming Systems with Network Coding Chen Feng, Baochun Li Dept. of Electrical and Computer Engineering University of Toronto.

Dynamic Bayesian Networks (DBNs)

Rumor Routing Algorithm For sensor Networks David Braginsky, Computer Science Department, UCLA Presented By: Yaohua Zhu CS691 Spring 2003.

Generated Waypoint Efficiency: The efficiency considered here is defined as follows: As can be seen from the graph, for the obstruction radius values (200,

Pradeep Varakantham Singapore Management University Joint work with J.Y.Kwak, M.Taylor, J. Marecki, P. Scerri, M.Tambe.

Farnoush Banaei-Kashani and Cyrus Shahabi Criticality-based Analysis and Design of Unstructured P2P Networks as “ Complex Systems ” Mohammad Al-Rifai.

DESIGN OF A GENERIC PATH PATH PLANNING SYSTEM AILAB Path Planning Workgroup.

What Are Partially Observable Markov Decision Processes and Why Might You Care? Bob Wall CS 536.

Planning under Uncertainty

Effective Coordination of Multiple Intelligent Agents for Command and Control The Robotics Institute Carnegie Mellon University PI: Katia Sycara

AAMAS 2009, Budapest1 Analyzing the Performance of Randomized Information Sharing Prasanna Velagapudi, Katia Sycara and Paul Scerri Robotics Institute,

Kuang-Hao Liu et al Presented by Xin Che 11/18/09.

Decentralized prioritized planning in large multirobot teams Prasanna Velagapudi Paul Scerri Katia Sycara Carnegie Mellon University, Robotics Institute.

In practice, we run into three common issues faced by concurrent optimization algorithms. We alter our model-shaping to mitigate these by reasoning about.

A Principled Information Valuation for Communications During Multi-Agent Coordination Simon A. Williamson, Enrico H. Gerding, Nicholas R. Jennings School.

More routing protocols Alec Woo June 18 th, 2002.

Brent Dingle Marco A. Morales Texas A&M University, Spring 2002

© nCode 2000 Title of Presentation goes here - go to Master Slide to edit - Slide 1 Reliable Communication for Highly Mobile Agents ECE 7995: Term Paper.

Chess Review May 11, 2005 Berkeley, CA Tracking Multiple Objects using Sensor Networks and Camera Networks Songhwai Oh EECS, UC Berkeley

AFOSR MURI. Salem, MA. June 4, /10 Coordinated UAV Operations: Perspectives and New Results Vishwesh Kulkarni Joint Work with Jan De Mot, Sommer.

8/22/20061 Maintaining a Linked Network Chain Utilizing Decentralized Mobility Control AIAA GNC Conference & Exhibit Aug. 21, 2006 Cory Dixon and Eric.

© 2005, it - instituto de telecomunicações. Todos os direitos reservados. Gerhard Maierbacher Scalable Coding Solutions for Wireless Sensor Networks IT.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Optimal Fixed-Size Controllers for Decentralized POMDPs Christopher Amato Daniel.

Distributed Constraint Optimization * some slides courtesy of P. Modi

Bayesian Filtering for Robot Localization

Decentralised Coordination of Mobile Sensors School of Electronics and Computer Science University of Southampton Ruben Stranders,

Self-Organizing Agents for Grid Load Balancing Junwei Cao Fifth IEEE/ACM International Workshop on Grid Computing (GRID'04)

Distributed Constraint Optimization Michal Jakob Agent Technology Center, Dept. of Computer Science and Engineering, FEE, Czech Technical University A4M33MAS.

CS 712 | Fall 2007 Using Mobile Relays to Prolong the Lifetime of Wireless Sensor Networks Wei Wang, Vikram Srinivasan, Kee-Chaing Chua. National University.

Controlling and Configuring Large UAV Teams Paul Scerri, Yang Xu, Jumpol Polvichai, Katia Sycara and Mike Lewis Carnegie Mellon University and University.

ANTs PI Meeting, Nov. 29, 2000W. Zhang, Washington University1 Flexible Methods for Multi-agent distributed resource Allocation by Exploiting Phase Transitions.

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

1 Robot Environment Interaction Environment perception provides information about the environment’s state, and it tends to increase the robot’s knowledge.

Model-based Bayesian Reinforcement Learning in Partially Observable Domains by Pascal Poupart and Nikos Vlassis (2008 International Symposium on Artificial.

MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.

Mobile Agent Migration Problem Yingyue Xu. Energy efficiency requirement of sensor networks Mobile agent computing paradigm Data fusion, distributed processing.

1 Iterative Integer Programming Formulation for Robust Resource Allocation in Dynamic Real-Time Systems Sethavidh Gertphol and Viktor K. Prasanna University.

CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.

Behavior-based Multirobot Architectures. Why Behavior Based Control for Multi-Robot Teams? Multi-Robot control naturally grew out of single robot control.

1 Multiagent Teamwork: Analyzing the Optimality and Complexity of Key Theories and Models David V. Pynadath and Milind Tambe Information Sciences Institute.

OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.

Scaling Human Robot Teams Prasanna Velagapudi Paul Scerri Katia Sycara Mike Lewis Robotics Institute Carnegie Mellon University Pittsburgh, PA.

1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.

Smart Sleeping Policies for Wireless Sensor Networks Venu Veeravalli ECE Department & Coordinated Science Lab University of Illinois at Urbana-Champaign.

Learning for Physically Diverse Robot Teams Robot Teams - Chapter 7 CS8803 Autonomous Multi-Robot Systems 10/3/02.

Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia

Distributed cooperation and coordination using the Max-Sum algorithm

Probabilistic Robotics Probability Theory Basics Error Propagation Slides from Autonomous Robots (Siegwart and Nourbaksh), Chapter 5 Probabilistic Robotics.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Achieving Goals in Decentralized POMDPs Christopher Amato Shlomo Zilberstein UMass.

Keep the Adversary Guessing: Agent Security by Policy Randomization

Impact of Interference on Multi-hop Wireless Network Performance

CS b659: Intelligent Robotics

Problem Solving by Searching

Multi-Agent Exploration

Market-based Dynamic Task Allocation in Mobile Surveillance Systems

CS 416 Artificial Intelligence

Presentation transcript:

Information Sharing in Large Heterogeneous Teams Prasanna Velagapudi Robotics Institute Carnegie Mellon University FRC Seminar - August 13, 2009

Large Heterogeneous Teams 100s to 1000s of agents (robots, agents, people) Shared goals Must collaborate to complete complex tasks Dynamic, uncertain environment FRC Seminar - August 13, 20092

Scaling Teams Far more data than can be feasibly shared –Amount of information exchanged often grows faster than amount of available bandwidth Vague, incomplete knowledge of large parts of the team –Often not important Shared information improves team performance

Search and Rescue Air robots, ground robots, human operators Each is generating information –Humans  Classify objects and issue commands –Robots  Explore and map area Geometric Random Graph FRC Seminar - August 13, 20094

Search and Rescue FRC Seminar - August 13, VideoStreams (320kbps x 24, For operators) VideoStreams (320kbps x 24, For operators) Decentralized Evidence Grid (14kbps x 24, For all agents) Decentralized Evidence Grid (14kbps x 24, For all agents) OperatorControl (<1kbps x 24, For robots) OperatorControl (<1kbps x 24, For robots) Available throughput: Θ(WN 0.5 ) [Gupta 2000]

Available Network Technologies FRC Seminar - August 13, Source: William Webb - Ofcom

Scaling Teams We need to deliver information efficiently –Get to the agents that can make use of it most –Don’t waste communication bandwidth Key Idea: Different agents have different needs for a given piece of information

Sharing information When information generation exceeds network capacity, there are a few options: –Compression/Fusion (Eliminate redundant data) –Structuring (Eliminate overhead costs) –Selection (Eliminate unimportant data) FRC Seminar - August 13, 20098

Related work Distributed Data Fusion –Channel filtering (DDF) [Makarenko 04] –Particle exchange [Rosencrantz 03] Networking –Gossip[Haas 06], SPIN[Heinzelman 99], IDR[Liu 03] Multiagent Coordination –STEAM [Tambe 97] –ACE-PJB-COMM [Roth 05], Reward-shaping [Williamson 09], dec-POMDP-com [Zilberstein 03] FRC Seminar - August 13, 20099

Domain assumptions Information generated dynamically and asynchronously Limited bandwidth and memory –With respect to size of team Significant local computing Some predictive knowledge about other agents’ information needs Peer-to-peer communications FRC Seminar - August 13,

Domain assumptions FRC Seminar - August 13, 2009 Inconsistency Complexity Communication 11 Our domains

Abstract Problem Suppose we are given some metric for team performance in a domain: –How much information sharing complexity and communication is necessary to achieve good performance in a large team? –How can we characterize the effects of information sharing on performance in large teams? FRC Seminar - August 13, Suppose we are given some metric for team performance in a domain: –How much information sharing complexity and communication is necessary to achieve good performance in a large team? –How can we characterize the effects of information sharing on performance in large teams?

A simple example Two robots (1 static, 1 mobile) in a maze Limited sensing radius, global communication Team task: Get mobile robot to goal point Team performance = battery power –Movement and communication use power How useful is it to the team for the static robot to share its info with the mobile robot? FRC Seminar - August 13,

A simple example FRC Seminar - August 13,

A simple example Without informationWith information FRC Seminar - August 13,

A simple example Without informationWith information FRC Seminar - August 13, The change in path cost is the “utility” of this information

Utility of Information Utility: the change in team performance when an agent gets a piece of information Often dependent on other information Difficult to calculate during execution, even with complete real-time knowledge –Need to know final state of team FRC Seminar - August 13,

Objective Utility: the change in team performance when an agent gets a piece of information Communication cost: the cost of sending a piece of information to a specific agent FRC Seminar - August 13,

Objective Maximize team performance: FRC Seminar - August 13, 2009 utilitycommunication agents info. source dissemination tree 19 In actual systems, this solution must be formed through local decisions!

Distributions of Utility For large amounts of information, consider the distribution of utility –May be conditioned on known data, or just independently sampled Characterize domains as having specific distributions of utility Estimate performance of various algorithms as function of this distribution FRC Seminar - August 13,

Back to the simple example FRC Seminar - August 13, Frequency Utility (Δ path cost) Maze Utility Distribution

Abstract Problem Suppose we are given some metric for team performance in a domain: –How much information sharing complexity and communication is necessary to achieve good performance in a large team? –How can we characterize the effects of information sharing on performance in large teams? FRC Seminar - August 13,

Approach Useful information sharing algorithms fall between two extremes: –Full knowledge/high complexity (omniscient) –No knowledge/low complexity (blind) Observe performance of two extremes of information sharing algorithms –Learn when it is useful to use complex algorithms –If blind policies do well, other low complexity algorithms will also work well FRC Seminar - August 13,

Utility vs. Communication FRC Seminar - August 13, Team UtilityCommunication Cost Distributional upper bound Omniscient policy Blind policy Efficient policies

Expected Upper Bound Order statistic: expectation of k-th highest value over n samples –Computable for many common distributions Expected best case performance –What values of utility would we expect to see in a team of n agents? –Sum of k highest order statistics FRC Seminar - August 13,

Utility vs. Communication FRC Seminar - August 13, Team UtilityCommunication Cost Distributional upper bound Omniscient policy Blind policy Efficient policies

Omniscient Policy Lookahead policy 1.Assume we are given estimate of utility for every other node (possibly with noise) 2.Exhaustively search all n-length paths from current node 3.Send information along best path 4.Repeat until TTL reaches 0 –Approximation of best omniscient policy –Full exhaustive search is intractable FRC Seminar - August 13,

Utility vs. Communication FRC Seminar - August 13, Team UtilityCommunication Cost Distributional upper bound Omniscient policy Blind policy Efficient policies

Blind policies Random: “Gossip” to randomly chosen neighbor Random Self-Avoiding –Keep history of agents visited –O(lifetime of piece) Random Trail –Keep history of links used –O(# of pieces/time step) FRC Seminar - August 13,

Questions How well does the lookahead policy approximate omniscient policy performance? How wide is the performance gap between the omniscient policy and blind policies? How does team size affect performance? Is omniscient policy performance better because it knows where to route, or where not to route? FRC Seminar - August 13,

Experiment Network of agents with utility sampled from distribution Single piece of information shared each trial Average-case performance recorded FRC Seminar - August 13, 2009 Distributions: Normal Exponential Uniform Networks: Small-Worlds (Watts-Beta) Scale-free (Preferential attachment) Lattice (2D grid) Hierarchy (Spanning tree) 31

Questions How well does the lookahead policy approximate omniscient policy performance? How wide is the performance gap between the omniscient policy and blind policies? How does team size affect performance? Is omniscient policy performance better because it knows where to route, or where not to route? FRC Seminar - August 13,

Lookahead convergence FRC Seminar - August 13, step lookahead: pathological case?

Questions How well does the lookahead policy approximate omniscient policy performance? How wide is the performance gap between the omniscient policy and blind policies? How does team size affect performance? Is omniscient policy performance better because it knows where to route, or where not to route? FRC Seminar - August 13,

Performance Results FRC Seminar - August 13, 2009 Normal DistributionExponential Distribution 35

Policy Performance FRC Seminar - August 13, (Utility sampled from Exponential distribution)

Utility of knowledge FRC Seminar - August 13, ~120 communications

Questions How well does the lookahead policy approximate omniscient policy performance? How wide is the performance gap between the omniscient policy and blind policies? How does team size affect performance? Is omniscient policy performance better because it knows where to route, or where not to route? FRC Seminar - August 13,

Scaling effects FRC Seminar - August 13, The costs of maintaining utility estimates for Lookahead increase with team size, but the costs of Random policy do not.

Questions How well does the lookahead policy approximate omniscient policy performance? How wide is the performance gap between the omniscient policy and blind policies? How does team size affect performance? Is omniscient policy performance better because it knows where to route, or where not to route? FRC Seminar - August 13,

Noisy estimation How does the omniscient policy degrade as its estimates of utility become noisy? As noise increases, the omniscient policy approaches an ideal blind policy Gaussian noise scaled by network distance: FRC Seminar - August 13,

Noisy estimation FRC Seminar - August 13,

Modeling maze navigation FRC Seminar - August 13, Frequency Utility (Δ path cost)

Modeling maze navigation FRC Seminar - August 13,

Summary of Results Omniscient policy approaches optimal routing on many graphs (not hierarchies) Gap between omniscient and blind policies is small when: –Network is conducive (Small Worlds, Lattice) –Maintaining shared knowledge is expensive –Network is massive –Estimation of value is poor FRC Seminar - August 13,

Improving the model Current work on validating this model –USARSim (Search and Rescue) –VBS2 (Military C2) –TREMOR (POMDP) Predictive utility estimation and dynamics Better solution for optimal policy: –Prize-collecting Steiner Tree [Ljubić 2007] FRC Seminar - August 13,

Conclusions Utility distributions: a mechanism to test information sharing performance –Computable from real-world data –Can be conditional/joint/marginal to encode domain dependencies Simple random policies: surprisingly competitive in many cases –No structural or computational overhead –No expensive costs to maintain utility estimates FRC Seminar - August 13,

FRC Seminar - August 13,

FRC Seminar - August 13,

Outline What we mean by large heterogeneous teams The common assumptions in our domains What we mean by utility  utility distributions The experiment The results Conclusions Future work/validation FRC Seminar - August 13,

We need information Information generated all over network Information consumed all over network Team performance is improved by additional information –More data = better decisions However, information loss degrades performance gracefully –Less data = alright decisions FRC Seminar - August 13,

Scalability of Large Teams As size increases, amount of information exchanged grows faster than amount of available bandwidth –Constant network density: O(n) FRC Seminar - August 13,

Motivation Large, heterogeneous teams of agents –100s to 1000s of robots, agents, and people –Must collaborate to complete complex tasks –Decentralized algorithms FRC Seminar - August 13,

Motivation Agents need to share information about objects and uncertainty in the environment to perform roles –Individual sensor readings unreliable –Used to reason about appropriate actions –Maintenance of mutual beliefs is key Need effective means to propagate information –Agent needs for information change dynamically –Highly redundant data FRC Seminar - August 13,

Utility of Information A given piece of data can improve a given agent’s performance by a certain amount –Need to determine which pieces are useful to deliver to which agents –Need to determine how a piece of information will affect team performance FRC Seminar - August 13,

Utility of Information In our domains, we want to maximize the utility of what we are sending around while minimizing the cost of communication There are many possible information sharing strategies, how can we estimate or predict their performance? FRC Seminar - August 13,

USARSim In search and rescue/disaster response, network communication is very limited, while information generated must be shipped elsewhere to be processed. Video and map information can be compressed, but compression is limited because data must be streamed to operators Also, as more autonomous vehicles are added, it becomes impossible for single operators to handle all the information anyway FRC Seminar - August 13,

VBS2 In military C2, high-level decisions must be made based on available information from a large number of units. However, military communications are especially limited, and further constrained by hierarchical organization and classification Can we intelligently guarantee that information will get between units and to command units? FRC Seminar - August 13,

TREMOR Varakantham et al. present a multiagent POMDP solver that uses reward shaping to decompose joint POMDPs into local POMDPs in situations where most interaction occurs at a small number of “coordination locales”. The reward shaping component can be described as an intelligent information sharing problem, and as such, we can create a distributed variant capable of solving much larger multi-agent POMDPs FRC Seminar - August 13,

FRC Seminar - August 13,

FRC Seminar - August 13,