Hierarchical mission control of automata with human supervision Prof. David A. Castañon Boston University.

Slides:



Advertisements
Similar presentations
Airline Schedule Optimization (Fleet Assignment I)
Advertisements

Primal Dual Combinatorial Algorithms Qihui Zhu May 11, 2009.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.
INTRODUCTION TO MODELING
Meta-Level Control in Multi-Agent Systems Anita Raja and Victor Lesser Department of Computer Science University of Massachusetts Amherst, MA
Scenario Trees and Metaheuristics for Stochastic Inventory Routing Problems DOMinant Workshop, Molde, Norway, September , 2009 Lars Magnus Hvattum.
Computational Stochastic Optimization:
Modeling Rich Vehicle Routing Problems TIEJ601 Postgraduate Seminar Tuukka Puranen October 19 th 2009.
© 2005 Prentice Hall6-1 Stumpf and Teague Object-Oriented Systems Analysis and Design with UML.
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
Decision Making: An Introduction 1. 2 Decision Making Decision Making is a process of choosing among two or more alternative courses of action for the.
Planning under Uncertainty
Kuang-Hao Liu et al Presented by Xin Che 11/18/09.
Improving Market-Based Task Allocation with Optimal Seed Schedules IAS-11, Ottawa. September 1, 2010 G. Ayorkor Korsah 1 Balajee Kannan 1, Imran Fanaswala.
System design-related Optimization problems Michela Milano Joint work DEIS Università di Bologna Dip. Ingegneria Università di Ferrara STI Università di.
Dynamic Power Management for Systems with Multiple Power Saving States Sandy Irani, Sandeep Shukla, Rajesh Gupta.
A Principled Information Valuation for Communications During Multi-Agent Coordination Simon A. Williamson, Enrico H. Gerding, Nicholas R. Jennings School.
Mary (Missy) Cummings Humans & Automation Lab
Modeling and Planning with Robust Hybrid Automata Cooperative Control of Distributed Autonomous Vehicles in Adversarial Environments 2001 MURI: UCLA, CalTech,
Simulation.
Ant Colonies As Logistic Processes Optimizers
Dynamic lot sizing and tool management in automated manufacturing systems M. Selim Aktürk, Siraceddin Önen presented by Zümbül Bulut.
Integrating POMDP and RL for a Two Layer Simulated Robot Architecture Presented by Alp Sardağ.
Ant Colony Optimization Optimisation Methods. Overview.
Multirobot Coordination in USAR Katia Sycara The Robotics Institute
1 Planning and Scheduling to Minimize Tardiness John Hooker Carnegie Mellon University September 2005.
Reinforcement Learning Yishay Mansour Tel-Aviv University.
Aeronautics & Astronautics Autonomous Flight Systems Laboratory All slides and material copyright of University of Washington Autonomous Flight Systems.
Package Transportation Scheduling Albert Lee Robert Z. Lee.
Summary Alan S. Willsky SensorWeb MURI Review Meeting September 22, 2003.
Column Generation Approach for Operating Rooms Planning Mehdi LAMIRI, Xiaolan XIE and ZHANG Shuguang Industrial Engineering and Computer Sciences Division.
ECES 741: Stochastic Decision & Control Processes – Chapter 1: The DP Algorithm 1 Chapter 1: The DP Algorithm To do:  sequential decision-making  state.
A Framework for Distributed Model Predictive Control
Team Exploration vs Exploitation with Finite Budgets David Castañón Boston University Center for Information and Systems Engineering.
SWAN 2006, ARRI_ Control of Distributed-Information Nonlinear Stochastic Systems Prof. Thomas Parisini University of Trieste.
ANTs PI Meeting, Nov. 29, 2000W. Zhang, Washington University1 Flexible Methods for Multi-agent distributed resource Allocation by Exploiting Phase Transitions.
Managerial Decision Making and Problem Solving
Scheduling policies for real- time embedded systems.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
Approximate Dynamic Programming Methods for Resource Constrained Sensor Management John W. Fisher III, Jason L. Williams and Alan S. Willsky MIT CSAIL.
1 Short Term Scheduling. 2  Planning horizon is short  Multiple unique jobs (tasks) with varying processing times and due dates  Multiple unique jobs.
Bekkjarvik, A Heuristic Solution Method for a Stochastic Vehicle Routing Problem Lars Magnus Hvattum.
1 Distributed and Optimal Motion Planning for Multiple Mobile Robots Yi Guo and Lynne Parker Center for Engineering Science Advanced Research Computer.
Exact and heuristics algorithms
Advanced Decision Architectures Collaborative Technology Alliance An Interactive Decision Support Architecture for Visualizing Robust Solutions in High-Risk.
Reinforcement Learning Yishay Mansour Tel-Aviv University.
Lecture 1 – Operations Research
Outline The role of information What is information? Different types of information Controlling information.
Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Competitive Scheduling in Wireless Networks with Correlated Channel State Ozan.
December 20, 2015 Decentralized Mission Planning for Heterogeneous Human-Robot Teams Sameera Ponda Prof. Jonathan How Department of Aeronautics and Astronautics.
Behavior-based Multirobot Architectures. Why Behavior Based Control for Multi-Robot Teams? Multi-Robot control naturally grew out of single robot control.
Xiao Liu 1, Yun Yang 1, Jinjun Chen 1, Qing Wang 2, and Mingshu Li 2 1 Centre for Complex Software Systems and Services Swinburne University of Technology.
Monte-Carlo based Expertise A powerful Tool for System Evaluation & Optimization  Introduction  Features  System Performance.
Robust Decentralized Planning for Large-Scale Heterogeneous Human-Robot Teams Prof. Jonathan P. How Department of Aeronautics and Astronautics Massachusetts.
Boeing-MIT Collaborative Time- Sensitive Targeting Project July 28, 2006 Stacey Scott, M. L. Cummings (PI) Humans and Automation Laboratory
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Learning for Physically Diverse Robot Teams Robot Teams - Chapter 7 CS8803 Autonomous Multi-Robot Systems 10/3/02.
© P. Pongcharoen CCSI/1 Scheduling Complex Products using Genetic Algorithms with Alternative Fitness Functions P. Pongcharoen, C. Hicks, P.M. Braiden.
Mean Field Methods for Computer and Communication Systems Jean-Yves Le Boudec EPFL Network Science Workshop Hong Kong July
Lecture 20 Review of ISM 206 Optimization Theory and Applications.
Keep the Adversary Guessing: Agent Security by Policy Randomization
Business Modeling Lecturer: Ing. Martina Hanová, PhD.
CHAPTER 8 Operations Scheduling
C.-S. Shieh, EC, KUAS, Taiwan
Optimization Techniques for Natural Resources SEFS 540 / ESRM 490 B
Announcements Homework 3 due today (grace period through Friday)
Markov Decision Problems
Presented By: Darlene Banta
Dr. Arslan Ornek MATHEMATICAL MODELS
Presentation transcript:

Hierarchical mission control of automata with human supervision Prof. David A. Castañon Boston University

Problem of Interest Coordination of heterogeneous teams to accomplish tasks in uncertain, risky environments -Vehicles with different capabilities, resources -Some resources are renewable (sensors), others are not -Tasks are spatially distributed, require combinations of capabilities -Successful completion of tasks not guaranteed -Likelihood of success depends on resources assigned -Tasks arrive, depart randomly -Task types may be unknown until observed -Vehicles may fail randomly, depending on trajectories Key aspect: Real-time adaptation to events Human Supervision -Determine task priority/value -Modify individual vehicle task assignments when desired -Determine specific vehicle schedules when desired

Problem Illustration

Experiment model Multiple robots search for and perform tasks at BU’s Mechatronics Lab

Why is this a hard problem Uncertain environment and dynamics -Unknown targets -Uncertain effectiveness of sensing, actions Requires highly adaptive system, anticipative of and responsive to new information Hedge against loss of assets, new arrivals, action failures, … Diverse set of vehicles with multiple capabilities -Dynamic role selection, ad hoc teaming Dual control problems: Manage both information acquisition and action -Trade off search and sensing versus actions -Dynamic coupling of available capabilities to achieve desired effects Support and adapt to human control inputs -Goals, constraints, fixed decisions -Provide information to assess effects of changes

Classes of algorithms Operations Research -Deterministic and stochastic multi-vehicle task assignment and scheduling -Large vehicles, small tasks, limited cooperation, homogeneous activities -No risk, limited uncertainty to new task arrivals, departures independent of vehicle actions -Search theory and sensor management -Large-scale resource allocation and integer programming Stochastic Control -Control of stochastic queuing systems in communications -Single vehicle routing and low level vehicle trajectory control -Swarm control approaches with stability and performance guarantees -Homogeneous vehicles -Approximate dynamic programming techniques -Not focused on combinatorial optimization in general, rare exceptions -Model predictive control of complex stochastic systems Artificial Intelligence/Computer Science -Constraint satisfaction, temporal planning systems -Non-real time, off-line combinatorial constraint-based search -Limited incorporation of risk/reward, information dynamics -Behavioral control in robotics for simple tasks -Reinforcement learning for stochastic planning in well-defined repeated environments (e.g. games)

Proposed Approach: Hierarchical Model Predictive Control Hierarchical approach: avoid combinatorial explosion of complexity through decomposition Team strategy selection: address uncertainty -Allocate team capabilities to tasks, hedging against task type uncertainty, new task arrivals, action success probabilities -Simplify distribution of resources across vehicles Team activity scheduling: address combinatorial complexity -Allocate team activities to platforms -Select schedules and routes Model Predictive Control: resolve algorithms in response to new information or human directives -Receding horizon control -Respond to new tasks, changes in task status, platform loss, …. -Adapt to human guidance and constraints Requires fast algorithms for real-time control

Team Strategy Selection Stochastic dynamic programming formulation -Multistage formulation, with outcomes observed after each stage Resources Stage 1Stage 2Stage 3 Task 1 Task N Task 1 Task N Task 1 Task N Type 1 Type 2 Type 3 Type 4 Task N+1 Task N+M

Notation N tasks i = 1, …, N M resource types j = 1, …, M Assume independence of all task completion events

Example: Two-Stage Single Resource Problem Define a task completion state after each stage -Task completion state observed after each stage Decisions are now feedback policies Task completion state dynamics: Controlled Markov chain -Resources assigned determine transition probabilities -Independence of completion event outcomes decouples transition dynamics across tasks

Two-Stage Problem Statement Objective: minimize expected uncompleted task value plus expected resource use costs Constraints: Resource limits

Relaxed Two-Stage Problem Original problem is stochastic integer program -P-space complete, hard Expand set of admissible feedback strategies in second stage -Generates lower bound to optimal value function -New constraint on average number of resources -Relaxes exponential number of constraints to a single constraint -Simple result: All feasible strategies in original problem are feasible in current problem -Lower bound on original performance -Idea: select optimal strategies for lower bound

Characterization of Optimal Strategies Important concept: Mixed local strategies -Local strategies: feedback strategies such that the actions on a given task depend only on the state of that task -Mixed strategy: random combination of pure strategies -Mixed strategies may achieve better performance than pure strategies in relaxed problem Theorem: In relaxed problem, for every pure strategy, there is a mixed local strategy which uses same resources and achieves same expected performance -Proven by construction -Restricts search to local mixed strategies -Fast algorithm for solution of optimal strategies using convex optimization principles! -Can solve exactly in Complexity O((M 1 +N)log(N))

Comments and Extensions MPC approach guarantees feasibility of approximate problem solution in terms of original problem -Obtain approximate solution, but implement only first stage allocations -Resolve problem when new observations are available, with receding horizon -Fast algorithm allows for rapid computation Main extensions: -Multiple stages -Multiple resource types -Multiple renewable and non-renewable resources -Solution NP-hard, but can solve approximately -Multiple task types: sensing and action -Must sense to observe outcomes -New task arrivals, discovered by searching -Unknown task types: Detect presence, but must observe to determine task type -Task departures, deadlines

Team Activity Scheduling Inputs from team strategy selection -Desired resources assigned to each task in current period -Desired resources held in reserve when future information is collected Guidance and constraints from human operators -Task values, select platform task assignments, select task resource assignments Known parameters -Vehicle locations and resources in each vehicle, task locations Problem: assign resource deliveries for tasks to individual vehicles, and select sequence of activities for vehicle -Deterministic multi-vehicle routing problem (VRP) -NP-hard, with many useful approximate approaches available

Team Activity Assignment Formulation Problem Formulation Visit Customers Subject to: N vehicles to route Integrality VRP is an NP-hard problem (traveling salesman) wrapped in an NP-hard problem (bin packing). Classical Application: Truck Routing where Discounted Cost

Team Activity Assignment Algorithm Candidate algorithm: Tabu Search -Locally perturbs trial solutions -Uses “Tabu” list to avoid local minima -Evaluated by AFIT for UAV routing -Fast replanning, leads to rapid response to events -Handles time window constraints instead of precedence constraint Significant extensions to date -Multiple task types -Multiple resource types -Compound tasks involving multiple vehicles Alternative algorithms (AFOSR-sponsored) -Mixed Integer-Linear Programming, J. How, MIT -Receding horizon controller, C. Cassandras, BU

Comments Algorithms available for dynamic control of automata performing tasks in uncertain, risky environments -Fast generation of desired courses of action -Hedge against uncertain outcomes, adapt to new information Operator interaction through value structure, plus fixed decision variables and constraints -Allows for “micro”-management -Very limited insight into effects of operator inputs on automata behavior and performance Fundamental problem for this MURI research: prediction of course of action in the presence of uncertainty -Not a single plan, but a contingency tree of possible actions/responses -Hard to modify, approve

Experimental Platform for Research Multiple robots search for and perform tasks at BU’s Mechatronics Lab -Can provide operator control of some platforms: human-automata teams -Control information displayed, risk to each operator using video

Future Activities Implement research experiments involving tasks with performance uncertainty in test facility -Vary tempo, size, uncertainty, information Develop algorithms to interact with operators in alternative roles -Supervisory control -Team partners Extend existing algorithms to different classes of tasks -Area search, task discovery, risk to platforms Develop algorithms to assist operators in predicting behavior of automata teams in uncertain environments Collaborate with MURI team to design and analyze experiments involving alternative structures for human-automata teams