Scalable Scheduling Policy Design for Open Soft Real-Time Systems*

Slides:



Advertisements
Similar presentations
QoS-based Management of Multiple Shared Resources in Dynamic Real-Time Systems Klaus Ecker, Frank Drews School of EECS, Ohio University, Athens, OH {ecker,
Advertisements

Stochastic Models Of Resource Allocation For Services Stochastic Models of Resource Allocation for Services Ralph D. Badinelli Virginia Tech.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Solving POMDPs Using Quadratically Constrained Linear Programs Christopher Amato.
Scalable Utility Aware Scheduling Heuristics for Real-time Tasks with Stochastic Non-preemptive Execution Intervals* Terry Tidwell 1, Carter Bass 1, Eli.
An Introduction to Markov Decision Processes Sarah Hickmott
COSC 878 Seminar on Large Scale Statistical Machine Learning 1.
Planning under Uncertainty
Kuang-Hao Liu et al Presented by Xin Che 11/18/09.
In practice, we run into three common issues faced by concurrent optimization algorithms. We alter our model-shaping to mitigate these by reasoning about.
*Sponsored in part by the DARPA IT-MANET Program, NSF OCE Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks Rahul.
Nov 14 th  Homework 4 due  Project 4 due 11/26.
7. Experiments 6. Theoretical Guarantees Let the local policy improvement algorithm be policy gradient. Notes: These assumptions are insufficient to give.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Optimal Fixed-Size Controllers for Decentralized POMDPs Christopher Amato Daniel.
Algorithms For Inverse Reinforcement Learning Presented by Alp Sardağ.
How to Stall a Motor: Information-Based Optimization for Safety Refutation of Hybrid Systems Todd W. Neller Knowledge Systems Laboratory Stanford University.
IE 594 : Research Methodology – Discrete Event Simulation David S. Kim Spring 2009.
Machine Learning and Optimization For Traffic and Emergency Resource Management. Milos Hauskrecht Department of Computer Science University of Pittsburgh.
Function Approximation for Imitation Learning in Humanoid Robots Rajesh P. N. Rao Dept of Computer Science and Engineering University of Washington,
1 CE 530 Molecular Simulation Lecture 7 David A. Kofke Department of Chemical Engineering SUNY Buffalo
1 Statistical Mechanics and Multi- Scale Simulation Methods ChBE Prof. C. Heath Turner Lecture 11 Some materials adapted from Prof. Keith E. Gubbins:
Cyber-Physical Systems Research* Chris Gill Professor of Computer Science and Engineering Washington University, St. Louis, MO, USA
Computational Stochastic Optimization: Bridging communities October 25, 2012 Warren Powell CASTLE Laboratory Princeton University
CPS Scheduling Policy Design with Utility and Stochastic Execution* Chris Gill Associate Professor Department of Computer Science and Engineering Washington.
Scheduling as a Learned Art* Christopher Gill, William D. Smart, Terry Tidwell, and Robert Glaubius {cdgill, wds, ttidwell, Department.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
Scheduling for Reliable Execution in Autonomic Systems* Terry Tidwell, Robert Glaubius, Christopher Gill, and William D. Smart {ttidwell, rlg1, cdgill,
Non-Preemptive Scheduling Policy Design for Tasks with Stochastic Execution Times* Chris Gill Associate Professor Department of Computer Science and Engineering.
Kevin Ross, UCSC, September Service Network Engineering Resource Allocation and Optimization Kevin Ross Information Systems & Technology Management.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
1 S ystems Analysis Laboratory Helsinki University of Technology Flight Time Allocation Using Reinforcement Learning Ville Mattila and Kai Virtanen Systems.
Value Function Approximation on Non-linear Manifolds for Robot Motor Control Masashi Sugiyama1)2) Hirotaka Hachiya1)2) Christopher Towell2) Sethu.
Hidden Markov Model Multiarm Bandits: A Methodology for Beam Scheduling in Multitarget Tracking Presented by Shihao Ji Duke University Machine Learning.
Scheduling Policy Design for Stochastic Non-preemptive Real-time Systems* Chris Gill Professor of Computer Science and Engineering Washington University,
Energy-Aware Resource Adaptation in Tessellation OS 3. Space-time Partitioning and Two-level Scheduling David Chou, Gage Eads Par Lab, CS Division, UC.
Model Minimization in Hierarchical Reinforcement Learning Balaraman Ravindran Andrew G. Barto Autonomous Learning Laboratory.
Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia
Auburn University
OPERATING SYSTEMS CS 3502 Fall 2017
Introduction to Load Balancing:
Prof. Dr. Holger Schlingloff 1,2 Dr. Esteban Pavese 1
CS b659: Intelligent Robotics
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Mapping Techniques Dr. Xiao Qin Auburn University.
PETRA 2014 An Interactive Learning and Adaptation Framework for Socially Assistive Robotics: An Interactive Reinforcement Learning Approach Konstantinos.
On Optimal Distributed Kalman Filtering in Non-ideal Situations
Professor Arne Thesen, University of Wisconsin-Madison
Motivation: Mobile Sensor Allocation Insight: Hypercube Sampling
Scheduling Design and Verification for Open Soft Real-time Systems
Optimizing Expected Time Utility in Cyber-Physical Systems Schedulers
Chris Gill Associate Professor
Planning to Maximize Reward: Markov Decision Processes
Robust Belief-based Execution of Manipulation Programs
 Real-Time Scheduling via Reinforcement Learning
CSc4730/6730 Scientific Visualization
Genetic Algorithms: A Tutorial
Efficient Distribution-based Feature Search in Multi-field Datasets Ohio State University (Shen) Problem: How to efficiently search for distribution-based.
CPU SCHEDULING.
COSC 4335: Other Classification Techniques
Optimal Control and Reachability with Competing Inputs
Approximate POMDP planning: Overcoming the curse of history!
Discrete Controller Synthesis
 Real-Time Scheduling via Reinforcement Learning
Linköping University, IDA, ESLAB
CS 416 Artificial Intelligence
Reinforcement Learning Dealing with Partial Observability
Retrieval Performance Evaluation - Measures
ACHIEVEMENT DESCRIPTION
Communication Driven Remapping of Processing Element (PE) in Fault-tolerant NoC-based MPSoCs Chia-Ling Chen, Yen-Hao Chen and TingTing Hwang Department.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Presentation transcript:

Scalable Scheduling Policy Design for Open Soft Real-Time Systems* Robert Glaubius, Terry Tidwell, Braden Sidoti, David Pilla, Justin Meden, Christopher Gill, and William D. Smart Department of Computer Science and Engineering Washington University, St. Louis, MO, USA *Research supported by NSF grants CNS-0716764 (Cybertrust) and CCF-0448562 (CAREER)

Media and Machines Laboratory Washington University in St. Louis Motivation Systems are increasingly being designed to interact with the physical world This trend offers compelling new research challenges that motivate our work Consider for example the domain of mobile robotics my name is Lewis Media and Machines Laboratory Washington University in St. Louis Glaubius et al. September 18, 2018

Media and Machines Laboratory Washington University in St. Louis Motivation As in many other systems, resources must be shared among competing tasks Fail-safe modes may reduce consequences of resource-induced timing failures, but precise scheduling matters The physical properties of some resources motivate new models and techniques my name is Lewis Media and Machines Laboratory Washington University in St. Louis Glaubius et al. September 18, 2018

Media and Machines Laboratory Washington University in St. Louis Motivation For example, sharing a camera between navigation and surveying tasks in general doesn’t allow efficient preemption involves stochastically distributed durations Other scenarios also raise scalability questions, e.g., multi-robot heterogeneous real-time data transmission Lewis Media and Machines Laboratory Washington University in St. Louis Glaubius et al. September 18, 2018

System Model Assumptions Time is modeled as being discrete Separate tasks require a shared resource Access is mutually exclusive Durations are independent and non-preemptive Each task’s distribution of durations can be known Each task is always available to run Goal: precise resource allocation among the tasks E.g., 2:1 utilization share targets for tasks A vs. B Need a scheduling policy that best achieves this goal over varying temporal intervals Glaubius et al. September 18, 2018

Towards Optimal Policies A Markov decision process (MDP) is a 4-tuple (X,A,C,T) that matches our system model well: X, a set of states (e.g., utilizations of 8 vs. 17 quanta) A, the set of actions (giving resource to a particular task) C, costs associated with each state T, stochastic state transition function Solving the MDP gives a policy that maps each state to an action to minimize long term expected costs However, we need a finite set of states in order to solve exactly Glaubius et al. September 18, 2018

Share Aware Scheduling System state: cumulative resource usage of each task. Dispatching a task moves the system stochastically through the state space according to that task’s duration. Begin the first part of the talk. (8,17) Glaubius et al. September 18, 2018

Share Aware Scheduling Utilization target induces a ray {u:0} through the state space. Encode “goodness” relative to the share as a cost. Require that costs grow with distance from utilization ray. u u=(1/3,2/3) Glaubius et al. September 18, 2018

Transition Structure Transitions are state-independent Relative distribution over successor states is the same in each state. May want to reorder this last point so that I make the claim first, then show the evidence: “State sets parallel to utilization ray behave similarly: costs are the same, and transitions are the same.” Glaubius et al. September 18, 2018

Repeating Structure Repeating structure allows us to remove all but one exemplar from each equivalence class. Restricting the model to states with low costs allows us to produce good approximations to the optimal policy. Glaubius et al. September 18, 2018

What About Scalability? MDP representation allows consistent approximation of the optimal scheduling policy Approach suffers from the curse of dimensionality To overcome this limitation, we focus on a restricted class of scheduling policies Examining the policies derived from the MDP based approach provides insight into selecting appropriate policies Glaubius et al. September 18, 2018 11

Two-task MDP Policy Scheduling policies induce a partition on a 2-D state space with boundary parallel to the share target. Establish a decision offset d to identify the partition boundary. Sufficient in 2-D, but what about higher dimensions? Glaubius et al. September 18, 2018

Time Horizons Suggest a Generalization Ht={x : x1+x2+…+xn=t} u (0,0,2) u (0,2,0) The idea that needs to come across on this slide is that we can think about the problem of scheduling at every moment by partitioning time horizons. H0 H1 (0,0) (2,0,0) H0 H1 H2 H3 H4 H2 Glaubius et al. September 18, 2018

Three-task MDP Policy t =10 t =20 t =30 Action partitions meet along a decision ray parallel to the utilization ray. Action partitions are roughly cone-shaped. Glaubius et al. September 18, 2018

Parameterizing the Partition Specify a decision offset at the intersection of partitions. Anchor action vectors at the decision offset to approximate partitions. The conic policy selects the action vector best aligned with the displacement between the query state x and the decision offset. a2 a1 x a3 Glaubius et al. September 18, 2018

Conic Policy Parameterization Decision offset d Action vectors a1, a2,…, an Sufficient to partition each time horizon into n conic regions Allows good policy parameters to be found by stochastic optimization Glaubius et al. September 18, 2018

Stable Conic Policies Guaranteed that stable conic policies exist. For example, set each action vector to point opposite its corresponding vertex. Induces a vector field that stochastically orbits the decision ray. (t, 0, 0) (0, t, 0) Glaubius et al. September 18, 2018 17

Comparing Policies Policy found by solving MDP (for small task sets) πMDP(x) – chooses action at state x per solved MDP Simple heuristics (for all numbers of tasks) πunderused(x) – runs the most underutilized task πgreedy(x) – minimizes immediate expected cost Conic approach (for all numbers of tasks) πconic(x) – selects action with best aligned action vector Glaubius et al. September 18, 2018

Policy Comparison on a 4 Task Problem Task durations: random histograms over [2,32] 100 iterations of parameter search MDP Glaubius et al. September 18, 2018 19

Policy Comparison on a Ten Task Problem Repeated the same experiment for 10 tasks MDP is omitted (intractable here) Glaubius et al. September 18, 2018 20

Comparison with Varying Task Set Sizes 100 independent problems for each task set size MDP tractable only among all 2 and 3 task cases MDP Glaubius et al. September 18, 2018 21

Conclusions We have developed new techniques for designing non-preemptive scheduling policies for tasks with stochastic resource usage durations Conic policy performance is competitive with MDP solutions when comparison possible, and for larger problems it improves on available heuristic policies Future work will focus on applying and evaluating our results in different cyber-physical systems, and on extending them further in design and verification Glaubius et al. September 18, 2018 22

For Further Information R. Glaubius, T. Tidwell, C. Gill, and W.D. Smart, “Scheduling Policy Design for Autonomic Systems”, International Journal on Autonomous and Adaptive Communications Systems, 2(3):276-296, 2009 R. Glaubius, Scheduling Policy Design using Stochastic Dynamic Programming, Ph.D. Thesis, Washington University, St. Louis, MO, USA, 2009 R. Glaubius, T. Tidwell, C. Gill, and W.D. Smart, “Scheduling Design and Verification for Open Soft Real-Time Systems”, RTSS 2008 R. Glaubius, T. Tidwell, C. Gill, and W.D. Smart, “Scheduling Design with Unknown Execution Time Distributions or Modes”. Tech. Report WUCSE-2009-15, 2009 T. Tidwell, R. Glaubius, C. Gill, and W.D. Smart, “Scheduling for Reliable Execution in Autonomic Systems”, ATC 2008 C. Gill, W.D. Smart, T. Tidwell, and R. Glaubius, “Scheduling as a Learned Art”, OSPERT, 2008 Project web site: http://www.cse.wustl.edu/~cdgill/Cybertrust/ Glaubius et al. September 18, 2018 23

Questions? ? Glaubius et al. September 18, 2018

Appendix: More Tasks Implies Higher Cost Simple problem: Fair-share scheduling of n deterministic tasks with unit duration Trajectories under round robin scheduling: 2 tasks: E{c(x)} = 1/2. Trajectory: (0,0)(1,0)(1,1)(0,0) Costs: c(0,0)=0; c(1,0)=1. 3 tasks: E{c(x)} = 8/9. Trajectory: (0,0,0)(1,0,0)(1,1,0)(1,1,1)(0,0,0) Costs: c(0,0,0)=0; c(1,0,0)=4/3; c(1,1,0)=4/3 n tasks: E{c(x)} = (n+1)(n-1)/(3n) Glaubius et al. September 18, 2018 25