Scalable Scheduling Policy Design for Open Soft Real-Time Systems* Robert Glaubius, Terry Tidwell, Braden Sidoti, David Pilla, Justin Meden, Christopher Gill, and William D. Smart Department of Computer Science and Engineering Washington University, St. Louis, MO, USA *Research supported by NSF grants CNS-0716764 (Cybertrust) and CCF-0448562 (CAREER)
Media and Machines Laboratory Washington University in St. Louis Motivation Systems are increasingly being designed to interact with the physical world This trend offers compelling new research challenges that motivate our work Consider for example the domain of mobile robotics my name is Lewis Media and Machines Laboratory Washington University in St. Louis Glaubius et al. September 18, 2018
Media and Machines Laboratory Washington University in St. Louis Motivation As in many other systems, resources must be shared among competing tasks Fail-safe modes may reduce consequences of resource-induced timing failures, but precise scheduling matters The physical properties of some resources motivate new models and techniques my name is Lewis Media and Machines Laboratory Washington University in St. Louis Glaubius et al. September 18, 2018
Media and Machines Laboratory Washington University in St. Louis Motivation For example, sharing a camera between navigation and surveying tasks in general doesn’t allow efficient preemption involves stochastically distributed durations Other scenarios also raise scalability questions, e.g., multi-robot heterogeneous real-time data transmission Lewis Media and Machines Laboratory Washington University in St. Louis Glaubius et al. September 18, 2018
System Model Assumptions Time is modeled as being discrete Separate tasks require a shared resource Access is mutually exclusive Durations are independent and non-preemptive Each task’s distribution of durations can be known Each task is always available to run Goal: precise resource allocation among the tasks E.g., 2:1 utilization share targets for tasks A vs. B Need a scheduling policy that best achieves this goal over varying temporal intervals Glaubius et al. September 18, 2018
Towards Optimal Policies A Markov decision process (MDP) is a 4-tuple (X,A,C,T) that matches our system model well: X, a set of states (e.g., utilizations of 8 vs. 17 quanta) A, the set of actions (giving resource to a particular task) C, costs associated with each state T, stochastic state transition function Solving the MDP gives a policy that maps each state to an action to minimize long term expected costs However, we need a finite set of states in order to solve exactly Glaubius et al. September 18, 2018
Share Aware Scheduling System state: cumulative resource usage of each task. Dispatching a task moves the system stochastically through the state space according to that task’s duration. Begin the first part of the talk. (8,17) Glaubius et al. September 18, 2018
Share Aware Scheduling Utilization target induces a ray {u:0} through the state space. Encode “goodness” relative to the share as a cost. Require that costs grow with distance from utilization ray. u u=(1/3,2/3) Glaubius et al. September 18, 2018
Transition Structure Transitions are state-independent Relative distribution over successor states is the same in each state. May want to reorder this last point so that I make the claim first, then show the evidence: “State sets parallel to utilization ray behave similarly: costs are the same, and transitions are the same.” Glaubius et al. September 18, 2018
Repeating Structure Repeating structure allows us to remove all but one exemplar from each equivalence class. Restricting the model to states with low costs allows us to produce good approximations to the optimal policy. Glaubius et al. September 18, 2018
What About Scalability? MDP representation allows consistent approximation of the optimal scheduling policy Approach suffers from the curse of dimensionality To overcome this limitation, we focus on a restricted class of scheduling policies Examining the policies derived from the MDP based approach provides insight into selecting appropriate policies Glaubius et al. September 18, 2018 11
Two-task MDP Policy Scheduling policies induce a partition on a 2-D state space with boundary parallel to the share target. Establish a decision offset d to identify the partition boundary. Sufficient in 2-D, but what about higher dimensions? Glaubius et al. September 18, 2018
Time Horizons Suggest a Generalization Ht={x : x1+x2+…+xn=t} u (0,0,2) u (0,2,0) The idea that needs to come across on this slide is that we can think about the problem of scheduling at every moment by partitioning time horizons. H0 H1 (0,0) (2,0,0) H0 H1 H2 H3 H4 H2 Glaubius et al. September 18, 2018
Three-task MDP Policy t =10 t =20 t =30 Action partitions meet along a decision ray parallel to the utilization ray. Action partitions are roughly cone-shaped. Glaubius et al. September 18, 2018
Parameterizing the Partition Specify a decision offset at the intersection of partitions. Anchor action vectors at the decision offset to approximate partitions. The conic policy selects the action vector best aligned with the displacement between the query state x and the decision offset. a2 a1 x a3 Glaubius et al. September 18, 2018
Conic Policy Parameterization Decision offset d Action vectors a1, a2,…, an Sufficient to partition each time horizon into n conic regions Allows good policy parameters to be found by stochastic optimization Glaubius et al. September 18, 2018
Stable Conic Policies Guaranteed that stable conic policies exist. For example, set each action vector to point opposite its corresponding vertex. Induces a vector field that stochastically orbits the decision ray. (t, 0, 0) (0, t, 0) Glaubius et al. September 18, 2018 17
Comparing Policies Policy found by solving MDP (for small task sets) πMDP(x) – chooses action at state x per solved MDP Simple heuristics (for all numbers of tasks) πunderused(x) – runs the most underutilized task πgreedy(x) – minimizes immediate expected cost Conic approach (for all numbers of tasks) πconic(x) – selects action with best aligned action vector Glaubius et al. September 18, 2018
Policy Comparison on a 4 Task Problem Task durations: random histograms over [2,32] 100 iterations of parameter search MDP Glaubius et al. September 18, 2018 19
Policy Comparison on a Ten Task Problem Repeated the same experiment for 10 tasks MDP is omitted (intractable here) Glaubius et al. September 18, 2018 20
Comparison with Varying Task Set Sizes 100 independent problems for each task set size MDP tractable only among all 2 and 3 task cases MDP Glaubius et al. September 18, 2018 21
Conclusions We have developed new techniques for designing non-preemptive scheduling policies for tasks with stochastic resource usage durations Conic policy performance is competitive with MDP solutions when comparison possible, and for larger problems it improves on available heuristic policies Future work will focus on applying and evaluating our results in different cyber-physical systems, and on extending them further in design and verification Glaubius et al. September 18, 2018 22
For Further Information R. Glaubius, T. Tidwell, C. Gill, and W.D. Smart, “Scheduling Policy Design for Autonomic Systems”, International Journal on Autonomous and Adaptive Communications Systems, 2(3):276-296, 2009 R. Glaubius, Scheduling Policy Design using Stochastic Dynamic Programming, Ph.D. Thesis, Washington University, St. Louis, MO, USA, 2009 R. Glaubius, T. Tidwell, C. Gill, and W.D. Smart, “Scheduling Design and Verification for Open Soft Real-Time Systems”, RTSS 2008 R. Glaubius, T. Tidwell, C. Gill, and W.D. Smart, “Scheduling Design with Unknown Execution Time Distributions or Modes”. Tech. Report WUCSE-2009-15, 2009 T. Tidwell, R. Glaubius, C. Gill, and W.D. Smart, “Scheduling for Reliable Execution in Autonomic Systems”, ATC 2008 C. Gill, W.D. Smart, T. Tidwell, and R. Glaubius, “Scheduling as a Learned Art”, OSPERT, 2008 Project web site: http://www.cse.wustl.edu/~cdgill/Cybertrust/ Glaubius et al. September 18, 2018 23
Questions? ? Glaubius et al. September 18, 2018
Appendix: More Tasks Implies Higher Cost Simple problem: Fair-share scheduling of n deterministic tasks with unit duration Trajectories under round robin scheduling: 2 tasks: E{c(x)} = 1/2. Trajectory: (0,0)(1,0)(1,1)(0,0) Costs: c(0,0)=0; c(1,0)=1. 3 tasks: E{c(x)} = 8/9. Trajectory: (0,0,0)(1,0,0)(1,1,0)(1,1,1)(0,0,0) Costs: c(0,0,0)=0; c(1,0,0)=4/3; c(1,1,0)=4/3 n tasks: E{c(x)} = (n+1)(n-1)/(3n) Glaubius et al. September 18, 2018 25