Senior Design Project: Parallel Task Scheduling in Heterogeneous Computing Environments Senior Design Students: Christopher Blandin and Dylan Machovec.

Slides:



Advertisements
Similar presentations
Hadi Goudarzi and Massoud Pedram
Advertisements

ISE480 Sequencing and Scheduling Izmir University of Economics ISE Fall Semestre.
Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.
Online Scheduling with Known Arrival Times Nicholas G Hall (Ohio State University) Marc E Posner (Ohio State University) Chris N Potts (University of Southampton)
Tasks Periodic The period is the amount of time between each iteration of a regularly repeated task Time driven The task is automatically activated by.
Scheduling of parallel jobs in a heterogeneous grid environment Scheduling of parallel jobs in a heterogeneous grid environment Each site has a homogeneous.
Workshop on HPC in India Grid Middleware for High Performance Computing Sathish Vadhiyar Grid Applications Research Lab (GARL) Supercomputer Education.
Soft Real-Time Semi-Partitioned Scheduling with Restricted Migrations on Uniform Heterogeneous Multiprocessors Kecheng Yang James H. Anderson Dept. of.
Chapter 6: CPU Scheduling. 5.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 2, 2005 Chapter 6: CPU Scheduling Basic.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
Present by Chen, Ting-Wei Adaptive Task Checkpointing and Replication: Toward Efficient Fault-Tolerant Grids Maria Chtepen, Filip H.A. Claeys, Bart Dhoedt,
1 of 14 1 Analysis and Synthesis of Communication-Intensive Heterogeneous Real-Time Systems Paul Pop Computer and Information Science Dept. Linköpings.
By Group: Ghassan Abdo Rayyashi Anas to’meh Supervised by Dr. Lo’ai Tawalbeh.
Real-Time Operating System Chapter – 8 Embedded System: An integrated approach.
Is 99% Utilization of a Supercomputer a Good Thing? Scheduling in Context: User Utility Functions Cynthia Bailey Lee Department of Computer Science and.
Scheduling Master - Slave Multiprocessor Systems Professor: Dr. G S Young Speaker:Darvesh Singh.
End-to-End Delay Analysis for Fixed Priority Scheduling in WirelessHART Networks Abusayeed Saifullah, You Xu, Chenyang Lu, Yixin Chen.
Multiprocessor Real- Time Scheduling Aaron Harris CSE 666 Prof. Ganesan.
Task Alloc. In Dist. Embed. Systems Murat Semerci A.Yasin Çitkaya CMPE 511 COMPUTER ARCHITECTURE.
Integrated Risk Analysis for a Commercial Computing Service Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. Dept.
OPERATING SYSTEMS CPU SCHEDULING.  Introduction to CPU scheduling Introduction to CPU scheduling  Dispatcher Dispatcher  Terms used in CPU scheduling.
Scheduling of Parallel Jobs In a Heterogeneous Multi-Site Environment By Gerald Sabin from Ohio State Reviewed by Shengchao Yu 02/2005.
Marcos Dias de Assunção 1,2, Alexandre di Costanzo 1 and Rajkumar Buyya 1 1 Department of Computer Science and Software Engineering 2 National ICT Australia.
ROBUST RESOURCE ALLOCATION OF DAGS IN A HETEROGENEOUS MULTI-CORE SYSTEM Luis Diego Briceño, Jay Smith, H. J. Siegel, Anthony A. Maciejewski, Paul Maxwell,
Meta Scheduling Sathish Vadhiyar Sources/Credits/Taken from: Papers listed in “References” slide.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 March 01, 2005 Session 14.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
Scheduling policies for real- time embedded systems.
Multiprocessor Real-time Scheduling Jing Ma 马靖. Classification Partitioned Scheduling In the partitioned approach, the tasks are statically partitioned.
Chapter 101 Multiprocessor and Real- Time Scheduling Chapter 10.
1 Nasser Alsaedi. The ultimate goal for any computer system design are reliable execution of task and on time delivery of service. To increase system.
Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.
Object-Oriented Design and Implementation of the OE-Scheduler in Real-time Environments Ilhyun Lee Cherry K. Owen Haesun K. Lee The University of Texas.
1 11/29/2015 Chapter 6: CPU Scheduling l Basic Concepts l Scheduling Criteria l Scheduling Algorithms l Multiple-Processor Scheduling l Real-Time Scheduling.
CUHK Learning-Based Power Management for Multi-Core Processors YE Rong Nov 15, 2011.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
Faucets Queuing System Presented by, Sameer Kumar.
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
CSCI1600: Embedded and Real Time Software Lecture 24: Real Time Scheduling II Steven Reiss, Fall 2015.
Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.
1 CS.217 Operating System By Ajarn..Sutapart Sappajak,METC,MSIT Chapter 5 CPU Scheduling Slide 1 Chapter 5 CPU Scheduling.
Job Scheduling P. (Saday) Sadayappan Ohio State University.
QoPS: A QoS based Scheme for Parallel Job Scheduling M. IslamP. Balaji P. Sadayappan and D. K. Panda Computer and Information Science The Ohio State University.
CSCI1600: Embedded and Real Time Software Lecture 23: Real Time Scheduling I Steven Reiss, Fall 2015.
Author Utility-Based Scheduling for Bulk Data Transfers between Distributed Computing Facilities Xin Wang, Wei Tang, Raj Kettimuthu,
1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.
CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.
Basic Concepts Maximum CPU utilization obtained with multiprogramming
Distributed Process Scheduling- Real Time Scheduling Csc8320(Fall 2013)
A Dynamic Critical Path Algorithm for Scheduling Scientific Workflow Applications on Global Grids e-Science IEEE 2007 Report: Wei-Cheng Lee
Chapter 6: CPU Scheduling
Chapter 6: CPU Scheduling
CPU Scheduling G.Anuradha
Module 5: CPU Scheduling
Sanjoy Baruah The University of North Carolina at Chapel Hill
TDC 311 Process Scheduling.
Chapter 6: CPU Scheduling
A Characterization of Approaches to Parrallel Job Scheduling
CPU SCHEDULING.
Chapter 6: CPU Scheduling
Operating System , Fall 2000 EA101 W 9:00-10:00 F 9:00-11:00
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Chapter 6: CPU Scheduling
Chapter 6: CPU Scheduling
Ch 4. Periodic Task Scheduling
Module 5: CPU Scheduling
Presentation transcript:

Senior Design Project: Parallel Task Scheduling in Heterogeneous Computing Environments Senior Design Students: Christopher Blandin and Dylan Machovec Post-doctoral Scholar: Bhavesh Khemka Faculty Advisor: H. J. Siegel Senior Design Presentation

Outline motivation our system model problem statement existing work simulation details future work 2

Motivation High Performance Computing (HPC) used by wide variety of fields to solve challenging problems  physics simulations, oil and gas industry, climate modeling, computational biology, computational chemistry, and many more improving performance increases productivity in these fields we plan on improving performance of system by designing novel scheduling techniques scheduling refers to the assignment and ordering of tasks to machines for execution 3

System Model – Definitions heterogeneity  differing execution characteristics homogeneity  have the same execution characteristics oversubscribed  more tasks arriving than the system can execute immediately 4

System Model – Cluster Model clusters have multiple homogeneous nodes clusters are heterogeneous from each other nodes may have multiple multicore processors each node may only have one task running at a given time  avoids interference between tasks task assignments are done at node-level a task cannot be spread across two clusters 5

System Model – Workload Characteristics dynamically arriving tasks when a task arrives, scheduler obtains the following information:  arrival time  execution time  different times on different clusters (because of heterogeneity)  number of processing cores required  value function tasks are heterogeneous no pre-emption 6

System Model – Value Function 7 each task has a value function  represents value of the task when it completes  value function may be different for each task  monotonically decreasing functions value functions can be fully described with four parameters  a constant starting value  after soft deadline value decays linearly to a final value  after hard deadline value drops to zero

Problem Statement we measure the performance of a scheduler in our environment as the sum of the value earned by completing tasks over a given amount of time goal of heuristics: maximize total sum of value earned over a given amount of time  improve performance of HPC systems main contribution  design, simulation, and analysis of resource allocation heuristics for task scheduling  heterogeneous HPC system with multiple clusters  tasks with associated value functions with soft and hard deadlines  each task executes in parallel over multiple cores 8

t4 t2 Mapping Event 9 mapping event: when task assignment decision(s) are made trigger mapping event whenever:  a node becomes available, or  a task arrives during mapping event, all tasks that have not been reserved or have not started execution are considered mappable only makes task assignments that can start now  heuristic may or may not make reservations n1n1 n2n2 n3n3 n4n4 t1 t6 unmapped tasks set nodes of cluster 1 time t11 t9 t13 t12 n1n1 n2n2 nodes of cluster 2 t5 t8 t10 current time t7 t3 t2 t4

Planned Heuristics four planned heuristics  EASY Backfilling  FCFS with Multiple Queues  Max-Max Value  Max-Max Value-Per-Resource submit to Metaheuristics International Conference (MIC 2015)  submission deadline: 2/6/15 10

Existing Work – Dr. Siegel’s Group focuses on utility of tasks  B. Khemka, R. Friese, L. D. Briceño, H. J. Siegel, A. A. Maciejewski, G. A. Koenig, C. Groer, G. Okonski, M. M. Hilton, R. Rambharos and S. Poole, “Utility Functions and Resource Management in an Oversubscribed Heterogeneous Computing Environment,” IEEE Transactions on Parallel and Distributed Systems, accepted 2014, to appear. another work that models stepped value functions  J-K Kim, S. Shivle, H. J. Siegel, A. A. Maciejewski, T. D. Braun, et al. “Dynamically Mapping Tasks with Priorities and Multiple Deadlines in a Heterogeneous Environment,” Journal of Parallel and Distributed Computing, vol. 67, no. 2, pp , Feb

Existing Work other parallel task scheduling techniques  EASY Backfilling  D. A. Lifka, “The ANL/IBM SP Scheduling System,” Proc. First Workshop Job Scheduling Strategies for Parallel Processing, pp ,  S. Gerald, R. Kettimuthu, A. Rajan and P. Sadayappan, “Scheduling of Parallel Jobs in a Heterogeneous Multi-Site Environment,” Job Scheduling Strategies for Parallel Processing, pp ,

Design of Parallel Simulator for Experiments extends existing serial simulator from Dr. Siegel’s group  modified to handle scheduling of parallel tasks created new modules  cluster class  has nodes within it  methods for obtaining parallel task information from workload trace  created a sleep task object to model idle time within each machine developed an algorithm to locate slots for parallel tasks within the area occupied by sleep tasks developed a method that picks the nodes that create the best packing (i.e., create the least future restrictions) 13

Workloads for Simulations will use Dr. Dror Feitelson’s Parallel Workload Trace to model the workload arrival  workload log from Curie Supercomputer in France (has 93,312 cores)  using last 10 months of data may use Downey’s model for execution time scaling 14

Future Work Use simulator to implement and compare the planned heuristics running a post-mortem analysis  use a genetic algorithm to find a loose upper bound solution when we know in advance the arrival time and characteristics of all tasks since scheduling is NP-hard it is hard to quantify the performance of heuristics  this analysis will give us a better metric to compare our results with 15

Thank You Questions? Feedback? 16

Back-up Slides 17

Packing Nodes Efficiently whenever an assignment is to be made, all heuristics pick the nodes that create the least amount of restrictions for future assignments  e.g., if task t8 needs 3 nodes, it will be assigned: n1, n2, n5 18 n1n1 n2 n3 n4 n5 current time t8 time

Heuristics – Overview 19 EASY Backfilling  considers tasks in a first come first serve (FCFS) order  makes only one reservation for the first task that cannot fit on idle machines  backfills other tasks so that they do no delay the reservation FCFS with Multiple Queues  puts the tasks in three queues  takes 1, 4, and 8 tasks from the large, medium, and small queues respectively  assigns tasks if possible, and otherwise makes the earliest reservation for them  repeats until the queues are empty

Heuristics – Overview 20 Max-Max Value  First phase: Considering all tasks  Determine the allocation choice that will earn it the highest value without delaying any place holder task  If there are ties, pick the choice with the earlier completion time  Second phase: Consider tasks from first phase  Make assignment or a place-holder for the choice that earns the highest value  This assignment should not start execution after the start of the earliest place holder task  Repeat the two phases until no more tasks can be mapped Max-Max Value-Per-Resource  Similar to Max-Max Value

Simulation Study to model real-world system environment experiments run on ISTeC Cray HPC System uses real workload traces as inputs 21